Patents by Inventor Gautham J. Mysore

Gautham J. Mysore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11138989
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for sound quality prediction and real-time feedback about sound quality, such as room acoustics quality and background noise. Audio data can be sampled from a live sound source and stored in an audio buffer. The audio data in the buffer is analyzed to calculate a stream of values of one or more sound quality measures, such as speech transmission index and signal-to-noise ratio. Speech transmission index can be calculated using a convolution neural network configured to predict speech transmission index from reverberant speech. The stream of values can be used to provide real-time feedback about sound quality of the audio data. For example, a visual indicator on a graphical user interface can be updated based on consistency of the values over time. The real-time feedback about sound quality can help users optimize their recording setup.
    Type: Grant
    Filed: March 7, 2019
    Date of Patent: October 5, 2021
    Assignee: Adobe Inc.
    Inventors: Prem Seetharaman, Gautham J. Mysore, Bryan A. Pardo
  • Publication number: 20200286504
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for sound quality prediction and real-time feedback about sound quality, such as room acoustics quality and background noise. Audio data can be sampled from a live sound source and stored in an audio buffer. The audio data in the buffer is analyzed to calculate a stream of values of one or more sound quality measures, such as speech transmission index and signal-to-noise ratio. Speech transmission index can be calculated using a convolution neural network configured to predict speech transmission index from reverberant speech. The stream of values can be used to provide real-time feedback about sound quality of the audio data. For example, a visual indicator on a graphical user interface can be updated based on consistency of the values over time. The real-time feedback about sound quality can help users optimize their recording setup.
    Type: Application
    Filed: March 7, 2019
    Publication date: September 10, 2020
    Inventors: Prem Seetharaman, Gautham J. Mysore, Bryan A. Pardo
  • Patent number: 10770063
    Abstract: Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.
    Type: Grant
    Filed: August 22, 2018
    Date of Patent: September 8, 2020
    Assignees: Adobe Inc., The Trustees of Princeton University
    Inventors: Zeyu Jin, Gautham J. Mysore, Jingwan Lu, Adam Finkelstein
  • Patent number: 10638221
    Abstract: Time interval sound alignment techniques are described. In one or more implementations, one or more inputs are received via interaction with a user interface that indicate that a first time interval in a first representation of sound data generated from a first sound signal corresponds to a second time interval in a second representation of sound data generated from a second sound signal. A stretch value is calculated based on an amount of time represented in the first and second time intervals, respectively. Aligned sound data is generated from the sound data for the first and second time intervals based on the calculated stretch value.
    Type: Grant
    Filed: November 13, 2012
    Date of Patent: April 28, 2020
    Assignee: Adobe Inc.
    Inventors: Brian John King, Gautham J. Mysore, Paris Smaragdis
  • Patent number: 10460763
    Abstract: Methods and systems for automatic audio loop generation from an audio track identify suitable portions of the audio track for generating audio loops. One or more embodiments identify portions of the audio track that include a beginning beat and an ending beat that have similar audio features that provide for seamless transitions when generating the audio loops. One or more embodiments generate scores for the portions based on the similarity of the audio features of the corresponding beginning and ending beats. Additionally, one or more embodiments use the generated scores to determine whether each portion is a suitable audio loop candidate. One or more embodiments then generate one or more audio loops using one or more suitable portions of the audio track.
    Type: Grant
    Filed: April 26, 2017
    Date of Patent: October 29, 2019
    Assignee: Adobe Inc.
    Inventors: Zhengshan Shi, Gautham J. Mysore
  • Publication number: 20190318726
    Abstract: Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.
    Type: Application
    Filed: August 22, 2018
    Publication date: October 17, 2019
    Applicants: Adobe Inc., The Trustees of Princeton University
    Inventors: Zeyu Jin, Gautham J. Mysore, Jingwan Lu, Adam Finkelstein
  • Patent number: 10347238
    Abstract: Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.
    Type: Grant
    Filed: October 27, 2017
    Date of Patent: July 9, 2019
    Assignees: Adobe Inc., The Trustees of Princeton University
    Inventors: Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
  • Publication number: 20190130894
    Abstract: Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.
    Type: Application
    Filed: October 27, 2017
    Publication date: May 2, 2019
    Applicants: Adobe Inc., The Trustees of Princeton University
    Inventors: Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
  • Patent number: 10262680
    Abstract: Variable sound decomposition masking techniques are described. In one or more implementations, a mask is generated that incorporates a user input as part of the mask, the user input is usable at least in part to define a threshold that is variable based on the user input and configured for use in performing a sound decomposition process. The sound decomposition process is performed using the mask to assign portions of sound data to respective ones of a plurality of sources of the sound data.
    Type: Grant
    Filed: June 28, 2013
    Date of Patent: April 16, 2019
    Assignee: Adobe Inc.
    Inventors: Gautham J. Mysore, Paris Smaragdis
  • Patent number: 10249321
    Abstract: Sound rate modification techniques are described. In one or more implementations, an indication is received of an amount that a rate of output of sound data is to be modified. One or more sound rate rules are applied to the sound data that, along with the received indication, are usable to calculate different rates at which different portions of the sound data are to be modified, respectively. The sound data is then output such that the calculated rates are applied.
    Type: Grant
    Filed: November 20, 2012
    Date of Patent: April 2, 2019
    Assignee: Adobe Inc.
    Inventors: Brian John King, Gautham J. Mysore, Paris Smaragdis
  • Patent number: 10176818
    Abstract: Sound processing using a product-of-filters model is described. In one or more implementations, a model is formed by one or more computing devices for a time frame of sound data as a product of filters. The model is utilized by the one or more computing devices to perform one or more sound processing techniques on the time frame of the sound data.
    Type: Grant
    Filed: November 15, 2013
    Date of Patent: January 8, 2019
    Assignee: Adobe Inc.
    Inventors: Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
  • Publication number: 20180315452
    Abstract: Methods and systems for automatic audio loop generation from an audio track identify suitable portions of the audio track for generating audio loops. One or more embodiments identify portions of the audio track that include a beginning beat and an ending beat that have similar audio features that provide for seamless transitions when generating the audio loops. One or more embodiments generate scores for the portions based on the similarity of the audio features of the corresponding beginning and ending beats. Additionally, one or more embodiments use the generated scores to determine whether each portion is a suitable audio loop candidate. One or more embodiments then generate one or more audio loops using one or more suitable portions of the audio track.
    Type: Application
    Filed: April 26, 2017
    Publication date: November 1, 2018
    Inventors: Zhengshan Shi, Gautham J. Mysore
  • Patent number: 10002622
    Abstract: Pattern identification using convolution is described. In one or more implementations, a representation of a pattern is obtained that is described using data points that include frequency coordinates, time coordinates, and energy values. An identification is made as to whether sound data described using irregularly positioned data points includes the pattern, the identifying including use of a convolution of the frequency or time coordinates to determine correspondence with the representation of the pattern.
    Type: Grant
    Filed: November 20, 2013
    Date of Patent: June 19, 2018
    Assignee: Adobe Systems Incorporated
    Inventors: Minje Kim, Paris Smaragdis, Gautham J. Mysore
  • Patent number: 9966088
    Abstract: Online source separation may include receiving a sound mixture that includes first audio data from a first source and second audio data from a second source. Online source separation may further include receiving pre-computed reference data corresponding to the first source. Online source separation may also include performing online separation of the second audio data from the first audio data based on the pre-computed reference data.
    Type: Grant
    Filed: December 22, 2011
    Date of Patent: May 8, 2018
    Assignee: ADOBE SYSTEMS INCORPORATED
    Inventors: Gautham J. Mysore, Paris Smaragdis, Zhiyao Duan
  • Patent number: 9866954
    Abstract: Performance metric based stopping criteria for iterative algorithm techniques are described. In one or more implementations, a training dataset is processed by one or more computing devices using an iterative algorithm having a cost function. The processing includes, for a plurality of iterations of the iterative algorithm, computing a cost for the iterative algorithm using the cost function and a value for each of a plurality of performance metrics that are usable to infer accuracy of the iterative algorithm for a respective one of the iterations. Responsive to the processing, a particular one of the plurality of iterations is identified as a stopping criterion based at least in part on the computed values for the plurality of performance metrics and the stopping criterion is output to configure the iterative algorithm to use the stopping criterion for subsequent processing of data by the iterative algorithm.
    Type: Grant
    Filed: July 7, 2014
    Date of Patent: January 9, 2018
    Assignee: ADOBE SYSTEMS INCORPORATED
    Inventors: Francois G. Germain, Gautham J. Mysore
  • Patent number: 9852743
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed towards automatic emphasis of spoken words. In one embodiment, a process may begin by identifying, within an audio recording, a word that is to be emphasized. Once identified, contextual and lexical information relating to the emphasized word can be extracted from the audio recording. This contextual and lexical information can be utilized in conjunction with a predictive model to determine a set of emphasis parameters for the identified word. These emphasis parameters can then be applied to the identified word to cause the word to be emphasized. Other embodiments may be described and/or claimed.
    Type: Grant
    Filed: November 20, 2015
    Date of Patent: December 26, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
  • Patent number: 9721202
    Abstract: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.
    Type: Grant
    Filed: February 21, 2014
    Date of Patent: August 1, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Nicolas Maurice Boulanger-Lewandowski, Gautham J. Mysore, Matthew Douglas Hoffman
  • Publication number: 20170148464
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed towards automatic emphasis of spoken words. In one embodiment, a process may begin by identifying, within an audio recording, a word that is to be emphasized. Once identified, contextual and lexical information relating to the emphasized word can be extracted from the audio recording. This contextual and lexical information can be utilized in conjunction with a predictive model to determine a set of emphasis parameters for the identified word. These emphasis parameters can then be applied to the identified word to cause the word to be emphasized. Other embodiments may be described and/or claimed.
    Type: Application
    Filed: November 20, 2015
    Publication date: May 25, 2017
    Inventors: Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
  • Patent number: 9607627
    Abstract: Sound enhancement techniques through dereverberation are described. In one or more implementations, a method is described of enhancing sound data through removal of reverberation from the sound data by one or more computing devices. The method includes obtaining a model that describes primary sound data that is to be utilized as a prior that assumes no prior knowledge about specifics of the sound data from which the reverberation is to be removed. A reverberation kernel is computed having parameters that, when applied to the model that describes the primary sound data, corresponds to the sound data from which the reverberation is to be removed. The reverberation is removed from the sound data using the reverberation kernel.
    Type: Grant
    Filed: February 5, 2015
    Date of Patent: March 28, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
  • Patent number: 9601124
    Abstract: Acoustic matching and splicing of sound tracks is described. In one or more implementations, a method to acoustically match and splice first and second sound tracks by one or more computing devices is described. The method includes source separating the first and second sound tracks into first track primary and background sound data and second track primary and background sound data. Features extracted from the first and second primary sound data are matched, one to another, to generate first and second primary matching masks. Features extracted from the first and second background sound data are matched, one to another, to generate first and second background matching masks, which are applied to respective separated sound data. The applied first track primary and background sound data and the applied second track primary and background sound data are spliced to generate a spliced sound track.
    Type: Grant
    Filed: January 7, 2015
    Date of Patent: March 21, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: François G. Germain, Gautham J. Mysore