Patents by Inventor Gautham J. Mysore
Gautham J. Mysore has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11138989Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for sound quality prediction and real-time feedback about sound quality, such as room acoustics quality and background noise. Audio data can be sampled from a live sound source and stored in an audio buffer. The audio data in the buffer is analyzed to calculate a stream of values of one or more sound quality measures, such as speech transmission index and signal-to-noise ratio. Speech transmission index can be calculated using a convolution neural network configured to predict speech transmission index from reverberant speech. The stream of values can be used to provide real-time feedback about sound quality of the audio data. For example, a visual indicator on a graphical user interface can be updated based on consistency of the values over time. The real-time feedback about sound quality can help users optimize their recording setup.Type: GrantFiled: March 7, 2019Date of Patent: October 5, 2021Assignee: Adobe Inc.Inventors: Prem Seetharaman, Gautham J. Mysore, Bryan A. Pardo
-
Publication number: 20200286504Abstract: Embodiments of the present invention provide systems, methods, and computer storage media for sound quality prediction and real-time feedback about sound quality, such as room acoustics quality and background noise. Audio data can be sampled from a live sound source and stored in an audio buffer. The audio data in the buffer is analyzed to calculate a stream of values of one or more sound quality measures, such as speech transmission index and signal-to-noise ratio. Speech transmission index can be calculated using a convolution neural network configured to predict speech transmission index from reverberant speech. The stream of values can be used to provide real-time feedback about sound quality of the audio data. For example, a visual indicator on a graphical user interface can be updated based on consistency of the values over time. The real-time feedback about sound quality can help users optimize their recording setup.Type: ApplicationFiled: March 7, 2019Publication date: September 10, 2020Inventors: Prem Seetharaman, Gautham J. Mysore, Bryan A. Pardo
-
Patent number: 10770063Abstract: Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.Type: GrantFiled: August 22, 2018Date of Patent: September 8, 2020Assignees: Adobe Inc., The Trustees of Princeton UniversityInventors: Zeyu Jin, Gautham J. Mysore, Jingwan Lu, Adam Finkelstein
-
Patent number: 10638221Abstract: Time interval sound alignment techniques are described. In one or more implementations, one or more inputs are received via interaction with a user interface that indicate that a first time interval in a first representation of sound data generated from a first sound signal corresponds to a second time interval in a second representation of sound data generated from a second sound signal. A stretch value is calculated based on an amount of time represented in the first and second time intervals, respectively. Aligned sound data is generated from the sound data for the first and second time intervals based on the calculated stretch value.Type: GrantFiled: November 13, 2012Date of Patent: April 28, 2020Assignee: Adobe Inc.Inventors: Brian John King, Gautham J. Mysore, Paris Smaragdis
-
Patent number: 10460763Abstract: Methods and systems for automatic audio loop generation from an audio track identify suitable portions of the audio track for generating audio loops. One or more embodiments identify portions of the audio track that include a beginning beat and an ending beat that have similar audio features that provide for seamless transitions when generating the audio loops. One or more embodiments generate scores for the portions based on the similarity of the audio features of the corresponding beginning and ending beats. Additionally, one or more embodiments use the generated scores to determine whether each portion is a suitable audio loop candidate. One or more embodiments then generate one or more audio loops using one or more suitable portions of the audio track.Type: GrantFiled: April 26, 2017Date of Patent: October 29, 2019Assignee: Adobe Inc.Inventors: Zhengshan Shi, Gautham J. Mysore
-
Publication number: 20190318726Abstract: Techniques for a recursive deep-learning approach for performing speech synthesis using a repeatable structure that splits an input tensor into a left half and right half similar to the operation of the Fast Fourier Transform, performs a 1-D convolution on each respective half, performs a summation and then applies a post-processing function. The repeatable structure may be utilized in a series configuration to operate as a vocoder or perform other speech processing functions.Type: ApplicationFiled: August 22, 2018Publication date: October 17, 2019Applicants: Adobe Inc., The Trustees of Princeton UniversityInventors: Zeyu Jin, Gautham J. Mysore, Jingwan Lu, Adam Finkelstein
-
Patent number: 10347238Abstract: Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.Type: GrantFiled: October 27, 2017Date of Patent: July 9, 2019Assignees: Adobe Inc., The Trustees of Princeton UniversityInventors: Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
-
Publication number: 20190130894Abstract: Systems and techniques are disclosed for synthesizing a new word or short phrase such that it blends seamlessly in the context of insertion or replacement in an existing narration. In one such embodiment, a text-to-speech synthesizer is utilized to say the word or phrase in a generic voice. Voice conversion is then performed on the generic voice to convert it into a voice that matches the narration. An editor and interface are described that support fully automatic synthesis, selection among a candidate set of alternative pronunciations, fine control over edit placements and pitch profiles, and guidance by the editors own voice.Type: ApplicationFiled: October 27, 2017Publication date: May 2, 2019Applicants: Adobe Inc., The Trustees of Princeton UniversityInventors: Zeyu Jin, Gautham J. Mysore, Stephen DiVerdi, Jingwan Lu, Adam Finkelstein
-
Patent number: 10262680Abstract: Variable sound decomposition masking techniques are described. In one or more implementations, a mask is generated that incorporates a user input as part of the mask, the user input is usable at least in part to define a threshold that is variable based on the user input and configured for use in performing a sound decomposition process. The sound decomposition process is performed using the mask to assign portions of sound data to respective ones of a plurality of sources of the sound data.Type: GrantFiled: June 28, 2013Date of Patent: April 16, 2019Assignee: Adobe Inc.Inventors: Gautham J. Mysore, Paris Smaragdis
-
Patent number: 10249321Abstract: Sound rate modification techniques are described. In one or more implementations, an indication is received of an amount that a rate of output of sound data is to be modified. One or more sound rate rules are applied to the sound data that, along with the received indication, are usable to calculate different rates at which different portions of the sound data are to be modified, respectively. The sound data is then output such that the calculated rates are applied.Type: GrantFiled: November 20, 2012Date of Patent: April 2, 2019Assignee: Adobe Inc.Inventors: Brian John King, Gautham J. Mysore, Paris Smaragdis
-
Patent number: 10176818Abstract: Sound processing using a product-of-filters model is described. In one or more implementations, a model is formed by one or more computing devices for a time frame of sound data as a product of filters. The model is utilized by the one or more computing devices to perform one or more sound processing techniques on the time frame of the sound data.Type: GrantFiled: November 15, 2013Date of Patent: January 8, 2019Assignee: Adobe Inc.Inventors: Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
-
Publication number: 20180315452Abstract: Methods and systems for automatic audio loop generation from an audio track identify suitable portions of the audio track for generating audio loops. One or more embodiments identify portions of the audio track that include a beginning beat and an ending beat that have similar audio features that provide for seamless transitions when generating the audio loops. One or more embodiments generate scores for the portions based on the similarity of the audio features of the corresponding beginning and ending beats. Additionally, one or more embodiments use the generated scores to determine whether each portion is a suitable audio loop candidate. One or more embodiments then generate one or more audio loops using one or more suitable portions of the audio track.Type: ApplicationFiled: April 26, 2017Publication date: November 1, 2018Inventors: Zhengshan Shi, Gautham J. Mysore
-
Patent number: 10002622Abstract: Pattern identification using convolution is described. In one or more implementations, a representation of a pattern is obtained that is described using data points that include frequency coordinates, time coordinates, and energy values. An identification is made as to whether sound data described using irregularly positioned data points includes the pattern, the identifying including use of a convolution of the frequency or time coordinates to determine correspondence with the representation of the pattern.Type: GrantFiled: November 20, 2013Date of Patent: June 19, 2018Assignee: Adobe Systems IncorporatedInventors: Minje Kim, Paris Smaragdis, Gautham J. Mysore
-
Patent number: 9966088Abstract: Online source separation may include receiving a sound mixture that includes first audio data from a first source and second audio data from a second source. Online source separation may further include receiving pre-computed reference data corresponding to the first source. Online source separation may also include performing online separation of the second audio data from the first audio data based on the pre-computed reference data.Type: GrantFiled: December 22, 2011Date of Patent: May 8, 2018Assignee: ADOBE SYSTEMS INCORPORATEDInventors: Gautham J. Mysore, Paris Smaragdis, Zhiyao Duan
-
Patent number: 9866954Abstract: Performance metric based stopping criteria for iterative algorithm techniques are described. In one or more implementations, a training dataset is processed by one or more computing devices using an iterative algorithm having a cost function. The processing includes, for a plurality of iterations of the iterative algorithm, computing a cost for the iterative algorithm using the cost function and a value for each of a plurality of performance metrics that are usable to infer accuracy of the iterative algorithm for a respective one of the iterations. Responsive to the processing, a particular one of the plurality of iterations is identified as a stopping criterion based at least in part on the computed values for the plurality of performance metrics and the stopping criterion is output to configure the iterative algorithm to use the stopping criterion for subsequent processing of data by the iterative algorithm.Type: GrantFiled: July 7, 2014Date of Patent: January 9, 2018Assignee: ADOBE SYSTEMS INCORPORATEDInventors: Francois G. Germain, Gautham J. Mysore
-
Patent number: 9852743Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed towards automatic emphasis of spoken words. In one embodiment, a process may begin by identifying, within an audio recording, a word that is to be emphasized. Once identified, contextual and lexical information relating to the emphasized word can be extracted from the audio recording. This contextual and lexical information can be utilized in conjunction with a predictive model to determine a set of emphasis parameters for the identified word. These emphasis parameters can then be applied to the identified word to cause the word to be emphasized. Other embodiments may be described and/or claimed.Type: GrantFiled: November 20, 2015Date of Patent: December 26, 2017Assignee: Adobe Systems IncorporatedInventors: Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
-
Patent number: 9721202Abstract: Sound processing techniques using recurrent neural networks are described. In one or more implementations, temporal dependencies are captured in sound data that are modeled through use of a recurrent neural network (RNN). The captured temporal dependencies are employed as part of feature extraction performed using nonnegative matrix factorization (NMF). One or more sound processing techniques are performed on the sound data based at least in part on the feature extraction.Type: GrantFiled: February 21, 2014Date of Patent: August 1, 2017Assignee: Adobe Systems IncorporatedInventors: Nicolas Maurice Boulanger-Lewandowski, Gautham J. Mysore, Matthew Douglas Hoffman
-
Publication number: 20170148464Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed towards automatic emphasis of spoken words. In one embodiment, a process may begin by identifying, within an audio recording, a word that is to be emphasized. Once identified, contextual and lexical information relating to the emphasized word can be extracted from the audio recording. This contextual and lexical information can be utilized in conjunction with a predictive model to determine a set of emphasis parameters for the identified word. These emphasis parameters can then be applied to the identified word to cause the word to be emphasized. Other embodiments may be described and/or claimed.Type: ApplicationFiled: November 20, 2015Publication date: May 25, 2017Inventors: Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
-
Patent number: 9607627Abstract: Sound enhancement techniques through dereverberation are described. In one or more implementations, a method is described of enhancing sound data through removal of reverberation from the sound data by one or more computing devices. The method includes obtaining a model that describes primary sound data that is to be utilized as a prior that assumes no prior knowledge about specifics of the sound data from which the reverberation is to be removed. A reverberation kernel is computed having parameters that, when applied to the model that describes the primary sound data, corresponds to the sound data from which the reverberation is to be removed. The reverberation is removed from the sound data using the reverberation kernel.Type: GrantFiled: February 5, 2015Date of Patent: March 28, 2017Assignee: Adobe Systems IncorporatedInventors: Dawen Liang, Matthew Douglas Hoffman, Gautham J. Mysore
-
Patent number: 9601124Abstract: Acoustic matching and splicing of sound tracks is described. In one or more implementations, a method to acoustically match and splice first and second sound tracks by one or more computing devices is described. The method includes source separating the first and second sound tracks into first track primary and background sound data and second track primary and background sound data. Features extracted from the first and second primary sound data are matched, one to another, to generate first and second primary matching masks. Features extracted from the first and second background sound data are matched, one to another, to generate first and second background matching masks, which are applied to respective separated sound data. The applied first track primary and background sound data and the applied second track primary and background sound data are spliced to generate a spliced sound track.Type: GrantFiled: January 7, 2015Date of Patent: March 21, 2017Assignee: Adobe Systems IncorporatedInventors: François G. Germain, Gautham J. Mysore