Patents by Inventor Woojay Jeon
Woojay Jeon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230386478Abstract: Systems and processes for speech recognition for multiple users are provided. For example, in response to receiving speech input from a user, a combined speech profile is obtained from a plurality of speech profiles. The speech input is interpreted based on the combined speech profile to obtain a plurality of speech recognition results. The plurality of speech recognition results includes a first speech recognition result corresponding to a first speech profile of the plurality of speech profiles, wherein the first speech profile corresponds to a first user, and a second speech recognition result corresponding to a second speech profile of the plurality of speech profiles, wherein the second speech profile corresponds to a second user different from the first user. A respective speech recognition result based on an identified voice profile is then selected from the plurality of speech recognition results.Type: ApplicationFiled: September 7, 2022Publication date: November 30, 2023Inventors: Leo LIU, Yaqiao DENG, Thiago FRAGA DA SILVA, Woojay JEON, Mahesh KRISHNAMOORTHY, Mary K. YOUNG
-
Patent number: 9697820Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.Type: GrantFiled: December 7, 2015Date of Patent: July 4, 2017Assignee: Apple Inc.Inventor: Woojay Jeon
-
Publication number: 20170092259Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.Type: ApplicationFiled: December 7, 2015Publication date: March 30, 2017Inventor: Woojay JEON
-
Patent number: 8442823Abstract: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.Type: GrantFiled: October 19, 2010Date of Patent: May 14, 2013Assignee: Motorola Solutions, Inc.Inventors: Woojay Jeon, Yan-Ming Cheng, Changxue Ma, Dusan Macho
-
Patent number: 8380506Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.Type: GrantFiled: November 29, 2007Date of Patent: February 19, 2013Assignee: Georgia Tech Research CorporationInventors: Woojay Jeon, Biing-Hwang Juang
-
Publication number: 20120095764Abstract: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.Type: ApplicationFiled: October 19, 2010Publication date: April 19, 2012Applicant: MOTOROLA, INC.Inventors: WOOJAY JEON, YAN-MING CHEN, CHANGXUE MA, DUSAN MACHO
-
Patent number: 8049093Abstract: During operation, a “coarse search” stage applies variable-scale windowing on the query pitch contours to compare them with fixed-length segments of target pitch contours to find matching candidates while efficiently scanning over variable tempo differences and target locations. Because the target segments are of fixed-length, this has the effect of drastically reducing the storage space required in a prior-art method. Furthermore, by breaking the query contours into parts, rhythmic inconsistencies can be more flexibly handled. Normalization is also applied to the contours to allow comparisons independent of differences in musical key. In a “fine search” stage, a “segmental” dynamic time warping (DTW) method is applied that calculates a more accurate similarity score between the query and each candidate target with more explicit consideration toward rhythmic inconsistencies.Type: GrantFiled: December 30, 2009Date of Patent: November 1, 2011Assignee: Motorola Solutions, Inc.Inventors: Woojay Jeon, Changxue Ma
-
Publication number: 20110154977Abstract: During operation, a “coarse search” stage applies variable-scale windowing on the query pitch contours to compare them with fixed-length segments of target pitch contours to find matching candidates while efficiently scanning over variable tempo differences and target locations. Because the target segments are of fixed-length, this has the effect of drastically reducing the storage space required in a prior-art method. Furthermore, by breaking the query contours into parts, rhythmic inconsistencies can be more flexibly handled. Normalization is also applied to the contours to allow comparisons independent of differences in musical key. In a “fine search” stage, a “segmental” dynamic time warping (DTW) method is applied that calculates a more accurate similarity score between the query and each candidate target with more explicit consideration toward rhythmic inconsistencies.Type: ApplicationFiled: December 30, 2009Publication date: June 30, 2011Applicant: MOTOROLA, INC.Inventors: Woojay Jeon, Changxue Ma
-
Publication number: 20080147402Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.Type: ApplicationFiled: November 29, 2007Publication date: June 19, 2008Inventors: Woojay Jeon, Biing-Hwang Juang