Patents by Inventor Woojay Jeon

Woojay Jeon has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

SPEECH RECOGNITION FOR MULTIPLE USERS USING SPEECH PROFILE COMBINATION

Publication number: 20230386478

Abstract: Systems and processes for speech recognition for multiple users are provided. For example, in response to receiving speech input from a user, a combined speech profile is obtained from a plurality of speech profiles. The speech input is interpreted based on the combined speech profile to obtain a plurality of speech recognition results. The plurality of speech recognition results includes a first speech recognition result corresponding to a first speech profile of the plurality of speech profiles, wherein the first speech profile corresponds to a first user, and a second speech recognition result corresponding to a second speech profile of the plurality of speech profiles, wherein the second speech profile corresponds to a second user different from the first user. A respective speech recognition result based on an identified voice profile is then selected from the plurality of speech recognition results.

Type: Application

Filed: September 7, 2022

Publication date: November 30, 2023

Inventors: Leo LIU, Yaqiao DENG, Thiago FRAGA DA SILVA, Woojay JEON, Mahesh KRISHNAMOORTHY, Mary K. YOUNG
Unit-selection text-to-speech synthesis using concatenation-sensitive neural networks

Patent number: 9697820

Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.

Type: Grant

Filed: December 7, 2015

Date of Patent: July 4, 2017

Assignee: Apple Inc.

Inventor: Woojay Jeon
UNIT-SELECTION TEXT-TO-SPEECH SYNTHESIS USING CONCATENATION-SENSITIVE NEURAL NETWORKS

Publication number: 20170092259

Abstract: Systems and processes for performing unit-selection text-to-speech synthesis are provided. In one example process, a sequence of target units can represent a spoken pronunciation of text. A set of predicted acoustic model parameters of a second target unit can be determined using a set of acoustic features of a first candidate speech segment of a first target unit and a set of linguistic features of the second target unit. A likelihood score of the second candidate speech segment with respect to the first candidate speech segment can be determined using the set of predicted acoustic model parameters of the second target unit and a set of acoustic features of the second candidate speech segment of the second target unit. The second candidate speech segment can be selected for speech synthesis based on the determined likelihood score. Speech corresponding to the received text can be generated using the selected second candidate speech segment.

Type: Application

Filed: December 7, 2015

Publication date: March 30, 2017

Inventor: Woojay JEON
Methods for creating and searching a database of speakers

Patent number: 8442823

Abstract: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.

Type: Grant

Filed: October 19, 2010

Date of Patent: May 14, 2013

Assignee: Motorola Solutions, Inc.

Inventors: Woojay Jeon, Yan-Ming Cheng, Changxue Ma, Dusan Macho
Automatic pattern recognition using category dependent feature selection

Patent number: 8380506

Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.

Type: Grant

Filed: November 29, 2007

Date of Patent: February 19, 2013

Assignee: Georgia Tech Research Corporation

Inventors: Woojay Jeon, Biing-Hwang Juang
METHODS FOR CREATING AND SEARCHING A DATABASE OF SPEAKERS

Publication number: 20120095764

Abstract: A method of performing a search of a database of speakers, includes: receiving a query speech sample spoken by a query speaker; deriving a query utterance from the query speech sample; extracting query utterance statistics from the query utterance; performing Kernelized Locality-Sensitive Hashing (KLSH) using a kernel function, the KLSH using as input the query utterance statistics and utterance statistics extracted from a plurality of utterances included in a database of speakers in order to select a subset of the plurality of utterances; and comparing, using an utterance comparison equation, the query utterance statistics to the utterance statistics for each utterance in the subset to generate a list of speakers from the database of utterances having a highest similarity to the query speaker.

Type: Application

Filed: October 19, 2010

Publication date: April 19, 2012

Applicant: MOTOROLA, INC.

Inventors: WOOJAY JEON, YAN-MING CHEN, CHANGXUE MA, DUSAN MACHO
Method and apparatus for best matching an audible query to a set of audible targets

Patent number: 8049093

Abstract: During operation, a “coarse search” stage applies variable-scale windowing on the query pitch contours to compare them with fixed-length segments of target pitch contours to find matching candidates while efficiently scanning over variable tempo differences and target locations. Because the target segments are of fixed-length, this has the effect of drastically reducing the storage space required in a prior-art method. Furthermore, by breaking the query contours into parts, rhythmic inconsistencies can be more flexibly handled. Normalization is also applied to the contours to allow comparisons independent of differences in musical key. In a “fine search” stage, a “segmental” dynamic time warping (DTW) method is applied that calculates a more accurate similarity score between the query and each candidate target with more explicit consideration toward rhythmic inconsistencies.

Type: Grant

Filed: December 30, 2009

Date of Patent: November 1, 2011

Assignee: Motorola Solutions, Inc.

Inventors: Woojay Jeon, Changxue Ma
METHOD AND APPARATUS FOR BEST MATCHING AN AUDIBLE QUERY TO A SET OF AUDIBLE TARGETS

Publication number: 20110154977

Abstract: During operation, a “coarse search” stage applies variable-scale windowing on the query pitch contours to compare them with fixed-length segments of target pitch contours to find matching candidates while efficiently scanning over variable tempo differences and target locations. Because the target segments are of fixed-length, this has the effect of drastically reducing the storage space required in a prior-art method. Furthermore, by breaking the query contours into parts, rhythmic inconsistencies can be more flexibly handled. Normalization is also applied to the contours to allow comparisons independent of differences in musical key. In a “fine search” stage, a “segmental” dynamic time warping (DTW) method is applied that calculates a more accurate similarity score between the query and each candidate target with more explicit consideration toward rhythmic inconsistencies.

Type: Application

Filed: December 30, 2009

Publication date: June 30, 2011

Applicant: MOTOROLA, INC.

Inventors: Woojay Jeon, Changxue Ma
Automatic pattern recognition using category dependent feature selection

Publication number: 20080147402

Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.

Type: Application

Filed: November 29, 2007

Publication date: June 19, 2008

Inventors: Woojay Jeon, Biing-Hwang Juang