Patents by Inventor Biing-Hwang Juang

Biing-Hwang Juang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Automatic pattern recognition using category dependent feature selection

Patent number: 8380506

Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.

Type: Grant

Filed: November 29, 2007

Date of Patent: February 19, 2013

Assignee: Georgia Tech Research Corporation

Inventors: Woojay Jeon, Biing-Hwang Juang
Method and apparatus for speech dereverberation based on probabilistic models of source and room acoustics

Patent number: 8290170

Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).

Type: Grant

Filed: May 1, 2006

Date of Patent: October 16, 2012

Assignees: Nippon Telegraph and Telephone Corporation, Georgia Tech Research Corporation

Inventors: Tomohiro Nakatani, Biing-Hwang Juang
Content interpolating web proxy server

Patent number: 8135860

Abstract: A content interpolating web proxy server is configured in a computer network for processing retrieved web content so as to place it in a format suitable for presentation on a particular client device such as, e.g., a computer, personal digital assistant (PDA), wireless telephone or voice browser-equipped device. The server processes a client request generated by a client device to determine a particular client type associated with the client device, retrieves web content identified in the client request, retrieves one or more augmentation files associated with the web content and the particular client type, and alters the retrieved web content in accordance with the one or more augmentation files. The altered web content is then delivered to the client device. The one or more augmentation files may be co-located with the web content at a site remote from the proxy server, such that the content owner need not own, maintain or otherwise control the proxy server.

Type: Grant

Filed: July 20, 2000

Date of Patent: March 13, 2012

Assignee: Alcatel Lucent

Inventors: Michael Kenneth Brown, Biing-Hwang Juang
Method and apparatus for combined wired/wireless pop-out speakerphone microphone

Patent number: 8064969

Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.

Type: Grant

Filed: August 15, 2003

Date of Patent: November 22, 2011

Assignee: Avaya Inc.

Inventors: Eric J. Diethorn, Gary W. Elko, Biing-Hwang Juang, James E. West
Method and Apparatus for Speech Dereverberation Based On Probabilistic Models Of Source And Room Acoustics

Publication number: 20090110207

Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).

Type: Application

Filed: May 1, 2006

Publication date: April 30, 2009

Applicants: NIPPON TELEGRAPH AND TELEPHONE COMPANY, GEORGIA TECH RESEARCH CORPORATION

Inventors: Tomohiro Nakatani, Biing-Hwang Juang
Automatic pattern recognition using category dependent feature selection

Publication number: 20080147402

Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.

Type: Application

Filed: November 29, 2007

Publication date: June 19, 2008

Inventors: Woojay Jeon, Biing-Hwang Juang
Method and apparatus for authenticating a user using verbal information verification

Publication number: 20050071168

Abstract: A method and apparatus are provided for authenticating a user using verbal information verification techniques. The user is challenged with one or more questions that the user has previously answered. A user's spoken utterances are first processed using automatic speech recognition techniques, and optionally utterance verification techniques. The recognized text that has been extracted from the user's spoken words is compared with the information recorded in a user profile corresponding to the answers provided by the user during the enrollment phase, using word spotting techniques. If the user's spoken answer is correct, the user may obtain access to a protected resource. If the user's spoken answer provided during verification deviates from the answer that was provided during enrollment, the disclosed verbal input verification server can still correctly recognize the answer.

Type: Application

Filed: September 29, 2003

Publication date: March 31, 2005

Inventors: Biing-Hwang Juang, Padma Ramesh
Method and apparatus for combined wired/wireless pop-out speakerphone microphone

Publication number: 20050037782

Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.

Type: Application

Filed: August 15, 2003

Publication date: February 17, 2005

Inventors: Eric Diethorn, Gary Elko, Biing-Hwang Juang, James West
Source coding and transmission with time diversity

Patent number: 6715125

Abstract: A repetitive transmission technique with time diversity which provides improved signal-to-noise ratio (SNR) in the presence of packet loss. Time shifts are introduced between N versions of a particular block of information to be transmitted, and the time-shifted versions are encoded in a set of N encoders and transmitted as N packets. The time shift introduced between a given pair of the N versions corresponds to approximately 1/N of the time duration of a particular one of the versions. The SNR of a composite reconstructed signal generated from the N packets with the introduced time shift in a receiver of the system is approximately the same as would be obtained using a set of N independent encoders to generate the plurality of packets without the introduced time shifts. The gain in the SNR of the composite reconstructed signal attributable to the introduction of the time shifts is 10 log10N′, where N′=1, . . .

Type: Grant

Filed: October 18, 1999

Date of Patent: March 30, 2004

Assignee: Agere Systems Inc.

Inventor: Biing-Hwang Juang
Methods and apparatus for fast and robust model training for object classification

Publication number: 20030225719

Abstract: Techniques for fast and robust data object classifier training are described. A process of classifier training creates a set of Gaussian mixture models, one model for each class to which data objects are to be assigned. Initial estimates of model parameters are made using training data. The model parameters are then optimized to maximize an aggregate a posteriori probability that data objects in the set of training data will be correctly classified. Optimization of parameters for each model is performed through the process of a number of iterations in which the closed form solutions are computed for the model parameters of each model, the model performance is tested to determine if the newly computed parameters improve the model performance and the model is updated with the newly computed parameters if performance has improved. At each new iteration, the parameters computed in the previous iteration are used as initial estimates.

Type: Application

Filed: May 31, 2002

Publication date: December 4, 2003

Applicant: Lucent Technologies, Inc.

Inventors: Biing-Hwang Juang, Qi P. Li
Speech recognition

Publication number: 20030171932

Abstract: A method and apparatus for automatically controlling the operation of a speech recognition system without requiring unusual or unnatural activity of the speaker by passively determining if received sound is speech of the user before activating the speech recognition system. A video camera and microphone are located in a hand-held device. The video camera records a video image of the speaker's face, i.e., of speech articulators of the user such as the lips and/or mouth. The recorded characteristics of the articulators are analyzed to identify the sound that the articulators would be expected to make, as in “lip reading”. A microphone concurrently records the acoustic properties of received sound proximate the user. The recorded acoustic properties of the received sound are then compared to the characteristics of speech that would be expected to be generated by the recorded speech articulators to determine whether they match.

Type: Application

Filed: March 7, 2002

Publication date: September 11, 2003

Inventors: Biing-Hwang Juang, Jialin Zhong
Methods and apparatus for discriminative training and adaptation of pronunciation networks

Patent number: 6076053

Abstract: A speech recognition method comprises the steps of using given speech data and the N-best algorithm to generate alternative pronunciations and then merging the obtained pronunciations into a pronunciation networks structure; using additional parameters to characterize a pronunciation network for a particular word; optimizing the parameters of the pronunciation networks using a minimum classification error criterion that maximizes a discrimination between different pronunciation networks; and adapting parameters of the pronunciation networks by, first, adjusting probabilities of the possible pronunciations that may be generated by the pronunciation network for a word claimed to be a true one and, second, to correct weights for all of the pronunciation networks by using the adjusted probabilities.

Type: Grant

Filed: May 21, 1998

Date of Patent: June 13, 2000

Assignee: Lucent Technologies Inc.

Inventors: Biing-Hwang Juang, Filipp E. Korkmazskiy
Adaptive decision directed speech recognition bias equalization method and apparatus

Patent number: 5812972

Abstract: The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors.

Type: Grant

Filed: December 30, 1994

Date of Patent: September 22, 1998

Assignee: Lucent Technologies Inc.

Inventors: Biing-Hwang Juang, David Mansour, Jay Gordon Wilpon
Systems, methods and articles of manufacture for performing high resolution N-best string hypothesization

Patent number: 5805772

Abstract: Disclosed are systems, methods and articles of manufacture for performing high resolution N-best string hypothesization during speech recognition. A received input signal, representing a speech utterance, is processed utilizing a plurality of recognition models to generate one or more string hypotheses of the received input signal. The plurality of recognition models preferably include one or more inter-word context dependent models and one or more language models. A forward partial path map is produced according to the allophonic specifications of at least one of the inter-word context dependent models and the language models. The forward partial path map is traversed in the backward direction as a function of the allophonic specifications to generate the one or more string hypotheses. One or more of the recognition models may represent one phone words.

Type: Grant

Filed: December 30, 1994

Date of Patent: September 8, 1998

Assignee: Lucent Technologies Inc.

Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Tatsuo Matsuoka
Method of key-phase detection and verification for flexible speech understanding

Patent number: 5797123

Abstract: A key-phrase detection and verification method that can be advantageously used to realize understanding of flexible (i.e., unconstrained) speech. A "multiple pass" procedure is applied to a spoken utterance comprising a sequence of words (i.e., a "sentence"). First, a plurality of key-phrases are detected (i.e., recognized) based on a set of phrase sub-grammars which may, for example, be specific to the state of the dialogue. These key-phrases are then verified by assigning confidence measures thereto and comparing these confidence measures to a threshold, resulting in a set of verified key-phrase candidates. Next, the verified key-phrase candidates are connected into sentence hypotheses based upon the confidence measures and predetermined (e.g., task-specific) semantic information. And, finally, one or more of these sentence hypotheses are verified to produce a verified sentence hypothesis and, from that, a resultant understanding of the spoken utterance.

Type: Grant

Filed: December 20, 1996

Date of Patent: August 18, 1998

Assignee: Lucent Technologies Inc.

Inventors: Wu Chou, Biing-Hwang Juang, Tatsuya Kawahara, Chin-Hui Lee
Speech recognition method with error reset commands

Patent number: 5781887

Abstract: A method for revising at least a portion of a sequence of speech data segments recognized by an automated speech recognition system. A user is prompted to vocalize the speech data segments sequentially, one speech data segment at a time. When each speech data segment is recognized it is stored as a data element and a confirmation of recognition is issued to the user. The user may then issue a verbal command to delete the last recognized data element if the confirmation indicates that a recognition error has occurred, and then repeat the last speech data element for a second recognition attempt. The user may also issue another verbal command to delete all thus-far recognized data elements in the sequence and to restart the recognition process from the beginning. If no such verbal commands are issued by the user, then the user may continue to vocalize the next sequential speech data segment.

Type: Grant

Filed: October 9, 1996

Date of Patent: July 14, 1998

Assignee: Lucent Technologies Inc.

Inventor: Biing-Hwang Juang
Discriminative utterance verification for connected digits recognition

Patent number: 5737489

Abstract: In a speech recognition system, a recognition processor receives an unknown utterance signal as input. The recognition processor in response to the unknown utterance signal input accesses a recognition database and scores the utterance signal against recognition models in the recognition database to classify the unknown utterance and to generate a hypothesis speech signal. A verification processor receives the hypothesis speech signal as input to be verified. The verification processor accesses a verification database to test the hypothesis speech signal against verification models reflecting a preselected type of training stored in the verification database. Based on the verification test, the verification processor generates a confidence measure signal. The confidence measure signal can be compared against a verification threshold to determine the accuracy of the recognition decision made by the recognition processor.

Type: Grant

Filed: September 15, 1995

Date of Patent: April 7, 1998

Assignee: Lucent Technologies Inc.

Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Mazin G. Rahim
Systems, methods and articles of manufacture for improving recognition confidence in hypothesized keywords

Patent number: 5710864

Abstract: Systems, methods and articles of manufacture are provided for adjusting the parameters of ones of a plurality of recognition models. The recognition models collectively represent a vocabulary. The recognition models are utilized to identify a known word represented within a received input signal. The received input signal may include within vocabulary and out of vocabulary words. An output signal representing a confidence measure corresponding to the relative accuracy of the identity of the known word is generated. Particular ones of the plurality of recognition models are adjusted as a function of the output signal to improve the confidence measure. The systems, methods and articles of manufacture are preferably implemented in accordance with discriminative techniques, and the adjustment process is used during either a preferred training or a recognition mode.

Type: Grant

Filed: December 29, 1994

Date of Patent: January 20, 1998

Assignee: Lucent Technologies Inc.

Inventors: Biing-Hwang Juang, Chin-Hui Lee, Richard Cameron Rose
Speaker verification with cohort normalized scoring

Patent number: 5675704

Abstract: A facility is provided for allowing a caller to place a telephone call by merely uttering a label identifying a desired called destination and to charge the telephone call to a particular billing account by merely uttering a label identifying that account. Alternatively, the caller may place the call by dialing or uttering the telephone number of the called destination or by entering a speed dial code associated with that telephone number. The facility includes a speaker verification system which employs cohort normalized scoring. Cohort normalized scoring provides a dynamic threshold for the verification process making the process more robust to variation in training and verification utterences. Such variation may be caused by, e.g., changes in communication channel characteristics or speaker loudness level.

Type: Grant

Filed: April 26, 1996

Date of Patent: October 7, 1997

Assignee: Lucent Technologies Inc.

Inventors: Biing-Hwang Juang, Chin-Hui Lee, Aaron Edward Rosenberg, Frank Kao-Ping Soong
Minimum error rate training of combined string models

Patent number: 5606644

Abstract: A method of making a speech recognition model database is disclosed. The database is formed based on a training string utterance signal and a plurality of sets of current speech recognition models. The sets of current speech recognition models may include acoustic models, language models, and other knowledge sources. In accordance with an illustrative embodiment of the invention, a set of confusable string models is generated, each confusable string model comprising speech recognition models from two or more sets of speech recognition models (such as acoustic and language models). A first scoring signal is generated based on the training string utterance signal and a string model for that utterance, wherein the string model for the utterance comprises speech recognition models from two or more sets of speech recognition models. One or more second scoring signals are also generated, wherein a second scoring signal is based on the training string utterance signal and a confusable string model.

Type: Grant

Filed: April 26, 1996

Date of Patent: February 25, 1997

Assignee: Lucent Technologies Inc.

Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee

1 2 next