Patents by Inventor Biing-Hwang Juang
Biing-Hwang Juang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 8380506Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.Type: GrantFiled: November 29, 2007Date of Patent: February 19, 2013Assignee: Georgia Tech Research CorporationInventors: Woojay Jeon, Biing-Hwang Juang
-
Patent number: 8290170Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).Type: GrantFiled: May 1, 2006Date of Patent: October 16, 2012Assignees: Nippon Telegraph and Telephone Corporation, Georgia Tech Research CorporationInventors: Tomohiro Nakatani, Biing-Hwang Juang
-
Patent number: 8135860Abstract: A content interpolating web proxy server is configured in a computer network for processing retrieved web content so as to place it in a format suitable for presentation on a particular client device such as, e.g., a computer, personal digital assistant (PDA), wireless telephone or voice browser-equipped device. The server processes a client request generated by a client device to determine a particular client type associated with the client device, retrieves web content identified in the client request, retrieves one or more augmentation files associated with the web content and the particular client type, and alters the retrieved web content in accordance with the one or more augmentation files. The altered web content is then delivered to the client device. The one or more augmentation files may be co-located with the web content at a site remote from the proxy server, such that the content owner need not own, maintain or otherwise control the proxy server.Type: GrantFiled: July 20, 2000Date of Patent: March 13, 2012Assignee: Alcatel LucentInventors: Michael Kenneth Brown, Biing-Hwang Juang
-
Patent number: 8064969Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.Type: GrantFiled: August 15, 2003Date of Patent: November 22, 2011Assignee: Avaya Inc.Inventors: Eric J. Diethorn, Gary W. Elko, Biing-Hwang Juang, James E. West
-
Publication number: 20090110207Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).Type: ApplicationFiled: May 1, 2006Publication date: April 30, 2009Applicants: NIPPON TELEGRAPH AND TELEPHONE COMPANY, GEORGIA TECH RESEARCH CORPORATIONInventors: Tomohiro Nakatani, Biing-Hwang Juang
-
Publication number: 20080147402Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.Type: ApplicationFiled: November 29, 2007Publication date: June 19, 2008Inventors: Woojay Jeon, Biing-Hwang Juang
-
Publication number: 20050071168Abstract: A method and apparatus are provided for authenticating a user using verbal information verification techniques. The user is challenged with one or more questions that the user has previously answered. A user's spoken utterances are first processed using automatic speech recognition techniques, and optionally utterance verification techniques. The recognized text that has been extracted from the user's spoken words is compared with the information recorded in a user profile corresponding to the answers provided by the user during the enrollment phase, using word spotting techniques. If the user's spoken answer is correct, the user may obtain access to a protected resource. If the user's spoken answer provided during verification deviates from the answer that was provided during enrollment, the disclosed verbal input verification server can still correctly recognize the answer.Type: ApplicationFiled: September 29, 2003Publication date: March 31, 2005Inventors: Biing-Hwang Juang, Padma Ramesh
-
Publication number: 20050037782Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.Type: ApplicationFiled: August 15, 2003Publication date: February 17, 2005Inventors: Eric Diethorn, Gary Elko, Biing-Hwang Juang, James West
-
Patent number: 6715125Abstract: A repetitive transmission technique with time diversity which provides improved signal-to-noise ratio (SNR) in the presence of packet loss. Time shifts are introduced between N versions of a particular block of information to be transmitted, and the time-shifted versions are encoded in a set of N encoders and transmitted as N packets. The time shift introduced between a given pair of the N versions corresponds to approximately 1/N of the time duration of a particular one of the versions. The SNR of a composite reconstructed signal generated from the N packets with the introduced time shift in a receiver of the system is approximately the same as would be obtained using a set of N independent encoders to generate the plurality of packets without the introduced time shifts. The gain in the SNR of the composite reconstructed signal attributable to the introduction of the time shifts is 10 log10N′, where N′=1, . . .Type: GrantFiled: October 18, 1999Date of Patent: March 30, 2004Assignee: Agere Systems Inc.Inventor: Biing-Hwang Juang
-
Publication number: 20030225719Abstract: Techniques for fast and robust data object classifier training are described. A process of classifier training creates a set of Gaussian mixture models, one model for each class to which data objects are to be assigned. Initial estimates of model parameters are made using training data. The model parameters are then optimized to maximize an aggregate a posteriori probability that data objects in the set of training data will be correctly classified. Optimization of parameters for each model is performed through the process of a number of iterations in which the closed form solutions are computed for the model parameters of each model, the model performance is tested to determine if the newly computed parameters improve the model performance and the model is updated with the newly computed parameters if performance has improved. At each new iteration, the parameters computed in the previous iteration are used as initial estimates.Type: ApplicationFiled: May 31, 2002Publication date: December 4, 2003Applicant: Lucent Technologies, Inc.Inventors: Biing-Hwang Juang, Qi P. Li
-
Publication number: 20030171932Abstract: A method and apparatus for automatically controlling the operation of a speech recognition system without requiring unusual or unnatural activity of the speaker by passively determining if received sound is speech of the user before activating the speech recognition system. A video camera and microphone are located in a hand-held device. The video camera records a video image of the speaker's face, i.e., of speech articulators of the user such as the lips and/or mouth. The recorded characteristics of the articulators are analyzed to identify the sound that the articulators would be expected to make, as in “lip reading”. A microphone concurrently records the acoustic properties of received sound proximate the user. The recorded acoustic properties of the received sound are then compared to the characteristics of speech that would be expected to be generated by the recorded speech articulators to determine whether they match.Type: ApplicationFiled: March 7, 2002Publication date: September 11, 2003Inventors: Biing-Hwang Juang, Jialin Zhong
-
Patent number: 6076053Abstract: A speech recognition method comprises the steps of using given speech data and the N-best algorithm to generate alternative pronunciations and then merging the obtained pronunciations into a pronunciation networks structure; using additional parameters to characterize a pronunciation network for a particular word; optimizing the parameters of the pronunciation networks using a minimum classification error criterion that maximizes a discrimination between different pronunciation networks; and adapting parameters of the pronunciation networks by, first, adjusting probabilities of the possible pronunciations that may be generated by the pronunciation network for a word claimed to be a true one and, second, to correct weights for all of the pronunciation networks by using the adjusted probabilities.Type: GrantFiled: May 21, 1998Date of Patent: June 13, 2000Assignee: Lucent Technologies Inc.Inventors: Biing-Hwang Juang, Filipp E. Korkmazskiy
-
Patent number: 5812972Abstract: The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors.Type: GrantFiled: December 30, 1994Date of Patent: September 22, 1998Assignee: Lucent Technologies Inc.Inventors: Biing-Hwang Juang, David Mansour, Jay Gordon Wilpon
-
Patent number: 5805772Abstract: Disclosed are systems, methods and articles of manufacture for performing high resolution N-best string hypothesization during speech recognition. A received input signal, representing a speech utterance, is processed utilizing a plurality of recognition models to generate one or more string hypotheses of the received input signal. The plurality of recognition models preferably include one or more inter-word context dependent models and one or more language models. A forward partial path map is produced according to the allophonic specifications of at least one of the inter-word context dependent models and the language models. The forward partial path map is traversed in the backward direction as a function of the allophonic specifications to generate the one or more string hypotheses. One or more of the recognition models may represent one phone words.Type: GrantFiled: December 30, 1994Date of Patent: September 8, 1998Assignee: Lucent Technologies Inc.Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Tatsuo Matsuoka
-
Patent number: 5797123Abstract: A key-phrase detection and verification method that can be advantageously used to realize understanding of flexible (i.e., unconstrained) speech. A "multiple pass" procedure is applied to a spoken utterance comprising a sequence of words (i.e., a "sentence"). First, a plurality of key-phrases are detected (i.e., recognized) based on a set of phrase sub-grammars which may, for example, be specific to the state of the dialogue. These key-phrases are then verified by assigning confidence measures thereto and comparing these confidence measures to a threshold, resulting in a set of verified key-phrase candidates. Next, the verified key-phrase candidates are connected into sentence hypotheses based upon the confidence measures and predetermined (e.g., task-specific) semantic information. And, finally, one or more of these sentence hypotheses are verified to produce a verified sentence hypothesis and, from that, a resultant understanding of the spoken utterance.Type: GrantFiled: December 20, 1996Date of Patent: August 18, 1998Assignee: Lucent Technologies Inc.Inventors: Wu Chou, Biing-Hwang Juang, Tatsuya Kawahara, Chin-Hui Lee
-
Patent number: 5781887Abstract: A method for revising at least a portion of a sequence of speech data segments recognized by an automated speech recognition system. A user is prompted to vocalize the speech data segments sequentially, one speech data segment at a time. When each speech data segment is recognized it is stored as a data element and a confirmation of recognition is issued to the user. The user may then issue a verbal command to delete the last recognized data element if the confirmation indicates that a recognition error has occurred, and then repeat the last speech data element for a second recognition attempt. The user may also issue another verbal command to delete all thus-far recognized data elements in the sequence and to restart the recognition process from the beginning. If no such verbal commands are issued by the user, then the user may continue to vocalize the next sequential speech data segment.Type: GrantFiled: October 9, 1996Date of Patent: July 14, 1998Assignee: Lucent Technologies Inc.Inventor: Biing-Hwang Juang
-
Patent number: 5737489Abstract: In a speech recognition system, a recognition processor receives an unknown utterance signal as input. The recognition processor in response to the unknown utterance signal input accesses a recognition database and scores the utterance signal against recognition models in the recognition database to classify the unknown utterance and to generate a hypothesis speech signal. A verification processor receives the hypothesis speech signal as input to be verified. The verification processor accesses a verification database to test the hypothesis speech signal against verification models reflecting a preselected type of training stored in the verification database. Based on the verification test, the verification processor generates a confidence measure signal. The confidence measure signal can be compared against a verification threshold to determine the accuracy of the recognition decision made by the recognition processor.Type: GrantFiled: September 15, 1995Date of Patent: April 7, 1998Assignee: Lucent Technologies Inc.Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Mazin G. Rahim
-
Patent number: 5710864Abstract: Systems, methods and articles of manufacture are provided for adjusting the parameters of ones of a plurality of recognition models. The recognition models collectively represent a vocabulary. The recognition models are utilized to identify a known word represented within a received input signal. The received input signal may include within vocabulary and out of vocabulary words. An output signal representing a confidence measure corresponding to the relative accuracy of the identity of the known word is generated. Particular ones of the plurality of recognition models are adjusted as a function of the output signal to improve the confidence measure. The systems, methods and articles of manufacture are preferably implemented in accordance with discriminative techniques, and the adjustment process is used during either a preferred training or a recognition mode.Type: GrantFiled: December 29, 1994Date of Patent: January 20, 1998Assignee: Lucent Technologies Inc.Inventors: Biing-Hwang Juang, Chin-Hui Lee, Richard Cameron Rose
-
Patent number: 5675704Abstract: A facility is provided for allowing a caller to place a telephone call by merely uttering a label identifying a desired called destination and to charge the telephone call to a particular billing account by merely uttering a label identifying that account. Alternatively, the caller may place the call by dialing or uttering the telephone number of the called destination or by entering a speed dial code associated with that telephone number. The facility includes a speaker verification system which employs cohort normalized scoring. Cohort normalized scoring provides a dynamic threshold for the verification process making the process more robust to variation in training and verification utterences. Such variation may be caused by, e.g., changes in communication channel characteristics or speaker loudness level.Type: GrantFiled: April 26, 1996Date of Patent: October 7, 1997Assignee: Lucent Technologies Inc.Inventors: Biing-Hwang Juang, Chin-Hui Lee, Aaron Edward Rosenberg, Frank Kao-Ping Soong
-
Patent number: 5606644Abstract: A method of making a speech recognition model database is disclosed. The database is formed based on a training string utterance signal and a plurality of sets of current speech recognition models. The sets of current speech recognition models may include acoustic models, language models, and other knowledge sources. In accordance with an illustrative embodiment of the invention, a set of confusable string models is generated, each confusable string model comprising speech recognition models from two or more sets of speech recognition models (such as acoustic and language models). A first scoring signal is generated based on the training string utterance signal and a string model for that utterance, wherein the string model for the utterance comprises speech recognition models from two or more sets of speech recognition models. One or more second scoring signals are also generated, wherein a second scoring signal is based on the training string utterance signal and a confusable string model.Type: GrantFiled: April 26, 1996Date of Patent: February 25, 1997Assignee: Lucent Technologies Inc.Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee