Patents by Inventor Biing-Hwang Juang

Biing-Hwang Juang has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 8380506
    Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.
    Type: Grant
    Filed: November 29, 2007
    Date of Patent: February 19, 2013
    Assignee: Georgia Tech Research Corporation
    Inventors: Woojay Jeon, Biing-Hwang Juang
  • Patent number: 8290170
    Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).
    Type: Grant
    Filed: May 1, 2006
    Date of Patent: October 16, 2012
    Assignees: Nippon Telegraph and Telephone Corporation, Georgia Tech Research Corporation
    Inventors: Tomohiro Nakatani, Biing-Hwang Juang
  • Patent number: 8135860
    Abstract: A content interpolating web proxy server is configured in a computer network for processing retrieved web content so as to place it in a format suitable for presentation on a particular client device such as, e.g., a computer, personal digital assistant (PDA), wireless telephone or voice browser-equipped device. The server processes a client request generated by a client device to determine a particular client type associated with the client device, retrieves web content identified in the client request, retrieves one or more augmentation files associated with the web content and the particular client type, and alters the retrieved web content in accordance with the one or more augmentation files. The altered web content is then delivered to the client device. The one or more augmentation files may be co-located with the web content at a site remote from the proxy server, such that the content owner need not own, maintain or otherwise control the proxy server.
    Type: Grant
    Filed: July 20, 2000
    Date of Patent: March 13, 2012
    Assignee: Alcatel Lucent
    Inventors: Michael Kenneth Brown, Biing-Hwang Juang
  • Patent number: 8064969
    Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.
    Type: Grant
    Filed: August 15, 2003
    Date of Patent: November 22, 2011
    Assignee: Avaya Inc.
    Inventors: Eric J. Diethorn, Gary W. Elko, Biing-Hwang Juang, James E. West
  • Publication number: 20090110207
    Abstract: Speech dereverberation is achieved by accepting an observed signal for initialization (1000) and performing likelihood maximization (2000) which includes Fourier Transforms (4000).
    Type: Application
    Filed: May 1, 2006
    Publication date: April 30, 2009
    Applicants: NIPPON TELEGRAPH AND TELEPHONE COMPANY, GEORGIA TECH RESEARCH CORPORATION
    Inventors: Tomohiro Nakatani, Biing-Hwang Juang
  • Publication number: 20080147402
    Abstract: Disclosed are apparatus and methods that employ a modified version of a computational model of the human peripheral and central auditory system, and that provide for automatic pattern recognition using category dependent feature selection. The validity of the output of the model is examined by deriving feature vectors from the dimension expanded cortical response of the central auditory system for use in a conventional phoneme recognition task. In addition, the cortical response may be a place-coded data set where sounds are categorized according to the regions containing their most distinguishing features. This provides for a novel category-dependent feature selection apparatus and methods in which this mechanism may be utilized to better simulate robust human pattern (speech) recognition.
    Type: Application
    Filed: November 29, 2007
    Publication date: June 19, 2008
    Inventors: Woojay Jeon, Biing-Hwang Juang
  • Publication number: 20050071168
    Abstract: A method and apparatus are provided for authenticating a user using verbal information verification techniques. The user is challenged with one or more questions that the user has previously answered. A user's spoken utterances are first processed using automatic speech recognition techniques, and optionally utterance verification techniques. The recognized text that has been extracted from the user's spoken words is compared with the information recorded in a user profile corresponding to the answers provided by the user during the enrollment phase, using word spotting techniques. If the user's spoken answer is correct, the user may obtain access to a protected resource. If the user's spoken answer provided during verification deviates from the answer that was provided during enrollment, the disclosed verbal input verification server can still correctly recognize the answer.
    Type: Application
    Filed: September 29, 2003
    Publication date: March 31, 2005
    Inventors: Biing-Hwang Juang, Padma Ramesh
  • Publication number: 20050037782
    Abstract: The present invention is a desktop speakerphone having a base-station and a detachable microphone pod. The base-station includes standard telephone components, as well as a wireless receiver and a housing for a detachable microphone pod. The detachable pod contains at least one microphone and a wireless transmitter. When the pod is attached to the base-station, and the conference mode of operation is activated, the pod microphone's audio signal goes directly to base-station audio circuitry via a wired connection. When the pod is detached and the conference mode activated, the pod microphone's audio signal now goes via the pod's wireless transmitter to the base-station's wireless receiver. This detached, wireless mode allows the microphone to be positioned anywhere in the room, thereby improving the quality of transmitted speech by increasing the speech-signal-to-room-noise ratio, and lessening the potential for room echo by reducing the acoustic coupling between base-station loudspeaker and pod microphone.
    Type: Application
    Filed: August 15, 2003
    Publication date: February 17, 2005
    Inventors: Eric Diethorn, Gary Elko, Biing-Hwang Juang, James West
  • Patent number: 6715125
    Abstract: A repetitive transmission technique with time diversity which provides improved signal-to-noise ratio (SNR) in the presence of packet loss. Time shifts are introduced between N versions of a particular block of information to be transmitted, and the time-shifted versions are encoded in a set of N encoders and transmitted as N packets. The time shift introduced between a given pair of the N versions corresponds to approximately 1/N of the time duration of a particular one of the versions. The SNR of a composite reconstructed signal generated from the N packets with the introduced time shift in a receiver of the system is approximately the same as would be obtained using a set of N independent encoders to generate the plurality of packets without the introduced time shifts. The gain in the SNR of the composite reconstructed signal attributable to the introduction of the time shifts is 10 log10N′, where N′=1, . . .
    Type: Grant
    Filed: October 18, 1999
    Date of Patent: March 30, 2004
    Assignee: Agere Systems Inc.
    Inventor: Biing-Hwang Juang
  • Publication number: 20030225719
    Abstract: Techniques for fast and robust data object classifier training are described. A process of classifier training creates a set of Gaussian mixture models, one model for each class to which data objects are to be assigned. Initial estimates of model parameters are made using training data. The model parameters are then optimized to maximize an aggregate a posteriori probability that data objects in the set of training data will be correctly classified. Optimization of parameters for each model is performed through the process of a number of iterations in which the closed form solutions are computed for the model parameters of each model, the model performance is tested to determine if the newly computed parameters improve the model performance and the model is updated with the newly computed parameters if performance has improved. At each new iteration, the parameters computed in the previous iteration are used as initial estimates.
    Type: Application
    Filed: May 31, 2002
    Publication date: December 4, 2003
    Applicant: Lucent Technologies, Inc.
    Inventors: Biing-Hwang Juang, Qi P. Li
  • Publication number: 20030171932
    Abstract: A method and apparatus for automatically controlling the operation of a speech recognition system without requiring unusual or unnatural activity of the speaker by passively determining if received sound is speech of the user before activating the speech recognition system. A video camera and microphone are located in a hand-held device. The video camera records a video image of the speaker's face, i.e., of speech articulators of the user such as the lips and/or mouth. The recorded characteristics of the articulators are analyzed to identify the sound that the articulators would be expected to make, as in “lip reading”. A microphone concurrently records the acoustic properties of received sound proximate the user. The recorded acoustic properties of the received sound are then compared to the characteristics of speech that would be expected to be generated by the recorded speech articulators to determine whether they match.
    Type: Application
    Filed: March 7, 2002
    Publication date: September 11, 2003
    Inventors: Biing-Hwang Juang, Jialin Zhong
  • Patent number: 6076053
    Abstract: A speech recognition method comprises the steps of using given speech data and the N-best algorithm to generate alternative pronunciations and then merging the obtained pronunciations into a pronunciation networks structure; using additional parameters to characterize a pronunciation network for a particular word; optimizing the parameters of the pronunciation networks using a minimum classification error criterion that maximizes a discrimination between different pronunciation networks; and adapting parameters of the pronunciation networks by, first, adjusting probabilities of the possible pronunciations that may be generated by the pronunciation network for a word claimed to be a true one and, second, to correct weights for all of the pronunciation networks by using the adjusted probabilities.
    Type: Grant
    Filed: May 21, 1998
    Date of Patent: June 13, 2000
    Assignee: Lucent Technologies Inc.
    Inventors: Biing-Hwang Juang, Filipp E. Korkmazskiy
  • Patent number: 5812972
    Abstract: The present invention provides a speech recognizer that creates and updates the equalization vector as input speech is provided to the recognizer. The present invention includes a speech analyzer which transforms an input speech signal into a series of feature vectors or observation sequence. Each feature vector is then provided to a speech recognizer which modifies the feature vector by subtracting a previously determined equalization vector therefrom. The recognizer then performs segmentation and matches the modified feature vector to a stored model vector which is defined as the segmentation vector. The recognizer then, from time to time, determines a new equalization vector, the new equalization vector being defined based on the difference between one or more input feature vectors and their respective segmentation vectors.
    Type: Grant
    Filed: December 30, 1994
    Date of Patent: September 22, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Biing-Hwang Juang, David Mansour, Jay Gordon Wilpon
  • Patent number: 5805772
    Abstract: Disclosed are systems, methods and articles of manufacture for performing high resolution N-best string hypothesization during speech recognition. A received input signal, representing a speech utterance, is processed utilizing a plurality of recognition models to generate one or more string hypotheses of the received input signal. The plurality of recognition models preferably include one or more inter-word context dependent models and one or more language models. A forward partial path map is produced according to the allophonic specifications of at least one of the inter-word context dependent models and the language models. The forward partial path map is traversed in the backward direction as a function of the allophonic specifications to generate the one or more string hypotheses. One or more of the recognition models may represent one phone words.
    Type: Grant
    Filed: December 30, 1994
    Date of Patent: September 8, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Tatsuo Matsuoka
  • Patent number: 5797123
    Abstract: A key-phrase detection and verification method that can be advantageously used to realize understanding of flexible (i.e., unconstrained) speech. A "multiple pass" procedure is applied to a spoken utterance comprising a sequence of words (i.e., a "sentence"). First, a plurality of key-phrases are detected (i.e., recognized) based on a set of phrase sub-grammars which may, for example, be specific to the state of the dialogue. These key-phrases are then verified by assigning confidence measures thereto and comparing these confidence measures to a threshold, resulting in a set of verified key-phrase candidates. Next, the verified key-phrase candidates are connected into sentence hypotheses based upon the confidence measures and predetermined (e.g., task-specific) semantic information. And, finally, one or more of these sentence hypotheses are verified to produce a verified sentence hypothesis and, from that, a resultant understanding of the spoken utterance.
    Type: Grant
    Filed: December 20, 1996
    Date of Patent: August 18, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Wu Chou, Biing-Hwang Juang, Tatsuya Kawahara, Chin-Hui Lee
  • Patent number: 5781887
    Abstract: A method for revising at least a portion of a sequence of speech data segments recognized by an automated speech recognition system. A user is prompted to vocalize the speech data segments sequentially, one speech data segment at a time. When each speech data segment is recognized it is stored as a data element and a confirmation of recognition is issued to the user. The user may then issue a verbal command to delete the last recognized data element if the confirmation indicates that a recognition error has occurred, and then repeat the last speech data element for a second recognition attempt. The user may also issue another verbal command to delete all thus-far recognized data elements in the sequence and to restart the recognition process from the beginning. If no such verbal commands are issued by the user, then the user may continue to vocalize the next sequential speech data segment.
    Type: Grant
    Filed: October 9, 1996
    Date of Patent: July 14, 1998
    Assignee: Lucent Technologies Inc.
    Inventor: Biing-Hwang Juang
  • Patent number: 5737489
    Abstract: In a speech recognition system, a recognition processor receives an unknown utterance signal as input. The recognition processor in response to the unknown utterance signal input accesses a recognition database and scores the utterance signal against recognition models in the recognition database to classify the unknown utterance and to generate a hypothesis speech signal. A verification processor receives the hypothesis speech signal as input to be verified. The verification processor accesses a verification database to test the hypothesis speech signal against verification models reflecting a preselected type of training stored in the verification database. Based on the verification test, the verification processor generates a confidence measure signal. The confidence measure signal can be compared against a verification threshold to determine the accuracy of the recognition decision made by the recognition processor.
    Type: Grant
    Filed: September 15, 1995
    Date of Patent: April 7, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee, Mazin G. Rahim
  • Patent number: 5710864
    Abstract: Systems, methods and articles of manufacture are provided for adjusting the parameters of ones of a plurality of recognition models. The recognition models collectively represent a vocabulary. The recognition models are utilized to identify a known word represented within a received input signal. The received input signal may include within vocabulary and out of vocabulary words. An output signal representing a confidence measure corresponding to the relative accuracy of the identity of the known word is generated. Particular ones of the plurality of recognition models are adjusted as a function of the output signal to improve the confidence measure. The systems, methods and articles of manufacture are preferably implemented in accordance with discriminative techniques, and the adjustment process is used during either a preferred training or a recognition mode.
    Type: Grant
    Filed: December 29, 1994
    Date of Patent: January 20, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Biing-Hwang Juang, Chin-Hui Lee, Richard Cameron Rose
  • Patent number: 5675704
    Abstract: A facility is provided for allowing a caller to place a telephone call by merely uttering a label identifying a desired called destination and to charge the telephone call to a particular billing account by merely uttering a label identifying that account. Alternatively, the caller may place the call by dialing or uttering the telephone number of the called destination or by entering a speed dial code associated with that telephone number. The facility includes a speaker verification system which employs cohort normalized scoring. Cohort normalized scoring provides a dynamic threshold for the verification process making the process more robust to variation in training and verification utterences. Such variation may be caused by, e.g., changes in communication channel characteristics or speaker loudness level.
    Type: Grant
    Filed: April 26, 1996
    Date of Patent: October 7, 1997
    Assignee: Lucent Technologies Inc.
    Inventors: Biing-Hwang Juang, Chin-Hui Lee, Aaron Edward Rosenberg, Frank Kao-Ping Soong
  • Patent number: 5606644
    Abstract: A method of making a speech recognition model database is disclosed. The database is formed based on a training string utterance signal and a plurality of sets of current speech recognition models. The sets of current speech recognition models may include acoustic models, language models, and other knowledge sources. In accordance with an illustrative embodiment of the invention, a set of confusable string models is generated, each confusable string model comprising speech recognition models from two or more sets of speech recognition models (such as acoustic and language models). A first scoring signal is generated based on the training string utterance signal and a string model for that utterance, wherein the string model for the utterance comprises speech recognition models from two or more sets of speech recognition models. One or more second scoring signals are also generated, wherein a second scoring signal is based on the training string utterance signal and a confusable string model.
    Type: Grant
    Filed: April 26, 1996
    Date of Patent: February 25, 1997
    Assignee: Lucent Technologies Inc.
    Inventors: Wu Chou, Biing-Hwang Juang, Chin-Hui Lee