Patents by Inventor Ruxin Chen

Ruxin Chen has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20100121640
    Abstract: The present invention relates to a method for modeling a common-language speech recognition, by a computer, under the influence of multiple dialects and concerns a technical field of speech recognition by a computer. In this method, a triphone standard common-language model is first generated based on training data of standard common language, and first and second monophone dialectal-accented common-language models are based on development data of dialectal-accented common languages of first kind and second kind, respectively. Then a temporary merged model is obtained in a manner that the first dialectal-accented common-language model is merged into the standard common-language model according to a first confusion matrix obtained by recognizing the development data of first dialectal-accented common language using the standard common-language model.
    Type: Application
    Filed: October 29, 2009
    Publication date: May 13, 2010
    Applicants: SONY COMPUTER ENTERTAINMENT INC., TSINGHUA UNIVERSITY
    Inventors: Fang Zheng, Xi Xiao, Linquan Liu, Zhan You, Wenxiao Cao, Makoto Akabane, Ruxin Chen, Yoshikazu Takahashi
  • Publication number: 20090252344
    Abstract: An audio headset may comprise a case, near field microphone and far field microphone. A speaker, processor, memory, battery, charging interface and cradle detection circuit may be mounted to the case. Processor-executable instructions embodied in the memory, may be configured to implement a battery charging method. The headset may be shut off in response to placement of the headset in a charging cradle. The far-field microphone is turned on but not the near-field microphone. The battery may then be charged from the cradle. A headset having near-field and far-field microphones may be used to distinguish between user speech and competing sounds by generating signals from the sounds detected by each microphone and comparing the strengths of the signals. The signals may be processed as user speech if they are of comparable strength. Otherwise, the near-field signal may be processed as user speech and the far-field signal as competing sounds.
    Type: Application
    Filed: April 7, 2008
    Publication date: October 8, 2009
    Applicant: Sony Computer Entertainment Inc.
    Inventors: Xiadong Mao, Ruxin Chen, Seth C.H. Luisi
  • Publication number: 20070198261
    Abstract: Methods and apparatus for voice recognition are disclosed. A voice signal is obtained and two or more voice recognition analyses are performed on the voice signal. Each voice recognition analysis uses a filter bank defined by a different maximum frequency and a different minimum frequency and wherein each voice recognition analysis produces a recognition probability ri of recognition of one or more speech units, whereby there are two or more recognition probabilities ri. The maximum frequency and the minimum frequency may be adjusted every time speech is windowed and analyzed. A final recognition probability Pƒ is determined based on the two or more recognition probabilities ri.
    Type: Application
    Filed: February 21, 2006
    Publication date: August 23, 2007
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ruxin Chen
  • Publication number: 20070198263
    Abstract: Voice recognition methods and systems are disclosed. A voice signal is obtained for an utterance of a speaker. A runtime pitch is determined from the voice signal for the utterance. The speaker is categorized based on the runtime pitch and one or more acoustic model parameters are adjusted based on a categorization of the speaker. The parameter adjustment may be performed at any instance of time during the recognition. A voice recognition analysis of the utterance is then performed based on the acoustic model.
    Type: Application
    Filed: February 21, 2006
    Publication date: August 23, 2007
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ruxin Chen
  • Publication number: 20070139443
    Abstract: A method of moving objects in a graphical user interface, includes obtaining a video image of a user of the interface; displaying the video image on a display such that the video image is superposed with one or more objects displayed on the display; and moving one or more objects displayed on the display based on recognition of motions of the video image of the user. Recognition of motions of the video image may include recognition of motions of an image of the user's hand.
    Type: Application
    Filed: December 12, 2005
    Publication date: June 21, 2007
    Applicant: Sonny Computer Entertainment Inc.
    Inventors: Richard Marks, Ruxin Chen
  • Publication number: 20070112566
    Abstract: Use of runtime memory may be reduced in a data processing algorithm that uses one or more probability distribution functions. Each probability distribution function may be characterized by one or more uncompressed mean values and one or more variance values. The uncompressed mean and variance values may be represented by ?-bit floating point numbers, where ? is an integer greater than 1. The probability distribution functions are converted to compressed probability functions having compressed mean and/or variance values represented as ?-bit integers, where ? is less than ?, whereby the compressed mean and/or variance values occupy less memory space than the uncompressed mean and/or variance values. Portions of the data processing algorithm can be performed with the compressed mean and variance values.
    Type: Application
    Filed: November 12, 2005
    Publication date: May 17, 2007
    Applicant: Sony Computer Entertainment Inc.
    Inventor: Ruxin Chen
  • Publication number: 20070061413
    Abstract: A system and method of displaying content to a user depending on whether the user's speech indicates the user is sufficiently mature to view the content.
    Type: Application
    Filed: April 10, 2006
    Publication date: March 15, 2007
    Inventors: Eric Larsen, Ruxin Chen
  • Publication number: 20070061142
    Abstract: Consumer electronic devices have been developed with enormous information processing capabilities, high quality audio and video outputs, large amounts of memory, and may also include wired and/or wireless networking capabilities. Additionally, relatively unsophisticated and inexpensive sensors, such as microphones, video camera, GPS or other position sensors, when coupled with devices having these enhanced capabilities, can be used to detect subtle features about users and their environments. A variety of audio, video, simulation and user interface paradigms have been developed to utilize the enhanced capabilities of these devices. These paradigms can be used separately or together in any combination. One paradigm automatically creating user identities using speaker identification. Another paradigm includes a control button with 3-axis pressure sensitivity for use with game controllers and other input devices.
    Type: Application
    Filed: September 15, 2006
    Publication date: March 15, 2007
    Applicant: Sony Computer Entertainment Inc.
    Inventors: Gustavo Hernandez-Abrego, Xavier Menendez-Pidal, Steven Osman, Ruxin Chen, Rishi Deshpande, Care Michaud-Wideman, Richard Marks, Eric Larsen, Xiaodong Mao
  • Publication number: 20070061851
    Abstract: A system and method for conditioning execution of a control function on a determination of whether or not a person's attention is directed toward a predetermined device. The method involves acquiring data concerning the activity of a person who is in the proximity of the device, the data being in the form of one or more temporal samples. One or more of the temporal samples is then analyzed to determine if the person's activity during the time of the analyzed samples indicates that the person's attention is not directed toward the device. The results of the determination are used to ascertain whether or not the control function should be performed.
    Type: Application
    Filed: March 6, 2006
    Publication date: March 15, 2007
    Applicant: Sony Computer Entertainment Inc.
    Inventors: Hrishikesh Deshpande, Ruxin Chen
  • Publication number: 20060277032
    Abstract: Methods for optimizing grammar structure for a set of phrases to be used in speech recognition during a computing event are provided. One method includes receiving a set of phrases, the set of phrases being relevant for the computing event and the set of phrases having a node and link structure. Also included is identifying redundant nodes by examining the node and link structures of each of the set of phrases so as to generate a single node for the redundant nodes. The method further includes examining the node and link structures to identify nodes that are capable of being vertically grouped and grouping the identified nodes to define vertical word groups. The method continues with fusing nodes of the set of phrases that are not vertically grouped into fused word groups. Wherein the vertical word groups and the fused word groups are linked to define an optimized grammar structure.
    Type: Application
    Filed: May 19, 2006
    Publication date: December 7, 2006
    Applicant: SONY COMPUTER ENTERTAINMENT INC.
    Inventors: Gustavo Hernandez-Abrego, Ruxin Chen
  • Patent number: 6778959
    Abstract: A system and method for speech verification using out-of-vocabulary models includes a speech recognizer that has a model bank with system vocabulary word models, a garbage model, and one or more noise models. The model bank may reject an utterance or other sound as an invalid vocabulary word when the model bank identifies the utterance or other sound as corresponding to the garbage model or the noise models. Initial noise models may be selectively combined into a pre-determined number of final noise model clusters to effectively reduce the number of noise models that are utilized by the model bank of the speech recognizer to verify system vocabulary words.
    Type: Grant
    Filed: October 18, 2000
    Date of Patent: August 17, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Lex Olorenshaw, Xavier Menendez-Pidal, Ruxin Chen
  • Patent number: 6768979
    Abstract: The noise suppressor utilizes statistical characteristics of the noise signal to attenuate amplitude values of the noisy speech signal that have a probability of containing noise. In one embodiment, the noise suppressor utilizes an attenuation function having a shape determined in part by a noise average and a noise standard deviation. In a further embodiment, the noise suppressor also utilizes an adaptive attenuation coefficient that depends on signal-to-noise conditions in the speech recognition system.
    Type: Grant
    Filed: March 31, 1999
    Date of Patent: July 27, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menéndez-Pidal, Miyuki Tanaka, Ruxin Chen
  • Patent number: 6718302
    Abstract: A method for utilizing validity constraints in a speech endpoint detector comprises a validity manager that may utilize a pulse width module to validate utterances that include a plurality of energy pulses during a certain time period. The validity manager also may utilize a minimum power module to ensure that speech energy below a pre-determined level is not classified as a valid utterance. In addition the validity manager may use a duration module to ensure that valid utterances fall within a specified duration. Finally, the validity manager may utilize a short-utterance minimum power module to specifically distinguish an utterance of short duration from background noise based on the energy level of the short utterance.
    Type: Grant
    Filed: January 12, 2000
    Date of Patent: April 6, 2004
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Ruxin Chen, Lex Olorenshaw
  • Patent number: 6473735
    Abstract: The present invention comprises a system and method for speech verification using a confidence measure that includes a speech verifier which compares a differential score for a recognized word to a predetermined threshold value, where a recognized word is the word model that produced the highest recognition score. In one embodiment, a single threshold is used for each word in a vocabulary. In another embodiment, each word model has an associated threshold, so that a differential score for a recognized word is compared to a unique threshold associated with that word. In a further embodiment, pairs of confused words in the vocabulary are dealt with separately. If a confused word is the recognized word, the speech verifier compares the differential score to a threshold that depends on the word model that produced the next-highest recognition score. Different values for the various thresholds may maximize rejection accuracy or recognition accuracy.
    Type: Grant
    Filed: April 20, 2000
    Date of Patent: October 29, 2002
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Xavier Menendez-Pidal, Lex Olorenshaw, Ruxin Chen
  • Patent number: 6216103
    Abstract: A method for implementing a speech recognition system for use during conditions with background noise includes the steps of calculating, in real-time, sequential short-term delta energy parameters for speech energy from a spoken utterance, determining threshold values in the speech energy, and identifying a beginning point and an ending point for the spoken utterance based on the relationship between the threshold values and the short-term delta energy parameters.
    Type: Grant
    Filed: October 20, 1997
    Date of Patent: April 10, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Duanpei Wu, Miyuki Tanaka, Ruxin Chen, Lex Olorenshaw
  • Patent number: 6173258
    Abstract: A method for reducing noise distortions in a speech recognition system comprises a feature extractor that includes a noise-suppressor, one or more time cosine transforms, and a normalizer. The noise-suppressor preferably performs a spectral subtraction process early in the feature extraction procedure. The time cosine transforms preferably operate in a centered-mode to each perform a transformation in the time domain. The normalizer calculates and utilizes normalization values to generate normalized features for speech recognition. The calculated normalization values preferably include mean values, left variances and right variances.
    Type: Grant
    Filed: October 22, 1998
    Date of Patent: January 9, 2001
    Assignees: Sony Corporation, Sony Electronics Inc.
    Inventors: Xavier Menendez-Pidal, Miyuki Tanaka, Ruxin Chen, Duanpei Wu
  • Patent number: 6006186
    Abstract: A method and an apparatus for a parameter sharing speech recognition system are provided. Speech signals are received into a processor of a speech recognition system. The speech signals are processed using a speech recognition system hosting a shared hidden Markov model (HMM) produced by generating a number of phoneme models, some of which are shared. The phoneme models are generated by retaining as a separate phoneme model any triphone model having a number of trained frames available that exceeds a prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having a common biphone exceed the prespecified threshold. A shared phoneme model is generated to represent each of the groups of triphone phoneme models for which the number of trained frames having an equivalent effect on a phonemic context exceed the prespecified threshold.
    Type: Grant
    Filed: October 16, 1997
    Date of Patent: December 21, 1999
    Assignees: Sony Corporation, Sony Electronics, Inc.
    Inventors: Ruxin Chen, Miyuki Tanaka, Duanpei Wu, Lex S. Olorenshaw