Patents Examined by Justin W. Rider

Text structure for voice synthesis, voice synthesis method, voice synthesis apparatus, and computer program thereof

Patent number: 7487093

Abstract: In a voice synthesis apparatus, by bounding a desired range of input text to be output by, e.g., a start tag “<morphing type=“emotion” start=“happy” end=“angry”>” and end tag </morphing>, a feature of synthetic voice is continuously changed while gradually changing voice from a happy voice to an angry voice upon outputting synthetic voice.

Type: Grant

Filed: August 10, 2004

Date of Patent: February 3, 2009

Assignee: Canon Kabushiki Kaisha

Inventors: Masahiro Mutsuno, Toshiaki Fukada
Eliminating interference of noisy modality in a multimodal application

Patent number: 7480618

Abstract: Aspects of the present invention provide for ranking various input modalities relative to each other and processing recognition results received through these input modalities based in part on the ranking.

Type: Grant

Filed: September 2, 2004

Date of Patent: January 20, 2009

Assignee: Microsoft Corporation

Inventors: Zhan Ding, David K. Burton, Yun-Cheng Ju
Speech decoder that detects stationary noise signal regions

Patent number: 7478042

Abstract: A first determiner 121 tentatively determines whether the current processing unit represents a stationary noise period, based on stationary properties of a decoded signal. Based on the tentative determination result and a determination result of the periodicity of the decoded signal, a second determiner 124 determines whether the current processing unit represents a stationary noise period, thereby distinguishing a decoded signal including a stationary speech signal such as a stationary vowel from stationary noise and correctly identifying the stationary noise period.

Type: Grant

Filed: November 30, 2001

Date of Patent: January 13, 2009

Assignee: Panasonic Corporation

Inventors: Hiroyuki Ehara, Kazutoshi Yasunaga, Kazunori Mano, Yusuke Hiwasaki
Method and system for tracking signal sources with wrapped-phase hidden markov models

Patent number: 7475014

Abstract: A method models trajectories of a signal source. Training signals generated by a signal source moving along known trajectories are acquired by each sensor in an array of sensors. Phase differences between all unique pairs of the training signals are determined. A wrapped-phase hidden Markov model is constructed from the phase differences. The wrapped-phase hidden Markov model includes multiple Gaussian distributions to model the known trajectories of the signal source.

Type: Grant

Filed: July 25, 2005

Date of Patent: January 6, 2009

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Paris Smaragdis, Petros Boufounos
Adaptive and scalable method for resolving natural language ambiguities

Patent number: 7475010

Abstract: A method for resolving ambiguities in natural language by organizing the task into multiple iterations of analysis done in successive levels of depth. The processing is adaptive to the users' need for accuracy and efficiency. At each level of processing the most accurate disambiguation is made based on the available information. As more analysis is done, additional knowledge is incorporated in a systematic manner to improve disambiguation accuracy. Associated with each level of processing is a measure of confidence, used to gauge the confidence of a process in its disambiguation accuracy. An overall confidence measure is also used to reflect the level of the analysis done.

Type: Grant

Filed: September 2, 2004

Date of Patent: January 6, 2009

Assignee: Lingospot, Inc.

Inventor: Gerald CheShun Chao
Speaker recognition using local models

Patent number: 7475013

Abstract: A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution.

Type: Grant

Filed: March 26, 2004

Date of Patent: January 6, 2009

Assignee: Honda Motor Co., Ltd.

Inventor: Ryan Rifkin
Text input support system and method

Patent number: 7475009

Abstract: Methods and apparatus, including computer program products, featuring techniques for text input support that, among other capabilities, can automatically convert a string of characters lacking spaces into a sentence that includes spaces. The apparatus and methods may find applicability, for example, when an operator inputs characters sequentially for generating text composed of plural words separated from each other with spaces, like English or French text.

Type: Grant

Filed: June 11, 2001

Date of Patent: January 6, 2009

Inventor: Hiroshi Ishikura
Apparatus and methods for pronunciation lexicon compression

Patent number: 7469205

Abstract: A compressed pronunciation lexicon file is generated from a source pronunciation lexicon using a pronunciation prediction algorithm in a multi-output mode. The pronunciation prediction algorithm may generate a deterministic ordered list of phoneme strings from the textual representation of a particular word. The compressed pronunciation lexicon file may include a sorted list of records of compressed textual representations of words and compressed phonetic representations of the words.

Type: Grant

Filed: June 30, 2004

Date of Patent: December 23, 2008

Assignee: Marvell International Ltd.

Inventors: Hagai Aronowitz, Adoram Erell
Method and apparatus for frame classification and rate determination in voice transcoders for telecommunications

Patent number: 7469209

Abstract: A method and apparatus for frame classification and rate determination in voice transcoders. The apparatus includes a classifier input parameter preparation module that unpacks the bitstream from the source codec and selects the codec parameters to be used for classification, parameter buffers that store previous input and output parameters of previous frames, and a frame classification and rate decision module that uses the source codec parameters from the current frame and zero or more frames to determine the frame class, rate, and classification feature parameters for the destination codec. The classifier input parameter preparation module separates the bitstream code and unquantizes the sub-codes into the codec parameters. The frame classification and rate decision module comprises M sub-classifiers and a final decision module.

Type: Grant

Filed: August 14, 2003

Date of Patent: December 23, 2008

Assignee: Dilithium Networks Pty Ltd.

Inventors: Nicola Chong-White, Jianwei Wang, Marwan A. Jabri
Telephone pathology assessment

Patent number: 7457753

Abstract: A system for remote assessment of a user is disclosed. The system comprises application software resident on a server and arranged to interact across a network with a user operating a client device to obtain one or more sample signals of the user's speech. A datastore is arranged to store the user speech samples in association with details of the user. A feature extraction engine is arranged to extract one or more first features from respective speech samples. A comparator is arranged to compare the first features extracted from a speech sample with second features extracted from one or more reference samples and to provide a measure of any differences between the first and second features for assessment of the user.

Type: Grant

Filed: June 29, 2005

Date of Patent: November 25, 2008

Assignee: University College Dublin National University of Ireland

Inventors: Rosalyn Moran, Richard Reilly, Philip De Chazal, Brian O'Mullane, Peter Lacy
Language model architecture

Patent number: 7454344

Abstract: An architectural design is disclosed wherein a single reusable language model component is shared by multiple applications. The language model component is loaded once for a plurality of applications, thereby reducing the amount of memory consumed by the applications independently.

Type: Grant

Filed: August 13, 2004

Date of Patent: November 18, 2008

Assignee: Microsoft Corporation

Inventor: William Ramsey
Separating multiple audio signals recorded as a single mixed signal

Patent number: 7454333

Abstract: A method according to the invention separates multiple audio signals recorded as a mixed signal via a single channel. The mixed signal is A/D converted and sampled. A sliding window is applied to the samples to obtain frames. The logarithms of the power spectra of the frames are determined. From the spectra, the a posteriori probabilities of pairs of spectra are determined. The probabilities are used to obtain Fourier spectra for each individual signal in each frame. The invention provides a minimum-mean-squared error metho or a soft mask method for making this determination. The Fourier spectra are inverted to obtain corresponding signals, which are concatenated to recover the individual signals.

Type: Grant

Filed: September 13, 2004

Date of Patent: November 18, 2008

Assignee: Mitsubishi Electric Research Lab, Inc.

Inventors: Bhiksha Ramakrishnan, Aarthi M. Reddy
Compound word breaker and spell checker

Patent number: 7447627

Abstract: A method of determining the component words of a compound word is disclosed. The method identifies the component words, by comparing the word with a list of words found in a lexicon. If the word is not found in the lexicon the method proceeds to analyze the word on a character-by-character basis. After each character the method identifies any potential matches to the selected characters in the lexicon. If a match is found, it is added to a hypothesis trace in a lattice. Next, the method checks to see whether the remaining characters form a valid entry in the lexicon, and whether the entry is allowed to be a final segment.

Type: Grant

Filed: March 19, 2004

Date of Patent: November 4, 2008

Assignee: Microsoft Corporation

Inventors: Andrea Maria Jessee, Miriam R. Eckert, Kevin R. Powell
Acoustic signal encoding method and apparatus, acoustic signal decoding method and apparatus and recording medium

Patent number: 7447640

Abstract: In an acoustic signal encoding apparatus (100), a tonal noise verification unit (110) verifies whether the input acoustic time-domain signals are tonal or noisy. If the input acoustic time-domain signals are tonal, tonal component signals are extracted by a tonal component extraction unit (121), and tonal component parameters are normalized and quantized in a normalization/quantization unit (122). The residual time-domain signals, obtained on extracting the tonal component signals from the acoustic time-domain signals, are transformed by an orthogonal transforming unit (131) into the spectral information, which spectral information is normalized and quantized by a normalization/quantization unit (132). A code string generating unit (140) generates a code string from the quantized tonal component parameters and the quantized residual component spectral information.

Type: Grant

Filed: June 11, 2002

Date of Patent: November 4, 2008

Assignee: Sony Corporation

Inventors: Minoru Tsuji, Shiro Suzuki, Keisuke Toyama
Method and system for synchronizing the user interface language between a software application and a web site

Patent number: 7444278

Abstract: A method and system is generally directed to synchronizing the language used for applications and the language used for information provided across a network. The client language that is associated with the language used for applications on a computing device is detected along with the services language. The services language corresponds to the language used for the user interface of network materials, such as web sites and help pages. The services language is stored online so that the services language and the client language may be made to correspond to one another despite the user moving from online to offline or from one computing device to another, allowing for a consistent user experience.

Type: Grant

Filed: March 19, 2004

Date of Patent: October 28, 2008

Assignee: Microsoft Corporation

Inventor: James Andrew Bennett
Dialog system for a man-machine interaction having cooperating dialog devices

Patent number: 7437292

Abstract: A system and method for a dialog system which provides for a user to carry on a dialog with dialog experts having different capabilities, during an existing dialog session. A terminal device may be connected to a special switching device, which, in dependence upon a communicated user statement, may provide the user with the optimal system statement from a multiplicity of system statements received from various dialog devices.

Type: Grant

Filed: October 31, 2001

Date of Patent: October 14, 2008

Assignee: Deutsche Telekom AG

Inventors: Stefan Feldes, Karlheinz Schuhmacher
Preprocessing of digital audio data for improving perceptual sound quality on a mobile phone

Patent number: 7430506

Abstract: Since music signals are encoded by a voice encoding method optimized to human voice signals such as EVRC (Enhanced Variable Rate Coding) in a cellular communication system, the music signals are often distorted by such encoding method, and listeners experience pauses in music caused by such voice-optimized encoding method. To improve the perceptual sound quality of music, a method for preprocessing digital audio data is provided in order to prevent the problem of pause in music signals in a cellular phone. In particular, AGC (Automatic Gain Control) preprocessing and PHE (Pitch Harmonics Enhancement) is performed to the digital audio data having low dynamic range. By this method, the number of pauses in music signal is reduced, and the perceptual sound quality of the music is improved.

Type: Grant

Filed: January 8, 2004

Date of Patent: September 30, 2008

Assignee: RealNetworks Asia Pacific Co., Ltd.

Inventors: Young Han Nam, Seop Hyeong Park, Yun Ho Jeon
Encoding method and apparatus, and decoding method and apparatus

Patent number: 7428489

Abstract: In a decoding apparatus (30), power compensation spectrum generation/composition units (371 to 374) adjust power of power compensation spectrums PCSP based on quantization accuracy information, normalization coefficients, gain control information, and power adjustment information. Then, power of the spectrums SP is compensated by replacing spectrums SP being equal to or smaller than a threshold with the power-adjusted power compensation spectrums PCSP, or by adding the power-adjusted power compensation spectrums PCSP to the spectrums SP.

Type: Grant

Filed: April 30, 2003

Date of Patent: September 23, 2008

Assignee: Sony Corporation

Inventors: Keisuke Touyama, Shiro Suzuki, Minoru Tsuji
Received voice processing apparatus

Patent number: 7428488

Abstract: A received voice processing apparatus is provided, in which the received voice processing apparatus includes: a target spectrum calculation part for calculating, for each frequency band, a target spectrum on the basis of a compression ratio for a voice spectrum; a gain calculation part for calculating a gain value for amplifying the voice spectrum to the target spectrum; a filter coefficient calculation part for calculating a filter coefficient from the gain value; and a filer part for processing a received voice signal by using the filter coefficient.

Type: Grant

Filed: January 16, 2003

Date of Patent: September 23, 2008

Assignee: Fujitsu Limited

Inventor: Mutsumi Saito
Information processing apparatus, information processing method, program, and storage medium

Patent number: 7424429

Abstract: The correspondence between input fields and grammars is obtained (S102), and a speech utterance example is displayed using a grammar corresponding to a portion (field) designated by an input instruction (S106). Also, a speech recognition process is executed using this grammar (S108). The speech recognition result is displayed in the field designated by the input instruction (S109). Upon reception an instruction for transmitting input data to an application, the input data is transmitted to the application (S110).

Type: Grant

Filed: June 13, 2003

Date of Patent: September 9, 2008

Assignee: Canon Kabushiki Kaisha

Inventors: Kenichiro Nakagawa, Hiroki Yamamoto

prev … 2 3 4 5 6 7 8 9 next