Dynamic Time Warping Patents (Class 704/241)

Time warped modified transform coding of audio signals

Patent number: 8838441

Abstract: A representation of an audio signal having a first, a second and a third frame is derived by estimating first warp information for the first and second frames and second warp information for the second and third frames, the warp information describing pitch information of the audio signal. First or second spectral coefficients for first and second frames or second and third frames are derived using first or second warp information and a first or second weighted representation of the first and second frames or second and third frames, the first or second weighted representation derived by applying a first or second window function to the first and second frames or second and third frames, wherein the first or second window function depends on the first or second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients.

Type: Grant

Filed: February 14, 2013

Date of Patent: September 16, 2014

Assignee: Dolby International AB

Inventor: Lars Villemoes
Control center for a voice controlled wireless communication device system

Patent number: 8775189

Abstract: A wireless communication device is disclosed that accepts recorded audio data from an end-user. The audio data can be in the form of a command requesting user action. Likewise, the audio data can be converted into a text file. The audio data is reduced to a digital file in a format that is supported by the device hardware, such as a .wav, .mp3, .vnf file, or the like. The digital file is sent via secured or unsecured wireless communication to one or more server computers for further processing. In accordance with an important aspect of the invention, the system evaluates the confidence level of the of the speech recognition process. If the confidence level is high, the system automatically builds the application command or creates the text file for transmission to the communication device.

Type: Grant

Filed: August 9, 2006

Date of Patent: July 8, 2014

Assignee: Nuance Communications, Inc.

Inventors: Stephen S. Burns, Mickey W. Kowitz
Speech-based speaker recognition systems and methods

Patent number: 8775179

Abstract: The illustrative embodiments described herein provide systems and methods for authenticating a speaker. In one embodiment, a method includes receiving reference speech input including a reference passphrase to form a reference recording, and receiving test speech input including a test passphrase to form a test recording. The method includes determining whether the test passphrase matches the reference passphrase, and determining whether one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase. The method authenticates the speaker of the test speech input in response to determining that the reference passphrase matches the test passphrase and that one or more voice features of the speaker of the test passphrase matches one or more voice features of the speaker of the reference passphrase.

Type: Grant

Filed: May 6, 2010

Date of Patent: July 8, 2014

Assignee: Senam Consulting, Inc.

Inventor: Serge Olegovich Seyfetdinov
System and method for selecting audio contents by using speech recognition

Patent number: 8706489

Abstract: A system and method for selecting audio contents by using the speech recognition to obtain a textual phrase from a series of audio contents are provided. The system includes an output module outputting the audio contents, an input module receiving a speech input from a user, a buffer temporarily storing the audio contents within a desired period and the speech input, and a recognizing module performing a speech recognition between the audio contents within the desired period and the speech input to generate an audio phrase and the corresponding textual phrase matching with the speech input.

Type: Grant

Filed: August 8, 2006

Date of Patent: April 22, 2014

Assignee: Delta Electronics Inc.

Inventors: Jia-lin Shen, Chien-Chou Hung
Audio transform coding using pitch correction

Patent number: 8700388

Abstract: A processed representation of an audio signal having a sequence of frames is generated by sampling the audio signal within first and second frames of the sequence of frames, the second frame following the first frame, the sampling using information on a pitch contour of the first and second frames to derive a first sampled representation. The audio signal is sampled within the second and third frames, the third frame following the second frame in the sequence of frames. The sampling uses the information on the pitch contour of the second frame and information on a pitch contour of the third frame to derive a second sampled representation. A first scaling window is derived for the first sampled representation, and a second scaling window is derived for the second sampled representation, the scaling windows depending on the samplings applied to derive the first sampled representations or the second sampled representation.

Type: Grant

Filed: March 23, 2009

Date of Patent: April 15, 2014

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Bernd Edler, Sascha Disch, Ralf Geiger, Stefan Bayer, Ulrich Kraemer, Guillaume Fuchs, Max Neuendorf, Markus Multrus, Gerald Schuller, Harald Popp
Fast partial pattern matching system and method

Patent number: 8639506

Abstract: Method, system and computer program for determining the matching between a first and a second sampled signals using an improved Dynamic Time Warping algorithm, called Unbounded DTW. It uses a dynamic programming algorithm to find exact start-end alignment points, unknown a priori, being the initial subsampling of the similarity matrix made via definition of optimal synchronization points, allowing a very fast process.

Type: Grant

Filed: December 10, 2010

Date of Patent: January 28, 2014

Assignee: Telefonica, S.A.

Inventors: Xavier Anguera Miro, Robert Macrae
Speech search device and speech search method

Patent number: 8626508

Abstract: Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.

Type: Grant

Filed: February 10, 2010

Date of Patent: January 7, 2014

Assignee: National University Corporation TOYOHASHI UNIVERSITY OF TECHNOLOGY

Inventors: Koichi Katsurada, Tsuneo Nitta, Shigeki Teshima
Enhanced interface for use with speech recognition

Patent number: 8583439

Abstract: Improved methods of presenting speech prompts to a user as part of an automated system that employs speech recognition or other voice input are described. The invention improves the user interface by providing in combination with at least one user prompt seeking a voice response, an enhanced user keyword prompt intended to facilitate the user selecting a keyword to speak in response to the user prompt. The enhanced keyword prompts may be the same words as those a user can speak as a reply to the user prompt but presented using a different audio presentation method, e.g., speech rate, audio level, or speaker voice, than used for the user prompt. In some cases, the user keyword prompts are different words from the expected user response keywords, or portions of words, e.g., truncated versions of keywords.

Type: Grant

Filed: January 12, 2004

Date of Patent: November 12, 2013

Assignee: Verizon Services Corp.

Inventor: James Mark Kondziela
Mobile terminal and menu control method thereof

Patent number: 8560324

Abstract: A mobile terminal including an input unit configured to receive an input to activate a voice recognition function on the mobile terminal, a memory configured to store information related to operations performed on the mobile terminal, and a controller configured to activate the voice recognition function upon receiving the input to activate the voice recognition function, to determine a meaning of an input voice instruction based on at least one prior operation performed on the mobile terminal and a language included in the voice instruction, and to provide operations related to the determined meaning of the input voice instruction based on the at least one prior operation performed on the mobile terminal and the language included in the voice instruction and based on a probability that the determined meaning of the input voice instruction matches the information related to the operations of the mobile terminal.

Type: Grant

Filed: January 31, 2012

Date of Patent: October 15, 2013

Assignee: LG Electronics Inc.

Inventors: Jong-Ho Shin, Jae-Do Kwak, Jong-Keun Youn
Apparatus and method of providing a quality measure for an output voice signal generated to reproduce an input voice signal

Patent number: 8538746

Abstract: A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.

Type: Grant

Filed: September 27, 2012

Date of Patent: September 17, 2013

Assignee: AudioCodes Ltd.

Inventors: Ilan D. Shallom, Nitay Shiran, Felix Flomen
Method for segmenting audio signals

Patent number: 8521529

Abstract: An input signal is converted to a feature-space representation. The feature-space representation is projected onto a discriminant subspace using a linear discriminant analysis transform to enhance the separation of feature clusters. Dynamic programming is used to find global changes to derive optimal cluster boundaries. The cluster boundaries are used to identify the segments of the audio signal.

Type: Grant

Filed: April 18, 2005

Date of Patent: August 27, 2013

Assignee: Creative Technology Ltd

Inventors: Michael M. Goodwin, Jean Laroche
Voice analysis device, voice analysis method, voice analysis program, and system integration circuit

Patent number: 8478587

Abstract: A sound analysis device comprises: a sound parameter calculation unit operable to acquire an audio signal and calculate a sound parameter for each of partial audio signals, the partial audio signals each being the acquired audio signal in a unit of time; a category determination unit operable to determine, from among a plurality of environmental sound categories, which environmental sound category each of the partial audio signals belongs to, based on a corresponding one of the calculated sound parameters; a section setting unit operable to sequentially set judgement target sections on a time axis as time elapses, each of the judgment target sections including two or more of the units of time, the two or more of the units of time being consecutive; and an environment judgment unit operable to judge, based on a number of partial audio signals in each environmental sound category determined in at least a most recent judgment target section, an environment that surrounds the sound analysis device in at least the

Type: Grant

Filed: March 13, 2008

Date of Patent: July 2, 2013

Assignee: Panasonic Corporation

Inventors: Takashi Kawamura, Ryouichi Kawanishi
Time warped modified transform coding of audio signals

Patent number: 8412518

Abstract: A representation of an audio signal having a first frame, a second frame following the first frame, and a third frame following the second frame, is derived by estimating first warp information for the first and the second frame and second warp information for the second frame and the third frame, the warp information describing a pitch information of the audio signal. First spectral coefficients for the first and the second frame are derived using the first warp information and a first weighted representation of the first and the second frame, the first weighted representation derived by applying a first window function to the first and the second frames, wherein the first window function depends on the first warp information.

Type: Grant

Filed: January 29, 2010

Date of Patent: April 2, 2013

Assignee: Dolby International AB

Inventor: Lars Villemoes
Speech recognizing apparatus

Patent number: 8407051

Abstract: A speech recognizing apparatus includes a speech start instructing section 3 for instructing to start speech recognition; a speech input section 1 for receiving uttered speech and converting to a speech signal; a speech recognizing section 2 for recognizing the speech on the basis of the speech signal; an utterance start time detecting section 4 for detecting duration from the time when the speech start instructing section instructs to the time when the speech input section delivers the speech signal; an utterance timing deciding section 5 for deciding utterance timing indicating whether the utterance start is quick or slow by comparing the duration detected by the utterance start time detecting section with a prescribed threshold; an interaction control section 6 for determining a content, which is to be shown when exhibiting a recognition result of the speech recognizing section, in accordance with the utterance timing decided; a system response generating section 7 for generating a system response on the b

Type: Grant

Filed: March 27, 2008

Date of Patent: March 26, 2013

Assignee: Mitsubishi Electric Corporation

Inventors: Yuzuru Inoue, Tadashi Suzuki, Fumitaka Sato, Takayoshi Chikuri
Utterance verification method and apparatus for isolated word N-best recognition result

Patent number: 8374869

Abstract: An utterance verification method for an isolated word N-best speech recognition result includes: calculating log likelihoods of a context-dependent phoneme and an anti-phoneme model based on an N-best speech recognition result for an input utterance; measuring a confidence score of an N-best speech-recognized word using the log likelihoods; calculating distance between phonemes for the N-best speech-recognized word; comparing the confidence score with a threshold and the distance with a predetermined mean of distances; and accepting the N-best speech-recognized word when the compared results for the confidence score and the distance correspond to acceptance.

Type: Grant

Filed: August 4, 2009

Date of Patent: February 12, 2013

Assignee: Electronics and Telecommunications Research Institute

Inventors: Jeom Ja Kang, Yunkeun Lee, Jeon Gue Park, Ho-Young Jung, Hyung-Bae Jeon, Hoon Chung, Sung Joo Lee, Euisok Chung, Ji Hyun Wang, Byung Ok Kang, Ki-young Park, Jong Jin Kim
Method and apparatus for phase matching frames in vocoders

Patent number: 8355907

Abstract: In one embodiment, the present invention comprises a vocoder having at least one input and at least one output, an encoder comprising a filter having at least one input operably connected to the input of the vocoder and at least one output, a decoder comprising a synthesizer having at least one input operably connected to the at least one output of the encoder, and at least one output operably connected to the at least one output of the vocoder, wherein the decoder comprises a memory and the decoder is adapted to execute instructions stored in the memory comprising phase matching and time-warping a speech frame.

Type: Grant

Filed: July 27, 2005

Date of Patent: January 15, 2013

Assignee: QUALCOMM Incorporated

Inventors: Rohit Kapoor, Serafin Diaz Spindola
Method and system for efficient pacing of speech for transcription

Patent number: 8332212

Abstract: A method and system for improving the efficiency of real-time and non-real-time speech transcription by machine speech recognizers, human dictation typists, and human voicewriters using speech recognizers. In particular, the pacing with which recorded speech is presented to transcriptionists is automatically adjusted by monitoring the transcriptionists' output by comparing the output acoustically or phonetically to the presented recorded speech as well as monitoring the resulting transcription, and accordingly adjusting the pacing.

Type: Grant

Filed: June 17, 2009

Date of Patent: December 11, 2012

Assignee: Cogi, Inc.

Inventors: Andreas Wittenstein, Mark Cromack
Method and apparatus of providing a quality measure for an output voice signal generated to reproduce an input voice signal

Patent number: 8296131

Abstract: A method of providing a quality measure for an output voice signal generated to reproduce an input voice signal, the method comprising: partitioning the input and output signals into frames; for each frame of the input signal, determining a disturbance relative to each of a plurality of frames of the output signal; determining a subset of the determined disturbances comprising one disturbance for each input frame such that a sum of the disturbances in the subset set is a minimum; and using the set of disturbances to provide the measure of quality.

Type: Grant

Filed: December 30, 2008

Date of Patent: October 23, 2012

Assignee: AudioCodes Ltd.

Inventors: Ilan D. Shallom, Nitay Shiran
Method and an apparatus for identifying frame type

Patent number: 8271291

Abstract: A method for identifying a frame type is disclosed. The present invention includes receiving current frame type information, obtaining previously received previous frame type information, generating frame identification information of a current frame using the current frame type information and the previous frame type information, and identifying the current frame using the frame identification information. And, a method for identifying a frame type is disclosed. The present invention includes receiving a backward type bit corresponding to current frame type information, obtaining a forward type bit corresponding to previous frame type information, generating frame identification information of a current frame by placing the backward type bit at a first position and placing the forward type bit at a second position.

Type: Grant

Filed: May 8, 2009

Date of Patent: September 18, 2012

Assignee: LG Electronics Inc.

Inventors: Sang Bae Chon, Lae Hoon Kim, Koeng Mo Sung
Auto-translation of source strings in global verification testing in a functional testing tool

Patent number: 8258947

Abstract: Embodiments of the present invention provide a method, system and computer program product for translation verification of source strings for controls in a target application graphical user interface (GUI). In an embodiment of the invention, a method for translation verification of source strings for controls in a target application GUI can include loading a target GUI for an application under test in a functional testing tool executing in memory by a processor of a computing system, retrieving different translated source strings in a target spoken language for respectively different control elements of the target GUI and, determining a score for each one of the translated source strings. Thereafter, an alert can be provided in the functional testing tool for each translated source string corresponding to a determined score failing to meet a threshold value, such as a score that falls below a threshold value, or a score that exceeds a threshold value.

Type: Grant

Filed: September 29, 2009

Date of Patent: September 4, 2012

Assignee: International Business Machines Corporation

Inventors: Jennifer G. Becker, Kenneth Lee McClamroch, VinodKumar Raghavan, Peter Sun
Speech recognition of character sequences

Patent number: 8255216

Abstract: A method of and a system for processing speech. A spoken utterance of a plurality of characters can be received. A plurality of known character sequences that potentially correspond to the spoken utterance can be selected. Each selected known character sequence can be scored based on, at least in part, a weighting of individual characters that comprise the known character sequence.

Type: Grant

Filed: October 30, 2006

Date of Patent: August 28, 2012

Assignee: Nuance Communications, Inc.

Inventor: Kenneth D. White
Time-warping frames of wideband vocoder

Patent number: 8239190

Abstract: A method of communicating speech comprising time-warping a residual low band speech signal to an expanded or compressed version of the residual low band speech signal, time-warping a high band speech signal to an expanded or compressed version of the high band speech signal, and merging the time-warped low band and high band speech signals to give an entire time-warped speech signal. In the low band, the residual low band speech signal is synthesized after time-warping of the residual low band signal while in the high band, an unwarped high band signal is synthesized before time-warping of the high band speech signal. The method may further comprise classifying speech segments and encoding the speech segments. The encoding of the speech segments may be one of code-excited linear prediction, noise-excited linear prediction or ? frame (silence) coding.

Type: Grant

Filed: August 22, 2006

Date of Patent: August 7, 2012

Assignee: QUALCOMM Incorporated

Inventors: Rohit Kapoor, Serafin Diaz Spindola
Method for determining a most probable K location

Patent number: 8238351

Abstract: The process of traversing a K may involve determining a match between a root node and a Result node of a node on the asCase list of a current K node. When learning is off and a match is not found, the procedure may ignore the particle being processed. An alternative solution determines which node on the asCase list is the most likely to be the next node. While the K Engine is traversing and events are being recorded into a K structure, a count field may be added to each K node to contain a record of how many times each K path has been traversed. The count field may be updated according to the processes traversing the K. Typically, the count is incremented only for learning functions. This count field may be used in determining which node may be the most (or least) probable.

Type: Grant

Filed: April 4, 2006

Date of Patent: August 7, 2012

Assignee: Unisys Corporation

Inventor: Jane Campbell Mazzagatti
Providing enhanced content

Patent number: 8234411

Abstract: Methods, systems, computer readable media, and apparatuses for providing enhanced content are presented. Data including a first program, a first caption stream associated with the first program, and a second caption stream associated with the first program may be received. The second caption stream may be extracted from the data, and a second program may be encoded with the second caption stream. The first program may be transmitted with the first caption stream including first captions and may include first content configured to be played back at a first speed. In response to receiving an instruction to change play back speed, the second program may be transmitted with the second caption stream. The second program may include the first content configured to be played back at a second speed different from the first speed, and the second caption stream may include second captions different from the first captions.

Type: Grant

Filed: September 2, 2010

Date of Patent: July 31, 2012

Assignee: Comcast Cable Communications, LLC

Inventor: Ross Gilson
Classification filter for processing data for creating a language model

Patent number: 8165870

Abstract: The method and apparatus utilize a filter to remove a variety of non-dictated words from data based on probability and improve the effectiveness of creating a language model.

Type: Grant

Filed: February 10, 2005

Date of Patent: April 24, 2012

Assignee: Microsoft Corporation

Inventors: Alejandro Acero, Dong Yu, Julian J. Odell, Milind V. Mahajan, Peter K. L. Mau
System and method for detecting repeated patterns in dialog systems

Patent number: 8140330

Abstract: Embodiments of a method and system for detecting repeated patterns in dialog systems are described. The system includes a dynamic time warping (DTW) based pattern comparison algorithm that is used to find the best matching parts between a correction utterance and an original utterance. Reference patterns are generated from the correction utterance by an unsupervised segmentation scheme. No significant information about the position of the repeated parts in the correction utterance is assumed, as each reference pattern is compared with the original utterance from the beginning of the utterance to the end. A pattern comparison process with DTW is executed without knowledge of fixed end-points. A recursive DTW computation is executed to find the best matching parts that are considered as the repeated parts as well as the end-points of the utterance.

Type: Grant

Filed: June 13, 2008

Date of Patent: March 20, 2012

Assignee: Robert Bosch GmbH

Inventors: Mert Cevik, Fuliang Weng
Method, apparatus and computer program product for providing improved voice conversion

Patent number: 8131550

Abstract: An apparatus for providing improved voice conversion includes a sub-feature generator and a transformation element. The sub-feature generator may be configured to define sub-feature units with respect to a feature of source speech. The transformation element may be configured to perform voice conversion of the source speech to target speech based on the conversion of the sub-feature units to corresponding target speech sub-feature units using a conversion model trained with respect to converting training source speech sub-feature units to training target speech sub-feature units.

Type: Grant

Filed: October 4, 2007

Date of Patent: March 6, 2012

Assignee: Nokia Corporation

Inventors: Jani Nurminen, Elina Helander
Signal modification method for efficient coding of speech signals

Patent number: 8121833

Abstract: The exemplary embodiments of the invention provide at least a method and an apparatus to perform operations including dividing a sound signal into a series of successive frames, dividing each frame into a number of subframes, producing a residual signal by filtering the sound signal through a linear prediction analysis filter, locating a last pitch pulse of the sound signal of a previous frame from the residual signal, extracting a pitch pulse prototype of given length around a position of the last pitch pulse of the previous frame using the residual signal, and locating pitch pulses in a current frame using the pitch pulse prototype.

Type: Grant

Filed: October 21, 2008

Date of Patent: February 21, 2012

Assignee: Nokia Corporation

Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
Dialog processing system, dialog processing method and computer program

Patent number: 8060365

Abstract: A dialog processing system which includes a target expression data extraction unit for extracting a plurality of target expression data each including a pattern matching portion which matches an utterance pattern, which are inputted by an utterance pattern input unit and is an utterance structure derived from contents of field-independent general conversations, among a plurality of utterance data which are inputted by an utterance data input unit and obtained by converting contents of a plurality of conversations in one field; a feature extraction unit for retrieving the pattern matching portions, respectively, from the plurality of target expression data extracted, and then for extracting feature quantity common to the plurality of pattern matching portions; and a mandatory data extraction unit for extracting mandatory data in the one field included in the plurality of utterance data by use of the feature quantities extracted.

Type: Grant

Filed: July 3, 2008

Date of Patent: November 15, 2011

Assignee: Nuance Communications, Inc.

Inventors: Nobuyasu Itoh, Shiho Negishi, Hironori Takeuchi
Fast Partial Pattern Matching System and Method

Publication number: 20110224984

Abstract: Method, system and computer program for determining the matching between a first and a second sampled signals using an improved Dynamic Time Warping algorithm, called Unbounded DTW. It uses a dynamic programming algorithm to find exact start-end alignment points, unknown a priori, being the initial subsampling of the similarity matrix made via definition of optimal synchronization points, allowing a very fast process.

Type: Application

Filed: December 10, 2010

Publication date: September 15, 2011

Applicant: TELEFONICA, S.A.

Inventors: Xavier Anguera Miro, Robert Macrae
Method and apparatus for estimating degree of similarity between voices

Patent number: 7996213

Abstract: A similarity degree estimation method is performed by two processes. In a first process, an inter-band correlation matrix is created from spectral data of an input voice such that the spectral data are divided into a plurality of discrete bands which are separated from each other with spaces therebetween along a frequency axis, a plurality of envelope components of the spectral data are obtained from the plurality of the discrete bands, and elements of the inter-band correlation matrix are correlation values between the respective envelope components of the input voice. In a second process, a degree of similarity is calculated between a pair of input voices to be compared with each other by using respective inter-band correlation matrices obtained for the pair of the input voices through the inter-band correlation matrix creation process.

Type: Grant

Filed: March 20, 2007

Date of Patent: August 9, 2011

Assignee: Yamaha Corporation

Inventors: Mikio Tohyama, Michiko Kazama, Satoru Goto, Takehiko Kawahara, Yasuo Yoshioka
Method and apparatus for enrollment and evaluation of speaker authentification

Patent number: 7962336

Abstract: The present invention provides a method and apparatus for enrollment and evaluation of speaker authentication. The method for enrollment of speaker authentication, comprising: generating a plurality of acoustic feature vector sequences respectively based on a plurality of utterances of the same content spoken by a speaker; generating a reference template from said plurality of acoustic feature vector sequences; generating a corresponding pseudo-impostor feature vector sequence for each of said plurality of acoustic feature vector sequences based on a code book that includes a plurality of codes and their corresponding feature vectors; and selecting an optimal acoustic feature subset based on said plurality of acoustic feature vector sequences, said reference template and said plurality of pseudo-impostor feature vector sequences.

Type: Grant

Filed: September 21, 2007

Date of Patent: June 14, 2011

Assignee: Kabushiki Kaisha Toshiba

Inventors: Jian Luan, Jie Hao
Sound processing apparatus and method, and program therefor

Patent number: 7945446

Abstract: Spectrum envelope of an input sound is detected. In the meantime, a converting spectrum is acquired which is a frequency spectrum of a converting sound comprising a plurality of sounds, such as unison sounds. Output spectrum is generated by imparting the detected spectrum envelope of the input sound to the acquired converting spectrum. Sound signal is synthesized on the basis of the generated output spectrum. Further, a pitch of the input sound may be detected, and frequencies of peaks in the acquired converting spectrum may be varied in accordance with the detected pitch of the input sound. In this manner, the output spectrum can have the pitch and spectrum envelope of the input sound and spectrum frequency components of the converting sound comprising a plurality of sounds, and thus, unison sounds can be readily generated with simple arrangements.

Type: Grant

Filed: March 9, 2006

Date of Patent: May 17, 2011

Assignee: Yamaha Corporation

Inventors: Hideki Kemmochi, Yasuo Yoshioka, Jordi Bonada
Method for Speech Recognition on All Languages and for Inputing words using Speech Recognition

Publication number: 20110066434

Abstract: The invention can recognize all languages and input words. It needs m unknown voices to represent m categories of known words with similar pronunciations. Words can be pronounced in any languages, dialects or accents. Each will be classified into one of m categories represented by its most similar unknown voice. When user pronounces a word, the invention finds its F most similar unknown voices. All words in F categories represented by F unknown voices will be arranged according to their pronunciation similarity and alphabetic letters. The pronounced word should be among the top words. Since we only find the F most similar unknown voices from m (=500) unknown voices and since the same word can be classified into several categories, our recognition method is stable for all users and can fast and accurately recognize all languages (English, Chinese and etc.) and input much more words without using samples.

Type: Application

Filed: September 29, 2009

Publication date: March 17, 2011

Inventors: Tze-Fen LI, Tai-Jan Lee Li, Shih-Tzung Li, Shih-Hon Li, Li-Chuan Liao
Apparatus and method for speech segment detection and system for speech recognition

Patent number: 7860718

Abstract: Provided are an apparatus and method for speech segment detection, and a system for speech recognition. The apparatus is equipped with a sound receiver and an image receiver and includes: a lip motion signal detector for detecting a motion region from image frames output from the image receiver, applying lip motion image feature information to the detected motion region, and detecting a lip motion signal; and a speech segment detector for detecting a speech segment using sound frames output from the sound receiver and the lip motion signal detected from the lip motion signal detector. Since lip motion image information is checked in a speech segment detection process, it is possible to prevent dynamic noise from being misrecognized as speech.

Type: Grant

Filed: December 4, 2006

Date of Patent: December 28, 2010

Assignee: Electronics and Telecommunications Research Institute

Inventors: Soo Jong Lee, Sang Hun Kim, Young Jik Lee, Eung Kyeu Kim
Detection system for segment including specific sound signal, method and program for the same

Patent number: 7860714

Abstract: The present invention is a detection system of a segment including specific sound signal which detects a segment in a stored sound signal similar to a reference sound signal, including: a reference signal spectrogram division portion which divides a reference signal spectrogram into spectrograms of small-regions; a small-region reference signal spectrogram coding portion which encodes the small-region reference signal spectrogram to a reference signal small-region code; a small-region stored signal spectrogram coding portion which encodes a small-region stored signal spectrogram to a stored signal small-region code; a similar small-region spectrogram detection portion which detects a small-region spectrogram similar to the small-region reference signal spectrograms based on a degree of similarity of a code; and a degree of segment similarity calculation portion which uses a degree of small-region similarity and calculates a degree of similarity between the segment of the stored signal and the reference signal

Type: Grant

Filed: July 1, 2005

Date of Patent: December 28, 2010

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Hidehisa Nagano, Takayuki Kurozumi, Kunio Kashino
Method and apparatus for verification of speaker authentication

Patent number: 7809561

Abstract: The present invention provides a method and apparatus for verification of speaker authentication. A method for verification of speaker authentication, comprising: inputting an utterance containing a password that is spoken by a speaker; extracting an acoustic feature vector sequence from said inputted utterance; DTW-matching said extracted acoustic feature vector sequence and a speaker template enrolled by an enrolled speaker; calculating each of a plurality of local distances between said DTW-matched acoustic feature vector sequence and said speaker template; nonlinear-transforming said each local distance calculated to give more weights on small local distances; calculating a DTW-matching score based on said plurality of local distances nonlinear-transformed; and comparing said matching score with a predefined discriminating threshold to determine whether said inputted utterance is an utterance containing a password spoken by the enrolled speaker.

Type: Grant

Filed: March 28, 2007

Date of Patent: October 5, 2010

Assignee: Kabushiki Kaisha Toshiba

Inventors: Jian Luan, Jie Hao
Speech activated control system and related methods

Patent number: 7774202

Abstract: A speech activated control system for controlling aerial vehicle components, program product, and associated methods are provided. The system can include a host processor adapted to develop speech recognition models and to provide speech command recognition. The host processor can be positioned in communication with a database for storing and retrieving speech recognition models. The system can include an avionic computer in communication with the host processor and adapted to provide command function management, a display and control processor in communication with the avionic computer adapted to provide a user interface between a user and the avionic computer, and a data interface positioned in communication with the avionic computer and the host processor provided to divorce speech command recognition functionality from vehicle or aircraft-related speech-command functionality.

Type: Grant

Filed: June 12, 2006

Date of Patent: August 10, 2010

Assignee: Lockheed Martin Corporation

Inventors: Richard P. Spengler, Jon C. Russo, Gregory W. Barnett, Kermit L. Armbruster
Methods for utilizing human perceptual systems for processing event log data

Patent number: 7769566

Abstract: A method is provided for utilizing the human perceptual system by providing a spectrum of event log data for listening. Event log data is received. Events of the event log data are mapped to an x-axis of a spectrum based on time, where events of the event log data correspond to a time slot on the x-axis. Categories for the events are mapped to a y-axis of the spectrum, where the y-axis is a frequency axis, and where each of the categories respectively corresponds to a frequency of the multiple frequencies. The significance of the events of the event log data is mapped to a z-axis of the spectrum, where the z-axis is a magnitude axis. The time from the x-axis, the multiple frequencies from the y-axis, and the magnitude from z-axis of the spectrum are translated into sound.

Type: Grant

Filed: March 4, 2008

Date of Patent: August 3, 2010

Assignee: International Business Machines Corporation

Inventor: Xiaoming Zhang
Pattern matching method and apparatus and speech information retrieval system

Patent number: 7739111

Abstract: A pattern matching method for matching between a first symbol sequence and a second symbol sequence which is shorter than the first symbol sequence is provided. The method includes the steps of performing DP matching between the first and second symbol sequences to create a matrix of the DP matching transition, detecting the maximum length of lengths of consecutive correct answers based on the matrix of the DP matching transition, and calculating similarity based on the maximum length.

Type: Grant

Filed: August 9, 2006

Date of Patent: June 15, 2010

Assignee: Canon Kabushiki Kaisha

Inventor: Kazue Kaneko
Multiframe control channel detection for enhanced dedicated channel

Patent number: 7733988

Abstract: A plurality of decoding metrics for a current frame may be generated based on a correlation set for a current frame and a correlation set for at least one previous frame. Whether a signal is present on a control channel may then be determined based on the generated decoding metrics.

Type: Grant

Filed: October 28, 2005

Date of Patent: June 8, 2010

Assignee: Alcatel-Lucent USA Inc.

Inventors: Rainer Bachl, Francis Dominique, Hongwei Kong, Walid E. Nabhane
Time warped modified transform coding of audio signals

Patent number: 7720677

Abstract: A spectral representation of an audio signal having consecutive audio frames can be derived more efficiently, when a common time warp is estimated for any two neighboring frames, such that a following block transform can additionally use the warp information. Thus, window functions required for successful application of an overlap and add procedure during reconstruction can be derived and applied, the window functions already anticipating the re-sampling of the signal due to the time warping. Therefore, the increased efficiency of block-based transform coding of time-warped signals can be used without introducing audible discontinuities.

Type: Grant

Filed: August 11, 2006

Date of Patent: May 18, 2010

Assignee: Coding Technologies AB

Inventor: Lars Villemoes
Method for training a consumer-oriented application device by speech items, whilst reporting progress by an animated character with various maturity statuses each associated to a respective training level, and a device arranged for supporting such method

Patent number: 7707033

Abstract: Training a consumer-oriented application device is based on a plurality of user-presented speech items. A progress measure is reported regarding a training status reached for a particular user person. In particular, the training status is visually represented by an animated character creature which has a plurality of training status representative maturity statuses that are each associated to a corresponding training level.

Type: Grant

Filed: June 18, 2002

Date of Patent: April 27, 2010

Assignee: Koninklijke Philips Electronics N.V.

Inventor: Lucas Jacobus Franciscus Geurts
Methods and apparatus for flexible speech recognition

Patent number: 7698136

Abstract: The present invention is directed to a computer implemented method and apparatus for flexibly recognizing meaningful data items within an arbitrary user utterance. According to one example embodiment of the invention, a set of one or more key phrases and a set of one or more filler phrases are defined, probabilities are assigned to the key phrases and/or the filler phrases, and the user utterances is evaluated against the set of key phrases and the set of filler phrases using the probabilities.

Type: Grant

Filed: January 28, 2003

Date of Patent: April 13, 2010

Assignee: Voxify, Inc.

Inventors: Patrick T. M. Nguyen, Adeeb W. M. Shana'a, Amit V. Desai
Signal modification method for efficient coding of speech signals

Patent number: 7680651

Abstract: In accordance with the exemplary embodiments of the invention there is disclosed at least a method and apparatus for determining a long-term-prediction delay parameter characterizing a long term prediction in a technique using signal modification for digitally encoding a sound signal, the sound signal is divided into a series of successive frames, a feature of the sound signal is located in a previous frame, a corresponding feature of the sound signal is located in a current frame, and the long-term-prediction delay parameter is determined for the current frame while mapping, with the long term prediction, the signal feature of the previous frame with the corresponding signal feature of the current frame. Each divided frame of the sound signal is partitioned into a plurality of signal segments, and at least a part of the signal segments of the frame are warped while constraining the warped signal segments inside the frame.

Type: Grant

Filed: December 13, 2002

Date of Patent: March 16, 2010

Assignee: Nokia Corporation

Inventors: Mikko Tammi, Milan Jelinek, Claude LaFlamme, Vesa Ruoppila
System and Method for Detecting Repeated Patterns in Dialog Systems

Publication number: 20090313016

Abstract: Embodiments of a method and system for detecting repeated patterns in dialog systems are described. The system includes a dynamic time warping (DTW) based pattern comparison algorithm that is used to find the best matching parts between a correction utterance and an original utterance. Reference patterns are generated from the correction utterance by an unsupervised segmentation scheme. No significant information about the position of the repeated parts in the correction utterance is assumed, as each reference pattern is compared with the original utterance from the beginning of the utterance to the end. A pattern comparison process with DTW is executed without knowledge of fixed end-points. A recursive DTW computation is executed to find the best matching parts that are considered as the repeated parts as well as the end-points of the utterance.

Type: Application

Filed: June 13, 2008

Publication date: December 17, 2009

Applicant: ROBERT BOSCH GMBH

Inventors: Mert Cevik, Fuliang Weng
Method and apparatus for adapting reference templates

Patent number: 7509257

Abstract: A method and apparatus for adapting reference templates is provided. The method includes adapting one or more reference templates using a stored test utterance by replacing data within at the reference templates with a weighted interpolation of that data and corresponding data within the test utterance.

Type: Grant

Filed: December 24, 2002

Date of Patent: March 24, 2009

Assignee: Marvell International Ltd.

Inventor: Hagai Aronowitz
Method for producing reference segments describing voice modules and method for modeling voice units of a spoken test model

Patent number: 7398208

Abstract: A method models voice units and produces reference segments for modeling voice units. The reference segments describe voice modules by characteristic vectors, the characteristic vectors being stored in the order in which they are found in a training voice signal. Alternative characteristic vectors are associated with each characteristic vector. The reference segments for describing the voice modules are combined during the modeling of larger voice units. In the event of identification, the respectively best adapted characteristic vector alternatives are used to determined the distance between a test utterance and the larger vocal units.

Type: Grant

Filed: October 1, 2002

Date of Patent: July 8, 2008

Assignee: Siemens Atkiengesellschaft

Inventor: Bernhard Kämmerer
APPARATUS AND METHODS FOR VOCAL TRACT ANALYSIS OF SPEECH SIGNALS

Publication number: 20080162134

Abstract: The present invention provides for speech processing apparatus arranged for the input or output of a speech data signal and including a function generating means arranged for producing a representation of a vocal-tract potential function representative of a speech source and as an example, a speaker identification process can comprise means to capture an incoming voice signal, for example from a microphone or telephone line; means to process the signal electronically to generate a time varying series of binary vocal-tract potentials and associated non-vowel binary parameters; means to refine the signal to revoke the speaker-independent speech components; and means to compare the residual signal with a database of such residual features of known individuals.

Type: Application

Filed: January 7, 2008

Publication date: July 3, 2008

Applicant: King's College London

Inventors: Barbara Janey Forbes, Edward Roy Pike
Method and apparatus for reducing synchronization delay in packet switched voice terminals using speech decoder modification

Patent number: 7394833

Abstract: A device is disclosed that makes packetized and encoded speech data audible to a listener, as is a method for operating the device. The device includes a unit for generating a synchronization request for reducing an amount of synchronization delay, and further includes a speech decoder that is responsive to the synchronization delay adjustment request for executing a time-warping operation for one of lengthening or shortening a duration of a speech frame. In one embodiment the speech decoder comprises a code excited linear prediction (CELP) speech decoder, and the CELP decoder time-warping operation is applied to a reconstructed excitation signal u(k) to derive a time-warped reconstructed signal uw(k). The time-warped reconstructed signal uw(k) is input to a Linear Predictor (LP) synthesis filter to derive a CELP decoder time-warped output signal y^w(k).

Type: Grant

Filed: February 11, 2003

Date of Patent: July 1, 2008

Assignee: Nokia Corporation

Inventors: Ari Heikkinen, Ari Lakaniemi

prev 1 2 3 4 next