Patents Examined by Pierre-Louis Desir

Unvoiced/voiced decision for speech processing

Patent number: 9570093

Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.

Type: Grant

Filed: September 3, 2014

Date of Patent: February 14, 2017

Assignee: Huawei Technologies Co., Ltd.

Inventor: Yang Gao
Automatic accuracy estimation for audio transcriptions

Patent number: 9570068

Abstract: Embodiments of the present invention provide an approach for estimating the accuracy of a transcription of a voice recording. Specifically, in a typical embodiment, each word of a transcription of a voice recording is checked against a customer-specific dictionary and/or a common language dictionary. The number of words not found in either dictionary is determined. An accuracy number for the transcription is calculated from the number of said words not found and the total number of words in the transcription.

Type: Grant

Filed: June 3, 2016

Date of Patent: February 14, 2017

Assignee: International Business Machines Corporation

Inventors: James E. Bostick, John M. Ganci, Jr., John P. Kaemmerer, Craig M. Trim
Speech recognition method and system based on user personalized information

Patent number: 9564127

Abstract: The present invention relates to a speech recognition method and system based on user personalized information. The method comprises the following steps: receiving a speech signal; decoding the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is a decoding network associated with a basic name language model; if a decoding path enters a name node in the basic static decoding network, network extending is carried out on the name node according to an affiliated static decoding network of a user, wherein the affiliated static decoding network is a decoding network associated with a name language model of a particular user; and returning a recognition result after the decoding is completed. The recognition accuracy rate of user personalized information in continuous speech recognition may be raised by using the present invention.

Type: Grant

Filed: December 20, 2013

Date of Patent: February 7, 2017

Assignee: iFLYTEK Co., Ltd.

Inventors: Qinghua Pan, Tingting He, Guoping Hu, Yu Hu, Qingfeng Liu
Dictionary device, dictionary search method, dictionary system, and server device

Patent number: 9563619

Abstract: On a text display screen displayed on a touch panel color display unit, after a plurality of desired words are specified by a touch operation and it is detected that the touched points are moved downward, an example sentence including each of the specified words is searched for in dictionary data corresponding to the character type of each of the words, and displayed. When it is detected that the touched points are moved upward, a phrase including each of the specified words is searched for in the dictionary data corresponding to the character type of each of the words, and displayed.

Type: Grant

Filed: December 17, 2013

Date of Patent: February 7, 2017

Assignee: CASIO COMPUTER CO., LTD.

Inventor: Kazuhisa Nakamura
Method for user communication with information dialogue system

Patent number: 9564149

Abstract: Provided is a method for user communications with an information dialog system, which may be used for organizing user interactions with the information dialog system based on a natural language. The method may include activating a user input subsystem in response to a user entering a request; receiving and converting the request of the user into text by the user input subsystem; sending the text obtained as a result of the conversion of the request to a dialog module; processing, by the dialog module, the text; forming, by the dialog module, the response to the request; sending the response to the user; and displaying and/or reproducing the formed response, where, after the displaying and/or the reproducing of the formed response, the user input subsystem is automatically activated upon entering a further request or a clarification request by the user.

Type: Grant

Filed: May 26, 2015

Date of Patent: February 7, 2017

Assignee: Google Inc.

Inventors: Ilya Genadevich Gelfenbeyn, Artem Goncharuk, Ilya Andreevich Platonov, Olga Aleksandrovna Gelfenbeyn, Pavel Aleksandrovich Sirotin
Audio communication system, audio communication method, audio communication purpose program, audio transmission terminal, and audio transmission terminal purpose program

Patent number: 9564147

Abstract: An audio communication system includes a generation unit that superimposes an addition sound having a volume level determined on the basis of a voice acquired by a voice acquisition unit on an input voice acquired by the voice acquisition unit of a transmission terminal and generates a synthesis sound and a transmission unit that transmits a signal of the synthesis sound generated by the generation unit to a reception terminal.

Type: Grant

Filed: April 30, 2013

Date of Patent: February 7, 2017

Assignee: Rakuten, Inc.

Inventor: Hisanori Yamahara
Audio processing apparatus and audio processing method

Patent number: 9558744

Abstract: An audio processing apparatus and an audio processing method are described. In one embodiment, the audio processing apparatus include an audio masker separator for separating from a first audio signal an audio material comprising a sound other than stationary noise and utterance meaningful in semantics, as an audio masker candidate. The apparatus also includes a first context analyzer for obtaining statistics regarding contextual information of detected audio masker candidates, and a masker library builder for building a masker library or updating an existing masker library by adding, based on the statistics, at least one audio masker candidate as an audio masker into the masker library, wherein audio maskers in the maker library are used to be inserted into a target position in a second audio signal to conceal defects in the second audio signal.

Type: Grant

Filed: November 27, 2013

Date of Patent: January 31, 2017

Assignee: Dolby Laboratories Licensing Corporation

Inventors: Xuejing Sun, Shen Huang, Poppy Crum, Hannes Muesch, Glenn N. Dickins, Michael Eckert
Integration of semantic context information

Patent number: 9558743

Abstract: In one implementation, a computer-implemented method includes receiving, at a computer system, a request to predict a next word in a dialog being uttered by a speaker; accessing, by the computer system, a neural network comprising i) an input layer, ii) one or more hidden layers, and iii) an output layer; identifying the local context for the dialog of the speaker; selecting, by the computer system and using a semantic model, at least one vector that represents the semantic context for the dialog; applying input to the input layer of the neural network, the input comprising i) the local context of the dialog and ii) the values for the at least one vector; generating probability values for at least a portion of the candidate words; and providing, by the computer system and based on the probability values, information that identifies one or more of the candidate words.

Type: Grant

Filed: April 16, 2013

Date of Patent: January 31, 2017

Assignee: Google Inc.

Inventors: Noah B. Coccaro, Patrick An Nguyen
Noise suppression assisted automatic speech recognition

Patent number: 9558755

Abstract: Noise suppression information is used to optimize or improve automatic speech recognition performed for a signal. Noise suppression can be performed on a noisy speech signal using a gain value. The gain to apply to the noisy speech signal is selected to optimize speech recognition analysis of the resulting signal. The gain may be selected based on one or more features for a current sub band and time frame, as well as one or more features for other sub bands and/or time frames. Noise suppression information can be provided to a speech recognition module to improve the robustness of the speech recognition analysis. Noise suppression information can also be used to encode and identify speech.

Type: Grant

Filed: December 7, 2010

Date of Patent: January 31, 2017

Assignee: Knowles Electronics, LLC

Inventors: Jean Laroche, Carlo Murgia
Systems and methods for facilitating software interface localization between multiple languages

Patent number: 9552213

Abstract: In one aspect, there is provided a system having a processor and a data storage device coupled to the processor. The data storage device stores instructions executable by the processor to receive a software module, the software module having an interface adapted to display a plurality of first graphemes in a first language, provide at least one look-up table having at least some of the first graphemes and a plurality of second graphemes in a second language associated therewith, said association being based on a phonetic similarly between the first and second graphemes when the first graphemes are vocalized in the first language and the second graphemes are vocalized in the second language, and replace at least one of the first graphemes in the interface with the associated second graphemes such that the interface is adapted to display the second graphemes in the second language, the second graphemes being understandable in the first language when the second graphemes are vocalized.

Type: Grant

Filed: May 16, 2011

Date of Patent: January 24, 2017

Assignee: D2L Corporation

Inventors: Dariusz Grabka, Ali Ghassemi
Signal adaptive FIR/IIR predictors for minimizing entropy

Patent number: 9548056

Abstract: The present document relates to coding. In particular, the present document relates to coding using linear prediction in combination with entropy encoding. A method (600) for determining a general prediction filter for a frame of an input signal (111) is described. The z-transform of the general prediction filter comprises an approximation to the z-transform of a finite impulse response, referred to as FIR, filter with the z variable of the FIR filter being replaced by the z-transform of an allpass filter. The FIR filter comprises a plurality of FIR coefficients (412). The allpass filter exhibits a pole defined by an adjustable pole parameter. The method (600) comprises determining the pole parameter and the plurality of FIR coefficients, such that an entropy of a frame of a prediction error signal (414) which is derived from the frame of the input signal (111) using the general prediction filter defined by the pole parameter and the plurality of FIR coefficients (412) is reduced.

Type: Grant

Filed: December 19, 2013

Date of Patent: January 17, 2017

Assignee: Dolby International AB

Inventor: Arijit Biswas
Presenting translations of text depicted in images

Patent number: 9547644

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting additional information for text depicted by an image. In one aspect, a method includes receiving an image. Text depicted in the image is identified. The identified text can be in one or more text blocks. A prominence presentation context is selected for the image based on the relative prominence of the one or more text blocks. Each prominence presentation context corresponds to a relative prominence of each text block in which text is presented within images. Each prominence presentation context has a corresponding user interface for presenting additional information related to the identified text depicted in the image. A user interface is identified that corresponds to the selected prominence presentation context. Additional information is presented for at least a portion of the text depicted in the image using the identified user interface.

Type: Grant

Filed: November 8, 2013

Date of Patent: January 17, 2017

Assignee: Google Inc.

Inventors: Alexander J. Cuthbert, Joshua J. Estelle
Sound processing device and sound processing method

Patent number: 9542937

Abstract: A sound processing device includes a noise suppression unit configured to suppress a noise component included in an input sound signal, an auxiliary noise addition unit configured to add auxiliary noise to the input sound signal, whose noise component has been suppressed by the noise suppression unit, to generate an auxiliary noise-added signal, a distortion calculation unit configured to calculate a degree of distortion of the auxiliary noise-added signal, and a control unit configured to control an addition amount by which the auxiliary noise addition unit adds the auxiliary noise based on the degree of distortion calculated by the distortion calculation unit.

Type: Grant

Filed: January 7, 2014

Date of Patent: January 10, 2017

Assignee: HONDA MOTOR CO., LTD.

Inventors: Kazuhiro Nakadai, Keisuke Nakamura, Daisuke Kimoto
Method and system for identification of speech segments

Patent number: 9536523

Abstract: A system for distinguishing and identifying speech segments originating from speech of one or more relevant speakers in a predefined detection area. The system includes an optical system which outputs optical patterns, each representing audio signals as detected by the optical system in the area within a specific time frame; and a computer processor which receives each of the outputted optical patterns and analyses each respective optical pattern to provide information that enables identification of speech segments thereby, by identifying blank spaces in the optical pattern, which define beginning or ending of each respective speech segment.

Type: Grant

Filed: June 21, 2012

Date of Patent: January 3, 2017

Assignee: VOCALZOOM SYSTEMS LTD.

Inventors: Tal Bakish, Gavriel Horowitz, Yekutiel Avargel, Yechiel Kurtz
Method and apparatus to generate a speech recognition library

Patent number: 9536519

Abstract: Methods and apparatus to generate a speech recognition library for use by a speech recognition system are disclosed. An example method comprises identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments, computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments, selecting a set of the plurality of audio data segments based on the plurality of difference metrics, identifying a first one of the audio data segments in the set as a representative audio data segment, determining a first phonetic transcription of the representative audio data segment, and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.

Type: Grant

Filed: October 29, 2015

Date of Patent: January 3, 2017

Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.

Inventor: Hisao Chang
Transcript re-sync

Patent number: 9536567

Abstract: In an aspect, in general, method for aligning an audio recording and a transcript includes receiving a transcript including a plurality of terms, each term of the plurality of terms associated with a time location within a different version of the audio recording, forming a plurality of search terms from the terms of the transcript, determining possible time locations of the search terms in the audio recording, determining a correspondence between time locations within the different version of the audio recording associated with the search terms and the possible time locations of the search terms in the audio recording, and aligning the audio recording and the transcript including updating the time location associated with terms of the transcript based on the determined correspondence.

Type: Grant

Filed: September 4, 2012

Date of Patent: January 3, 2017

Assignee: NEXIDIA INC.

Inventors: Jacob B. Garland, Drew Lanham, Daryl Kip Watters, Marsal Gavalda, Mark Finlay, Kenneth K. Griggs
-Gram-based language prediction

Patent number: 9535895

Abstract: Techniques are described for predicting the language of a text excerpt. The language prediction is accomplished by comparing n-grams of the text excerpt with n-grams of different language references. A probability is calculated for each n-gram of the text excerpt with respect to each of the language references. The calculated probabilities corresponding to a single language are then averaged to yield an overall probability corresponding to that language, and the resulting overall probabilities are compared to find the most likely language of the sample text.

Type: Grant

Filed: March 17, 2011

Date of Patent: January 3, 2017

Assignee: Amazon Technologies, Inc.

Inventor: Eugene Gershnik
Methods, systems, and circuits for text independent speaker recognition with automatic learning features

Patent number: 9530417

Abstract: Methods and systems of text independent speaker recognition provide a complexity comparable to text dependent speaker recognition system. These methods and systems exploit the fact that speech is a quasi-stationary signal and simplify the recognition process based on this theory. The speaker modeling allows a speaker profile to be updated progressively with new speech samples that are acquired during usage over time by the speaker.

Type: Grant

Filed: April 1, 2013

Date of Patent: December 27, 2016

Assignee: STMicroelectronics Asia Pacific Pte Ltd.

Inventors: Evelyn Kurniawati, Sapna George
Apparatus and method for recognizing voice

Patent number: 9530406

Abstract: An apparatus and a method for recognizing a voice include a plurality of array microphones configured to have at least one microphone, and a seat controller configured to check a position of a seat provided in a vehicle. A microphone controller is configured to set a beam forming region based on the checked position of the seat and controls an array microphone so as to obtain sound source data from the set beam forming region.

Type: Grant

Filed: April 3, 2014

Date of Patent: December 27, 2016

Assignee: HYUNDAI MOTOR COMPANY

Inventor: Seok Min Oh
Apparatus and method for translation using a translation tree structure in a portable terminal

Patent number: 9529796

Abstract: A method for translation using a translation tree structure in a portable terminal includes inputting, by a user speaking in a first language, one or more first language words to a first portable terminal; translating the inputted first language words into one or more second language words using a database of the first portable terminal; displaying the translated second language words according to the translation tree structure; selecting one of words of the second language words displayed in the translation tree structure; and transmitting the selected second language words to a server and a second portable terminal which uses the second language.

Type: Grant

Filed: August 31, 2012

Date of Patent: December 27, 2016

Assignee: Samsung Electronics Co., Ltd.

Inventors: Hyung-Sun Ryu, Kil-Su Eo, Young-Cheol Kang, Byeong-Yong Jeon

prev … 12 13 14 15 16 17 18 19 20 … next