Patents Examined by Pierre-Louis Desir
  • Patent number: 9570093
    Abstract: In accordance with an embodiment of the present invention, a method for speech processing includes determining an unvoicing/voicing parameter reflecting a characteristic of unvoiced/voicing speech in a current frame of a speech signal comprising a plurality of frames. A smoothed unvoicing/voicing parameter is determined to include information of the unvoicing/voicing parameter in a frame prior to the current frame of the speech signal. A difference between the unvoicing/voicing parameter and the smoothed unvoicing/voicing parameter is computed. The method further includes generating an unvoiced/voiced decision point for determining whether the current frame comprises unvoiced speech or voiced speech using the computed difference as a decision parameter.
    Type: Grant
    Filed: September 3, 2014
    Date of Patent: February 14, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventor: Yang Gao
  • Patent number: 9570068
    Abstract: Embodiments of the present invention provide an approach for estimating the accuracy of a transcription of a voice recording. Specifically, in a typical embodiment, each word of a transcription of a voice recording is checked against a customer-specific dictionary and/or a common language dictionary. The number of words not found in either dictionary is determined. An accuracy number for the transcription is calculated from the number of said words not found and the total number of words in the transcription.
    Type: Grant
    Filed: June 3, 2016
    Date of Patent: February 14, 2017
    Assignee: International Business Machines Corporation
    Inventors: James E. Bostick, John M. Ganci, Jr., John P. Kaemmerer, Craig M. Trim
  • Patent number: 9564127
    Abstract: The present invention relates to a speech recognition method and system based on user personalized information. The method comprises the following steps: receiving a speech signal; decoding the speech signal according to a basic static decoding network to obtain a decoding path on each active node in the basic static decoding network, wherein the basic static decoding network is a decoding network associated with a basic name language model; if a decoding path enters a name node in the basic static decoding network, network extending is carried out on the name node according to an affiliated static decoding network of a user, wherein the affiliated static decoding network is a decoding network associated with a name language model of a particular user; and returning a recognition result after the decoding is completed. The recognition accuracy rate of user personalized information in continuous speech recognition may be raised by using the present invention.
    Type: Grant
    Filed: December 20, 2013
    Date of Patent: February 7, 2017
    Assignee: iFLYTEK Co., Ltd.
    Inventors: Qinghua Pan, Tingting He, Guoping Hu, Yu Hu, Qingfeng Liu
  • Patent number: 9563619
    Abstract: On a text display screen displayed on a touch panel color display unit, after a plurality of desired words are specified by a touch operation and it is detected that the touched points are moved downward, an example sentence including each of the specified words is searched for in dictionary data corresponding to the character type of each of the words, and displayed. When it is detected that the touched points are moved upward, a phrase including each of the specified words is searched for in the dictionary data corresponding to the character type of each of the words, and displayed.
    Type: Grant
    Filed: December 17, 2013
    Date of Patent: February 7, 2017
    Assignee: CASIO COMPUTER CO., LTD.
    Inventor: Kazuhisa Nakamura
  • Patent number: 9564149
    Abstract: Provided is a method for user communications with an information dialog system, which may be used for organizing user interactions with the information dialog system based on a natural language. The method may include activating a user input subsystem in response to a user entering a request; receiving and converting the request of the user into text by the user input subsystem; sending the text obtained as a result of the conversion of the request to a dialog module; processing, by the dialog module, the text; forming, by the dialog module, the response to the request; sending the response to the user; and displaying and/or reproducing the formed response, where, after the displaying and/or the reproducing of the formed response, the user input subsystem is automatically activated upon entering a further request or a clarification request by the user.
    Type: Grant
    Filed: May 26, 2015
    Date of Patent: February 7, 2017
    Assignee: Google Inc.
    Inventors: Ilya Genadevich Gelfenbeyn, Artem Goncharuk, Ilya Andreevich Platonov, Olga Aleksandrovna Gelfenbeyn, Pavel Aleksandrovich Sirotin
  • Patent number: 9564147
    Abstract: An audio communication system includes a generation unit that superimposes an addition sound having a volume level determined on the basis of a voice acquired by a voice acquisition unit on an input voice acquired by the voice acquisition unit of a transmission terminal and generates a synthesis sound and a transmission unit that transmits a signal of the synthesis sound generated by the generation unit to a reception terminal.
    Type: Grant
    Filed: April 30, 2013
    Date of Patent: February 7, 2017
    Assignee: Rakuten, Inc.
    Inventor: Hisanori Yamahara
  • Patent number: 9558744
    Abstract: An audio processing apparatus and an audio processing method are described. In one embodiment, the audio processing apparatus include an audio masker separator for separating from a first audio signal an audio material comprising a sound other than stationary noise and utterance meaningful in semantics, as an audio masker candidate. The apparatus also includes a first context analyzer for obtaining statistics regarding contextual information of detected audio masker candidates, and a masker library builder for building a masker library or updating an existing masker library by adding, based on the statistics, at least one audio masker candidate as an audio masker into the masker library, wherein audio maskers in the maker library are used to be inserted into a target position in a second audio signal to conceal defects in the second audio signal.
    Type: Grant
    Filed: November 27, 2013
    Date of Patent: January 31, 2017
    Assignee: Dolby Laboratories Licensing Corporation
    Inventors: Xuejing Sun, Shen Huang, Poppy Crum, Hannes Muesch, Glenn N. Dickins, Michael Eckert
  • Patent number: 9558743
    Abstract: In one implementation, a computer-implemented method includes receiving, at a computer system, a request to predict a next word in a dialog being uttered by a speaker; accessing, by the computer system, a neural network comprising i) an input layer, ii) one or more hidden layers, and iii) an output layer; identifying the local context for the dialog of the speaker; selecting, by the computer system and using a semantic model, at least one vector that represents the semantic context for the dialog; applying input to the input layer of the neural network, the input comprising i) the local context of the dialog and ii) the values for the at least one vector; generating probability values for at least a portion of the candidate words; and providing, by the computer system and based on the probability values, information that identifies one or more of the candidate words.
    Type: Grant
    Filed: April 16, 2013
    Date of Patent: January 31, 2017
    Assignee: Google Inc.
    Inventors: Noah B. Coccaro, Patrick An Nguyen
  • Patent number: 9558755
    Abstract: Noise suppression information is used to optimize or improve automatic speech recognition performed for a signal. Noise suppression can be performed on a noisy speech signal using a gain value. The gain to apply to the noisy speech signal is selected to optimize speech recognition analysis of the resulting signal. The gain may be selected based on one or more features for a current sub band and time frame, as well as one or more features for other sub bands and/or time frames. Noise suppression information can be provided to a speech recognition module to improve the robustness of the speech recognition analysis. Noise suppression information can also be used to encode and identify speech.
    Type: Grant
    Filed: December 7, 2010
    Date of Patent: January 31, 2017
    Assignee: Knowles Electronics, LLC
    Inventors: Jean Laroche, Carlo Murgia
  • Patent number: 9552213
    Abstract: In one aspect, there is provided a system having a processor and a data storage device coupled to the processor. The data storage device stores instructions executable by the processor to receive a software module, the software module having an interface adapted to display a plurality of first graphemes in a first language, provide at least one look-up table having at least some of the first graphemes and a plurality of second graphemes in a second language associated therewith, said association being based on a phonetic similarly between the first and second graphemes when the first graphemes are vocalized in the first language and the second graphemes are vocalized in the second language, and replace at least one of the first graphemes in the interface with the associated second graphemes such that the interface is adapted to display the second graphemes in the second language, the second graphemes being understandable in the first language when the second graphemes are vocalized.
    Type: Grant
    Filed: May 16, 2011
    Date of Patent: January 24, 2017
    Assignee: D2L Corporation
    Inventors: Dariusz Grabka, Ali Ghassemi
  • Patent number: 9548056
    Abstract: The present document relates to coding. In particular, the present document relates to coding using linear prediction in combination with entropy encoding. A method (600) for determining a general prediction filter for a frame of an input signal (111) is described. The z-transform of the general prediction filter comprises an approximation to the z-transform of a finite impulse response, referred to as FIR, filter with the z variable of the FIR filter being replaced by the z-transform of an allpass filter. The FIR filter comprises a plurality of FIR coefficients (412). The allpass filter exhibits a pole defined by an adjustable pole parameter. The method (600) comprises determining the pole parameter and the plurality of FIR coefficients, such that an entropy of a frame of a prediction error signal (414) which is derived from the frame of the input signal (111) using the general prediction filter defined by the pole parameter and the plurality of FIR coefficients (412) is reduced.
    Type: Grant
    Filed: December 19, 2013
    Date of Patent: January 17, 2017
    Assignee: Dolby International AB
    Inventor: Arijit Biswas
  • Patent number: 9547644
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for presenting additional information for text depicted by an image. In one aspect, a method includes receiving an image. Text depicted in the image is identified. The identified text can be in one or more text blocks. A prominence presentation context is selected for the image based on the relative prominence of the one or more text blocks. Each prominence presentation context corresponds to a relative prominence of each text block in which text is presented within images. Each prominence presentation context has a corresponding user interface for presenting additional information related to the identified text depicted in the image. A user interface is identified that corresponds to the selected prominence presentation context. Additional information is presented for at least a portion of the text depicted in the image using the identified user interface.
    Type: Grant
    Filed: November 8, 2013
    Date of Patent: January 17, 2017
    Assignee: Google Inc.
    Inventors: Alexander J. Cuthbert, Joshua J. Estelle
  • Patent number: 9542937
    Abstract: A sound processing device includes a noise suppression unit configured to suppress a noise component included in an input sound signal, an auxiliary noise addition unit configured to add auxiliary noise to the input sound signal, whose noise component has been suppressed by the noise suppression unit, to generate an auxiliary noise-added signal, a distortion calculation unit configured to calculate a degree of distortion of the auxiliary noise-added signal, and a control unit configured to control an addition amount by which the auxiliary noise addition unit adds the auxiliary noise based on the degree of distortion calculated by the distortion calculation unit.
    Type: Grant
    Filed: January 7, 2014
    Date of Patent: January 10, 2017
    Assignee: HONDA MOTOR CO., LTD.
    Inventors: Kazuhiro Nakadai, Keisuke Nakamura, Daisuke Kimoto
  • Patent number: 9536523
    Abstract: A system for distinguishing and identifying speech segments originating from speech of one or more relevant speakers in a predefined detection area. The system includes an optical system which outputs optical patterns, each representing audio signals as detected by the optical system in the area within a specific time frame; and a computer processor which receives each of the outputted optical patterns and analyses each respective optical pattern to provide information that enables identification of speech segments thereby, by identifying blank spaces in the optical pattern, which define beginning or ending of each respective speech segment.
    Type: Grant
    Filed: June 21, 2012
    Date of Patent: January 3, 2017
    Assignee: VOCALZOOM SYSTEMS LTD.
    Inventors: Tal Bakish, Gavriel Horowitz, Yekutiel Avargel, Yechiel Kurtz
  • Patent number: 9536519
    Abstract: Methods and apparatus to generate a speech recognition library for use by a speech recognition system are disclosed. An example method comprises identifying a plurality of video segments having closed caption data corresponding to a phrase, the plurality of video segments associated with respective ones of a plurality of audio data segments, computing a plurality of difference metrics between a baseline audio data segment associated with the phrase and respective ones of the plurality of audio data segments, selecting a set of the plurality of audio data segments based on the plurality of difference metrics, identifying a first one of the audio data segments in the set as a representative audio data segment, determining a first phonetic transcription of the representative audio data segment, and adding the first phonetic transcription to a speech recognition library when the first phonetic transcription differs from a second phonetic transcription associated with the phrase in the speech recognition library.
    Type: Grant
    Filed: October 29, 2015
    Date of Patent: January 3, 2017
    Assignee: AT&T INTELLECTUAL PROPERTY I, L.P.
    Inventor: Hisao Chang
  • Patent number: 9536567
    Abstract: In an aspect, in general, method for aligning an audio recording and a transcript includes receiving a transcript including a plurality of terms, each term of the plurality of terms associated with a time location within a different version of the audio recording, forming a plurality of search terms from the terms of the transcript, determining possible time locations of the search terms in the audio recording, determining a correspondence between time locations within the different version of the audio recording associated with the search terms and the possible time locations of the search terms in the audio recording, and aligning the audio recording and the transcript including updating the time location associated with terms of the transcript based on the determined correspondence.
    Type: Grant
    Filed: September 4, 2012
    Date of Patent: January 3, 2017
    Assignee: NEXIDIA INC.
    Inventors: Jacob B. Garland, Drew Lanham, Daryl Kip Watters, Marsal Gavalda, Mark Finlay, Kenneth K. Griggs
  • Patent number: 9535895
    Abstract: Techniques are described for predicting the language of a text excerpt. The language prediction is accomplished by comparing n-grams of the text excerpt with n-grams of different language references. A probability is calculated for each n-gram of the text excerpt with respect to each of the language references. The calculated probabilities corresponding to a single language are then averaged to yield an overall probability corresponding to that language, and the resulting overall probabilities are compared to find the most likely language of the sample text.
    Type: Grant
    Filed: March 17, 2011
    Date of Patent: January 3, 2017
    Assignee: Amazon Technologies, Inc.
    Inventor: Eugene Gershnik
  • Patent number: 9530417
    Abstract: Methods and systems of text independent speaker recognition provide a complexity comparable to text dependent speaker recognition system. These methods and systems exploit the fact that speech is a quasi-stationary signal and simplify the recognition process based on this theory. The speaker modeling allows a speaker profile to be updated progressively with new speech samples that are acquired during usage over time by the speaker.
    Type: Grant
    Filed: April 1, 2013
    Date of Patent: December 27, 2016
    Assignee: STMicroelectronics Asia Pacific Pte Ltd.
    Inventors: Evelyn Kurniawati, Sapna George
  • Patent number: 9530406
    Abstract: An apparatus and a method for recognizing a voice include a plurality of array microphones configured to have at least one microphone, and a seat controller configured to check a position of a seat provided in a vehicle. A microphone controller is configured to set a beam forming region based on the checked position of the seat and controls an array microphone so as to obtain sound source data from the set beam forming region.
    Type: Grant
    Filed: April 3, 2014
    Date of Patent: December 27, 2016
    Assignee: HYUNDAI MOTOR COMPANY
    Inventor: Seok Min Oh
  • Patent number: 9529796
    Abstract: A method for translation using a translation tree structure in a portable terminal includes inputting, by a user speaking in a first language, one or more first language words to a first portable terminal; translating the inputted first language words into one or more second language words using a database of the first portable terminal; displaying the translated second language words according to the translation tree structure; selecting one of words of the second language words displayed in the translation tree structure; and transmitting the selected second language words to a server and a second portable terminal which uses the second language.
    Type: Grant
    Filed: August 31, 2012
    Date of Patent: December 27, 2016
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyung-Sun Ryu, Kil-Su Eo, Young-Cheol Kang, Byeong-Yong Jeon