Patents Examined by James Wozniak
  • Patent number: 9934791
    Abstract: Provided is a method, non-transitory computer program product and system for an improved noise suppression technique for speech enhancement. It operates on speech signals from a single source such as either the output from a single microphone or the reconstructed speech signal at the receiving end of a communication application. The system performs background noise monitoring of an in-coming speech signal and determines its level, and performs a time domain gain calculation. The noise suppressed output signal is the gain shaped original speech signal.
    Type: Grant
    Filed: September 27, 2016
    Date of Patent: April 3, 2018
    Assignee: QOSOUND, INC.
    Inventor: Huan-Yu Su
  • Patent number: 9911434
    Abstract: An audio processing system (100) accepts an audio bitstream having one of a plurality of predefined audio frame rates. The system comprises a front-end component (110), which receives a variable number of quantized spectral components, corresponding to one audio frame in any of the predefined audio frame rates, and performs an inverse quantization according to predetermined, frequency-dependent quantization levels. The front-end component may be agnostic of the audio frame rate. The audio processing system further comprises a frequency-domain processing stage (120) and a sample rate converter (130), which provide a reconstructed audio signal sampled at a target sampling frequency independent of the audio frame rate. By its frame-rate adaptability, the system can be configured to operate frame-synchronously in parallel with a video processing system that accepts plural video frame rates.
    Type: Grant
    Filed: April 4, 2014
    Date of Patent: March 6, 2018
    Assignee: Dolby International AB
    Inventors: Heiko Purnhagen, Kristopher Kjoerling, Alexander Groeschel, Jens Popp, Karl Jonas Roeden
  • Patent number: 9892745
    Abstract: Disclosed herein are systems, methods, and computer-readable storage media for detecting voice activity in a media signal in an augmented, multi-tier classifier architecture. A system configured to practice the method can receive, from a first classifier, a first voice activity indicator detected in a first modality for a human subject. Then, the system can receive, from a second classifier, a second voice activity indicator detected in a second modality for the human subject, wherein the first voice activity indicator and the second voice activity indicators are based on the human subject at a same time, and wherein the first modality and the second modality are different. The system can concatenate, via a third classifier, the first voice activity indicator and the second voice activity indicator with original features of the human subject, to yield a classifier output, and determine voice activity based on the classifier output.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: February 13, 2018
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Dimitrios Dimitriadis, Eric Zavesky, Matthew Burlick
  • Patent number: 9886954
    Abstract: One or more context aware processing parameters and an ambient audio stream are received. One or more sound characteristics associated with the ambient audio stream are identified using a machine learning model. One or more actions to perform are determined using the machine learning model and based on the one or more context aware processing parameters and the identified one or more sound characteristics. The one or more actions are performed.
    Type: Grant
    Filed: September 30, 2016
    Date of Patent: February 6, 2018
    Assignee: Doppler Labs, Inc.
    Inventors: Jacob Meacham, Matthew Sills, Richard Fritz Lanman, III, Jeffrey Baker
  • Patent number: 9875740
    Abstract: Approaches provide for using voice information to influence the importance of search result categories for a search query. For example, various embodiments may provide search results for a search query based on a most relevant search result category to the search query. Voice information associated with a subsequent user interaction may be analyzed to identify whether the search result category is correct or if search results from a different category should be provided. Additionally, the voice information may be used to update the relevance score of the search result category to the search query to improve the category matching of future queries.
    Type: Grant
    Filed: June 20, 2016
    Date of Patent: January 23, 2018
    Assignee: A9.com, Inc.
    Inventors: Mukul Raj Kumar, Balpreet Singh Pankaj
  • Patent number: 9877071
    Abstract: This disclosure relates to systems and methods for proactively determining identification information for a plurality of audio segments within a plurality of broadcast media streams, and providing identification information associated with specific audio portions of a broadcast media stream automatically or upon request.
    Type: Grant
    Filed: April 30, 2013
    Date of Patent: January 23, 2018
    Assignee: GOOGLE INC.
    Inventors: Matthew Sharifi, Ant Oztaskent, Yaroslav Volovich
  • Patent number: 9865247
    Abstract: A device may receive a speech signal. The device may determine acoustic feature parameters for the speech signal. The acoustic feature parameters may include phase data. The device may determine circular space representations for the phase data based on an alignment of the phase data with given axes of the circular space representations. The device may map the phase data to linguistic features based on the circular space representations. The linguistic features may be associated with linguistic content that includes phonemic content or text content. The device may provide a synthetic audio pronunciation of the linguistic content based on the mapping.
    Type: Grant
    Filed: February 25, 2015
    Date of Patent: January 9, 2018
    Assignee: Google Inc.
    Inventors: Ioannis Agiomyrgiannakis, Byung Ha Chun
  • Patent number: 9858939
    Abstract: Method and decoder for processing of audio signals. The method and decoder relate to deriving a processed vector {circumflex over (d)} by applying a post-filter directly on a vector d comprising quantized MDCT domain coefficients of a time segment of an audio signal. The post-filter is configured to have a transfer function H which is a compressed version of the envelope of the vector d. A signal waveform is reconstructed by performing an inverse MDCT transform on the processed vector {circumflex over (d)}.
    Type: Grant
    Filed: May 10, 2011
    Date of Patent: January 2, 2018
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Volodya Grancharov, Sigurdur Sverrisson
  • Patent number: 9847082
    Abstract: A system includes a speech recognition processor, a depth sensor coupled to the speech recognition processor, and an array of microphones coupled to the speech recognition processor. The depth sensor is operable to calculate a distance and a direction from the array of microphones to a source of audio data. The speech recognition processor is operable to select an acoustic model as a function of the distance and the direction from the array of microphones to the source of audio data. The speech recognition processor is operable to apply the distance measure in the microphone array beam formation so as to boost portions of the signals originating from the source of audio data and to suppress portions of the signals resulting from noise.
    Type: Grant
    Filed: August 23, 2013
    Date of Patent: December 19, 2017
    Assignee: Honeywell International Inc.
    Inventors: Mohammed Ibrahim Mohideen, Yadhunandan Us
  • Patent number: 9812130
    Abstract: The technology of the present application provides a method and apparatus to manage speech resources. The method includes using a text recognizer to detect a change in a speech application that requires the use of different resources. On detection of the change, the method loads the different resources without the user needing to exit the currently executing speech application.
    Type: Grant
    Filed: March 4, 2015
    Date of Patent: November 7, 2017
    Assignee: nVoq Incorporated
    Inventor: Charles Corfield
  • Patent number: 9805715
    Abstract: A method of recognizing speech commands includes generating a background acoustic model for a sound using a first sound sample, the background acoustic model characterized by a first precision metric. A foreground acoustic model is generated for the sound using a second sound sample, the foreground acoustic model characterized by a second precision metric. A third sound sample is received and decoded by assigning a weight to the third sound sample corresponding to a probability that the sound sample originated in a foreground using the foreground acoustic model and the background acoustic model. The method further includes determining if the weight meets predefined criteria for assigning the third sound sample to the foreground and, when the weight meets the predefined criteria, interpreting the third sound sample as a portion of a speech command. Otherwise, recognition of the third sound sample as a portion of a speech command is forgone.
    Type: Grant
    Filed: December 13, 2013
    Date of Patent: October 31, 2017
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Shuai Yue, Li Lu, Xiang Zhang, Dadong Xie, Haibo Liu, Bo Chen, Jian Liu
  • Patent number: 9800453
    Abstract: A method and apparatus for providing signal processing coefficients for processing an input signal at a predetermined signal processing sampling rate, wherein the input signal is received at an input signal sampling rate, the method comprising the steps of computing a correlation or covariance function based on the received input signal at the input signal sampling rate to provide correlation or covariance coefficients at the input signal sampling rate, re-sampling the computed correlation or covariance coefficients having the input signal sampling rate to provide correlation or covariance coefficients at the predetermined signal processing sampling rate, and calculating the signal processing coefficients based on the correlation or covariance coefficients at the predetermined signal processing sampling rate.
    Type: Grant
    Filed: August 1, 2013
    Date of Patent: October 24, 2017
    Assignee: Huawei Technologies Co., Ltd.
    Inventors: Anisse Taleb, Jianfeng Xu, David Virette
  • Patent number: 9798799
    Abstract: A vehicle personal assistant to engage a user in a conversational dialog about vehicle-related topics, such as those commonly found in a vehicle owner's manual, includes modules to interpret spoken natural language input, search a vehicle knowledge base and/or other data sources for pertinent information, and respond to the user's input in a conversational fashion. The dialog may be initiated by the user or more proactively by the vehicle personal assistant based on events that may be currently happening in relation to the vehicle. The vehicle personal assistant may use real-time inputs obtained from the vehicle and/or non-verbal inputs from the user to enhance its understanding of the dialog and assist the user in a variety of ways.
    Type: Grant
    Filed: November 15, 2012
    Date of Patent: October 24, 2017
    Assignee: SRI INTERNATIONAL
    Inventors: Michael J. Wolverton, William S. Mark, Harry Bratt, Douglas A. Bercow
  • Patent number: 9779744
    Abstract: A linear prediction coefficient of a signal represented in a frequency domain is obtained by performing linear prediction analysis in a frequency direction by using a covariance method or an autocorrelation method. After the filter strength of the obtained linear prediction coefficient is adjusted, filtering may be performed in the frequency direction on the signal by using the adjusted coefficient, whereby the temporal envelope of the signal is shaped. This reduces the occurrence of pre-echo and post-echo and improves the subjective quality of the decoded signal, without significantly increasing the bit rate in a bandwidth extension technique in the frequency domain represented by SBR.
    Type: Grant
    Filed: August 18, 2016
    Date of Patent: October 3, 2017
    Assignee: NTT Docomo, Inc.
    Inventors: Kosuke Tsujino, Kei Kikuiri, Nobuhiko Naka
  • Patent number: 9773499
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for recognizing names of entities in speech. In one aspect, a method includes actions of receiving an utterance that includes (i) a first term that indicates a particular entity type, and (ii) a second term that indicates an entity name. Additional actions include obtaining a phonetic representation of the second term and determining that the phonetic representation of the second term matches a particular phonetic representation of a particular canonical name of a set of canonical names associated with a particular entity. Further actions include outputting a reference name associated with the particular entity as a transcription of the second term.
    Type: Grant
    Filed: March 5, 2015
    Date of Patent: September 26, 2017
    Assignee: Google Inc.
    Inventors: Gleb Skobeltsyn, Behshad Behzadi
  • Patent number: 9767801
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for intelligently cancelling user inputs. In one aspect, a requests input by a user is received by a dialog engine. A prompt or notification regarding the request is output by the dialog engine. That the user has taken an action in response to the prompt or notification is determined by the dialog engine. Based on the action taken by the user, that the response corresponds to a potential cancellation command is determined by the dialog system.
    Type: Grant
    Filed: December 30, 2016
    Date of Patent: September 19, 2017
    Assignee: Google Inc.
    Inventors: Jason Sanders, Gabriel Taubman
  • Patent number: 9754601
    Abstract: A very coarse quantization exceeding the measure determined by the masking threshold without or only very little quality losses is enabled by quantizing not immediately the prefiltered signal, but a prediction error obtained by forward-adaptive prediction of the prefiltered signal. Due to the forward adaptivity, the quantizing error has no negative effect on the prediction on the decoder side.
    Type: Grant
    Filed: February 28, 2007
    Date of Patent: September 5, 2017
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jens Hirschfeld, Gerald Schuller, Manfred Lutzky, Ulrich Kraemer, Stefan Wabnik
  • Patent number: 9747893
    Abstract: A computer-based, unsupervised training method for an N-gram language model includes reading, by a computer, recognition results obtained as a result of speech recognition of speech data; acquiring, by the computer, a reliability for each of the read recognition results; referring, by the computer, to the recognition result and the acquired reliability to select an N-gram entry; and training, by the computer, the N-gram language model about selected one of more of the N-gram entries using all recognition results.
    Type: Grant
    Filed: October 6, 2016
    Date of Patent: August 29, 2017
    Assignee: International Business Machines Corporation
    Inventors: Nobuyasu Itoh, Gakuto Kurata, Masafumi Nishimura
  • Patent number: 9741360
    Abstract: A method of speech enhancement for target speakers is presented. A blind source separation (BSS) module is used to separate a plurality of microphone recorded audio mixtures into statistically independent audio components. At least one of a plurality of speaker profiles are used to score and weight each audio components, and a speech mixer is used to first mix the weighted audio components, then align the mixed signals, and finally add the aligned signals to generate an extracted speech signal. Similarly, a noise mixer is used to first weight the audio components, then mix the weighted signals, and finally add the mixed signals to generate an extracted noise signal. Post processing is used to further enhance the extracted speech signal with a Wiener filtering or spectral subtraction procedure by subtracting the shaped power spectrum of extracted noise signal from that of the extracted speech signal.
    Type: Grant
    Filed: October 9, 2016
    Date of Patent: August 22, 2017
    Assignee: Spectimbre Inc.
    Inventors: Xi-Lin Li, Yan-Chen Lu
  • Patent number: 9710819
    Abstract: A computing system accepts audio from one or more sources, parses the audio into chunks, and transcribes the chunks in substantially real time. Some transcription is performed automatically, while other transcription is performed by humans who listen to the audio and enter the words spoken and/or the intent of the caller (such as directions given to the system). The system provides for participants a user interface that is updated in substantially real time with the transcribed text from the audio stream(s). A single audio line can be used for simple transcription, and multiple audio lines are used to provide a real-time transcript of a conference call, deposition, or the like. A pool of analysts creates, checks, and/or corrects transcription, and callers/observers can even assist in the correction process through their respective user interfaces. Ads derived from the transcript are displayed together with the text in substantially real time.
    Type: Grant
    Filed: November 15, 2009
    Date of Patent: July 18, 2017
    Assignee: Interactions LLC
    Inventors: Michael Eric Cloran, David Paul Heitzman, Mitchell Gregory Shields, Jeromey Russell Goetz