Patents Examined by Shaun Roberts
  • Patent number: 9881620
    Abstract: Provided are, among other things, systems, methods and techniques for compressing an audio signal. According to one representative embodiment, an audio signal that includes quantization indexes, identification of segments of such quantization indexes, and indexes of entropy codebooks that have been assigned to such segments is obtained, with a single entropy codebook index having been assigned to each such segment. Potential merging operations in which adjacent ones of the segments potentially would be merged with each are identified, and bit penalties for the potential merging operations are estimated. At least one of the potential merging operations is performed based on the estimated bit penalties, thereby obtaining a smaller updated set of segments of quantization indexes and corresponding assigned codebooks. The quantization indexes in each of the segments in the smaller updated set are then entropy encoded by using the corresponding assigned entropy codebooks, thereby compressing the audio signal.
    Type: Grant
    Filed: December 4, 2016
    Date of Patent: January 30, 2018
    Assignee: Digital Rise Technology Co., Ltd.
    Inventor: Yuli You
  • Patent number: 9875747
    Abstract: A sensor device may include a computing device in communication with multiple microphones. A neural network executing on the computing device may receive audio signals from each microphone. One microphone signal may serve as a reference signal. The neural network may extract differences in signal characteristics of the other microphone signals as compared to the reference signal. The neural network may combine these signal differences into a lossy compressed signal. The sensor device may transmit the lossy compressed signal and the lossless reference signal to a remote neural network executing in a cloud computing environment for decompression and sound recognition analysis.
    Type: Grant
    Filed: July 15, 2016
    Date of Patent: January 23, 2018
    Assignee: GOOGLE LLC
    Inventors: Chanwoo Kim, Rajeev Conrad Nongpiur, Tara Sainath
  • Patent number: 9866938
    Abstract: A microphone system includes a first transducer deployed at a first microphone; a second transducer deployed at a second microphone, the first microphone being physically distinct from the second microphone; a decimator deployed at the second microphone that receives first pulse density modulation (PDM) data from the first transducer and second PDM data from the second transducer and decimates and combines the first PDM data and the second PDM data into combined pulse code modulation (PCM) data; and an interpolator deployed at the second microphone for converting the combined PCM data to combined PDM data, and transmits the combined PDM data to an external processing device.
    Type: Grant
    Filed: February 4, 2016
    Date of Patent: January 9, 2018
    Assignee: Knowles Electronics, LLC
    Inventors: Robert Popper, Dibyendu Nandy, Ramanujapuram Raghuvir, Sarmad Qutub, Oddy Khamharn
  • Patent number: 9858945
    Abstract: The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S.
    Type: Grant
    Filed: July 10, 2017
    Date of Patent: January 2, 2018
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 9858263
    Abstract: A method for predicting a canonical form for an input text sequence includes predicting the canonical form with a neural network model. The model includes an encoder, which generates a first representation of the input text sequence based on a representation of n-grams in the text sequence and a second representation of the input text sequence generated by a first neural network. The model also includes a decoder which sequentially predicts terms of the canonical form based on the first and second representations and a predicted prefix of the canonical form. The canonical form can be used, for example, to query a knowledge base or to generate a next utterance in a discourse.
    Type: Grant
    Filed: May 5, 2016
    Date of Patent: January 2, 2018
    Assignees: Conduent Business Services, LLC, Centre National De La Recherche Scientifique
    Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
  • Patent number: 9852743
    Abstract: Embodiments of the present invention provide systems, methods, and computer storage media directed towards automatic emphasis of spoken words. In one embodiment, a process may begin by identifying, within an audio recording, a word that is to be emphasized. Once identified, contextual and lexical information relating to the emphasized word can be extracted from the audio recording. This contextual and lexical information can be utilized in conjunction with a predictive model to determine a set of emphasis parameters for the identified word. These emphasis parameters can then be applied to the identified word to cause the word to be emphasized. Other embodiments may be described and/or claimed.
    Type: Grant
    Filed: November 20, 2015
    Date of Patent: December 26, 2017
    Assignee: Adobe Systems Incorporated
    Inventors: Yang Zhang, Gautham J. Mysore, Floraine Berthouzoz
  • Patent number: 9842588
    Abstract: A method and a device of voice recognition are provided. The method involves receiving a voice signal, identifying a first voice recognition model in which context information associated with a situation at reception of the voice signal is not reflected and a second voice recognition model in which the context information is reflected, determining a weighted value of the first voice recognition model and a weighted value of the second voice recognition model, and recognizing a word in the voice signal by applying the determined weighted values to the first voice recognition model and the second voice recognition model.
    Type: Grant
    Filed: February 6, 2015
    Date of Patent: December 12, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Hyun-Jun Kim, Young Sang Choi
  • Patent number: 9830315
    Abstract: A system and method are provided which employ a neural network model which has been trained to predict a sequentialized form for an input text sequence. The sequentialized form includes a sequence of symbols. The neural network model includes an encoder which generates a representation of the input text sequence based on a representation of n-grams in the text sequence and a decoder which sequentially predicts a next symbol of the sequentialized form based on the representation and a predicted prefix of the sequentialized form. Given an input text sequence, a sequentialized form is predicted with the trained neural network model. The sequentialized form is converted to a structured form and information based on the structured form is output.
    Type: Grant
    Filed: July 13, 2016
    Date of Patent: November 28, 2017
    Assignees: XEROX CORPORATION, Centre National de la Recherche Scientifique
    Inventors: Chunyang Xiao, Marc Dymetman, Claire Gardent
  • Patent number: 9830920
    Abstract: A method, device, and apparatus provide the ability to predict a portion of a polyphonic audio signal for compression and networking applications. The solution involves a framework of a cascade of long term prediction filters, which by design is tailored to account for all periodic components present in a polyphonic signal. This framework is complemented with a design method to optimize the system parameters. Specialization may include specific techniques for coding and networking scenarios, where the potential of each enhanced prediction is realized to considerably improve the overall system performance for that application. One specific technique provides enhanced inter-frame prediction for the compression of polyphonic audio signals, particularly at low delay. Another specific technique provides improved frame loss concealment capabilities to combat packet loss in audio communications.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: November 28, 2017
    Assignee: The Regents of the University of California
    Inventors: Kenneth Rose, Tejaswi Nanjundaswamy
  • Patent number: 9817818
    Abstract: A method and computer system for translating sentences between languages from an intermediate language-independent semantic representation is provided. On the basis of comprehensive understanding about languages and semantics, exhaustive linguistic descriptions are used to analyze sentences, to build syntactic structures and language independent semantic structures and representations, and to synthesize one or more sentences in a natural or artificial language. A computer system is also provided to analyze and synthesize various linguistic structures and to perform translation of a wide spectrum of various sentence types. As result, a generalized data structure, such as a semantic structure, is generated from a sentence of an input language and can be transformed into a natural sentence expressing its meaning correctly in an output language.
    Type: Grant
    Filed: May 21, 2012
    Date of Patent: November 14, 2017
    Assignee: ABBYY PRODUCTION LLC
    Inventors: Konstantin Anisimovich, Vladimir Selegey, Konstantin Zuev
  • Patent number: 9805732
    Abstract: Embodiments of the present application proposes a frequency envelope vector quantization method and apparatus, where the method includes: dividing N frequency envelopes in one frame into N1 vectors; quantizing a first vector in the N1 vectors by using a first codebook, to obtain a code word corresponding to the quantized first vector, where the first codebook is divided into 2B1 portions; determining, according to the code word corresponding to the quantized first vector; determining a second codebook according to the codebook of the ith portion; and quantizing a second vector in the N1 vectors based on the second codebook. In the embodiments of the present application, vector quantization can be performed on frequency envelope vectors by using a codebook with a smaller quantity of bits. Therefore, complexity of vector quantization can be reduced, and an effect of vector quantization can also be ensured.
    Type: Grant
    Filed: December 29, 2015
    Date of Patent: October 31, 2017
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Chen Hu, Lei Miao, Zexin Liu
  • Patent number: 9807217
    Abstract: A computer-implemented method of determining when an audio notification should be generated includes detecting receipt of a triggering event that occurs on a user device; generating, based on detecting, the audio notification for the triggering event; receiving, from the user device, a user voice command responding to the audio notification; and generating a response to the user voice command based on one or more of (i) information associated with the audio notification, and (ii) information associated with the user voice command.
    Type: Grant
    Filed: April 21, 2016
    Date of Patent: October 31, 2017
    Assignee: Google Inc.
    Inventors: Michael J. LeBeau, John Nicholas Jitkoff
  • Patent number: 9807473
    Abstract: Video description generation using neural network training based on relevance and coherence is described. In some examples, long short-term memory with visual-semantic embedding (LSTM-E) can maximize the probability of generating the next word given previous words and visual content and can create a visual-semantic embedding space for enforcing the relationship between the semantics of an entire sentence and visual content. LSTM-E can include a 2-D and/or 3-D deep convolutional neural networks for learning powerful video representation, a deep recurrent neural network for generating sentences, and a joint embedding model for exploring the relationships between visual content and sentence semantics.
    Type: Grant
    Filed: November 20, 2015
    Date of Patent: October 31, 2017
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tao Mei, Ting Yao, Yong Rui
  • Patent number: 9792911
    Abstract: A method of operating a speech recognition system includes converting a spoken utterance by a user into an electrical voice signal by use of a local microphone associated with a local electronic device. The electrical voice signal is transmitted to a remote voice recognizer. The remote voice recognizer is used to transcribe the electrical voice signal and to produce a confidence score. The confidence score indicates a level of confidence that the transcription of the electrical voice signal substantially matches the words of the spoken utterance. The transcription of the electrical voice signal and the confidence score are transmitted from the remote voice recognizer to the local electronic device. The electrical voice signal, the transcription of the electrical voice signal, and the confidence score are used at the local device to train a local voice recognizer.
    Type: Grant
    Filed: March 24, 2015
    Date of Patent: October 17, 2017
    Assignee: Panasonic Automotive Systems Company of America, Division of Panasonic Corporation of North America
    Inventors: Ilya Veksler, Ambuj Kumar, Naveen Reddy Korupol
  • Patent number: 9792907
    Abstract: Techniques related to key phrase detection for applications such as wake on voice are discussed. Such techniques may include updating a start state based rejection model and a key phrase model based on scores of sub-phonetic units from an acoustic model to generate a rejection likelihood score and a key phrase likelihood score and determining whether received audio input is associated with a predetermined key phrase based on the rejection likelihood score and the key phrase likelihood score.
    Type: Grant
    Filed: November 24, 2015
    Date of Patent: October 17, 2017
    Assignee: Intel IP Corporation
    Inventors: Tobias Bocklet, Joachim Hofer
  • Patent number: 9779752
    Abstract: A system for cloud acoustic enhancement is disclosed. In particular, the system may leverage metadata and cloud-computing network resources to mitigate the impact of noisy environments that may potentially interfere with user communications. In order to do so, the system may receive an audio stream including an audio signal associated with a user, and determine if the audio stream also includes an interference signal. The system may determine that the audio stream includes the interference signal if a portion of the audio stream correlates with metadata that identifies the interference signal. If the audio stream is determined to include the interference signal, the system may cancel the interference signal from the audio stream by utilizing the metadata and the cloud-computing network resources. Once the interference signal is cancelled, the system may transmit the audio stream including the audio signal associated with the user to an intended destination.
    Type: Grant
    Filed: October 31, 2014
    Date of Patent: October 3, 2017
    Assignee: AT&T Intellectual Property I, L.P.
    Inventors: Donald J. Bowen, Dimitrios B. Dimitriadis, Lusheng Ji, Horst J. Schroeter
  • Patent number: 9767808
    Abstract: A method and apparatus for suppressing vocoder noise are provided. In the method, first information and second information are received from a channel decoder, the first information indicating whether a decoded data frame has an error and the second information being a channel quality metric, error concealment voice decoding is performed on the decoded data frame if the first information indicates that no channel decoding error has been generated and the second information is smaller than a predetermined first threshold, and normal voice decoding is performed on the decoded data frame if the first information indicates that no channel decoding error has been generated and the second information is equal to or larger than the first threshold.
    Type: Grant
    Filed: July 26, 2013
    Date of Patent: September 19, 2017
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Yong-Won Shin, Joon-Sang Ryu, Jung-In Kim
  • Patent number: 9753918
    Abstract: A speech translation system and methods for cross-lingual communication that enable users to improve and customize content and usage of the system and easily. The methods include, in response to receiving an utterance including a first term associated with a field, translating the utterance into a second language. In response to receiving an indication to add the first term associated with the field to a first recognition lexicon, adding the first term associated with the field and the determined translation to a first machine translation module and to a shared database for a community associated with the field of the first term associated with the field, wherein the first term associated with the field added to the shared database is accessible by the community.
    Type: Grant
    Filed: January 5, 2015
    Date of Patent: September 5, 2017
    Assignee: Facebook, Inc.
    Inventors: Alexander Waibel, Ian R. Lane
  • Patent number: 9747916
    Abstract: In a CELP-type speech coding apparatus, switching between an orthogonal search of a fixed codebook and a non-orthogonal search is performed in a practical and effective manner. The CELP-type speech coding apparatus includes a parameter quantizer that selects an adaptive codebook vector and a fixed codebook vector so as to minimize an error between a synthesized speech signal and an input speech signal. The parameter quantizer includes a fixed codebook searcher that switches between the orthogonal fixed codebook search and the non-orthogonal fixed codebook search based on a correlation value between a target vector for the fixed codebook search and the adaptive codebook vector obtained as a result of a synthesis filtering process.
    Type: Grant
    Filed: January 20, 2016
    Date of Patent: August 29, 2017
    Assignee: PANASONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Hiroyuki Ehara, Takako Hori
  • Patent number: 9747896
    Abstract: In certain implementations, follow-up responses may be provided for prior natural language inputs of a user. As an example, a natural language input associated with a user may be received at a computer system. A determination of whether information sufficient for providing an adequate response to the natural language input is currently accessible to the computer system may be effectuated. A first response to the natural language input (that indicates that a follow-up response will be provided) may be provided based on a determination that information sufficient for providing an adequate response to the natural language input is not currently accessible. Information sufficient for providing an adequate response to the natural language input may be received. A second response to the natural language input may then be provided based on the received sufficient information.
    Type: Grant
    Filed: October 15, 2015
    Date of Patent: August 29, 2017
    Assignee: VoiceBox Technologies Corporation
    Inventors: Michael R. Kennewick, Jr., Michael R. Kennewick, Sr.