Patents Examined by Bryan S Blankenagel
  • Patent number: 11315584
    Abstract: The present disclosure relates to an apparatus for decoding an encoded Unified Audio and Speech stream. The apparatus comprises a core decoder for decoding the encoded Unified Audio and Speech stream. The core decoder includes an eSBR unit for extending a bandwidth of an input signal, the eSBR unit including a QMF based harmonic transposer. The QMF based harmonic transposer is configured to process the input signal in the QMF domain, in each of a plurality of synthesis subbands, to extend the bandwidth of the input signal. The QMF based harmonic transposer is configured to operate at least in part based on pre-computed information. The present disclosure further relates to corresponding methods and storage media.
    Type: Grant
    Filed: December 19, 2018
    Date of Patent: April 26, 2022
    Assignee: Dolby International AB
    Inventors: Rajat Kumar, Ramesh Katuri, Saketh Sathuvalli, Reshma Rai
  • Patent number: 11315580
    Abstract: An assignment of one of phase set of different loss concealment tools of an audio decoder to a portion of the audio signal to be decoded from a data stream, which portion is affected by loss, that is the selection out of the set of different loss concealment tools, may be made in a manner leading to a more pleasant loss concealment if the assignment/selection is done based on two measures: A first measure which is determined measures a spectral position of a spectral centroid of a spectrum of the audio signal and a second measure which is determined measures a temporal predictability of the audio signal. The assigned or selected loss concealment tool may then be used to recover the portion of the audio signal.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: April 26, 2022
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Adrian Tomasek, Emmanuel Ravelli, Markus Schnell, Alexander Tschekalinskij, Michael Schnabel, Ralph Sperschneider
  • Patent number: 11315581
    Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.
    Type: Grant
    Filed: August 17, 2020
    Date of Patent: April 26, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Aditya Sharadchandra Joshi, Carlo Murgia, Michael Thomas Peterson
  • Patent number: 11302343
    Abstract: A signal analysis device includes an estimation unit that models a sound source position occurrence probability matrix Q using a product of a sound source position probability matrix B and a sound source existence probability matrix A, and estimates at least one of the sound source position probability matrix B and the sound source existence probability matrix A based on the modeling, the sound source position occurrence probability matrix Q being composed of probabilities of arrival of a signal from each sound source position candidate per frame, which is a time section, with respect to a plurality of sound source position candidates. The sound source position probability matrix B being composed of probabilities of arrival of a signal from each sound source position candidate per sound source with respect to a plurality of sound sources.
    Type: Grant
    Filed: April 4, 2019
    Date of Patent: April 12, 2022
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Nobutaka Ito, Tomohiro Nakatani, Shoko Araki
  • Patent number: 11289106
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.
    Type: Grant
    Filed: January 28, 2019
    Date of Patent: March 29, 2022
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11282502
    Abstract: A computer-implemented method for utterance generation, a smart device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a first utterance to be answered, generating at least one random semantic vector, inputting the at least one random semantic vector and the first utterance into a trained generator, and obtaining at least one first answer outputted by the trained generator, wherein the trained generator is obtained based on a preset generative adversarial network. Due to the random semantic vector, even for the same utterance, the smart device can generate different answers corresponding to the different random semantic vectors, the possibility of generating too many identical answers during the human-machine conversation is reduced, and the fun during the human-machine conversation is enhanced.
    Type: Grant
    Filed: August 31, 2020
    Date of Patent: March 22, 2022
    Assignee: UBTECH ROBOTICS CORP LTD
    Inventors: Rixing Huang, Youjun Xiong
  • Patent number: 11276412
    Abstract: A method and device allocates a bit-budget to a plurality of first parts of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In the method and device, bit-budget allocation tables assign, for each of a plurality of intermediate bit rates, respective bit-budgets to the first CELP core module parts. A CELP core module bit rate is determined and one of the intermediate bit rates is selected based on the determined CELP core module bit rate. The respective bit-budgets assigned by the bit-budget allocation tables for the selected intermediate bit rate are allocated to the first CELP core module parts.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: March 15, 2022
    Assignee: VOICEAGE CORPORATION
    Inventor: Vaclav Eksler
  • Patent number: 11275854
    Abstract: A method, computer program product, and computing system for defining a conversation print for each of a plurality of known entities, thus defining a plurality of conversation prints. Voice-based content is received from a third-party. The voice-based content is compared to at least one of the plurality of conversation prints to identify the third party.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: March 15, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Haydar Talib
  • Patent number: 11275855
    Abstract: A method, computer program product, and computing system for defining a conversation print for each of a plurality of known fraudsters, thus defining a plurality of fraudster conversation prints. The plurality of fraudster conversation prints is processed to identify one or more fraudster commonalities. A fraudster conversation template is generated based, at least in part, upon the one or more fraudster commonalities.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: March 15, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Haydar Talib
  • Patent number: 11276411
    Abstract: A method and device for allocating a bit-budget to a plurality of first parts and to a second part of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In a frame of the sound signal comprising sub-frames, respective bit-budgets are allocated to the first CELP core module parts and a bit-budget remaining after allocating to the first CELP core module parts their respective bit-budgets is allocated to the second CELP core module part. According to an alternative, the second CELP core module part bit-budget is distributed between the sub-frames of the frame and a larger bit-budget is allocated to at least one of the sub-frames of the frame. The at least one sub-frame may be the first sub-frame of the frame, at least one sub-frame following the first sub-frame, or the sub-frame using a glottal-impulse-shape codebook.
    Type: Grant
    Filed: September 20, 2018
    Date of Patent: March 15, 2022
    Assignee: VOICEAGE CORPORATION
    Inventor: Vaclav Eksler
  • Patent number: 11275853
    Abstract: Conversation Print: A method, computer program product, and computing system for receiving voice-based content from a third-party. The voice-based content is processed to define a text-based transcript for the voice-based content. The voice-based content is processed to define speech-pattern indicia for the voice-based content. A conversation print for the voice-based content is generated based, at least in part, upon the text-based transcript and the speech-pattern indicia.
    Type: Grant
    Filed: July 18, 2018
    Date of Patent: March 15, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Haydar Talib, Peter Stubley
  • Patent number: 11270714
    Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into frames including N subframes (where N is an integer greater than 1); computing model parameters for the subframes, the model parameters including spectral parameters; and generating a representation of the frame. The representation includes information representing the spectral parameters of P subframes (where P is an integer and P<N) and information identifying the P subframes. The representation excludes information representing the spectral parameters of the N?P subframes not included in the P subframes.
    Type: Grant
    Filed: January 8, 2020
    Date of Patent: March 8, 2022
    Assignee: Digital Voice Systems, Inc.
    Inventor: Thomas Clark
  • Patent number: 11270261
    Abstract: A method, computer program product, and computer system for mapping, by a computing device, an automatic speech recognition output of a conversation to a concept marker and a verbalized version of a value associated with the concept marker based upon, at least in part, the automatic speech recognition output of the conversation. The concept marker and the verbalized version of the value associated with the concept marker may be replaced with a formatted version. A plurality of user selectable format configurations of the formatted version may be provided as a textual output in a user interface.
    Type: Grant
    Filed: February 8, 2019
    Date of Patent: March 8, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventor: Paul Joseph Vozila
  • Patent number: 11257498
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.
    Type: Grant
    Filed: November 20, 2020
    Date of Patent: February 22, 2022
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
  • Patent number: 11244668
    Abstract: A method for generating speech animation from an audio signal includes: receiving the audio signal; transforming the received audio signal into frequency-domain audio features; performing neural-network processing on the frequency-domain audio features to recognize phonemes, wherein the neural-network processing is performed using a neural network trained with a phoneme dataset comprising of audio signals with corresponding ground-truth phoneme labels; and generating the speech animation from the recognized phonemes.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: February 8, 2022
    Assignee: TCL RESEARCH AMERICA INC.
    Inventors: Zixiao Yu, Haohong Wang
  • Patent number: 11238877
    Abstract: Proposed are a generative adversarial network-based speech bandwidth extender and extension method. A generative adversarial network-based speech bandwidth extension method, according to an embodiment, comprises the steps of: extracting feature vectors from a narrowband (NB) signal and a wideband (WB) signal of a speech; estimating the feature vector of the wideband signal from the feature vector of the narrowband signal; and learning a deep neural network classification model for discriminating the estimated feature vector of the wideband signal from the actually extracted feature vector of the wideband signal and the actually extracted feature vector of the narrowband signal.
    Type: Grant
    Filed: May 17, 2018
    Date of Patent: February 1, 2022
    Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)
    Inventors: Joon-Hyuk Chang, Kyoungjin Noh
  • Patent number: 11238881
    Abstract: A method of decomposing digital signals using non-negative matrix factorization by generating an initial set of values in a row in the weight matrix from a ratio of a first function of a first signal of a plurality of digital signals divided by a second function of at least two other signals of the plurality of the digital signals, wherein the row in the weight matrix determines a decomposition of the plurality of digital signals into signal components.
    Type: Grant
    Filed: July 25, 2019
    Date of Patent: February 1, 2022
    Assignee: ACCUSONUS, INC.
    Inventors: Elias Kokkinis, Alexandros Tsilfidis
  • Patent number: 11227614
    Abstract: A system and method of recording and transmitting compressed audio signals over a network is disclosed. The end node device first converts the audio signal to a spectrogram, which is commonly used by machine learning algorithms to perform speech recognition. The end node device then compresses the spectrogram prior to transmission. In certain embodiments, the compression is performed using Discrete Cosine Transforms (DCT). Furthermore, in some embodiments, the DCT is performed on the difference between two columns of the spectrogram. Further, in some embodiments, a function that replaces values below a predetermined threshold with zeroes in the Encoded Spectrogram is utilized. These functions may be performed in hardware or software.
    Type: Grant
    Filed: June 11, 2020
    Date of Patent: January 18, 2022
    Assignee: Silicon Laboratories Inc.
    Inventors: Antonio Torrini, Sebastian Ahmed
  • Patent number: 11222644
    Abstract: The purpose of the present invention is to estimate, with a small amount of computation, a linear prediction synthesis filter after conversion of an internal sampling frequency. A linear prediction coefficient conversion device is a device that converts first linear prediction coefficients calculated at a first sampling frequency to second linear prediction coefficients at a second sampling frequency different from the first sampling frequency, which includes a means for calculating, on the real axis of the unit circle, a power spectrum corresponding to the second linear prediction coefficients at the second sampling frequency based on the first linear prediction coefficients or an equivalent parameter, a means for calculating, on the real axis of the unit circle, autocorrelation coefficients from the power spectrum, and a means for converting the autocorrelation coefficients to the second linear prediction coefficients at the second sampling frequency.
    Type: Grant
    Filed: June 9, 2020
    Date of Patent: January 11, 2022
    Assignee: NTT DOCOMO, INC.
    Inventors: Nobuhiko Naka, Vesa Ruoppila
  • Patent number: 11217261
    Abstract: In methods and apparatus and non-transitory memory units for encoding/decoding audio signal information, the encoder side may determine if a signal frame is useful for long term post filtering and/or packet lost concealment and may encode information in accordance to the results of the determination, and the decoder side may apply the LTPF and/or PLC in accordance to the information obtained from the encoder.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: January 4, 2022
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Emmanuel Ravelli, Adrian Tomasek, Manfred Lutzky, Conrad Benndorf