Patents Examined by Bryan S Blankenagel

Methods and apparatus for unified speech and audio decoding QMF based harmonic transposer improvements

Patent number: 11315584

Abstract: The present disclosure relates to an apparatus for decoding an encoded Unified Audio and Speech stream. The apparatus comprises a core decoder for decoding the encoded Unified Audio and Speech stream. The core decoder includes an eSBR unit for extending a bandwidth of an input signal, the eSBR unit including a QMF based harmonic transposer. The QMF based harmonic transposer is configured to process the input signal in the QMF domain, in each of a plurality of synthesis subbands, to extend the bandwidth of the input signal. The QMF based harmonic transposer is configured to operate at least in part based on pre-computed information. The present disclosure further relates to corresponding methods and storage media.

Type: Grant

Filed: December 19, 2018

Date of Patent: April 26, 2022

Assignee: Dolby International AB

Inventors: Rajat Kumar, Ramesh Katuri, Saketh Sathuvalli, Reshma Rai
Audio decoder supporting a set of different loss concealment tools

Patent number: 11315580

Abstract: An assignment of one of phase set of different loss concealment tools of an audio decoder to a portion of the audio signal to be decoded from a data stream, which portion is affected by loss, that is the selection out of the set of different loss concealment tools, may be made in a manner leading to a more pleasant loss concealment if the assignment/selection is done based on two measures: A first measure which is determined measures a spectral position of a spectral centroid of a spectrum of the audio signal and a second measure which is determined measures a temporal predictability of the audio signal. The assigned or selected loss concealment tool may then be used to recover the portion of the audio signal.

Type: Grant

Filed: May 6, 2020

Date of Patent: April 26, 2022

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Adrian Tomasek, Emmanuel Ravelli, Markus Schnell, Alexander Tschekalinskij, Michael Schnabel, Ralph Sperschneider
Encoding audio metadata in an audio frame

Patent number: 11315581

Abstract: Techniques for encoding audio data with metadata are described. In an example, a device receives audio data corresponding to audio detected by a microphone and receives metadata associated with the audio. The device generates encoded data based at least in part on encoding the audio data with the metadata. The encoding involves replacing a portion of the audio data with the metadata, such that the encoded data includes the metadata and a remaining portion of the audio data. The device sends the encoded data to an audio processing application.

Type: Grant

Filed: August 17, 2020

Date of Patent: April 26, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Aditya Sharadchandra Joshi, Carlo Murgia, Michael Thomas Peterson
Signal analysis device, signal analysis method, and signal analysis program

Patent number: 11302343

Abstract: A signal analysis device includes an estimation unit that models a sound source position occurrence probability matrix Q using a product of a sound source position probability matrix B and a sound source existence probability matrix A, and estimates at least one of the sound source position probability matrix B and the sound source existence probability matrix A based on the modeling, the sound source position occurrence probability matrix Q being composed of probabilities of arrival of a signal from each sound source position candidate per frame, which is a time section, with respect to a plurality of sound source position candidates. The sound source position probability matrix B being composed of probabilities of arrival of a signal from each sound source position candidate per sound source with respect to a plurality of sound sources.

Type: Grant

Filed: April 4, 2019

Date of Patent: April 12, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Nobutaka Ito, Tomohiro Nakatani, Shoko Araki
Backward-compatible integration of high frequency reconstruction techniques for audio signals

Patent number: 11289106

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag.

Type: Grant

Filed: January 28, 2019

Date of Patent: March 29, 2022

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Method for utterance generation, smart device, and computer readable storage medium

Patent number: 11282502

Abstract: A computer-implemented method for utterance generation, a smart device, and a non-transitory computer readable storage medium are provided. The method includes: obtaining a first utterance to be answered, generating at least one random semantic vector, inputting the at least one random semantic vector and the first utterance into a trained generator, and obtaining at least one first answer outputted by the trained generator, wherein the trained generator is obtained based on a preset generative adversarial network. Due to the random semantic vector, even for the same utterance, the smart device can generate different answers corresponding to the different random semantic vectors, the possibility of generating too many identical answers during the human-machine conversation is reduced, and the fun during the human-machine conversation is enhanced.

Type: Grant

Filed: August 31, 2020

Date of Patent: March 22, 2022

Assignee: UBTECH ROBOTICS CORP LTD

Inventors: Rixing Huang, Youjun Xiong
Method and device for efficiently distributing a bit-budget in a CELP codec

Patent number: 11276412

Abstract: A method and device allocates a bit-budget to a plurality of first parts of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In the method and device, bit-budget allocation tables assign, for each of a plurality of intermediate bit rates, respective bit-budgets to the first CELP core module parts. A CELP core module bit rate is determined and one of the intermediate bit rates is selected based on the determined CELP core module bit rate. The respective bit-budgets assigned by the bit-budget allocation tables for the selected intermediate bit rate are allocated to the first CELP core module parts.

Type: Grant

Filed: September 20, 2018

Date of Patent: March 15, 2022

Assignee: VOICEAGE CORPORATION

Inventor: Vaclav Eksler
Conversation print system and method

Patent number: 11275854

Abstract: A method, computer program product, and computing system for defining a conversation print for each of a plurality of known entities, thus defining a plurality of conversation prints. Voice-based content is received from a third-party. The voice-based content is compared to at least one of the plurality of conversation prints to identify the third party.

Type: Grant

Filed: July 18, 2018

Date of Patent: March 15, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventor: Haydar Talib
Conversation print system and method

Patent number: 11275855

Abstract: A method, computer program product, and computing system for defining a conversation print for each of a plurality of known fraudsters, thus defining a plurality of fraudster conversation prints. The plurality of fraudster conversation prints is processed to identify one or more fraudster commonalities. A fraudster conversation template is generated based, at least in part, upon the one or more fraudster commonalities.

Type: Grant

Filed: July 18, 2018

Date of Patent: March 15, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventor: Haydar Talib
Method and device for allocating a bit-budget between sub-frames in a CELP CODEC

Patent number: 11276411

Abstract: A method and device for allocating a bit-budget to a plurality of first parts and to a second part of a CELP core module of (a) an encoder for encoding a sound signal or (b) a decoder for decoding the sound signal. In a frame of the sound signal comprising sub-frames, respective bit-budgets are allocated to the first CELP core module parts and a bit-budget remaining after allocating to the first CELP core module parts their respective bit-budgets is allocated to the second CELP core module part. According to an alternative, the second CELP core module part bit-budget is distributed between the sub-frames of the frame and a larger bit-budget is allocated to at least one of the sub-frames of the frame. The at least one sub-frame may be the first sub-frame of the frame, at least one sub-frame following the first sub-frame, or the sub-frame using a glottal-impulse-shape codebook.

Type: Grant

Filed: September 20, 2018

Date of Patent: March 15, 2022

Assignee: VOICEAGE CORPORATION

Inventor: Vaclav Eksler
Conversation print system and method

Patent number: 11275853

Abstract: Conversation Print: A method, computer program product, and computing system for receiving voice-based content from a third-party. The voice-based content is processed to define a text-based transcript for the voice-based content. The voice-based content is processed to define speech-pattern indicia for the voice-based content. A conversation print for the voice-based content is generated based, at least in part, upon the text-based transcript and the speech-pattern indicia.

Type: Grant

Filed: July 18, 2018

Date of Patent: March 15, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventors: Haydar Talib, Peter Stubley
Speech coding using time-varying interpolation

Patent number: 11270714

Abstract: Encoding a sequence of digital speech samples into a bit stream includes dividing the digital speech samples into frames including N subframes (where N is an integer greater than 1); computing model parameters for the subframes, the model parameters including spectral parameters; and generating a representation of the frame. The representation includes information representing the spectral parameters of P subframes (where P is an integer and P<N) and information identifying the P subframes. The representation excludes information representing the spectral parameters of the N?P subframes not included in the P subframes.

Type: Grant

Filed: January 8, 2020

Date of Patent: March 8, 2022

Assignee: Digital Voice Systems, Inc.

Inventor: Thomas Clark
System and method for concept formatting

Patent number: 11270261

Abstract: A method, computer program product, and computer system for mapping, by a computing device, an automatic speech recognition output of a conversation to a concept marker and a verbalized version of a value associated with the concept marker based upon, at least in part, the automatic speech recognition output of the conversation. The concept marker and the verbalized version of the value associated with the concept marker may be replaced with a formatted version. A plurality of user selectable format configurations of the formatted version may be provided as a textual output in a user interface.

Type: Grant

Filed: February 8, 2019

Date of Patent: March 8, 2022

Assignee: NUANCE COMMUNICATIONS, INC.

Inventor: Paul Joseph Vozila
Recorded media hotword trigger suppression

Patent number: 11257498

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for hotword trigger suppression are disclosed. In one aspect, a method includes the actions of receiving, by a microphone of a computing device, audio corresponding to playback of an item of media content, the audio including an utterance of a predefined hotword that is associated with performing an operation on the computing device. The actions further include processing the audio. The actions further include in response to processing the audio, suppressing performance of the operation on the computing device.

Type: Grant

Filed: November 20, 2020

Date of Patent: February 22, 2022

Assignee: Google LLC

Inventors: Alexander H. Gruenstein, Johan Schalkwyk, Matthew Sharifi
Device and method for generating speech animation

Patent number: 11244668

Abstract: A method for generating speech animation from an audio signal includes: receiving the audio signal; transforming the received audio signal into frequency-domain audio features; performing neural-network processing on the frequency-domain audio features to recognize phonemes, wherein the neural-network processing is performed using a neural network trained with a phoneme dataset comprising of audio signals with corresponding ground-truth phoneme labels; and generating the speech animation from the recognized phonemes.

Type: Grant

Filed: May 29, 2020

Date of Patent: February 8, 2022

Assignee: TCL RESEARCH AMERICA INC.

Inventors: Zixiao Yu, Haohong Wang
Generative adversarial network-based speech bandwidth extender and extension method

Patent number: 11238877

Abstract: Proposed are a generative adversarial network-based speech bandwidth extender and extension method. A generative adversarial network-based speech bandwidth extension method, according to an embodiment, comprises the steps of: extracting feature vectors from a narrowband (NB) signal and a wideband (WB) signal of a speech; estimating the feature vector of the wideband signal from the feature vector of the narrowband signal; and learning a deep neural network classification model for discriminating the estimated feature vector of the wideband signal from the actually extracted feature vector of the wideband signal and the actually extracted feature vector of the narrowband signal.

Type: Grant

Filed: May 17, 2018

Date of Patent: February 1, 2022

Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)

Inventors: Joon-Hyuk Chang, Kyoungjin Noh
Weight matrix initialization method to improve signal decomposition

Patent number: 11238881

Abstract: A method of decomposing digital signals using non-negative matrix factorization by generating an initial set of values in a row in the weight matrix from a ratio of a first function of a first signal of a plurality of digital signals divided by a second function of at least two other signals of the plurality of the digital signals, wherein the row in the weight matrix determines a decomposition of the plurality of digital signals into signal components.

Type: Grant

Filed: July 25, 2019

Date of Patent: February 1, 2022

Assignee: ACCUSONUS, INC.

Inventors: Elias Kokkinis, Alexandros Tsilfidis
End node spectrogram compression for machine learning speech recognition

Patent number: 11227614

Abstract: A system and method of recording and transmitting compressed audio signals over a network is disclosed. The end node device first converts the audio signal to a spectrogram, which is commonly used by machine learning algorithms to perform speech recognition. The end node device then compresses the spectrogram prior to transmission. In certain embodiments, the compression is performed using Discrete Cosine Transforms (DCT). Furthermore, in some embodiments, the DCT is performed on the difference between two columns of the spectrogram. Further, in some embodiments, a function that replaces values below a predetermined threshold with zeroes in the Encoded Spectrogram is utilized. These functions may be performed in hardware or software.

Type: Grant

Filed: June 11, 2020

Date of Patent: January 18, 2022

Assignee: Silicon Laboratories Inc.

Inventors: Antonio Torrini, Sebastian Ahmed
Linear prediction coefficient conversion device and linear prediction coefficient conversion method

Patent number: 11222644

Abstract: The purpose of the present invention is to estimate, with a small amount of computation, a linear prediction synthesis filter after conversion of an internal sampling frequency. A linear prediction coefficient conversion device is a device that converts first linear prediction coefficients calculated at a first sampling frequency to second linear prediction coefficients at a second sampling frequency different from the first sampling frequency, which includes a means for calculating, on the real axis of the unit circle, a power spectrum corresponding to the second linear prediction coefficients at the second sampling frequency based on the first linear prediction coefficients or an equivalent parameter, a means for calculating, on the real axis of the unit circle, autocorrelation coefficients from the power spectrum, and a means for converting the autocorrelation coefficients to the second linear prediction coefficients at the second sampling frequency.

Type: Grant

Filed: June 9, 2020

Date of Patent: January 11, 2022

Assignee: NTT DOCOMO, INC.

Inventors: Nobuhiko Naka, Vesa Ruoppila
Encoding and decoding audio signals

Patent number: 11217261

Abstract: In methods and apparatus and non-transitory memory units for encoding/decoding audio signal information, the encoder side may determine if a signal frame is useful for long term post filtering and/or packet lost concealment and may encode information in accordance to the results of the determination, and the decoder side may apply the LTPF and/or PLC in accordance to the information obtained from the encoder.

Type: Grant

Filed: May 6, 2020

Date of Patent: January 4, 2022

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Emmanuel Ravelli, Adrian Tomasek, Manfred Lutzky, Conrad Benndorf

prev 1 2 3 4 5 6 7 8 9 … next