Patents Examined by Shaun Roberts
  • Patent number: 11664040
    Abstract: An apparatus for processing an audio signal includes an audio signal analyzer and a filter. The audio signal analyzer is configured to analyze an audio signal to determine a plurality of noise suppression filter values for a plurality of bands of the audio signal, wherein the analyzer is configured to determine a noise suppression filter value so that a noise suppression filter value is greater than or equal to a minimum noise suppression filter value and so that the minimum noise suppression value depends on a characteristic of the audio signal. The filter is configured for filtering the audio signal, wherein the filter is adjusted based on the noise suppression filter values.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 30, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Anthony Lombard, Bernhard Birzer, Dirk Mahne, Edwin Mabande, Fabian Kuech, Emanuel Habets, Paolo Annibale
  • Patent number: 11664021
    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.
    Type: Grant
    Filed: December 9, 2021
    Date of Patent: May 30, 2023
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
  • Patent number: 11646047
    Abstract: The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S.
    Type: Grant
    Filed: May 23, 2022
    Date of Patent: May 9, 2023
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 11636858
    Abstract: A language proficiency analyzer automatically evaluates a person's language proficiency by analyzing that person's oral communications with another person. The analyzer first enhances the quality of an audio recording of a conversation between the two people using a neural network that automatically detects loss features in the audio and adds those loss features back into the audio. The analyzer then performs a textual and audio analysis on the improved audio. Through textual analysis, the analyzer uses a multi-attention network to determine how focused one person is on the other and/or how pleased one person is with the other. Through audio analysis, the analyzer uses a neural network to determine how well one person pronounced words during the conversation.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: April 25, 2023
    Assignee: Bank of America Corporation
    Inventors: Madhusudhanan Krishnamoorthy, Harikrishnan Rajeev
  • Patent number: 11626108
    Abstract: A method of operating a customer utterance analysis system includes obtaining a subset of utterances from among a first set of utterances. The method includes encoding, by a sentence encoder, the subset of utterances into multi-dimensional vectors. The method includes generating reduced-dimensionality vectors by reducing a dimensionality of the multi-dimensional vectors. Each vector of the reduced-dimensionality vectors corresponds to an utterance from among the subset of utterances. The method includes performing clustering on the reduced-dimensionality vectors. The method includes, based on the clustering performed on the reduced-dimensionality vectors, arranging the subset of utterances into clusters. The method includes obtaining labels for a least two clusters from among the clusters. The method includes generating training data based on the obtained labels. The method includes training a neural network model to predict an intent of an utterance based on the training data.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: April 11, 2023
    Assignee: TD Ameritrade IP Company, Inc.
    Inventors: Abhilash Krishnankutty Nair, Amaris Yuseon Sim, Dayanand Narregudem, Drew David Riassetto, Logan Sommers Ahlstrom, Nafiseh Saberian, Stephen Filios, Ravindra Reddy Tappeta Venkata
  • Patent number: 11626101
    Abstract: Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user's spoken command.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: April 11, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventors: George Thomas Des Jardins, Vikrant Sagar
  • Patent number: 11626115
    Abstract: Implementations relate to dynamically, and in a context-sensitive manner, biasing voice to text conversion. In some implementations, the biasing of voice to text conversions is performed by a voice to text engine of a local agent, and the biasing is based at least in part on content provided to the local agent by a third-party (3P) agent that is in network communication with the local agent. In some of those implementations, the content includes contextual parameters that are provided by the 3P agent in combination with responsive content generated by the 3P agent during a dialog that: is between the 3P agent, and a user of a voice-enabled electronic device; and is facilitated by the local agent. The contextual parameters indicate potential feature(s) of further voice input that is to be provided in response to the responsive content generated by the 3P agent.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: April 11, 2023
    Assignee: GOOGLE LLC
    Inventors: Barnaby James, Bo Wang, Sunil Vemuri, David Schairer, Ulas Kirazci, Ertan Dogrultan, Petar Aleksic
  • Patent number: 11620990
    Abstract: A method for optimizing speech recognition includes receiving a first acoustic segment characterizing a hotword detected by a hotword detector in streaming audio captured by a user device, extracting one or more hotword attributes from the first acoustic segment, and adjusting, based on the one or more hotword attributes extracted from the first acoustic segment, one or more speech recognition parameters of an automated speech recognition (ASR) model. After adjusting the speech recognition parameters of the ASR model, the method also includes processing, using the ASR model, a second acoustic segment to generate a speech recognition result. The second acoustic segment characterizes a spoken query/command that follows the first acoustic segment in the streaming audio captured by the user device.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: April 4, 2023
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Aleksandar Kracun
  • Patent number: 11621011
    Abstract: Described herein is a method of decoding an audio or speech signal, the method including the steps of: (a) receiving, by a decoder, a coded bitstream including the audio or speech signal and conditioning information; (b) providing, by a bitstream decoder, decoded conditioning information in a format associated with a first bitrate; (c) converting, by a converter, the decoded conditioning information from the format associated with the first bitrate to a format associated with a second bitrate; and (d) providing, by a generative neural network, a reconstruction of the audio or speech signal according to a probabilistic model conditioned by the conditioning information in the format associated with the second bitrate. Described are further an apparatus for decoding an audio or speech signal, a respective encoder, a system of the encoder and the apparatus for decoding an audio or speech signal as well as a respective computer program product.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: April 4, 2023
    Assignee: Dolby International AB
    Inventors: Janusz Klejsa, Per Hedelin
  • Patent number: 11600281
    Abstract: There is disclosed inter alia an apparatus for spatial audio signal encoding comprising means for receiving for each time frequency block of a sub band of an audio frame a spatial audio parameter comprising an azimuth and an elevation; determining a first distortion measure for the audio frame by determining a first distance measure for each time frequency block and summing the first distance measure for each time frequency block; determining a second distortion measure for the audio frame by determining a second distance measure for each time frequency block and summing the second distance measure for each time frequency block, and selecting either the first quantization scheme or the second quantization scheme for quantising the elevation and the azimuth for all time frequency blocks of the sub band of the audio frame, wherein the selecting is dependent on the first and second distortion measures.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: March 7, 2023
    Assignee: Nokia Technologies Oy
    Inventor: Adriana Vasilache
  • Patent number: 11594230
    Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate language independent-speaker verification. In one aspect, a method includes actions of receiving, by a user device, audio data representing an utterance of a user. Other actions may include providing, to a neural network stored on the user device, input data derived from the audio data and a language identifier. The neural network may be trained using speech data representing speech in different languages or dialects. The method may include additional actions of generating, based on output of the neural network, a speaker representation and determining, based on the speaker representation and a second representation, that the utterance is an utterance of the user. The method may provide the user with access to the user device based on determining that the utterance is an utterance of the user.
    Type: Grant
    Filed: May 4, 2021
    Date of Patent: February 28, 2023
    Assignee: Google LLC
    Inventors: Ignacio Lopez Moreno, Li Wan, Quan Wang
  • Patent number: 11580981
    Abstract: An in-vehicle apparatus is connectable to a device that includes a voice assistant function. The in-vehicle apparatus includes: a voice detector that performs voice recognition of an audio signal input from a microphone and that controls functions of the in-vehicle apparatus based on a result of the voice recognition; and an interface that communicates with the device. When being informed of a detection of a predetermined word in the audio signal as the result of the voice recognition of the audio signal performed by the voice detector, the interface sends to the device, not via the voice detector, the audio signal input from the microphone. The predetermined word is for activating the voice assistant function of the device.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: February 14, 2023
    Assignee: DENSO TEN Limited
    Inventors: Katsuaki Hikima, Daisuke Yamasaki, Futoshi Kosuga
  • Patent number: 11568878
    Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: January 31, 2023
    Assignee: GOOGLE LLC
    Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
  • Patent number: 11568888
    Abstract: A terminal control method, a terminal and a non-transitory computer-readable storage medium are provided. The terminal control method includes: receiving, by a microphone, a detection audio signal emitted from a speaker and having a frequency within a pre-set detection frequency range; acquiring actual audio parameters of the detection audio signal when being received by the microphone, and original audio parameters of the detection audio signal when being emitted from the speaker; determining a relative state between the microphone and the speaker according to the actual audio parameters and the original audio parameters; determining a terminal control operation to be performed, according to the relative state and a pre-set correspondence between relative states and terminal control operations; and performing the determined terminal control operation on a terminal where the microphone is located.
    Type: Grant
    Filed: June 3, 2020
    Date of Patent: January 31, 2023
    Assignee: ZTE CORPORATION
    Inventors: Shaowu Shen, Liting Liu
  • Patent number: 11568858
    Abstract: A computer-implemented method of building a multilingual acoustic model for automatic speech recognition in a low resource setting includes training a multilingual network on a set of training languages with an original transcribed training data to create a baseline multilingual acoustic model. Transliteration of transcribed training data is performed by processing through the multilingual network a plurality of multilingual data types from the set of languages, and outputting a pool of transliterated data. A filtering metric is applied to the pool of transliterated data output to select one or more portions of the transliterated data for retraining of the acoustic model. Data augmentation is performed by adding one or more selected portions of the output transliterated data back to the original transcribed training data to update training data. The training of a new multilingual acoustic model through the multilingual network is performed using the updated training data.
    Type: Grant
    Filed: October 17, 2020
    Date of Patent: January 31, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samuel Thomas, Kartik Audhkhasi, Brian E. D. Kingsbury
  • Patent number: 11562764
    Abstract: An apparatus for generating a bandwidth enhanced audio signal from an input audio signal having an input audio signal frequency range includes: a raw signal generator configured for generating a raw signal having an enhancement frequency range, wherein the enhancement frequency range is not included in the input audio signal frequency range; a neural network processor configured for generating a parametric representation for the enhancement frequency range using the input audio frequency range of the input audio signal and a trained neural network; and a raw signal processor for processing the raw signal using the parametric representation for the enhancement frequency range to obtain a processed raw signal having frequency components in the enhancement frequency range, wherein the processed raw signal or the processed raw signal and the input audio signal frequency range of the input audio signal represent the bandwidth enhanced audio signal.
    Type: Grant
    Filed: April 17, 2020
    Date of Patent: January 24, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Konstantin Schmidt, Christian Uhle, Bernd Edler
  • Patent number: 11562737
    Abstract: Speech recognition may be improved by generating and using a topic specific language model. A topic specific language model may be created by performing an initial pass on an audio signal using a generic or basis language model. A speech recognition device may then determine topics relating to the audio signal based on the words identified in the initial pass and retrieve a corpus of text relating to those topics. Using the retrieved corpus of text, the speech recognition device may create a topic specific language model. In one example, the speech recognition device may adapt or otherwise modify the generic language model based on the retrieved corpus of text.
    Type: Grant
    Filed: December 27, 2019
    Date of Patent: January 24, 2023
    Assignee: TIVO CORPORATION
    Inventors: David F. Houghton, Seth Michael Murray, Sibley Verbeck Simon
  • Patent number: 11562757
    Abstract: An audio signal encoding method performed by an encoder includes identifying a time-domain audio signal in a unit of blocks, quantizing a linear prediction coefficient extracted from a combined block in which a current original block of the audio signal and a previous original block chronologically adjacent to the current original block using frequency-domain linear predictive coding (LPC), generating a temporal envelope by dequantizing the quantized linear prediction coefficient, extracting a residual signal from the combined block based on the temporal envelope, quantizing the residual signal by one of time-domain quantization and frequency-domain quantization, and transforming the quantized residual signal and the quantized linear prediction coefficient into a bitstream.
    Type: Grant
    Filed: July 15, 2021
    Date of Patent: January 24, 2023
    Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE
    Inventors: Seung Kwon Beack, Jongmo Sung, Mi Suk Lee, Tae Jin Lee, Woo-taek Lim, Inseon Jang, Jin Soo Choi
  • Patent number: 11562758
    Abstract: An encoder operable to filter audio signals into a plurality of frequency band components, generate quantized digital components for each band, identify a potential for pre-echo events within the generated quantized digital components, generate an approximate signal by decoding the quantized digital components using inverse pulse code modulation, generate an error signal by comparing the approximate signal with the sampled audio signal, and process the error signal and quantized digital components. The encoder operable to process the error signal by processing delayed audio signals and Q band values, determining the potential for pre-echo events from the Q band values, and determining scale factors and MDCT block sizes for the potential for pre-echo events.
    Type: Grant
    Filed: March 29, 2022
    Date of Patent: January 24, 2023
    Assignee: IMMERSION NETWORKS, INC.
    Inventors: James David Johnston, Stephen Daniel White, King Wei Hor, Barry M. Genova
  • Patent number: 11557287
    Abstract: Provided is a system which allows a learner who is a non-native speaker of a given language to intuitively improve pronunciation of the language. A pronunciation conversion apparatus includes a conversion section which converts a first feature value corresponding to a first speech signal obtained when a first speaker who speaks a given language as his/her native language speaks another language such that the first feature value approaches a second feature value corresponding to a second speech signal obtained when a second speaker who speaks the other language as his/her native language speaks the other language, each of the first feature value and the second feature value is a feature value capable of representing a difference in pronunciation, and a speech signal obtained from the first feature value after the conversion is presented to the first speaker.
    Type: Grant
    Filed: April 9, 2019
    Date of Patent: January 17, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventor: Sadao Hiroya