Patents Examined by Shaun A Roberts
  • Patent number: 11699439
    Abstract: Method for voice-based interactive communication using a digital assistant, wherein the method comprises, an attention detection step, in which the digital assistant detects a user attention and as a result is set into a listening mode; a speaker detection step, in which the digital assistant detects the user as a current speaker; a speech sound detection step, in which the digital assistant detects and records speech uttered by the current speaker, which speech sound detection step further comprises a lip movement detection step, in which the digital assistant detects a lip movement of the current speaker; a speech analysis step, in which the digital assistant parses said recorded speech and extracts speech-based verbal informational content from said recorded speech; and a subsequent response step, in which the digital assistant provides feed-back to the user based on said recorded speech.
    Type: Grant
    Filed: December 21, 2020
    Date of Patent: July 11, 2023
    Assignee: Tobii AB
    Inventors: Erland George-Svahn, Sourabh Pateriya, Onur Kurt, Deepak Akkil
  • Patent number: 11694026
    Abstract: A computer-implemented method includes: receiving, by a computing device, an input file defining correct spellings of one or more transliterated words; generating, by the computing device, suffix outputs based on the one or more transliterated words; generating, by the computing device, a dictionary that maps the suffix outputs to the one or more transliterated words; recognizing, by the computing device, an alternatively spelled transliterated word included in a document as one of the one or more correctly spelled transliterated words using the dictionary; and outputting, by the computing device, information corresponding to the recognized transliterated word.
    Type: Grant
    Filed: September 23, 2021
    Date of Patent: July 4, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: James E. Bostick, John M. Ganci, Jr., Martin G. Keen, Craig M. Trim
  • Patent number: 11694707
    Abstract: A target speech signal extraction method for robust speech recognition includes: initializing a steering vector for a target speech source and an adaptive vector, setting a real output channel of the target speech source as an output by the adaptive vector, initializing adaptive vectors for a noise and setting a dummy channel as an output by the adaptive vectors for the noise; setting a cost function for minimizing dependency between a real output for the target speech source and a dummy output for the noise; setting an auxiliary function to the cost function, and updating the adaptive vector for the target speech source and the adaptive vectors for the noise by using the auxiliary function and the steering vector; estimating the target speech signal by using the adaptive vector thereby extracting the target speech signal from the input signals; and updating the steering vector for the target speech source.
    Type: Grant
    Filed: March 29, 2021
    Date of Patent: July 4, 2023
    Assignee: INDUSTRY-UNIVERSITY COOPERATION FOUNDATION SOGANG UNIVERSITY
    Inventors: Hyung Min Park, Uihyeop Shin
  • Patent number: 11694705
    Abstract: A sound signal processing system includes: a sound signal processing apparatus executing non-linear signal processing on a collected sound signal collected by a microphone, and transmitting, to an information processing apparatus, both a pre-execution sound signal before the non-linear signal processing is executed and a post-execution sound signal after the non-linear signal processing is executed; and the information processing apparatus receiving the pre-execution sound signal and the post-execution sound signal from the sound signal processing apparatus, and executing first processing on the pre-execution sound signal and executing second processing on the post-execution sound signal, the second processing being different from the first processing.
    Type: Grant
    Filed: July 16, 2019
    Date of Patent: July 4, 2023
    Assignee: Sony Interactive Entertainment Inc.
    Inventors: Yoshihiko Tamaru, Tetsuro Horikawa
  • Patent number: 11676602
    Abstract: Implementations relate to generating and/or executing a customized interactive dialog application. The customized interactive dialog application may be generated from a state mapping tool that allows a user to generate custom states and custom transitions between the custom states. A customized configuration description is then generated based on the generated custom states and custom transitions. Further, a default configuration description is identified that includes additional or alternative states and transitions. In executing the customized interactive dialog application, dialog turns are generated based on the states and transition information, with the customized configuration description taking precedence and the default configuration description being utilized for any undefined states and/or transitions. Implementations additionally or alternatively relate to generating and/or executing a custom agent based on generated custom states and custom transitions, and a default configuration description.
    Type: Grant
    Filed: May 20, 2022
    Date of Patent: June 13, 2023
    Assignee: GOOGLE LLC
    Inventors: Uri First, Yang Sun
  • Patent number: 11664040
    Abstract: An apparatus for processing an audio signal includes an audio signal analyzer and a filter. The audio signal analyzer is configured to analyze an audio signal to determine a plurality of noise suppression filter values for a plurality of bands of the audio signal, wherein the analyzer is configured to determine a noise suppression filter value so that a noise suppression filter value is greater than or equal to a minimum noise suppression filter value and so that the minimum noise suppression value depends on a characteristic of the audio signal. The filter is configured for filtering the audio signal, wherein the filter is adjusted based on the noise suppression filter values.
    Type: Grant
    Filed: March 23, 2021
    Date of Patent: May 30, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Anthony Lombard, Bernhard Birzer, Dirk Mahne, Edwin Mabande, Fabian Kuech, Emanuel Habets, Paolo Annibale
  • Patent number: 11664021
    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.
    Type: Grant
    Filed: December 9, 2021
    Date of Patent: May 30, 2023
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
  • Patent number: 11646047
    Abstract: The present document relates to audio source coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR), as well as to digital effect processors, e.g. exciters, where generation of harmonic distortion add brightness to the processed signal, and to time stretchers where a signal duration is prolonged with maintained spectral content. A system and method configured to generate a time stretched and/or frequency transposed signal from an input signal is described. The system comprises an analysis filterbank configured to provide an analysis subband signal from the input signal; wherein the analysis subband signal comprises a plurality of complex valued analysis samples, each having a phase and a magnitude. Furthermore, the system comprises a subband processing unit configured to determine a synthesis subband signal from the analysis subband signal using a subband transposition factor Q and a subband stretch factor S.
    Type: Grant
    Filed: May 23, 2022
    Date of Patent: May 9, 2023
    Assignee: Dolby International AB
    Inventor: Lars Villemoes
  • Patent number: 11636858
    Abstract: A language proficiency analyzer automatically evaluates a person's language proficiency by analyzing that person's oral communications with another person. The analyzer first enhances the quality of an audio recording of a conversation between the two people using a neural network that automatically detects loss features in the audio and adds those loss features back into the audio. The analyzer then performs a textual and audio analysis on the improved audio. Through textual analysis, the analyzer uses a multi-attention network to determine how focused one person is on the other and/or how pleased one person is with the other. Through audio analysis, the analyzer uses a neural network to determine how well one person pronounced words during the conversation.
    Type: Grant
    Filed: October 12, 2021
    Date of Patent: April 25, 2023
    Assignee: Bank of America Corporation
    Inventors: Madhusudhanan Krishnamoorthy, Harikrishnan Rajeev
  • Patent number: 11626108
    Abstract: A method of operating a customer utterance analysis system includes obtaining a subset of utterances from among a first set of utterances. The method includes encoding, by a sentence encoder, the subset of utterances into multi-dimensional vectors. The method includes generating reduced-dimensionality vectors by reducing a dimensionality of the multi-dimensional vectors. Each vector of the reduced-dimensionality vectors corresponds to an utterance from among the subset of utterances. The method includes performing clustering on the reduced-dimensionality vectors. The method includes, based on the clustering performed on the reduced-dimensionality vectors, arranging the subset of utterances into clusters. The method includes obtaining labels for a least two clusters from among the clusters. The method includes generating training data based on the obtained labels. The method includes training a neural network model to predict an intent of an utterance based on the training data.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: April 11, 2023
    Assignee: TD Ameritrade IP Company, Inc.
    Inventors: Abhilash Krishnankutty Nair, Amaris Yuseon Sim, Dayanand Narregudem, Drew David Riassetto, Logan Sommers Ahlstrom, Nafiseh Saberian, Stephen Filios, Ravindra Reddy Tappeta Venkata
  • Patent number: 11626101
    Abstract: Systems and methods are described for processing and interpreting audible commands spoken in one or more languages. Speech recognition systems disclosed herein may be used as a stand-alone speech recognition system or comprise a portion of another content consumption system. A requesting user may provide audio input (e.g., command data) to the speech recognition system via a computing device to request an entertainment system to perform one or more operational commands. The speech recognition system may analyze the audio input across a variety of linguistic models, and may parse the audio input to identify a plurality of phrases and corresponding action classifiers. In some embodiments, the speech recognition system may utilize the action classifiers and other information to determine the one or more identified phrases that appropriately match the desired intent and operational command associated with the user's spoken command.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: April 11, 2023
    Assignee: Comcast Cable Communications, LLC
    Inventors: George Thomas Des Jardins, Vikrant Sagar
  • Patent number: 11626115
    Abstract: Implementations relate to dynamically, and in a context-sensitive manner, biasing voice to text conversion. In some implementations, the biasing of voice to text conversions is performed by a voice to text engine of a local agent, and the biasing is based at least in part on content provided to the local agent by a third-party (3P) agent that is in network communication with the local agent. In some of those implementations, the content includes contextual parameters that are provided by the 3P agent in combination with responsive content generated by the 3P agent during a dialog that: is between the 3P agent, and a user of a voice-enabled electronic device; and is facilitated by the local agent. The contextual parameters indicate potential feature(s) of further voice input that is to be provided in response to the responsive content generated by the 3P agent.
    Type: Grant
    Filed: January 24, 2022
    Date of Patent: April 11, 2023
    Assignee: GOOGLE LLC
    Inventors: Barnaby James, Bo Wang, Sunil Vemuri, David Schairer, Ulas Kirazci, Ertan Dogrultan, Petar Aleksic
  • Patent number: 11620990
    Abstract: A method for optimizing speech recognition includes receiving a first acoustic segment characterizing a hotword detected by a hotword detector in streaming audio captured by a user device, extracting one or more hotword attributes from the first acoustic segment, and adjusting, based on the one or more hotword attributes extracted from the first acoustic segment, one or more speech recognition parameters of an automated speech recognition (ASR) model. After adjusting the speech recognition parameters of the ASR model, the method also includes processing, using the ASR model, a second acoustic segment to generate a speech recognition result. The second acoustic segment characterizes a spoken query/command that follows the first acoustic segment in the streaming audio captured by the user device.
    Type: Grant
    Filed: December 11, 2020
    Date of Patent: April 4, 2023
    Assignee: Google LLC
    Inventors: Matthew Sharifi, Aleksandar Kracun
  • Patent number: 11621011
    Abstract: Described herein is a method of decoding an audio or speech signal, the method including the steps of: (a) receiving, by a decoder, a coded bitstream including the audio or speech signal and conditioning information; (b) providing, by a bitstream decoder, decoded conditioning information in a format associated with a first bitrate; (c) converting, by a converter, the decoded conditioning information from the format associated with the first bitrate to a format associated with a second bitrate; and (d) providing, by a generative neural network, a reconstruction of the audio or speech signal according to a probabilistic model conditioned by the conditioning information in the format associated with the second bitrate. Described are further an apparatus for decoding an audio or speech signal, a respective encoder, a system of the encoder and the apparatus for decoding an audio or speech signal as well as a respective computer program product.
    Type: Grant
    Filed: October 29, 2019
    Date of Patent: April 4, 2023
    Assignee: Dolby International AB
    Inventors: Janusz Klejsa, Per Hedelin
  • Patent number: 11600281
    Abstract: There is disclosed inter alia an apparatus for spatial audio signal encoding comprising means for receiving for each time frequency block of a sub band of an audio frame a spatial audio parameter comprising an azimuth and an elevation; determining a first distortion measure for the audio frame by determining a first distance measure for each time frequency block and summing the first distance measure for each time frequency block; determining a second distortion measure for the audio frame by determining a second distance measure for each time frequency block and summing the second distance measure for each time frequency block, and selecting either the first quantization scheme or the second quantization scheme for quantising the elevation and the azimuth for all time frequency blocks of the sub band of the audio frame, wherein the selecting is dependent on the first and second distortion measures.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: March 7, 2023
    Assignee: Nokia Technologies Oy
    Inventor: Adriana Vasilache
  • Patent number: 11594230
    Abstract: Methods, systems, apparatus, including computer programs encoded on computer storage medium, to facilitate language independent-speaker verification. In one aspect, a method includes actions of receiving, by a user device, audio data representing an utterance of a user. Other actions may include providing, to a neural network stored on the user device, input data derived from the audio data and a language identifier. The neural network may be trained using speech data representing speech in different languages or dialects. The method may include additional actions of generating, based on output of the neural network, a speaker representation and determining, based on the speaker representation and a second representation, that the utterance is an utterance of the user. The method may provide the user with access to the user device based on determining that the utterance is an utterance of the user.
    Type: Grant
    Filed: May 4, 2021
    Date of Patent: February 28, 2023
    Assignee: Google LLC
    Inventors: Ignacio Lopez Moreno, Li Wan, Quan Wang
  • Patent number: 11580981
    Abstract: An in-vehicle apparatus is connectable to a device that includes a voice assistant function. The in-vehicle apparatus includes: a voice detector that performs voice recognition of an audio signal input from a microphone and that controls functions of the in-vehicle apparatus based on a result of the voice recognition; and an interface that communicates with the device. When being informed of a detection of a predetermined word in the audio signal as the result of the voice recognition of the audio signal performed by the voice detector, the interface sends to the device, not via the voice detector, the audio signal input from the microphone. The predetermined word is for activating the voice assistant function of the device.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: February 14, 2023
    Assignee: DENSO TEN Limited
    Inventors: Katsuaki Hikima, Daisuke Yamasaki, Futoshi Kosuga
  • Patent number: 11568878
    Abstract: Techniques disclosed herein are directed towards streaming keyphrase detection which can be customized to detect one or more particular keyphrases, without requiring retraining of any model(s) for those particular keyphrase(s). Many implementations include processing audio data using a speaker separation model to generate separated audio data which isolates an utterance spoken by a human speaker from one or more additional sounds not spoken by the human speaker, and processing the separated audio data using a text independent speaker identification model to determine whether a verified and/or registered user spoke a spoken utterance captured in the audio data. Various implementations include processing the audio data and/or the separated audio data using an automatic speech recognition model to generate a text representation of the utterance.
    Type: Grant
    Filed: April 16, 2021
    Date of Patent: January 31, 2023
    Assignee: GOOGLE LLC
    Inventors: Rajeev Rikhye, Quan Wang, Yanzhang He, Qiao Liang, Ian C. McGraw
  • Patent number: 11568888
    Abstract: A terminal control method, a terminal and a non-transitory computer-readable storage medium are provided. The terminal control method includes: receiving, by a microphone, a detection audio signal emitted from a speaker and having a frequency within a pre-set detection frequency range; acquiring actual audio parameters of the detection audio signal when being received by the microphone, and original audio parameters of the detection audio signal when being emitted from the speaker; determining a relative state between the microphone and the speaker according to the actual audio parameters and the original audio parameters; determining a terminal control operation to be performed, according to the relative state and a pre-set correspondence between relative states and terminal control operations; and performing the determined terminal control operation on a terminal where the microphone is located.
    Type: Grant
    Filed: June 3, 2020
    Date of Patent: January 31, 2023
    Assignee: ZTE CORPORATION
    Inventors: Shaowu Shen, Liting Liu
  • Patent number: 11568858
    Abstract: A computer-implemented method of building a multilingual acoustic model for automatic speech recognition in a low resource setting includes training a multilingual network on a set of training languages with an original transcribed training data to create a baseline multilingual acoustic model. Transliteration of transcribed training data is performed by processing through the multilingual network a plurality of multilingual data types from the set of languages, and outputting a pool of transliterated data. A filtering metric is applied to the pool of transliterated data output to select one or more portions of the transliterated data for retraining of the acoustic model. Data augmentation is performed by adding one or more selected portions of the output transliterated data back to the original transcribed training data to update training data. The training of a new multilingual acoustic model through the multilingual network is performed using the updated training data.
    Type: Grant
    Filed: October 17, 2020
    Date of Patent: January 31, 2023
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Samuel Thomas, Kartik Audhkhasi, Brian E. D. Kingsbury