Patents Examined by Michael N. Opsasnick
  • Patent number: 11367450
    Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. A least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcripted customer service interaction.
    Type: Grant
    Filed: December 4, 2019
    Date of Patent: June 21, 2022
    Assignee: Verint Systems Inc.
    Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss
  • Patent number: 11355134
    Abstract: A method, system, and computer readable medium for decomposing an audio signal into different isolated sources. The techniques and mechanisms convert an audio signal into K input spectrogram fragments. The fragments are sent into a deep neural network to isolate for different sources. The isolated fragments are then combined to form full isolated source audio signals.
    Type: Grant
    Filed: October 2, 2020
    Date of Patent: June 7, 2022
    Assignee: AUDIOSHAKE, INC.
    Inventor: Luke Miner
  • Patent number: 11341972
    Abstract: In one aspect, a method comprises accessing audio data generated by a computing device based on audio input from a user, the audio data encoding one or more user utterances. The method further comprises generating a first transcription of the utterances by performing speech recognition on the audio data using a first speech recognizer that employs a language model based on user-specific data. The method further comprises generating a second transcription of the utterances by performing speech recognition on the audio data using a second speech recognizer that employs a language model independent of user-specific data. The method further comprises determining that the second transcription of the utterances includes a term from a predefined set of one or more terms. The method further comprises, based on determining that the second transcription of the utterance includes the term, providing an output of the first transcription of the utterance.
    Type: Grant
    Filed: October 22, 2020
    Date of Patent: May 24, 2022
    Assignee: Google LLC
    Inventors: Alexander H. Gruenstein, Petar Aleksic
  • Patent number: 11322154
    Abstract: Systems and methods of diarization using linguistic labeling include receiving a set of diarized textual transcripts. At least one heuristic is automatedly applied to the diarized textual transcripts to select transcripts likely to be associated with an identified group of speakers. The selected transcripts are analyzed to create at least one linguistic model. The linguistic model is applied to transcripted audio data to label a portion of the transcripted audio data as having been spoken by the identified group of speakers. Still further embodiments of diarization using linguistic labeling may serve to label agent speech and customer speech in a recorded and transcribed customer service interaction.
    Type: Grant
    Filed: December 4, 2019
    Date of Patent: May 3, 2022
    Assignee: Verint Systems Inc.
    Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss
  • Patent number: 11314481
    Abstract: Systems and methods for enabling voice-based interactions with electronic devices can include a data processing system maintaining a plurality of device action data sets and a respective identifier for each device action data set. The data processing system can receive, from an electronic device, an audio signal representing a voice query and an identifier. The data processing system can identify, using the identifier, a device action data set. The data processing system can identify a device action from device action data set based on content of the audio signal. The data processing system can then identify, from the device action dataset, a command associated with the device action and send the command to the for execution device for execution.
    Type: Grant
    Filed: May 7, 2018
    Date of Patent: April 26, 2022
    Assignee: GOOGLE LLC
    Inventors: Bo Wang, Venkat Kotla, Chad Yoshikawa, Chris Ramsdale, Pravir Gupta, Alfonso Gomez-Jordana, Kevin Yeun, Jae Won Seo, Lantian Zheng, Sang Soo Sung
  • Patent number: 11315554
    Abstract: Methods, systems, and media for connecting an IoT device to a call are provided. In some embodiments, a method is provided, the method comprising: establishing, at a first end-point device, a telecommunication channel with a second end-point device; subsequent to establishing the telecommunication channel, and prior to a termination of the telecommunication channel, detecting, using the first end-point device, a voice command that includes a keyword; and in response to detecting the voice command, causing information associated with an IoT device that corresponds to the keyword to be transmitted to the second end-point device.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: April 26, 2022
    Assignee: Google LLC
    Inventors: Saptarshi Bhattacharya, Shreedhar Madhavapeddi
  • Patent number: 11308969
    Abstract: A method performed in an audio decoder for decoding M encoded audio channels representing N audio channels is disclosed. The method includes receiving a bitstream containing the M encoded audio channels and a set of spatial parameters, decoding the M encoded audio channels, and extracting the set of spatial parameters from the bitstream. The method also includes analyzing the M audio channels to detect a location of a transient, decorrelating the M audio channels, and deriving N audio channels from the M audio channels and the set of spatial parameters. A first decorrelation technique is applied to a first subset of each audio channel and a second decorrelation technique is applied to a second subset of each audio channel. The first decorrelation technique represents a first mode of operation of a decorrelator, and the second decorrelation technique represents a second mode of operation of the decorrelator.
    Type: Grant
    Filed: October 5, 2020
    Date of Patent: April 19, 2022
    Assignee: Dolby Laboratories Licensing Corporation
    Inventor: Mark F. Davis
  • Patent number: 11301642
    Abstract: One general aspect includes a system to translate language exhibited on a publicly viewable sign, the system including: a memory configured to include one or more executable instructions and a processor configured to execute the executable instructions, where the executable instructions enable the processor to carry out the steps of: reviewing the sign; translating relevant information conveyed on the sign from a first language to a second language; and producing an output in an interior of a vehicle, the output based on the second language of the relevant information.
    Type: Grant
    Filed: April 17, 2019
    Date of Patent: April 12, 2022
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Brunno L. Moretti, Esther Anderson, Luis Goncalves
  • Patent number: 11302309
    Abstract: A technique for aligning spike timing of models is disclosed. A first model having a first architecture trained with a set of training samples is generated. Each training sample includes an input sequence of observations and an output sequence of symbols having different length from the input sequence. Then, one or more second models are trained with the trained first model by minimizing a guide loss jointly with a normal loss for each second model and a sequence recognition task is performed using the one or more second models. The guide loss evaluates dissimilarity in spike timing between the trained first model and each second model being trained.
    Type: Grant
    Filed: September 13, 2019
    Date of Patent: April 12, 2022
    Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Gakuto Kurata, Kartik Audhkhasi
  • Patent number: 11282531
    Abstract: A method includes receiving multiple samples of time-domain data that includes noise, computing a first two-dimensional (2D) time-frequency representation of the time domain data, and processing the first time-frequency representation using a time-frequency noise reduction mask to generate a second, noise-reduced time-frequency representation of the time domain data. The method also includes generating a time domain output based on the noise-reduced time-frequency representation.
    Type: Grant
    Filed: February 3, 2020
    Date of Patent: March 22, 2022
    Assignee: Bose Corporation
    Inventors: Ankita D. Jain, Cristian Marius Hera, Elie Bou Daher
  • Patent number: 11282524
    Abstract: A device may receive a set of audio data files corresponding to a set of calls, wherein the set of audio data files includes digital representations of one or more segments of respective calls of the set of calls, and wherein the set of calls includes audio data relating to a particular industry. The device may receive a set of transcripts corresponding to the set of audio data files. The device may determine a plurality of text-audio pairs within the set of calls, wherein a text-audio pair, of the plurality of text-audio pairs, comprises: a digital representation of a segment a call of the set of calls, and a corresponding excerpt of text from the set of transcripts. The device may train, using a machine learning process, an industry-specific text-to-speech model, tailored for the particular industry, based on the plurality of text-audio pairs.
    Type: Grant
    Filed: September 25, 2020
    Date of Patent: March 22, 2022
    Assignee: Capital One Services, LLC
    Inventor: Abhishek Dube
  • Patent number: 11276414
    Abstract: An electronic device includes an audio input module, an audio output module, and a processor. The processor is configured to provide a first signal and a second signal into which a first audio signal is processed, output the first audio signal through the audio output module, acquire an external audio signal comprising the first audio signal of the electronic device, acquire a first output value through a first input channel of an audio filter, acquire a second output value through a second input channel of the audio filter, and provide a second audio signal, based at least on a first difference value between the magnitude value corresponding to the first frequency of the external audio signal and the first output value and a second difference value between the magnitude value corresponding to the second frequency of the external audio signal and the second output value.
    Type: Grant
    Filed: August 30, 2018
    Date of Patent: March 15, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Jaemo Yang, Hangil Moon, Soonho Baek, Beak-Kwon Son, Kiho Cho, Chulmin Choi
  • Patent number: 11262975
    Abstract: A soft decision audio decoding system for preserving audio continuity in a digital wireless audio receiver is provided that deduces the likelihood of errors in a received digital signal, based on generated hard bits and soft bits. The soft bits may be utilized by a soft audio decoder to determine whether the digital signal should be decoded or muted. The soft bits may be generated based on a degree of closeness of a detected phase trajectory to known legal phase trajectories determined from the running the phase trajectory through a soft-output Viterbi algorithm. The value of the soft bits may indicate confidence in the strength of the hard bit generation. The soft decision audio decoding system may infer errors and decode perceptually acceptable audio without requiring error detection, as in conventional systems, as well as have low latency and improved granularity.
    Type: Grant
    Filed: June 5, 2020
    Date of Patent: March 1, 2022
    Assignee: Shure Acquisition Holdings, Inc.
    Inventor: Robert Mamola
  • Patent number: 11256866
    Abstract: The present disclosure provides systems and methods that perform machine-learned natural language processing. A computing system can include a machine-learned natural language processing model that includes an encoder model trained to receive a natural language text body and output a knowledge graph and a programmer model trained to receive a natural language question and output a program. The computing system can include a computer-readable medium storing instructions that, when executed, cause the processor to perform operations. The operations can include obtaining the natural language text body, inputting the natural language text body into the encoder model, receiving, as an output of the encoder model, the knowledge graph, obtaining the natural language question, inputting the natural language question into the programmer model, receiving the program as an output of the programmer model, and executing the program on the knowledge graph to produce an answer to the natural language question.
    Type: Grant
    Filed: October 25, 2017
    Date of Patent: February 22, 2022
    Assignee: Google LLC
    Inventors: Ni Lao, Jiazhong Nie, Fan Yang
  • Patent number: 11257507
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for generating discrete latent representations of input audio data. Only the discrete latent representation needs to be transmitted from an encoder system to a decoder system in order for the decoder system to be able to effectively to decode, i.e., reconstruct, the input audio data.
    Type: Grant
    Filed: January 17, 2020
    Date of Patent: February 22, 2022
    Assignee: DeepMind Technologies Limited
    Inventors: Cristina Garbacea, Aaron Gerard Antonius van den Oord, Yazhe Li, Sze Chie Lim, Alejandro Luebs, Oriol Vinyals, Thomas Chadwick Walters
  • Patent number: 11257490
    Abstract: Particular embodiments described herein provide for an electronic device that can be configured to receive a verbal command to active a device with an unknown label, derive a probable device and a label for the probable device, activate the probable device, determine that the activated probable device is the same device to be activated by the verbal command, and store the label and a description for the device. In some examples, the label is associated with the description.
    Type: Grant
    Filed: April 1, 2016
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Robert James Firby, Jesus Gonzalez Marti, Jose Gabriel De Amores Carredano, Martin Henk Van Den Berg, Maria Pilar Manchon Portillo, Guillermo Perez, Steven Thomas Holmes
  • Patent number: 11250874
    Abstract: A language proficiency analyzer automatically evaluates a person's language proficiency by analyzing that person's oral communications with another person. The analyzer first enhances the quality of an audio recording of a conversation between the two people using a neural network that automatically detects loss features in the audio and adds those loss features back into the audio. The analyzer then performs a textual and audio analysis on the improved audio. Through textual analysis, the analyzer uses a multi-attention network to determine how focused one person is on the other and how pleased one person is with the other. Through audio analysis, the analyzer uses a neural network to determine how well one person pronounced words during the conversation.
    Type: Grant
    Filed: May 21, 2020
    Date of Patent: February 15, 2022
    Assignee: Bank of America Corporation
    Inventors: MadhuSudhanan Krishnamoorthy, Harikrishnan Rajeev
  • Patent number: 11245646
    Abstract: In one embodiment, a method includes, by one or more computing systems, receiving, from a client system associated with a first user, a first user input from the first user, identifying one or more entities referenced by the first user input, determining a classification of the first user input based on a machine-learning classifier model, generating several candidate conversational fillers based on the classification of the first user input and the one or more identified entities, wherein each candidate conversational filler references at least one of the one or more identified entities, ranking the candidate conversational fillers based on a relevancy of the candidate conversational filler to the first user input and a decay model hysteresis, and sending instructions for presenting a top-ranked candidate conversational filler as an initial response to the first user.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: February 8, 2022
    Assignee: Facebook, Inc.
    Inventors: Emmanouil Koukoumidis, Michael Robert Hanson, Mohsen M Agsen
  • Patent number: 11240057
    Abstract: One embodiment provides a method in which an audible command is first received from a user. Subsequent to receipt of the audible command, one or more sensors may detect certain contextual data associated with the user's physical surroundings. An embodiment may then determine, using the data, whether a default output response associated with the audible command is appropriate with respect to the user's physical surroundings. If the default output response is determined not to be appropriate, an embodiment may thereafter provide an alternative output response that is appropriate with respect to the user's surroundings. Other aspects are described and claimed.
    Type: Grant
    Filed: March 15, 2018
    Date of Patent: February 1, 2022
    Assignee: Lenovo (Singapore) Pte. Ltd.
    Inventors: John Carl Mese, Russell Speight VanBlon, Nathan J. Peterson
  • Patent number: 11227603
    Abstract: Systems and method of diarization of audio files use an acoustic voiceprint model. A plurality of audio files are analyzed to arrive at an acoustic voiceprint model associated to an identified speaker. Metadata associate with an audio file is used to select an acoustic voiceprint model. The selected acoustic voiceprint model is applied in a diarization to identify audio data of the identified speaker.
    Type: Grant
    Filed: April 14, 2020
    Date of Patent: January 18, 2022
    Assignee: Verint Systems Ltd.
    Inventors: Omer Ziv, Ran Achituv, Ido Shapira, Jeremie Dreyfuss