Patents Examined by Shreyans A Patel
-
Patent number: 11676577Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adapting a language model are disclosed. In one aspect, a method includes the actions of receiving transcriptions of utterances that were received by computing devices operating in a domain and that are in a source language. The actions further include generating translated transcriptions of the transcriptions of the utterances in a target language. The actions further include receiving a language model for the target language. The actions further include biasing the language model for the target language by increasing the likelihood of the language model selecting terms included in the translated transcriptions. The actions further include generating a transcription of an utterance in the target language using the biased language model and while operating in the domain.Type: GrantFiled: September 9, 2021Date of Patent: June 13, 2023Assignee: Google LLCInventors: Petar Aleksic, Benjamin Paul Hillson Haynor
-
Patent number: 11669688Abstract: A system and a corresponding computer-implemented method identifies and classifies community-sourced documents as true documents. The community-sourced documents include one or more data objects such as data items, including text, strings, phrases, and words; image items, including still image items, video image items, and icons; and drawing items. The system and corresponding method then report the analysis results.Type: GrantFiled: June 7, 2021Date of Patent: June 6, 2023Assignee: Architecture Technology CorporationInventors: Eric R. Chartier, Andrew Murphy, William Colligan, Paul C. Davis
-
Patent number: 11670300Abstract: Systems and methods are described include a robot and/or an associated computing system that can use various cues about an environment of the robot to apply a bias to increase the accuracy of speech transcription. In some implementations, audio data corresponding to a spoken instruction to a robot is received. Candidate transcriptions of the audio data are obtained. A respective action of the robot corresponding to each of the candidate transcriptions of the audio data is determined. One or more scores indicating characteristics of a potential outcome of performing the respective action corresponding to the candidate transcription of the audio data are determined for each of the candidate transcriptions of the audio data. A particular candidate transcription is selected from among the candidate transcriptions based at least on the one or more scores. The action determined for the particular candidate transcription is performed.Type: GrantFiled: July 8, 2022Date of Patent: June 6, 2023Assignee: X Development LLCInventor: Daniel Alex Lam
-
Patent number: 11670311Abstract: A wireless audio system for encoding and decoding an audio signal using spectral bandwidth replication is provided. Bandwidth extension is performed in the time-domain, enabling low-latency audio coding.Type: GrantFiled: April 12, 2021Date of Patent: June 6, 2023Assignee: Shure Acquisition Holdings, Inc.Inventors: Wenshun Tian, Michael Ryan Lester
-
Patent number: 11657277Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for performing sequence modeling tasks using insertions. One of the methods includes receiving a system input that includes one or more source elements from a source sequence and zero or more target elements from a target sequence, wherein each source element is selected from a vocabulary of source elements and wherein each target element is selected from a vocabulary of target elements; generating a partial concatenated sequence that includes the one or more source elements from the source sequence and the zero or more target elements from the target sequence, wherein the source and target elements arranged in the partial concatenated sequence according to a combined order; and generating a final concatenated sequence that includes a finalized source sequence and a finalized target sequence, wherein the finalized target sequence includes one or more target elements.Type: GrantFiled: May 26, 2020Date of Patent: May 23, 2023Assignee: Google LLCInventors: William Chan, Mitchell Thomas Stern, Nikita Kitaev, Kelvin Gu, Jakob D. Uszkoreit
-
Patent number: 11657828Abstract: Embodiments improve speech data quality through training a neural network for de-noising audio enhancement. One such embodiment creates simulated noisy speech data from high quality speech data. In turn, training, e.g., deep normalizing flow training, is performed on a neural network using the high quality speech data and the simulated noisy speech data to train the neural network to create de-noised speech data given noisy speech data. Performing the training includes minimizing errors in the neural network according to at least one of (i) a decoding error of an Automatic Speech Recognition (ASR) system processing current de-noised speech data results generated by the neural network during the training and (ii) spectral distance between the high quality speech data and the current de-noised speech data results generated by the neural network during the training.Type: GrantFiled: January 31, 2020Date of Patent: May 23, 2023Assignee: Nuance Communications, Inc.Inventor: Carl Benjamin Quillen
-
Patent number: 11651780Abstract: A speech recognition system utilizing automatic speech recognition techniques such as end-pointing techniques in conjunction with beamforming and/or signal processing to isolate speech from one or more speaking users from multiple received audio signals and to detect the beginning and/or end of the speech based at least in part on the isolation. Audio capture devices such as microphones may be arranged in a beamforming array to receive the multiple audio signals. Multiple audio sources including speech may be identified in different beams and processed.Type: GrantFiled: June 7, 2021Date of Patent: May 16, 2023Assignee: Amazon Technologies, Inc.Inventors: Kenneth John Basye, Jeffrey Penrod Adams
-
Patent number: 11646010Abstract: A method for estimating an embedding capacity includes receiving, at a deterministic reference encoder, a reference audio signal, and determining a reference embedding corresponding to the reference audio signal, the reference embedding having a corresponding embedding dimensionality. The method also includes measuring a first reconstruction loss as a function of the corresponding embedding dimensionality of the reference embedding and obtaining a variational embedding from a variational posterior. The variational embedding has a corresponding embedding dimensionality and a specified capacity. The method also includes measuring a second reconstruction loss as a function of the corresponding embedding dimensionality of the variational embedding and estimating a capacity of the reference embedding by comparing the first measured reconstruction loss for the reference embedding relative to the second measured reconstruction loss for the variational embedding having the specified capacity.Type: GrantFiled: December 9, 2021Date of Patent: May 9, 2023Assignee: Google LLCInventors: Eric Dean Battenberg, Daisy Stanton, Russell John Wyatt Skerry-Ryan, Soroosh Mariooryad, David Teh-Hwa Kao, Thomas Edward Bagby, Sean Matthew Shannon
-
Patent number: 11630999Abstract: A method and system for voice emotion identification contained in audio in a call providing customer support between a customer and a service agent by implementing an emotion identification application to identify emotions captured in a voice of the customer from audio received by a media streaming device; receiving, by the emotion identification application, an audio stream of a series of voice samples contained in consecutive frames from audio received; extracting, by the emotion identification application, a set of voice emotion features from each frame in each voice sample of the audio by applying a trained machine learning (ML) model for identifying emotions utilizing a neural networks to determine one or more voice emotions by a configured set of voice emotion features captured in each voice sample; and classifying, by the emotion identification application, each emotion determined by the trained ML model based on a set of classifying features to label one or more types of emotions captured in each voicType: GrantFiled: December 19, 2019Date of Patent: April 18, 2023Inventors: Arun Lokman Gangotri, Balarama Mathukumilli, Debashish Sahoo
-
Patent number: 11620502Abstract: The present disclosure provides a method for syncing data of a computing task across a plurality of groups of computing nodes. Each group including a set of computing nodes A-D, a set of intra-group interconnects that communicatively couple computing node A with computing nodes B and C and computing node D with computing nodes B and C, and a set of inter-group interconnects that communicatively couple each of computing nodes A-D with corresponding computing nodes A-D in each of a plurality of neighboring groups. The method comprises syncing data at a computing node of the plurality of groups of computing nodes using inter-group interconnects and intra-group interconnects along four different directions relative to the node; and broadcasting synced data from the node to the plurality of groups of computing nodes using inter-group interconnects and intra-group interconnects along four different directions relative to the node.Type: GrantFiled: January 30, 2020Date of Patent: April 4, 2023Assignee: Alibaba Group Holding LimitedInventors: Liang Han, Yang Jiao
-
Patent number: 11610577Abstract: Methods and Systems for providing a change to a voice interacting with a user are described. Information indicating a change that can be made to the voice can be received. The voice can be changed based on the information.Type: GrantFiled: November 19, 2020Date of Patent: March 21, 2023Assignee: Capital One Services, LLCInventors: Anh Truong, Mark Watson, Jeremy Goodsitt, Vincent Pham, Fardin Abdi Taghi Abad, Kate Key, Austin Walters, Reza Farivar
-
Patent number: 11605370Abstract: Disclosed are methods and systems for providing audible flight information to an operator of an aircraft. A method, for example, may include receiving flight information detected by one or more sensors positioned on the aircraft, causing an image to be displayed on a display device, the image including a plurality of text items corresponding to the flight information, receiving a first operator selection indicative of one or more of the text items, parsing the one or more text items to generate a set of intermediate data, synthesizing audio data based on the intermediate data, and causing audible content corresponding to the audio data to be emitted by one or more audio emitting devices, wherein the audible content includes speech corresponding to the flight information.Type: GrantFiled: August 12, 2021Date of Patent: March 14, 2023Assignee: Honeywell International Inc.Inventor: Dongfang Zhang
-
Patent number: 11605371Abstract: Embodiments of the present systems and methods may provide techniques for synthesizing speech in any voice in any language in any accent. For example, in an embodiment, a text-to-speech conversion system may comprise a text converter adapted to convert input text to at least one phoneme selected from a plurality of phonemes stored in memory, a machine-learning model storing voice patterns for a plurality of individuals and adapted to receive the at least one phoneme and an identity of a speaker and to generate acoustic features for each phoneme, and a decoder adapted to receive the generated acoustic features and to generate a speech signal simulating a voice of the identified speaker in a language.Type: GrantFiled: June 14, 2019Date of Patent: March 14, 2023Assignee: Georgetown UniversityInventors: Joe Garman, Ophir Frieder
-
Patent number: 11600279Abstract: A method to transcribe communications may include obtaining audio data originating at a first device during a communication session between the first device and a second device and providing the audio data to an automated speech recognition system configured to transcribe the audio data. The method may further include obtaining multiple hypothesis transcriptions generated by the automated speech recognition system. Each of the multiple hypothesis transcriptions may include one or more words determined by the automated speech recognition system to be a transcription of a portion of the audio data. The method may further include determining one or more consistent words that are included in two or more of the multiple hypothesis transcriptions and in response to determining the one or more consistent words, providing the one or more consistent words to the second device for presentation of the one or more consistent words by the second device.Type: GrantFiled: August 26, 2019Date of Patent: March 7, 2023Assignee: Sorenson IP Holdings, LLCInventors: Brian Chevrier, Shane Roylance, Kenneth Boehme
-
Patent number: 11599791Abstract: An example embodiment includes a neural network unit to which a plurality of element values based on learning target data are input, and a learning unit that trains the neural network unit. The neural network unit has a plurality of learning cells each including a plurality of input nodes that perform predetermined weighting on each of the plurality of element values and an output node that sums the plurality of weighted element values and outputs the sum, and in accordance with an output value of each of the learning cells, the learning unit updates weighting coefficients of the plurality of input nodes of each of the learning cells or adds a new learning cell to the neural network unit.Type: GrantFiled: November 20, 2018Date of Patent: March 7, 2023Assignee: NEC Solution Innovators, Ltd.Inventors: Yoshihito Miyauchi, Akio Uda, Katsuhiro Nakade
-
Patent number: 11593562Abstract: A smart assistant is disclosed that provides for interfaces to capture requirements for a technical assistance request and then execute actions responsive to the technical assistance request. Example embodiments relate to parsing natural language input defining a technical assistance request to determine a series of instructions responsive to the technical assistance request. The smart assistant may also automatically detect a condition and generate a technical assistance request responsive to the condition. One or more driver applications may control or command one or more computing systems to respond to the technical assistance request.Type: GrantFiled: November 11, 2019Date of Patent: February 28, 2023Assignee: Affirm, Inc.Inventors: Adam Smith, Tarak Upadhyaya, Juan Lozano, Daniel Hung
-
Patent number: 11593568Abstract: An agent system includes a first memory and a first processor coupled to the first memory. The first processor analyzes contents of a verbal question, and carries out pre-processing that replaces vocabulary, which is used in the contents of the question, with homogenized vocabulary, and generates response information based on results of analysis. In a case in which there exists substitution vocabulary that has replaced original vocabulary in the pre-processing, the first processor changes the response information such that it can be recognized that the substitution vocabulary in the response information is synonymous with the original vocabulary, and outputs the response information.Type: GrantFiled: November 12, 2020Date of Patent: February 28, 2023Assignee: TOYOTA JIDOSHA KABUSHIKI KAISHAInventors: Chikage Kubo, Keiko Nakano, Eiichi Maeda, Hiroyuki Nishizawa
-
Patent number: 11587547Abstract: An electronic apparatus which acquires input data to be input into a TTS module for outputting a voice through the TTS module, acquires a voice signal corresponding to the input data through the TTS module, detects an error in the acquired voice signal based on the input data, corrects the input data based on the detection result, and acquires a corrected voice signal corresponding to the corrected input data through the TTS module.Type: GrantFiled: February 12, 2020Date of Patent: February 21, 2023Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Hosang Sung, Kyoungbo Min, Seonho Hwang, Doohwa Hong, Eunmi Oh, Jonghoon Jeong, Kihyun Choo
-
Patent number: 11580955Abstract: A speech-processing system receives input data representing text. A first encoder processes segments of the text to determine embedding data representing the text, and a second encoder processes corresponding audio data to determine prosodic data corresponding to the text. The embedding and prosodic data is processed to create output data including a representation of speech corresponding to the text and prosody.Type: GrantFiled: March 31, 2021Date of Patent: February 14, 2023Assignee: Amazon Technologies, Inc.Inventors: Yixiong Meng, Roberto Barra Chicote, Grzegorz Beringer, Zeya Chen, Jie Liang, James Garnet Droppo, Chia-Hao Chang, Oguz Hasan Elibol
-
Patent number: 11580952Abstract: A method includes receiving an input text sequence to be synthesized into speech in a first language and obtaining a speaker embedding, the speaker embedding specifying specific voice characteristics of a target speaker for synthesizing the input text sequence into speech that clones a voice of the target speaker. The target speaker includes a native speaker of a second language different than the first language. The method also includes generating, using a text-to-speech (TTS) model, an output audio feature representation of the input text by processing the input text sequence and the speaker embedding. The output audio feature representation includes the voice characteristics of the target speaker specified by the speaker embedding.Type: GrantFiled: April 22, 2020Date of Patent: February 14, 2023Assignee: Google LLCInventors: Yu Zhang, Ron J. Weiss, Byungha Chun, Yonghui Wu, Zhifeng Chen, Russell John Wyatt Skerry-Ryan, Ye Jia, Andrew M. Rosenberg, Bhuvana Ramabhadran