Patents Examined by Susan I McFadden

Neural-network-based text-to-speech model for novel speaker generation

Patent number: 12087275

Abstract: Systems and methods for text-to-speech with novel speakers can obtain text data and output audio data. The input text data may be input along with one or more speaker preferences. The speaker preferences can include speaker characteristics. The speaker preferences can be processed by a machine-learned model conditioned on a learned prior distribution to determine a speaker embedding. The speaker embedding can then be processed with the text data to generate an output that includes audio data descriptive of the text data spoken by a novel speaker.

Type: Grant

Filed: February 16, 2022

Date of Patent: September 10, 2024

Assignee: GOOGLE LLC

Inventors: Daisy Antonia Stanton, Sean Matthew Shannon, Soroosh Mariooryad, Russell John-Wyatt Skerry-Ryan, Eric Dean Battenberg, Thomas Edward Bagby, David Teh-Hwa Kao
Voice command detection and prediction

Patent number: 12067975

Abstract: Methods, systems, and apparatuses for predicting an end of a command in a voice recognition input are described herein. The system may receive data comprising a voice input. The system may receive a signal comprising a voice input. The system may detect, in the voice input, data that is associated with a first portion of a command. The system may predict, based on the first portion and while the voice input is being received, a second portion of the command. The prediction may be generated by a machine learning algorithm that is trained based at least in part on historical data comprising user input data. The system may cause execution of the command, based on the first portion and the predicted second portion, prior to an end of the voice input.

Type: Grant

Filed: April 18, 2023

Date of Patent: August 20, 2024

Assignee: Comcast Cable Communications, LLC

Inventors: Rui Min, Hongcheng Wang
Tied and reduced RNN-T

Patent number: 12062363

Abstract: A recurrent neural network-transducer (RNN-T) model improves speech recognition by processing sequential non-blank symbols at each time step after an initial one. The model's prediction network receives a sequence of symbols from a final Softmax layer and employs a shared embedding matrix to create and map embeddings to each symbol, associating them with unique position vectors. These embeddings are weighted according to their similarity to their matching position vector. Subsequently, a joint network of the RNN-T model uses these weighted embeddings to output a probability distribution for potential speech recognition hypotheses at each time step, enabling more accurate transcriptions of spoken language.

Type: Grant

Filed: July 6, 2023

Date of Patent: August 13, 2024

Assignee: Google LLC

Inventors: Rami Botros, Tara Sainath
System and method for content creation

Patent number: 12050880

Abstract: In a first aspect, a system for creating fact-based content is presented. The system includes an application service provider operating on a network. The application service provider is configured to receive a user prompt and generate a web query for content based on the user prompt. The system includes a fact-based language model in communication with the application service provider. The fact-based language model is configured to receive the web query from the application service provider and retrieve, from a electronic library, relevant fact-based content based on the web query. The electronic library includes proprietary data. The fact-based language model is configured to provide the relevant fact-based content to the application service provider. The application service provider communicates content to a user based on the user prompt. The content includes at least a portion of the relevant fact-based content from the electronic library.

Type: Grant

Filed: December 21, 2023

Date of Patent: July 30, 2024

Assignee: Cengage Learning, Inc.

Inventors: James Chilton, Peter Griffiths, Charles Qian
Digital assistant interaction in a video communication session environment

Patent number: 12033636

Abstract: This relates to an intelligent automated assistant in a video communication environment. An example includes, during a video communication session between at least two devices, receiving a voice input at one device, generating and transmitting to a server a textual representation of the voice input, receiving from the server a shared transcription including both the textual representation of the voice input and one or more additional textual representations generated by another device, and determining and presenting one or more candidate tasks based on the shared transcription.

Type: Grant

Filed: August 9, 2023

Date of Patent: July 9, 2024

Assignee: Apple Inc.

Inventors: Niranjan Manjunath, Willem Mattelaer, Jessica Peck, Lily Shuting Zhang
Information processing device, information processing method, and program

Patent number: 12033630

Abstract: An information processing device includes an input unit, an extracting unit, an output unit, and a specifying unit. The input unit receives a voice operation. The extracting unit extracts a processing detail corresponding to the voice operation received by the input unit. When the processing detail corresponding to the voice operation received by the input unit cannot be specified, the output unit outputs response information for the user to make a selection of at least one processing detail from a plurality of processing details extracted by the extracting unit. The specifying unit specifies the processing detail selected from among the plurality of processing details contained in the response information as the processing detail corresponding to the voice operation received by the input unit.

Type: Grant

Filed: March 2, 2020

Date of Patent: July 9, 2024

Assignee: SONY GROUP CORPORATION

Inventors: Yuhei Taki, Hiro Iwase, Kunihito Sawai, Masaki Takase, Akira Miyashita
Usability in information retrieval systems

Patent number: 12032613

Abstract: In order to facilitate a search and identification of documents, an information retrieval system is provided for performing a search on a corpus of data objects. The information retrieval system comprises a device and a database. The database is configured to store at least one syntactic search index data structure and at least one semantic search index data structure. The syntactic search index data structure is configured to index and store in the database a plurality of terms from the corpus of data objects along with syntactic annotations indicating syntactic information. The at least one semantic search index data structure is configured to index and store in the database the plurality of terms from the corpus of data objects along with semantic annotations indicating semantic information. The device comprises an input unit, a processing unit, and an output unit. The input unit is configured to receive a syntactic query and a semantic query.

Type: Grant

Filed: July 12, 2021

Date of Patent: July 9, 2024

Assignee: BASF SE

Inventors: Henning Schwabe, Arunav Mishra, Juergen Mueller, Michael Schuhmacher
Method and system for bridging a gap between disparate platforms that remove the need for a user to switch between the disparate platforms

Patent number: 12026470

Abstract: Various techniques are disclosed, including receiving at a multiplatform management system a communication from a computing device via a groupware platform, the multiplatform management system interfacing with multiple disparate platforms including the groupware platform and an image processing platform, determining an event type based on the communication from the computing device to identify a cloud platform to be selected from among the plurality of disparate platforms based on a detection of one of the image or the text in the communication from the groupware platform; and identifying an action to be performed by the selected cloud platform based on the determined event type.

Type: Grant

Filed: July 3, 2023

Date of Patent: July 2, 2024

Assignee: Certinia Inc.

Inventors: Stephen Paul Willcock, Matthew David Wood
Dialogue data generation device, dialogue data generation method, and program

Patent number: 12026460

Abstract: To make it possible to generate dialogue data for generating a question sentence to deeply delve conversation at a low cost. For each of a plurality of pieces of data each including a set of a first utterance sentence that is a sentence uttered by a first user, a second utterance sentence that is a sentence uttered by a second user and is a response to the first utterance sentence, and a third utterance sentence that is a sentence uttered by the first user and is a response to the second utterance sentence, a dialogue data generation unit 110 generates a set of the first utterance sentence of the data and the second utterance sentence of the data as dialogue data when the second utterance sentence of the data is a question sentence using an interrogative.

Type: Grant

Filed: May 7, 2019

Date of Patent: July 2, 2024

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Taichi Katayama, Atsushi Otsuka, Ko Mitsuda, Kuniko Saito, Junji Tomita
Method and system for conversation transcription with metadata

Patent number: 12020708

Abstract: Methods and systems for enabling an efficient review of meeting content via a metadata-enriched, speaker-attributed transcript are disclosed. By incorporating speaker diarization and other metadata, the system can provide a structured and effective way to review and/or edit the transcript. One type of metadata can be image or video data to represent the meeting content. Furthermore, the present subject matter utilizes a multimodal diarization model to identify and label different speakers. The system can synchronize various sources of data, e.g., audio channel data, voice feature vectors, acoustic beamforming, image identification, and extrinsic data, to implement speaker diarization.

Type: Grant

Filed: October 11, 2021

Date of Patent: June 25, 2024

Assignee: SoundHound AI IP, LLC.

Inventors: Kiersten L. Bradley, Ethan Coeytaux, Ziming Yin
Method and apparatus for selecting answers to idiom fill-in-the-blank questions, and computer device

Patent number: 12008319

Abstract: Disclosed are a method and apparatus for selecting answers to idiom fill-in-the-blank questions, a computer device, and a storage medium. The method includes: obtaining a question text of idiom fill-in-the-blank questions, the question text including a fill-in-the-blank text and n candidate idioms, and the fill-in-the-blank text including m fill-in-the-blanks to be filled in with the candidate idioms; obtaining an explanatory text of all the candidate idioms; obtaining, through an idiom selection fill-in-the-blank model, a confidence that each fill-in-the-blank is filled in with each candidate idiom; selecting m idioms from the n candidate idioms to form multiple groups of answers; calculating a sum of the confidences that the fill-in-the-blanks are filled in with the candidate idioms in each group of answers; and obtaining a group of answers with the highest confidence sum as answers to the idiom fill-in-the-blank questions.

Type: Grant

Filed: November 30, 2020

Date of Patent: June 11, 2024

Assignee: PING AN TECHNOLOGY (SHENZHEN) CO., LTD.

Inventors: Xiang Liu, Xiuling Chen
Electronic device performing speaker recognition and control method thereof

Patent number: 12002475

Abstract: The present disclosure provides an electronic device and a control method thereof. The electronic device of the present disclosure includes: a memory in which a speaker model including acoustic characteristics and context information of a first user voice is stored; and a processor for comparing a degree of similarity between the acoustic characteristics of the first user included in the speaker model and the acoustic characteristics of a second user voice, with a threshold value changing according to a degree of similarity between the context information included in the speaker model and the context information of the second user voice, and then performing authentication on the second user voice.

Type: Grant

Filed: September 27, 2019

Date of Patent: June 4, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventor: Jaesung Kwon
Selectively tuning machine translation models for custom machine translations

Patent number: 12001809

Abstract: Machine learning translation models may be selectively tuned to provide custom machine translations. A request to translate input text from an input language to a target language may be received. A tuning data set for translating the input text to the target language may be identified and searched to select pairs of texts in the tuning data according to comparisons with the input text. A machine learning model used to translate into the target language may be tuned using only second texts in the target language in the selected pairs of texts. The tuned machine learning model may then be used to translate the input text into the target language.

Type: Grant

Filed: November 18, 2021

Date of Patent: June 4, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Anna Currey, Dengke Liu, Aakash Upadhyay, Prashant Mathur, Georgiana Dinu, Eric J. Nowell
Performing subtask(s) for a predicted action in response to a separate user interaction with an automated assistant prior to performance of the predicted action

Patent number: 11996102

Abstract: Implementations relate to receiving natural language input that requests an automated assistant to provide information and processing the natural language input to identify the requested information and to identify one or more predicted actions. Those implementations further cause a computing device, at which the natural language input is received, to render the requested information and the one or more predicted actions in response to the natural language input. Yet further, those implementations, in response to the user confirming a rendered predicted action, cause the automated assistant to initialize the predicted action.

Type: Grant

Filed: May 25, 2023

Date of Patent: May 28, 2024

Assignee: GOOGLE LLC

Inventors: Lucas Mirelmann, Zaheed Sabur, Bohdan Vlasyuk, Marie Patriarche Bledowski, Sergey Nazarov, Denis Burakov, Behshad Behzadi, Michael Golikov, Steve Cheng, Daniel Cotting, Mario Bertschler
Visual responses to user inputs

Patent number: 11996081

Abstract: Techniques for generating a visual response to a user input are described. A system may receive a natural language input and use a machine learning model to determine a first component is to determine a response to the natural language input while a second component is to determine supplemental content related to the natural language input. The system may receive, from the first component, first image data corresponding to the response. The system may also receive, from the second component, second image data corresponding to the supplemental content. The system may send, to a display, a command to present the first image data and the second image data.

Type: Grant

Filed: May 26, 2023

Date of Patent: May 28, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Vasiliy Radostev, Ruhi Sarikaya, Rekha Seshadrinathan, Abhinav Sethy, Chetan Nagaraj Naik, Anjishnu Kumar
Multi-stage adaptive system for content moderation

Patent number: 11996117

Abstract: A toxicity moderation system has an input configured to receive speech from a speaker. The system includes a multi-stage toxicity machine learning system having a first stage and a second stage. The first stage is trained to analyze the received speech to determine whether a toxicity level of the speech meets a toxicity threshold. The first stage is also configured to filter-through, to the second stage, speech that meets the toxicity threshold, and is further configured to filter-out speech that does not meet the toxicity threshold.

Type: Grant

Filed: October 8, 2021

Date of Patent: May 28, 2024

Inventors: William Carter Huffman, Michael Pappas, Henry Howie
Method for generating a voice announcement as feedback to a handwritten user input, corresponding control device, and motor vehicle

Patent number: 11975729

Abstract: A method for generating a voice announcement as feedback to a handwritten user input is disclosed in which a user enters on a control device. A list of possible whole words which can be entered by the user input is provided together with a corresponding transcription and a predetermined word end, which comprises one or more characters of a whole word of the whole words, is removed from the end of said whole word in accordance with a predetermined shortening rule and corresponding to this, a transcription end corresponding to the word end is determined based on a predetermined assignment rule and is removed from the corresponding transcription of the whole word for generating a partial word and an associated partial transcription. The partial word and the partial transcription are added to another list.

Type: Grant

Filed: July 29, 2019

Date of Patent: May 7, 2024

Assignee: AUDI AG

Inventor: Jan Dusik
Wakeword detection

Patent number: 11978440

Abstract: Techniques for processing input data for a detected user are described. Received image data is processed to identify an indicated user. Based on the user a machine learning model is implemented. The machine learning model is then used to process input data for a user input. An action is performed using the resulting output data.

Type: Grant

Filed: May 25, 2023

Date of Patent: May 7, 2024

Assignee: Amazon Technologies, Inc.

Inventors: Deepak Yavagal, Ajith Prabhakara, John Gray
Method and apparatus for calculating downmixed signal and residual signal

Patent number: 11961526

Abstract: A method and an apparatus for calculating a downmixed signal and a residual signal are provided. According to the method, if a first target frame (a current frame or a previous frame of the current frame) is a switching frame, a to-be-encoded downmixed signal and a to-be-encoded residual signal of the subband corresponding to the preset frequency band in the current frame is calculated based on a switch fade-in/fade-out factor of a second target frame, an initial downmixed signal and an initial residual signal of the preset frequency band.

Type: Grant

Filed: November 25, 2020

Date of Patent: April 16, 2024

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Haiting Li, Bin Wang, Zexin Liu
Electronic apparatus and controlling method thereof

Patent number: 11961506

Abstract: An electronic apparatus including a memory configured to store first voice recognition information related to a first language and second voice recognition information related to a second language, and a processor to obtain a first text corresponding to a user voice that is received on the basis of first voice recognition information, based on an entity name being included in the user voice according to the obtained first text, identify a segment in the user voice in which the entity name is included. The processor is to obtain a second text corresponding to the identified segment of the user voice on the basis of the second voice recognition information, and obtain control information corresponding to the user voice on the basis of the first text and the second text.

Type: Grant

Filed: February 23, 2023

Date of Patent: April 16, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Chansik Bok, Jihun Park

1 2 3 4 5 … next