Semantic Context, E.g., Disambiguation Of The Recognition Hypotheses Based On Word Meaning, Etc. (epo) Patents (Class 704/E15.024)

Speech recognition using neural networks

Patent number: 12243515

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for speech recognition using neural networks. A feature vector that models audio characteristics of a portion of an utterance is received. Data indicative of latent variables of multivariate factor analysis is received. The feature vector and the data indicative of the latent variables is provided as input to a neural network. A candidate transcription for the utterance is determined based on at least an output of the neural network.

Type: Grant

Filed: March 2, 2023

Date of Patent: March 4, 2025

Assignee: Google LLC

Inventors: Andrew W. Senior, Ignacio L. Moreno
Component libraries for voice interaction services

Patent number: 12236163

Abstract: The disclosed embodiments include computerized methods, systems, and devices, including computer programs encoded on a computer storage medium, for integrating voice-based interaction and control into a native graphical user interface (GUI) of an executed application. For example, a communications device may obtaining component data identifying a plurality of components of a voice-user interface from a computing system maintained by a voice-service provider, and may execute an application linked to a corresponding one of the components of the voice-user interface. The communications device may generate the native GUI based on an output of the executed application, and may generate an interface element representative of the corresponding one of the components of the voice-user interface. The communications device may present the generated interface element within the native GUI, which may embed the corresponding component of the voice-user interface into the native GUI.

Type: Grant

Filed: August 2, 2021

Date of Patent: February 25, 2025

Assignee: GOOGLE LLC

Inventors: Sang Soo Sung, Lantian Zheng, Haywai Hayward Chan, Chen Liu, Liuyi Sun, David P. Whipp
Multi-model approach to natural language processing and recommendation generation

Patent number: 12112133

Abstract: In some implementations, a device may monitor a set of data sources to generate a set of language models corresponding to the set of data sources. The device may determine a plurality of sets of keyword groups. The device may generate a plurality of sets of skill catalogs. The device may receive a source document for processing. The device may process the source document to extract a key phrase set and to determine a first similarity distance. The device may select a corresponding skill catalog and an associated language model based on a relevancy value. The device may determine second similarity distances between the source document and one or more target documents using the corresponding skill catalog and the associated language model. The device may output information associated with one or more target documents based at least in part on the second similarity distances.

Type: Grant

Filed: August 13, 2021

Date of Patent: October 8, 2024

Assignee: Avanade Holdings LLC

Inventors: Takashi Ogura, Yu Nakahara, Naoki Hirose
Methods and apparatus for hybrid speech recognition processing

Patent number: 11990135

Abstract: Methods and apparatus for selectively performing speech processing in a hybrid speech processing system. The hybrid speech processing system includes at least one mobile electronic device and a network-connected server remotely located from the at least one mobile electronic device. The mobile electronic device is configured to use an embedded speech recognizer to process at least a portion of input audio to produce recognized text. A controller on the mobile electronic device determines whether to send information from the mobile electronic device to the server for speech processing. The determination of whether to send the information is based, at least in part, on an analysis of the input audio, the recognized text, or a semantic category associated with the recognized text.

Type: Grant

Filed: February 9, 2021

Date of Patent: May 21, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Daniel Willett, Joel Pinto, William F. Ganong, III
Automatically determining language for speech recognition of spoken utterance received via an automated assistant interface

Patent number: 11798541

Abstract: Determining a language for speech recognition of a spoken utterance received via an automated assistant interface for interacting with an automated assistant. Implementations can enable multilingual interaction with the automated assistant, without necessitating a user explicitly designate a language to be utilized for each interaction. Implementations determine a user profile that corresponds to audio data that captures a spoken utterance, and utilize language(s), and optionally corresponding probabilities, assigned to the user profile in determining a language for speech recognition of the spoken utterance. Some implementations select only a subset of languages, assigned to the user profile, to utilize in speech recognition of a given spoken utterance of the user.

Type: Grant

Filed: November 16, 2020

Date of Patent: October 24, 2023

Assignee: GOOGLE LLC

Inventors: Pu-sen Chao, Diego Melendo Casado, Ignacio Lopez Moreno
Systems and methods for reading comprehension for a question answering task

Patent number: 11775775

Abstract: Embodiments described herein provide a pipelined natural language question answering system that improves a BERT-based system. Specifically, the natural language question answering system uses a pipeline of neural networks each trained to perform a particular task. The context selection network identifies premium context from context for the question. The question type network identifies the natural language question as a yes, no, or span question and a yes or no answer to the natural language question when the question is a yes or no question. The span extraction model determines an answer span to the natural language question when the question is a span question.

Type: Grant

Filed: November 26, 2019

Date of Patent: October 3, 2023

Assignee: Salesforce.com, Inc.

Inventors: Akari Asai, Kazuma Hashimoto, Richard Socher, Caiming Xiong
Mandarin and dialect mixed modeling and speech recognition

Patent number: 11688391

Abstract: The present disclosure provides a modeling method for speech recognition and a device. The method includes: determining N types of tags; training a neural network according to speech data of Mandarin to generate a recognition model whose outputs are the N types of tags; inputting speech data of each dialect into the recognition model to obtain an output tag of each frame of the speech data of each dialect; determining, according to the output tags and tagged true tags, error rates of the N types of tags for the each dialect, generating M types of target tags according to tags with error rates greater than a preset threshold; and training an acoustic model according to third speech data of Mandarin and third speech data of the P dialects, outputs of the acoustic model being the N types of tags and the M types of target tags corresponding to each dialect.

Type: Grant

Filed: April 8, 2020

Date of Patent: June 27, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO.

Inventor: Shenglong Yuan
Electronic apparatus and controlling method thereof

Patent number: 11586689

Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a memory configured to store at least one instruction, and a processor configured to execute the at least one instruction to control the electronic apparatus to: determine a keyword from a query based on the query being input, obtain a word related to the keyword based on information on a user preference, and provide a response to the user query based on the keyword and the word. The processor may be configured to control the electronic apparatus to obtain at least one word from among a plurality of candidate words corresponding to the keyword as a word related to the keyword based on the user preference information. For example, at least part of a method of providing a response to a query by the electronic apparatus may use an AI model that is trained using at least one of machine learning, neural network or deep learning algorithm.

Type: Grant

Filed: December 11, 2019

Date of Patent: February 21, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jaechul Yang, Munjo Kim, Youngbin Shin, Changho Paeon, Inchul Hwang
Methods, apparatuses, devices, and computer-readable storage media for determining category of entity

Patent number: 11526663

Abstract: According to embodiments of the present disclosure, a method, an apparatus, a device, and a computer-readable storage medium for determining a category of an entity are provided. The method includes: based on a suffix of the entity, obtaining a suffix feature associated with the suffix; determining one or more candidate categories of the entity based on a name of the entity; and determining a set of categories of the entity based on the one or more candidate categories and the suffix feature.

Type: Grant

Filed: September 5, 2019

Date of Patent: December 13, 2022

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Jianyi Cheng, Min Zhao
Evaluating language models using negative data

Patent number: 11488579

Abstract: A method of evaluating a language model using negative data may include accessing a first language model that is trained using a first training corpus, and accessing a second language model. The second language model may be configured to generate outputs that are less grammatical than outputs generated by the first language model. The method may also include training the second language model using a second training corpus, and generating output text from the second language model. The method may further include testing the first language model using the output text from the second language model.

Type: Grant

Filed: June 2, 2020

Date of Patent: November 1, 2022

Assignee: Oracle International Corporation

Inventors: Michael Louis Wick, Jean-Baptiste Frederic George Tristan, Jason Peck
Evaluating chatbots for knowledge gaps

Patent number: 11416682

Abstract: Knowledge gaps in a chatbot are identified with reference to a domain-specific document and a set of QA pairs of the chatbot. Entities and/or entity values associated with the document are compared to the entities and/or entity values of the QA pairs. Entities of the document not associated with the QA pairs are identified as knowledge gaps. The QA pairs and knowledge gaps are ranked by relevance to the domain.

Type: Grant

Filed: July 1, 2020

Date of Patent: August 16, 2022

Assignee: International Business Machines Corporation

Inventors: Hima Patel, Jayachandu Bandlamudi, Kuntal Dey, Daivik Swarup Oggu Venkata
Method and system for prompt construction for selection from a list of acoustically confusable items in spoken dialog systems

Patent number: 8909528

Abstract: A method (and system) of determining confusable list items and resolving this confusion in a spoken dialog system includes receiving user input, processing the user input and determining if a list of items needs to be played back to the user, retrieving the list to be played back to the user, identifying acoustic confusions between items on the list, changing the items on the list as necessary to remove the acoustic confusions, and playing unambiguous list items back to the user.

Type: Grant

Filed: May 9, 2007

Date of Patent: December 9, 2014

Assignee: Nuance Communications, Inc.

Inventors: Ellen Marie Eide, Vaibhava Goel, Ramesh Gopinath, Osamuyimen T. Stewart
REAL-TIME SEMANTIC ANNOTATION SYSTEM AND THE METHOD OF CREATING ONTOLOGY DOCUMENTS ON THE FLY FROM NATURAL LANGUAGE STRING ENTERED BY USER

Publication number: 20100114563

Abstract: Disclosed herein are a real-time semantic annotation system and a method of converting user-entered natural language strings into semantically-readable knowledge structure documents using the system in real time. The real-time semantic annotation system includes a natural language character string input device for enabling a user to enter natural language character strings, a character string pattern triplet-mapping table for storing natural language character string patterns and their corresponding triplets, a triplet extraction device for converting the entered natural language character strings into triplets by analyzing and processing the entered natural language character strings using the pattern-triplet mapping table, an alternative word recommendation device for providing notification that a user should enter an alternative word, and a machine-readable document generation device for generating machine-readable documents from the triplets using a semantically-readable knowledge structure.

Type: Application

Filed: November 2, 2009

Publication date: May 6, 2010

Inventors: Key-Sun Choi, Jinhyun Ahn, Jason J. Jung
PROJECTING SYNTACTIC INFORMATION USING A BOTTOM-UP PATTERN MATCHING ALGORITHM

Publication number: 20090326925

Abstract: Embodiments for converting a token collection that is derived from a natural language expression into a computational independent model (CIM) syntax tree representation are disclosed. In accordance with one embodiment, the conversion includes deriving a plurality of tokens from a natural language expression, where each of the plurality of tokens including at least one word. The conversion further includes transforming the plurality of tokens into a CIM syntax tree representation based on a CIM phrase tree model. The conversion also includes providing the CIM syntax tree representation to an application.

Type: Application

Filed: December 15, 2008

Publication date: December 31, 2009

Applicant: MICROSOFT CORPORATION

Inventors: Anthony L. Crider, Donald E. Baisley
SPEECH DIALOG CONTROL BASED ON SIGNAL PRE-PROCESSING

Publication number: 20080147397

Abstract: A speech dialog system interfaces a user to a computer. The system includes a signal pre-processor that processes a speech input to generate an enhanced signal and an analysis signal. A speech recognition unit may generate a recognition result based on the enhanced signal. A control unit may manage an output unit or an external device based on the information within the analysis signal.

Type: Application

Filed: December 6, 2007

Publication date: June 19, 2008

Inventors: Lars Konig, Gerhard Uwe Schmidt, Andreas Low
Semantic retrieval method and computer product

Publication number: 20080082318

Abstract: A dictionary server includes a retrieval-display processing unit. Upon receipt of a request for retrieval of semantic information related to a term from a client PC, the retrieval-display processing unit acquires the semantic information, header information, and link information related to the semantic information from knowledge reference data, dictionary content data, and dictionary data. Based on the acquired information, the retrieval-display processing unit causes the client PC to display items on webpage related to the semantic information, the header information, and the link information.

Type: Application

Filed: October 11, 2007

Publication date: April 3, 2008

Applicant: Fujitsu Limited

Inventors: Masahiro Kataoka, Takashi Furuta, Koichi Takahashi, Takashi Tsubokura