Patents Examined by Athar N Pasha

Method and apparatus for speech interaction, and computer storage medium

Patent number: 11830482

Abstract: Embodiments of the present disclosure relate to a method and an apparatus for speech interaction, and a computer readable storage medium. The method may include determining text information corresponding to a received speech signal. The method also includes obtaining label information of the text information by labeling elements in the text information. In addition, the method further includes determining first intention information of the text information based on the label information. The method further includes determining a semantic of the text information based on the first intention information and the label information.

Type: Grant

Filed: June 8, 2020

Date of Patent: November 28, 2023

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD

Inventors: Zhen Wu, Yufang Wu, Hua Liang, Jiaxiang Ge, Xingyuan Peng, Jinfeng Bai, Lei Jia
Method, apparatus and device for training network and storage medium

Patent number: 11823660

Abstract: Embodiments of the present disclosure disclose a method, apparatus and device for training a network, and a storage medium, relate to the field of artificial intelligence technology such as deep learning and speech analysis. A semantic prediction network comprises: an encoder network and at least one decoder network; and a particular solution is: acquiring a first speech feature of a target speech sample; the target speech sample being a synthesized speech sample or a real speech sample, the synthesized speech sample being attached with a sample syllable label and a semantic label comprising a value of the domain, and the real speech sample being attached with a sample syllable label; and jointly training an initial semantic prediction network and a syllable classification network using the first speech feature of the target speech sample, to obtain a trained semantic prediction network.

Type: Grant

Filed: June 21, 2021

Date of Patent: November 21, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Li Chen, Saisai Zou
Voice transmission compensation apparatus, voice transmission compensation method and program

Patent number: 11806213

Abstract: A speech transmission compensation apparatus that assists discrimination of speech heard by a user, includes: one or more computers each including a memory and a processor configured to: accept input of a speech signal, detect a specific type of sound in the speech signal, analyze an acoustic characteristic of the specific type of sound in the speech signal and output the acoustic characteristic; accept input of the acoustic characteristic being output by the memory and the processor, generate a vibration signal of a duration corresponding to the acoustic characteristic and output the vibration signal; and accept input of the vibration signal being output by the memory and the processor and provide the user with vibration for the duration on the basis of the vibration signal.

Type: Grant

Filed: April 30, 2020

Date of Patent: November 7, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Asuka Ono, Momoko Nakatani, Ai Nakane, Yoko Ishii
Information processing apparatus and non-transitory computer readable medium

Patent number: 11803705

Abstract: An information processing apparatus includes a memory and processor. The memory stores a trouble category into which a trouble occurring in a product is classified and a design-element category into which a design element causing the trouble and included in the product is classified in mutual association and store a design-element category into which a design element included in a product is classified and a design-requirement category into which a design requirement of the design element is classified in association with each other.

Type: Grant

Filed: December 7, 2020

Date of Patent: October 31, 2023

Assignee: FUJIFILM Business Innovation Corp.

Inventors: Makoto Fuchigami, Yasuaki Miyazawa, Mari Horie, Hiroshi Murano, Nobukazu Takahashi, Kimihiro Wakabayashi, Masaki Suda, Eiji Ooyama, Hiroaki Murai, Masahiro Ishino
Speech recognition method, apparatus, device and readable storage medium

Patent number: 11798548

Abstract: The present application discloses a speech recognition method, apparatus, device and readable storage medium, and relates to the technical field of artificial intelligence. A specific implementation includes: an electronic device recognizes a speech signal to obtain a first text; if a first pinyin sequence corresponding to the first text exists in the database, the electronic device uses a correct text corresponding to the first pinyin sequence as the speech recognition result; otherwise, the electronic device performs a fuzzy matching on the first pinyin sequence to obtain multiple second pinyin sequences and second texts corresponding to the second pinyin sequences, and selects the speech recognition result from the multiple second texts.

Type: Grant

Filed: December 21, 2020

Date of Patent: October 24, 2023

Assignee: APOLLO INTELLIGENT CONNECTIVITY (BEIJING) TECHNOLOGY CO., LTD.

Inventors: Yi Zhou, Qie Yin, Long Zhang, Zhen Chen
Artificial intelligence apparatus for recognizing speech of user and method for the same

Patent number: 11776544

Abstract: An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.

Type: Grant

Filed: May 18, 2022

Date of Patent: October 3, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Jaehong Kim, Hyoeun Kim, Hangil Jeong, Heeyeon Choi
Meaning inference from speech audio

Patent number: 11769488

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Grant

Filed: March 3, 2022

Date of Patent: September 26, 2023

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Intent recognition and emotional text-to-speech learning

Patent number: 11727914

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Grant

Filed: December 24, 2021

Date of Patent: August 15, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
System and method for automatic testing of conversational assistance

Patent number: 11710476

Abstract: A voice recognition system includes a microphone configured to receive one or more spoken dialogue commands from a user in a voice recognition session. The system also includes a processor in communication with the microphone. The processor is configured to receive one or more audio files associated with one or more audio events associated with the voice recognition system, execute the one or more audio files in a voice recognition session in an audio event, and output a log report indicating a result of the audio events with the voice recognition session.

Type: Grant

Filed: April 27, 2020

Date of Patent: July 25, 2023

Assignee: ROBERT BOSCH GMBH

Inventors: Xiaowei Zhou, Pongtep Angkititrakul
Spoken notifications

Patent number: 11705130

Abstract: An example method includes, at an electronic device: receiving an indication of a notification; in accordance with receiving the indication of the notification: obtaining one or more data streams from one or more sensors; determining, based on the one or more data streams, whether a user associated with the electronic device is speaking; and in accordance with a determination that the user is not speaking: causing an output associated with the notification to be provided.

Type: Grant

Filed: November 12, 2021

Date of Patent: July 18, 2023

Assignee: Apple Inc.

Inventors: William M. York, Rebecca P. Fish, Gagan A. Gupta, Xinyuan Huang, Heriberto Nieto, Benjamin S. Phipps, Kurt Piersol
Automated graph based information extraction

Patent number: 11669680

Abstract: A set of sentences within a natural language text document are parsed, generating a word-level graph corresponding to a sentence in the set of sentences. Within the word-level graph using a trained entity identification model, a set of entity candidates are identified. From a set of graphs modelling relationships between portions of the set of sentences, a set of embeddings is generated. From a set of pairs of embeddings in the set of embeddings using a set of deconvolution layers, a set of links between entity candidates within the set of entity candidates is extracted. From the set of links and the set of entity candidates, an output graph modelling linkages between portions of the set of sentences within the natural language text document is generated.

Type: Grant

Filed: February 2, 2021

Date of Patent: June 6, 2023

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Lingfei Wu, Tengfei Ma, Tian Gao, Xiaojie Guo
Graph-based labeling rule augmentation for weakly supervised training of machine-learning-based named entity recognition

Patent number: 11669740

Abstract: Systems and methods for training a machine-learning model for named-entity recognition. A rule graph is constructed including a plurality of nodes each corresponding to a different labeling rule of a set of labeling rules (including a set of seeding rules of known labeling accuracy and a plurality of candidate rules of unknown labeling accuracy). The nodes are coupled to other nodes based on which rules exhibit the highest sematic similarity. A labeling accuracy metric is estimated for each candidate rule by propagating a labeling confidence metric through the rule graph from the seeding rules to each candidate rule. A subset of labeling rules is then identified by ranking the rules by their labeling confidence metric. The identified subset of labeling rules is applied to unlabeled data to generate a set of weakly labeled named entities and the machine-learning model is trained based on the set of weakly labeled named entities.

Type: Grant

Filed: February 25, 2021

Date of Patent: June 6, 2023

Assignee: Robert Bosch GmbH

Inventors: Xinyan Zhao, Haibo Ding, Zhe Feng
Suggesting an alternative interface when environmental interference is expected to inhibit certain automated assistant interactions

Patent number: 11657817

Abstract: Implementations set forth relate to suggesting an alternate interface modality when an automated assistant and/or a user is expected to not understand a particular interaction between the user and the automated assistant. In some instances, the automated assistant can pre-emptively determine that a forthcoming and/or ongoing interaction between a user and an automated assistant may experience interference. Based on this determination, the automated assistant can provide an indication that the interaction may not be successful and/or that the user should interact with the automated assistant through a different modality. For example, the automated assistant can render a keyboard interface at a portable computing device when the automated assistant determines that an audio interface of the portable computing device is experiencing interference.

Type: Grant

Filed: November 20, 2020

Date of Patent: May 23, 2023

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Victor Carbune
Minimum word error rate training for attention-based sequence-to-sequence models

Patent number: 11646019

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses N-best lists of decoded hypotheses, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.

Type: Grant

Filed: July 27, 2021

Date of Patent: May 9, 2023

Assignee: Google LLC

Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
Hybrid conversations with human and virtual assistants

Patent number: 11646013

Abstract: In some examples, a user, either a customer or potential customer of a business, engages in conversations with a virtual assistant (VA) provided by the business. The virtual assistant (VA) is further supported by one or more human assistants (HA), if needed. In embodiments, to facilitate seamless transitions between a VA and a HA, when needed, an intelligent decision maker (IDM) is provided. The IDM receives a user question and a proposed answer to the question from a VA, evaluates the proposed answer in the context of the conversation, and determines if the proposed answer requires further review by an HA. In response to a determination that the proposed answer requires further review, the IDM sends the proposed answer to an HA, and, in response to an indication by the HA, takes further action in the conversation.

Type: Grant

Filed: December 30, 2019

Date of Patent: May 9, 2023

Assignee: International Business Machines Corporation

Inventors: Khoi-Nguyen Dao Tran, Jingshi Li, Mukesh Kumar Mohania, Jaysen Ollerenshaw
Method for embedding and executing audio semantics

Patent number: 11640829

Abstract: Aspects of the subject disclosure may include, for example, a device that includes a processing system having a processor and a memory that stores executable instructions that, when executed by the processing system, facilitate performance of operations, where the operations include determining parameters for adapting audio in the content to the device, wherein the device renders the content, and wherein the parameters are based on semantic metadata embedded in the content, adapting the audio in the content based on the parameters, and rendering the content, as adapted by the parameters, to represent a semantic in the semantic metadata. Other embodiments are disclosed.

Type: Grant

Filed: May 27, 2021

Date of Patent: May 2, 2023

Assignee: AT&T Intellectual Property I, L.P.

Inventors: Eric Zavesky, Jason DeCuir, Robert Gratz
Electronic apparatus and controlling method thereof

Patent number: 11631413

Abstract: An electronic apparatus is provided. The electronic apparatus includes a memory and a processor configured to control the electronic apparatus to: classify a plurality of input data into a plurality of types to store in the memory, determine at least one among the input data of the classified plurality of types based on a voice command being recognized among the input data, and provide response information corresponding to the voice command based on the input data of the determined type.

Type: Grant

Filed: June 29, 2022

Date of Patent: April 18, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Inchul Hwang, Hyeonmok Ko, Munjo Kim, Hojung Lee
Multi-task knowledge distillation for language model

Patent number: 11620515

Abstract: Systems and methods are provided that employ knowledge distillation under a multi-task learning setting. In some embodiments, the systems and methods are implemented with a larger teacher model and a smaller student model, each of which comprise one or more shared layers and a plurality of task layers for performing multiple tasks. During training of the teacher model, its shared layers are initialized, and then the teacher model is multi-task refined. The teacher model predicts teacher logits. During training of the student model, its shared layers are initialized. Knowledge distillation is employed to transfer knowledge from the teacher model to the student model by the student model updating its shared layers and task layers, for example, according to the teacher logits of the teacher model. Other features are also provided.

Type: Grant

Filed: December 16, 2019

Date of Patent: April 4, 2023

Assignee: salesforce.com, inc.

Inventors: Linqing Liu, Caiming Xiong
Systems and methods for machine learning based modeling

Patent number: 11604984

Abstract: A system comprising a first computing apparatus in communication with multiple second computing apparatuses. The first computing apparatus may obtain a plurality of first trained machine learning models for a task from the multiple second computing apparatuses. At least a portion of parameter values of the plurality of first trained machine learning models may be different from each other. The first computing apparatus may also obtain a plurality of training samples. The first computing apparatus may further determine, based on the plurality of training samples, a second trained machine learning model by learning from the plurality of first trained machine learning models.

Type: Grant

Filed: November 18, 2019

Date of Patent: March 14, 2023

Assignee: SHANGHAI UNITED IMAGING INTELLIGENCE CO., LTD.

Inventors: Abhishek Sharma, Arun Innanje, Ziyan Wu, Shanhui Sun, Terrence Chen
Systems, methods, and computer-readable media for improved real-time audio processing

Patent number: 11593633

Abstract: Systems, methods, and computer-readable storage devices are disclosed for improved real-time audio processing. One method including: constructing a deep neural network model, including a plurality of at least one-bit neurons, configured to output a predicted label of audio data, the plurality of at least one-bit neurons arranged in a plurality of layers, including at least one hidden layer, and being connected by a plurality of connections, each connection having at least a one-bit weight, wherein one or both of the plurality of at least one-bit neurons and the plurality of connections have a reduced bit precision; receiving a training data set, the training data set including audio data; training the deep neural network model using the training data set; and outputting a trained deep neural network model configured to output a predicted label of real-time audio data.

Type: Grant

Filed: April 13, 2018

Date of Patent: February 28, 2023

Assignee: Microsoft Technology Licensing, LLC

Inventors: Ivan Jelev Tashev, Shuayb M Zarar, Matthai Philipose, Jong Hwan Ko

prev 1 2 3 4 5 next