Patents Examined by Darioush Agahi

Information processing apparatus and non-transitory computer readable medium storing program

Patent number: 11606629

Abstract: An information processing apparatus includes an acquisition unit that acquires voice data and image data, respectively, a display control unit that performs control to display the image data acquired by the acquisition unit in synchronization with the voice data, a reception unit that receives a display element to be added for display to a specific character in the image data displayed by the display control unit, and a setting unit that sets a playback period in which the specific character in the voice data is played back, as a display period of the display element received by the reception unit in the image data.

Type: Grant

Filed: July 19, 2019

Date of Patent: March 14, 2023

Assignee: FUJIFILM Business Innovation Corp.

Inventor: Mai Suzuki
Transcription of communications

Patent number: 11580985

Abstract: A method to transcribe communications may include obtaining, at a first device, an audio signal that originates at a remote device during a communication session. The audio signal may be shared between the first device and a second device. The method may also include obtaining an indication that the second device is associated with a remote transcription system and in response to the second device being associated with the remote transcription system, directing the audio signal to the remote transcription system by one of the first device and the second device instead of both the first device and the second device directing the audio signal to the remote transcription system when the second device is not associated with the remote transcription system.

Type: Grant

Filed: June 19, 2020

Date of Patent: February 14, 2023

Assignee: Sorenson IP Holdings, LLC

Inventors: Andrew Jesse Spry, David Earl Bergum
Multi-modal spoken language understanding systems

Patent number: 11562735

Abstract: A spoken language understanding (SLU) system may include an automatic speech recognizer (ASR), an audio feature extractor, an optional synchronizer and a language understanding module. The ASR may produce a first set of input data representing transcripts of utterances. The audio feature extractor may produce a second set of input data representing audio features of the utterances, in particular, non-transcript specific characteristics of the speaker in one or more portions the utterances. The two sets of input data may be provided for the language understanding module to predict intents and slot labels for the utterances. The SLU system may use the optional synchronizer to align the two sets of input data before providing them to the language understanding module.

Type: Grant

Filed: March 31, 2020

Date of Patent: January 24, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Arshit Gupta, Julian E. S. Salazar, Peng Zhang, Katrin Kirchhoff, Yi Zhang
Generating replacement sentences for a particular sentiment

Patent number: 11551010

Abstract: Certain aspects of the present disclosure provide techniques for generating a replacement sentence with the same or similar meaning but a different sentiment than an input sentence. The method generally includes receiving a request for a replacement sentence and iteratively determining a next word of the replacement sentence word-by-word based on an input sentence. Iteratively determining the next word generally includes evaluating a set of words of the input sentence using a language model configured to output candidate sentences and evaluating the candidate sentences using a sentiment model configured to output sentiment scores for the candidates sentences. Iteratively determining the next word further includes calculating convex combinations for the candidate sentences and selecting an ending word of one of the candidate sentences as the next word of the replacement sentence. The method further includes transmitting the replacement sentence in response to the request for the replacement sentence.

Type: Grant

Filed: October 6, 2021

Date of Patent: January 10, 2023

Assignee: INTUIT, INC.

Inventors: Manav Kohli, Cindy Osmon, Nicholas Roberts
Sentence extraction system, sentence extraction method, and information storage medium

Patent number: 11526674

Abstract: A text extracting system includes at least one processor configured to obtain a plurality of texts, specify at least one characteristic expression included in the plurality of texts, and extract, based on the at least one characteristic expression, at least one text to be entered into a question sentence generator from the plurality of texts, where the question sentence generator generating a question sentence from an input sentence.

Type: Grant

Filed: March 1, 2019

Date of Patent: December 13, 2022

Assignee: RAKUTEN GROUP, INC.

Inventors: Masakatsu Hamashita, Takashi Inui, Koji Murakami
Wakeword detection using a neural network

Patent number: 11521599

Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.

Type: Grant

Filed: September 20, 2019

Date of Patent: December 6, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
Machine learning framework for tuning interactive voice response systems

Patent number: 11521596

Abstract: An artificial intelligence (“AI”) system for tuning a machine learning interactive voice response system is provided. The AI system may perform analysis of outputs generated by the machine learning models. The AI system may determine an expected model output for a given test input. The AI system may determine accuracy, precision and recall scores for an actual output garneted in response to the test input. The system may determine performance metrics for interim outputs generated by individual machine learning models within the interactive voice response system. The AI system may replace malfunctioning models with replacement models.

Type: Grant

Filed: August 14, 2020

Date of Patent: December 6, 2022

Assignee: Bank of America Corporation

Inventors: Bharathiraja Krishnamoorthy, Emad Noorizadeh, Ravisha Andar
Automatically generating conference minutes

Patent number: 11521603

Abstract: A conference minutes generation method is provided, which relates to the technical field of natural language processing. The conference minutes generation method comprises: acquiring a text conference record; dividing the text conference record into a plurality of conference paragraphs, generating a conference paragraph summary for each conference paragraph, and generating a conference record summary based on the conference paragraph summary of each conference paragraph; extracting conference instructions based on the text conference record; and generating the conference minutes based on the conference record summary and the conference instructions.

Type: Grant

Filed: October 1, 2020

Date of Patent: December 6, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Ke Sun, Ying Liu, Kai Liu, Lei Han, Chao Wang, Yingzhuo Song, Shuai Gao, Liyan Yang, Qianqian Wang, Jing Liu, Di Wei
Voice context-aware content manipulation

Patent number: 11514893

Abstract: Techniques performed by a data processing system for processing voice content received from a user herein include receiving a first audio input from a user comprising spoken content, analyzing the first audio input using one or more natural language processing models to produce a first textual output comprising a textual representation of the first audio input, analyzing the first textual output using one or more machine learning models to determine first context information of the first textual output, and processing the first textual output in the application based on the first context information.

Type: Grant

Filed: March 13, 2020

Date of Patent: November 29, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Erez Kikin-Gil, Emily Tran, Benjamin David Smith, Alan Liu, Erik Thomas Oveson
Electronic device for outputting response to speech input by using application and operation method thereof

Patent number: 11508364

Abstract: An artificial intelligence (AI) system is provided. The AI system simulates functions of human brain such as recognition and judgment by utilizing a machine learning algorithm such as deep learning, etc. and an application of the AI system. A method, performed by an electronic device, of outputting a response to a speech input by using an application, includes receiving the speech input, obtaining text corresponding to the speech input by performing speech recognition on the speech input, obtaining metadata for the speech input based on the obtained text, selecting at least one application from among a plurality of applications for outputting the response to the speech input based on the metadata, and outputting the response to the speech input by using the selected at least one application.

Type: Grant

Filed: May 21, 2019

Date of Patent: November 22, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Cheenepalli Srirama Krishna Bhargava, Ankush Gupta
Speech recognition using data analysis and dilation of interlaced audio input

Patent number: 11495216

Abstract: The disclosure includes using dilation of speech content from an interlaced audio input for speech recognition. A learning model is initiated to determine dilation parameters for each of a plurality of audible sounds of speech content from a plurality of speakers received at a computer as an audio input. As part of the learning model, a change of each of a plurality of independent sounds is determined in response to an audio stimulus, the independent sounds being derived from the audio input. The disclosure applies the dilation parameters, respectively, based on the change of each of the independent sounds. A voice print is constructed for each of the speakers based on the independent sounds and the dilation parameters, respectively. Speech content is attributed to each of the plurality of speakers based at least in part on the voice print, respectively, and the independent sounds.

Type: Grant

Filed: September 9, 2020

Date of Patent: November 8, 2022

Assignee: International Business Machines Corporation

Inventors: Aaron K. Baughman, Corey B. Shelton, Stephen C. Hammer, Shikhar Kwatra
Selectively activating on-device speech recognition, and using recognized text in selectively activating on-device NLU and/or on-device fulfillment

Patent number: 11482217

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant by, for example, obviating the need to provide an explicit invocation to the automated assistant, such as by saying a hot-word/phrase or performing a specific user input, prior to speaking a command or query. In addition, the automated assistant can optionally receive, understand, and/or respond to the command or query without communicating with a server, thereby further reducing the time in which a response can be provided. Implementations only selectively initiate on-device speech recognition responsive to determining one or more condition(s) are satisfied. Further, in some implementations, on-device NLU, on-device fulfillment, and/or resulting execution occur only responsive to determining, based on recognized text form the on-device speech recognition, that such further processing should occur.

Type: Grant

Filed: May 31, 2019

Date of Patent: October 25, 2022

Assignee: GOOGLE LLC

Inventors: Michael Golikov, Zaheed Sabur, Denis Burakov, Behshad Behzadi, Sergey Nazarov, Daniel Cotting, Mario Bertschler, Lucas Mirelmann, Steve Cheng, Bohdan Vlasyuk, Jonathan Lee, Lucia Terrenghi, Adrian Zumbrunnen
Low delay voice processing system

Patent number: 11475891

Abstract: Disclosed is a speech processing method. The speech processing method controls activation timing of a microphone based on a response pattern of the microphone from a user in order to implement a natural conversation. The speech processing device and the NLP system of the present disclosure may be associated with an artificial intelligence module, a drone (or unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to 5G service, etc.

Type: Grant

Filed: October 22, 2020

Date of Patent: October 18, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Soonpil Jang, Seongjae Jeong, Wonkyum Kim, Jonghoon Chae
Document identification device, document identification method, and program

Patent number: 11462212

Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.

Type: Grant

Filed: May 10, 2018

Date of Patent: October 4, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Hirokazu Masataki
Alternate natural language input generation

Patent number: 11437027

Abstract: Techniques for handling errors during processing of natural language inputs are described. A system may process a natural language input to generate an ASR hypothesis or NLU hypothesis. The system may use more than one data searching technique (e.g., deep neural network searching, convolutional neural network searching, etc.) to generate an alternate ASR hypothesis or NLU hypothesis, depending on the type of hypothesis input for alternate hypothesis processing.

Type: Grant

Filed: December 4, 2019

Date of Patent: September 6, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Chenlei Guo, Xing Fan, Jin Hock Ong, Kai Wei
Method, apparatus, and storage medium for segmenting sentences for speech recognition

Patent number: 11430428

Abstract: The present disclosure describes a method, apparatus, and storage medium for performing speech recognition. The method includes acquiring, by an apparatus, first to-be-processed speech information. The apparatus includes a memory storing instructions and a processor in communication with the memory. The method includes acquiring, by the apparatus, a first pause duration according to the first to-be-processed speech information; and in response to the first pause duration being greater than or equal to a first threshold, performing, by the apparatus, speech recognition on the first to-be-processed speech information to obtain a first result of sentence segmentation of speech, the first result of sentence segmentation of speech being text information, the first threshold being determined according to speech information corresponding to a previous moment.

Type: Grant

Filed: September 10, 2020

Date of Patent: August 30, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lianwu Chen, Jingliang Bai, Min Luo
Locally grouping voice-enabled device state communications

Patent number: 11429344

Abstract: Devices, systems, and methods are provided for locally grouping voice-enabled device state communications. A device may determine first state information associated with the first device and send the first state information to a second device. The device may receive second state information associated with a second device and third state information associated with a third device. The device may receive an audible command, and may determine, based on the audible command, an indicator to send state data. The device may send the first state information, the second state information, the third state information, and data associated with the audible command. The device may receive fourth state information associated with the audible command.

Type: Grant

Filed: September 27, 2019

Date of Patent: August 30, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Tomas Manuel Fernandez, Mark Lawrence, Charles James Torbert
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 11423890

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).

Type: Grant

Filed: August 21, 2018

Date of Patent: August 23, 2022

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Jaclyn Konzelmann
Semantic consistency of explanations in explainable artificial intelligence applications

Patent number: 11423334

Abstract: An explainable artificially intelligent (XAI) application contains an ordered sequence of artificially intelligent software modules. When an input dataset is submitted to the application, each module generates an output dataset and an explanation that represents, as a set of Boolean expressions, reasoning by which each output element was chosen. If any pair of explanations are determined to be semantically inconsistent, and if this determination is confirmed by further determining that an apparent inconsistency was not a correct response to an unexpected characteristic of the input dataset, nonzero inconsistency scores are assigned to inconsistent elements of the pair of explanations.

Type: Grant

Filed: May 8, 2020

Date of Patent: August 23, 2022

Assignee: KYNDRYL, INC.

Inventors: Sreekrishnan Venkateswaran, Debasisha Padhi, Shubhi Asthana, Anuradha Bhamidipaty, Ashish Kundu
Controlling voice recognition sensitivity for voice recognition

Patent number: 11417321

Abstract: A device for changing a speech recognition sensitivity for speech recognition can include a memory and a processor configured to obtain a first plurality of speech data input at different times, apply a pre-trained speech recognition model to the first plurality of speech data at a plurality of different speech recognition sensitivities, obtain a first speech recognition sensitivity from among the plurality of different speech recognition sensitivities based on the pre-trained speech recognition model and the plurality of different speech recognition sensitivities, the first speech recognition sensitivity corresponding to an optimal speech recognition sensitivity at which a speech recognition success rate of the speech recognition model satisfies a set first recognition success rate criterion, and change a setting of the speech recognition sensitivity based on the first speech recognition sensitivity obtained from among the plurality of different speech recognition sensitivities.

Type: Grant

Filed: April 24, 2020

Date of Patent: August 16, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Sang Won Kim, Joonbeom Lee

prev 1 2 3 4 5 next