Patents Examined by Michelle M Koeth

Method and apparatus for sound processing

Patent number: 11488617

Abstract: Disclosed are a sound processing apparatus and a sound processing method. The sound processing method includes extracting a desired voice enhanced signal by a sound source separation and a sound extraction. By using a multi-channel blind source separation method based on independent vector analysis, the desired voice enhanced signal is extracted from a channel having the smallest sum of off-diagonal values of a separation adaptive filter when the power of the desired voice signal is larger than that of other voice signals. According to the present disclosure, a user may build a robust artificial intelligence (AI) speech recognition system by using sound source separation and voice extraction using eMBB, URLLC, and mMTC techniques of 5G mobile communication.

Type: Grant

Filed: October 18, 2019

Date of Patent: November 1, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Jae Pil Seo, Keun Sang Lee, Jae Woong Jeong
Selectively activating on-device speech recognition, and using recognized text in selectively activating on-device NLU and/or on-device fulfillment

Patent number: 11482217

Abstract: Implementations can reduce the time required to obtain responses from an automated assistant by, for example, obviating the need to provide an explicit invocation to the automated assistant, such as by saying a hot-word/phrase or performing a specific user input, prior to speaking a command or query. In addition, the automated assistant can optionally receive, understand, and/or respond to the command or query without communicating with a server, thereby further reducing the time in which a response can be provided. Implementations only selectively initiate on-device speech recognition responsive to determining one or more condition(s) are satisfied. Further, in some implementations, on-device NLU, on-device fulfillment, and/or resulting execution occur only responsive to determining, based on recognized text form the on-device speech recognition, that such further processing should occur.

Type: Grant

Filed: May 31, 2019

Date of Patent: October 25, 2022

Assignee: GOOGLE LLC

Inventors: Michael Golikov, Zaheed Sabur, Denis Burakov, Behshad Behzadi, Sergey Nazarov, Daniel Cotting, Mario Bertschler, Lucas Mirelmann, Steve Cheng, Bohdan Vlasyuk, Jonathan Lee, Lucia Terrenghi, Adrian Zumbrunnen
Low delay voice processing system

Patent number: 11475891

Abstract: Disclosed is a speech processing method. The speech processing method controls activation timing of a microphone based on a response pattern of the microphone from a user in order to implement a natural conversation. The speech processing device and the NLP system of the present disclosure may be associated with an artificial intelligence module, a drone (or unmanned aerial vehicle (UAV)), a robot, an augmented reality (AR) device, a virtual reality (VR) device, a device related to 5G service, etc.

Type: Grant

Filed: October 22, 2020

Date of Patent: October 18, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Soonpil Jang, Seongjae Jeong, Wonkyum Kim, Jonghoon Chae
Automated assistants with conference capabilities

Patent number: 11470022

Abstract: Techniques are described related to enabling automated assistants to enter into a “conference mode” in which they can “participate” in meetings between multiple human participants and perform various functions described herein. In various implementations, an automated assistant implemented at least in part on conference computing device(s) may be set to a conference mode in which the automated assistant performs speech-to-text processing on multiple distinct spoken utterances, provided by multiple meeting participants, without requiring explicit invocation prior to each utterance. The automated assistant may perform semantic processing on first text generated from the speech-to-text processing of one or more of the spoken utterances, and generate, based on the semantic processing, data that is pertinent to the first text. The data may be output to the participants at conference computing device(s).

Type: Grant

Filed: March 27, 2020

Date of Patent: October 11, 2022

Assignee: GOOGLE LLC

Inventors: Marcin Nowak-Przygodzki, Jan Lamecki, Behshad Behzadi
Predicting text by reconstructing neuroimaging signals

Patent number: 11468351

Abstract: A brain computer interface (BCI) system predicts text based on input and output signals obtained in relation to an individual that are informative for determining an individual's neurobiological activity. The BCI system applies a first predictive model to the input signal and a second predictive model to the output signal. The first predictive model predicts the forward propagation of the input signal through the individual's head whereas the second predictive model predicts the backward propagation of the output signal through the individual's head. Each of the first predictive model and second predictive model predicts characteristics of their respective signal at a common plane such as the cortical surface of the individual's brain. The BCI system predicts text by applying a third predictive model to the predicted signal characteristics at the common plane outputted by the first predictive model and the second predictive model.

Type: Grant

Filed: May 29, 2019

Date of Patent: October 11, 2022

Assignee: Meta Platforms, Inc.

Inventors: Michael Andrew Choma, Emily Mittag Mugler, Patrick Mineault, Soo Yeon Kim Jennings, Mark Allan Chevillet
Document identification device, document identification method, and program

Patent number: 11462212

Abstract: A document identification device that improves class identification precision of multi-stream documents is provided. The document identification device includes: a primary stream expression generation unit that generates a primary stream expression, which is a fixed-length vector of a word sequence corresponding to each speaker's speech recorded in a setting including a plurality of speakers, for each speaker; a primary multi-stream expression generation unit that generates a primary multi-stream expression obtained by integrating the primary stream expression; a secondary stream expression generation unit that generates a secondary stream expression, which is a fixed-length vector generated based on the word sequence of each speaker and the primary multi-stream expression, for each speaker; and a secondary multi-stream expression generation unit that generates a secondary multi-stream expression obtained by integrating the secondary stream expression.

Type: Grant

Filed: May 10, 2018

Date of Patent: October 4, 2022

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Ryo Masumura, Hirokazu Masataki
Artificial intelligence device

Patent number: 11445265

Abstract: An AI device is provided. The AI device includes a content output interface to output video data contained in content and voice data contained in the content, and a processor to control the content output interface to acquire a voice recognition result by providing, to a voice recognition model, content extraction information including at least one of video information acquired from the video data in the content or tag information of the content and the voice data, and control the content output interface to output the voice recognition result.

Type: Grant

Filed: October 18, 2019

Date of Patent: September 13, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Taeho Lee, Boseop Kim
Alternate natural language input generation

Patent number: 11437027

Abstract: Techniques for handling errors during processing of natural language inputs are described. A system may process a natural language input to generate an ASR hypothesis or NLU hypothesis. The system may use more than one data searching technique (e.g., deep neural network searching, convolutional neural network searching, etc.) to generate an alternate ASR hypothesis or NLU hypothesis, depending on the type of hypothesis input for alternate hypothesis processing.

Type: Grant

Filed: December 4, 2019

Date of Patent: September 6, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Chenlei Guo, Xing Fan, Jin Hock Ong, Kai Wei
Method, apparatus, and storage medium for segmenting sentences for speech recognition

Patent number: 11430428

Abstract: The present disclosure describes a method, apparatus, and storage medium for performing speech recognition. The method includes acquiring, by an apparatus, first to-be-processed speech information. The apparatus includes a memory storing instructions and a processor in communication with the memory. The method includes acquiring, by the apparatus, a first pause duration according to the first to-be-processed speech information; and in response to the first pause duration being greater than or equal to a first threshold, performing, by the apparatus, speech recognition on the first to-be-processed speech information to obtain a first result of sentence segmentation of speech, the first result of sentence segmentation of speech being text information, the first threshold being determined according to speech information corresponding to a previous moment.

Type: Grant

Filed: September 10, 2020

Date of Patent: August 30, 2022

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Lianwu Chen, Jingliang Bai, Min Luo
Locally grouping voice-enabled device state communications

Patent number: 11429344

Abstract: Devices, systems, and methods are provided for locally grouping voice-enabled device state communications. A device may determine first state information associated with the first device and send the first state information to a second device. The device may receive second state information associated with a second device and third state information associated with a third device. The device may receive an audible command, and may determine, based on the audible command, an indicator to send state data. The device may send the first state information, the second state information, the third state information, and data associated with the audible command. The device may receive fourth state information associated with the audible command.

Type: Grant

Filed: September 27, 2019

Date of Patent: August 30, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Tomas Manuel Fernandez, Mark Lawrence, Charles James Torbert
Conversational data analysis

Patent number: 11423229

Abstract: Implementations of the subject matter described herein relate to conversational data analysis. After a data analysis request is received from a user, heuristic information may be determined based on the data analysis request. The heuristic information mentioned here is not a result for the data analysis request but information which may be used for leading the conversation to proceed. Based on such heuristic information, the user may provide supplementary information associated with the data analysis request, for example, clarify meaning of the data analysis request, submit a relevant further analysis request, and so on. A really desired and meaningful data analysis result can be provided to the user according to the supplementary information provided by the user. Thus, data analysis will become more accurate and effective. While obtaining really helpful information, the user also gains good user experience.

Type: Grant

Filed: September 22, 2017

Date of Patent: August 23, 2022

Assignee: Microsoft Technology Licensing, LLC

Inventors: Zhitao Hou, Jian-Guang Lou, Bo Zhang, Xiao Liang, Dongmei Zhang, Haidong Zhang
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 11423890

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).

Type: Grant

Filed: August 21, 2018

Date of Patent: August 23, 2022

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Jaclyn Konzelmann
Method for updating a speech recognition model, electronic device and storage medium

Patent number: 11423880

Abstract: The embodiments of the present application provide a method for updating a speech recognition model a storage medium and an electronic device. The method includes: detecting whether the speech recognition algorithm is updated; and updating the speech recognition model when the speech recognition algorithm has been updated. Wherein, the voice information is recognized by the electronic device based on the speech recognition algorithm and the speech recognition model. In the method for updating a speech recognition model, when the electronic device detects that the speech recognition algorithm has been updated, the electronic device can update the speech recognition model.

Type: Grant

Filed: August 6, 2019

Date of Patent: August 23, 2022

Assignee: GUANGDONG OPPO MOBILE TELECOMMUNICATIONS CORP., LTD.

Inventor: Yan Chen
Semantic consistency of explanations in explainable artificial intelligence applications

Patent number: 11423334

Abstract: An explainable artificially intelligent (XAI) application contains an ordered sequence of artificially intelligent software modules. When an input dataset is submitted to the application, each module generates an output dataset and an explanation that represents, as a set of Boolean expressions, reasoning by which each output element was chosen. If any pair of explanations are determined to be semantically inconsistent, and if this determination is confirmed by further determining that an apparent inconsistency was not a correct response to an unexpected characteristic of the input dataset, nonzero inconsistency scores are assigned to inconsistent elements of the pair of explanations.

Type: Grant

Filed: May 8, 2020

Date of Patent: August 23, 2022

Assignee: KYNDRYL, INC.

Inventors: Sreekrishnan Venkateswaran, Debasisha Padhi, Shubhi Asthana, Anuradha Bhamidipaty, Ashish Kundu
Utilizing pre-event and post-event input streams to engage an automated assistant

Patent number: 11423885

Abstract: Techniques are described herein for selectively processing a user's utterances captured prior to and after an event that invokes an automated assistant to determine the user's intent and/or any parameters required for resolving the user's intent. In various implementations, respective measures of fitness for triggering responsive action by the automated assistant may be determined for pre-event and a post-event input streams. Based on the respective measures of fitness, one or both of the pre-event input stream or post-event input stream may be selected and used to cause the automated assistant to perform one or more responsive actions.

Type: Grant

Filed: February 20, 2019

Date of Patent: August 23, 2022

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Tom Hume, Mohamad Hassan Mohamad Rom, Jan Althaus, Diego Melendo Casado
Controlling voice recognition sensitivity for voice recognition

Patent number: 11417321

Abstract: A device for changing a speech recognition sensitivity for speech recognition can include a memory and a processor configured to obtain a first plurality of speech data input at different times, apply a pre-trained speech recognition model to the first plurality of speech data at a plurality of different speech recognition sensitivities, obtain a first speech recognition sensitivity from among the plurality of different speech recognition sensitivities based on the pre-trained speech recognition model and the plurality of different speech recognition sensitivities, the first speech recognition sensitivity corresponding to an optimal speech recognition sensitivity at which a speech recognition success rate of the speech recognition model satisfies a set first recognition success rate criterion, and change a setting of the speech recognition sensitivity based on the first speech recognition sensitivity obtained from among the plurality of different speech recognition sensitivities.

Type: Grant

Filed: April 24, 2020

Date of Patent: August 16, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Sang Won Kim, Joonbeom Lee
Audio processing device for speech recognition

Patent number: 11404046

Abstract: An audio processing device for speech recognition is provided, which includes a memory circuit, a power spectrum transfer circuit, and a feature extraction circuit. The power spectrum transfer circuit is coupled to the memory circuit, reads frequency spectrum coefficients of time-domain audio sample data from the memory circuit, generates compressed power parameters by performing a power spectrum transfer processing and a compressing processing according to the frequency spectrum coefficients, and writes the compressed power parameters into the memory circuit. The feature extraction circuit is coupled to the memory circuit, reads the compressed power parameters from the memory circuit, generates an audio feature vector by performing mel-filtering and frequency-to-time transfer processing according to the compressed power parameters. The bit width of the compressed power parameters is less than the bit width of the frequency spectrum coefficients.

Type: Grant

Filed: May 6, 2020

Date of Patent: August 2, 2022

Assignee: XSail Technology Co., Ltd

Inventors: Meng-Hao Feng, Chao Chen
Systems and methods for enhancing audio signals

Patent number: 11393488

Abstract: Embodiments of the disclosure provide systems and methods for enhancing audio signals. The system may include a communication interface configured to receive multi-channel audio signals acquired from a common signal source. The system may further include at least one processor. The at least one processor may be configured to separate the multi-channel audio signals into a first audio signal and a second audio signal in a time domain. The at least one processor may be further configured to decompose the first audio signal and the second audio signal in a frequency domain to obtain a first decomposition data and a second decomposition data, respectively. The at least one processor may be also configured to estimate a noise component in the frequency domain based on the first decomposition data and the second decomposition data. The at least one processor may be additionally configured to enhance the first audio signal based on the estimated noise component.

Type: Grant

Filed: April 24, 2020

Date of Patent: July 19, 2022

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Yi Zhang, Hui Song, Chengyun Deng, Yongtao Sha
All deep learning minimum variance distortionless response beamformer for speech separation and enhancement

Patent number: 11380307

Abstract: A method, computer program, and computer system is provided for automated speech recognition. Audio data corresponding to one or more speakers is received. Covariance matrices of target speech and noise associated with the received audio data are estimated based on a gated recurrent unit-based network. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated by a minimum variance distortionless response function based on the estimated covariance matrices.

Type: Grant

Filed: September 30, 2020

Date of Patent: July 5, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 11373649

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).

Type: Grant

Filed: August 21, 2018

Date of Patent: June 28, 2022

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Jaclyn Konzelmann

prev 1 2 3 4 5 6 7 … next