Patents Examined by Michelle M Koeth

Semantic consistency of explanations in explainable artificial intelligence applications

Patent number: 11423334

Abstract: An explainable artificially intelligent (XAI) application contains an ordered sequence of artificially intelligent software modules. When an input dataset is submitted to the application, each module generates an output dataset and an explanation that represents, as a set of Boolean expressions, reasoning by which each output element was chosen. If any pair of explanations are determined to be semantically inconsistent, and if this determination is confirmed by further determining that an apparent inconsistency was not a correct response to an unexpected characteristic of the input dataset, nonzero inconsistency scores are assigned to inconsistent elements of the pair of explanations.

Type: Grant

Filed: May 8, 2020

Date of Patent: August 23, 2022

Assignee: KYNDRYL, INC.

Inventors: Sreekrishnan Venkateswaran, Debasisha Padhi, Shubhi Asthana, Anuradha Bhamidipaty, Ashish Kundu
Utilizing pre-event and post-event input streams to engage an automated assistant

Patent number: 11423885

Abstract: Techniques are described herein for selectively processing a user's utterances captured prior to and after an event that invokes an automated assistant to determine the user's intent and/or any parameters required for resolving the user's intent. In various implementations, respective measures of fitness for triggering responsive action by the automated assistant may be determined for pre-event and a post-event input streams. Based on the respective measures of fitness, one or both of the pre-event input stream or post-event input stream may be selected and used to cause the automated assistant to perform one or more responsive actions.

Type: Grant

Filed: February 20, 2019

Date of Patent: August 23, 2022

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Tom Hume, Mohamad Hassan Mohamad Rom, Jan Althaus, Diego Melendo Casado
Controlling voice recognition sensitivity for voice recognition

Patent number: 11417321

Abstract: A device for changing a speech recognition sensitivity for speech recognition can include a memory and a processor configured to obtain a first plurality of speech data input at different times, apply a pre-trained speech recognition model to the first plurality of speech data at a plurality of different speech recognition sensitivities, obtain a first speech recognition sensitivity from among the plurality of different speech recognition sensitivities based on the pre-trained speech recognition model and the plurality of different speech recognition sensitivities, the first speech recognition sensitivity corresponding to an optimal speech recognition sensitivity at which a speech recognition success rate of the speech recognition model satisfies a set first recognition success rate criterion, and change a setting of the speech recognition sensitivity based on the first speech recognition sensitivity obtained from among the plurality of different speech recognition sensitivities.

Type: Grant

Filed: April 24, 2020

Date of Patent: August 16, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Sang Won Kim, Joonbeom Lee
Audio processing device for speech recognition

Patent number: 11404046

Abstract: An audio processing device for speech recognition is provided, which includes a memory circuit, a power spectrum transfer circuit, and a feature extraction circuit. The power spectrum transfer circuit is coupled to the memory circuit, reads frequency spectrum coefficients of time-domain audio sample data from the memory circuit, generates compressed power parameters by performing a power spectrum transfer processing and a compressing processing according to the frequency spectrum coefficients, and writes the compressed power parameters into the memory circuit. The feature extraction circuit is coupled to the memory circuit, reads the compressed power parameters from the memory circuit, generates an audio feature vector by performing mel-filtering and frequency-to-time transfer processing according to the compressed power parameters. The bit width of the compressed power parameters is less than the bit width of the frequency spectrum coefficients.

Type: Grant

Filed: May 6, 2020

Date of Patent: August 2, 2022

Assignee: XSail Technology Co., Ltd

Inventors: Meng-Hao Feng, Chao Chen
Systems and methods for enhancing audio signals

Patent number: 11393488

Abstract: Embodiments of the disclosure provide systems and methods for enhancing audio signals. The system may include a communication interface configured to receive multi-channel audio signals acquired from a common signal source. The system may further include at least one processor. The at least one processor may be configured to separate the multi-channel audio signals into a first audio signal and a second audio signal in a time domain. The at least one processor may be further configured to decompose the first audio signal and the second audio signal in a frequency domain to obtain a first decomposition data and a second decomposition data, respectively. The at least one processor may be also configured to estimate a noise component in the frequency domain based on the first decomposition data and the second decomposition data. The at least one processor may be additionally configured to enhance the first audio signal based on the estimated noise component.

Type: Grant

Filed: April 24, 2020

Date of Patent: July 19, 2022

Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.

Inventors: Yi Zhang, Hui Song, Chengyun Deng, Yongtao Sha
All deep learning minimum variance distortionless response beamformer for speech separation and enhancement

Patent number: 11380307

Abstract: A method, computer program, and computer system is provided for automated speech recognition. Audio data corresponding to one or more speakers is received. Covariance matrices of target speech and noise associated with the received audio data are estimated based on a gated recurrent unit-based network. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated by a minimum variance distortionless response function based on the estimated covariance matrices.

Type: Grant

Filed: September 30, 2020

Date of Patent: July 5, 2022

Assignee: TENCENT AMERICA LLC

Inventors: Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu
Dynamic and/or context-specific hot words to invoke automated assistant

Patent number: 11373649

Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).

Type: Grant

Filed: August 21, 2018

Date of Patent: June 28, 2022

Assignee: GOOGLE LLC

Inventors: Diego Melendo Casado, Jaclyn Konzelmann
System and method for adaptive skill level assignments

Patent number: 11367029

Abstract: A system and method are presented for adaptive skill level assignments of agents in contact center environments. A client and a service collaborate to automatically determine the effectiveness of an agent handling an interaction that has been routed using skills-based routing. Evaluation operations may be performed including emotion detection, transcription of audio to text, keyword analysis, and sentiment analysis. The results of the evaluation are aggregated with other information such as the interaction's duration, agent skills and agent skill levels, and call requirement skills and skill levels, to update the agent's profile which is then used for subsequent routing operations.

Type: Grant

Filed: February 26, 2020

Date of Patent: June 21, 2022

Inventors: James Murison, Johnson Tse, Gaurav Mehrotra, Anthony Lam
System for articulation training for hearing impaired persons

Patent number: 11361677

Abstract: A computing device, method, and a non-transitory computer readable medium for articulation training for hearing impaired persons is disclosed. The computing device comprises a database including stored mel-frequency cepstral representations of audio recordings associated with text and/or images related to the audio recordings, a microphone configured to receive audible inputs and a display. The computing device is operatively connected to the database, the microphone and the display. The computing device includes circuitry and program instructions stored therein which when executed by one or more processors, cause the system to receive an audible input from the microphone, convert the audible input to a mel-frequency cepstral representation, search the database for a match of the mel-frequency cepstral representation to a stored mel-frequency cepstral representation and display the text and/or images related to the stored mel-frequency cepstral representation when the match is found.

Type: Grant

Filed: November 10, 2021

Date of Patent: June 14, 2022

Assignee: King Abdulaziz University

Inventor: Wadee Saleh Ahmed Alhalabi
Neural network model for generation of compressed haptic actuator signal from audio input

Patent number: 11355033

Abstract: A method comprises inputting an audio signal into a machine learning circuit to compress the audio signal into a sequence of actuator signals. The machine learning circuit being trained by: receiving a training set of acoustic signals and pre-processing the training set of acoustic signals into pre-processed audio data. The pre-processed audio data including at least a spectrogram. The training further includes training the machine learning circuit using the pre-processed audio data. The neural network has a cost function based on a reconstruction error and a plurality of constraints. The machine learning circuit generates a sequence of haptic cues corresponding to the audio input. The sequence of haptic cues is transmitted to a plurality of cutaneous actuators to generate a sequence of haptic outputs.

Type: Grant

Filed: April 10, 2018

Date of Patent: June 7, 2022

Assignee: Meta Platforms, Inc.

Inventors: Brian Alexander Knott, Venkatasiva Prasad Chakkabala
Universal and user-specific command processing

Patent number: 11348576

Abstract: A system configured to process an incoming spoken utterance and to coordinate among multiple speechlet components to execute an action of the utterance, where a trained model considers user history and preference information to select the primary speechlet to execute the action as well as any intermediate speechlets that may be provide input data to the speechlet that will ultimately perform the action. The trained model may also consider current dialog information, feedback data, or other data when determining how to process a dialog.

Type: Grant

Filed: December 6, 2017

Date of Patent: May 31, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Bradford Lynch, Adam D. Baran, Kevindra Pal Singh, Udai Sen Mody
System for improving dysarthria speech intelligibility and method thereof

Patent number: 11328709

Abstract: A system for improving dysarthria speech intelligibility and method thereof, are provided. In the system, user only needs to provides a set of paired corpus including a reference corpus and a patient corpus, and a speech disordering module can automatically generate a new corpus completely synchronous with the reference corpus, and the new corpus can be used as a training corpus for training a dysarthria voice conversion model. The present invention does not need to use a conventional corpus alignment technology or a manual manner to perform pre-processing on the training corpus, so that manpower cost and time cost can be reduced, and synchronization of the training corpus can be ensured, thereby improving both training and conversion qualities of the voice conversion model.

Type: Grant

Filed: August 20, 2019

Date of Patent: May 10, 2022

Assignee: NATIONAL CHUNG CHENG UNIVERSITY

Inventors: Tay-Jyi Lin, Ching-Hau Sung, Che-Chia Pai, Ching-Wei Yeh
Processing audio data

Patent number: 11328714

Abstract: Processing data for speech recognition by generating hypotheses from input data, assigning each hypothesis, a score according to a confidence level value and hypothesis ranking, executing a pass/fail grammar test against each hypothesis, generating replacement hypotheses according to grammar test failures, assigning each replacement hypothesis a score according to a number of hypothesis changes, and providing a set of hypotheses, wherein the set comprises at least one replacement hypotheses.

Type: Grant

Filed: January 2, 2020

Date of Patent: May 10, 2022

Assignee: International Business Machines Corporation

Inventors: Andrew R. Freed, Marco Noel, Victor Povar
Voice user interface

Patent number: 11322157

Abstract: A method of speaker authentication comprises: receiving a speech signal; dividing the speech signal into segments; and, following each segment, obtaining an authentication score based on said segment and previously received segments, wherein the authentication score represents a probability that the speech signal comes from a specific registered speaker. In response to an authentication request, an authentication result is output based on the authentication score.

Type: Grant

Filed: June 6, 2017

Date of Patent: May 3, 2022

Assignee: Cirrus Logic, Inc.

Inventors: Carlos Vaquero Avilés-Casco, David Martínez González, Ryan Roberts
Responding with unresponsive content

Patent number: 11315552

Abstract: This disclosure describes systems and techniques receiving a request for information from a user and, in response, outputting the requested information along with unsolicited, interesting content that is related to, yet nonresponsive to, the requested information. In some instances, if the requested information is unknown, the techniques may output an indication that the information is unknown, followed by the additional, unsolicited, interesting content.

Type: Grant

Filed: March 23, 2018

Date of Patent: April 26, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Michael Martin George, David Garfield Uffelman, Deepak Maini, Kyle Beyer, Amarpaul Singh Sandhu
Artificial intelligence apparatus for recognizing speech of user using personalized language model and method for the same

Patent number: 11302311

Abstract: An artificial intelligence apparatus for recognizing speech of a user includes a microphone, and a processor configured to receive, via the microphone, a sound signal corresponding to the speech of the user, acquire personalize identification information corresponding to the speech, recognize the speech from the sound signal using a global language model, calculate a reliability for the recognition, and if the calculated reliability exceeds a predetermined first reference value, update a personalized language model corresponding to the personalize identification information using the recognition result.

Type: Grant

Filed: August 21, 2019

Date of Patent: April 12, 2022

Assignee: LG ELECTRONICS INC.

Inventors: Boseop Kim, Jaehong Kim
Spoken language quality automatic regression detector background

Patent number: 11302312

Abstract: A new model is introduced into a particular domain that receives a routing of a dialog from a speech processing component. A method associated with the model includes running a set of test utterances through the speech processing component that enables a spoken language dialog with a user to establish a base line score associated with processing for the set of test utterances. The speech processing component determines an intent of the user and routes the spoken language dialog to a network-based domain based on the intent. The method includes establishing an automatic test run of the set of test utterances to obtain a current score and, when a threshold associated with a difference between the current score and the base line score is breached, switching, at the network-based domain, from the false accept detection model to a second model.

Type: Grant

Filed: September 27, 2019

Date of Patent: April 12, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Ajay Soni, Xi Chen, Jingqian Zhao, Liu Yang, Prathap Ramachandra, Ruiqi Luo
Using phonetic variants in a local context to improve natural language understanding

Patent number: 11295730

Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.

Type: Grant

Filed: August 1, 2019

Date of Patent: April 5, 2022

Assignee: SoundHound, Inc.

Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
On-device neural network adaptation with binary mask learning for language understanding systems

Patent number: 11257483

Abstract: Spoken language understanding techniques include training a dynamic neural network mask relative to a static neural network using only post-deployment training data such that the mask zeroes out some of the weights of the static neural network and allows some other weights to pass through and applying a dynamic neural network corresponding to the masked static neural network to input queries to identify outputs for the queries.

Type: Grant

Filed: March 29, 2019

Date of Patent: February 22, 2022

Assignee: Intel Corporation

Inventors: Krzysztof Czarnowski, Munir Georges
Voice interaction for image editing

Patent number: 11257491

Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.

Type: Grant

Filed: November 29, 2018

Date of Patent: February 22, 2022

Assignee: ADOBE INC.

Inventors: Sarah Kong, Yinglan Ma, Hyunghwan Byun, Chih-Yao Hsieh

prev 1 2 3 4 5 6 7 8 … next