Patents Examined by Michelle M Koeth
  • Patent number: 11423334
    Abstract: An explainable artificially intelligent (XAI) application contains an ordered sequence of artificially intelligent software modules. When an input dataset is submitted to the application, each module generates an output dataset and an explanation that represents, as a set of Boolean expressions, reasoning by which each output element was chosen. If any pair of explanations are determined to be semantically inconsistent, and if this determination is confirmed by further determining that an apparent inconsistency was not a correct response to an unexpected characteristic of the input dataset, nonzero inconsistency scores are assigned to inconsistent elements of the pair of explanations.
    Type: Grant
    Filed: May 8, 2020
    Date of Patent: August 23, 2022
    Assignee: KYNDRYL, INC.
    Inventors: Sreekrishnan Venkateswaran, Debasisha Padhi, Shubhi Asthana, Anuradha Bhamidipaty, Ashish Kundu
  • Patent number: 11423885
    Abstract: Techniques are described herein for selectively processing a user's utterances captured prior to and after an event that invokes an automated assistant to determine the user's intent and/or any parameters required for resolving the user's intent. In various implementations, respective measures of fitness for triggering responsive action by the automated assistant may be determined for pre-event and a post-event input streams. Based on the respective measures of fitness, one or both of the pre-event input stream or post-event input stream may be selected and used to cause the automated assistant to perform one or more responsive actions.
    Type: Grant
    Filed: February 20, 2019
    Date of Patent: August 23, 2022
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Tom Hume, Mohamad Hassan Mohamad Rom, Jan Althaus, Diego Melendo Casado
  • Patent number: 11417321
    Abstract: A device for changing a speech recognition sensitivity for speech recognition can include a memory and a processor configured to obtain a first plurality of speech data input at different times, apply a pre-trained speech recognition model to the first plurality of speech data at a plurality of different speech recognition sensitivities, obtain a first speech recognition sensitivity from among the plurality of different speech recognition sensitivities based on the pre-trained speech recognition model and the plurality of different speech recognition sensitivities, the first speech recognition sensitivity corresponding to an optimal speech recognition sensitivity at which a speech recognition success rate of the speech recognition model satisfies a set first recognition success rate criterion, and change a setting of the speech recognition sensitivity based on the first speech recognition sensitivity obtained from among the plurality of different speech recognition sensitivities.
    Type: Grant
    Filed: April 24, 2020
    Date of Patent: August 16, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Sang Won Kim, Joonbeom Lee
  • Patent number: 11404046
    Abstract: An audio processing device for speech recognition is provided, which includes a memory circuit, a power spectrum transfer circuit, and a feature extraction circuit. The power spectrum transfer circuit is coupled to the memory circuit, reads frequency spectrum coefficients of time-domain audio sample data from the memory circuit, generates compressed power parameters by performing a power spectrum transfer processing and a compressing processing according to the frequency spectrum coefficients, and writes the compressed power parameters into the memory circuit. The feature extraction circuit is coupled to the memory circuit, reads the compressed power parameters from the memory circuit, generates an audio feature vector by performing mel-filtering and frequency-to-time transfer processing according to the compressed power parameters. The bit width of the compressed power parameters is less than the bit width of the frequency spectrum coefficients.
    Type: Grant
    Filed: May 6, 2020
    Date of Patent: August 2, 2022
    Assignee: XSail Technology Co., Ltd
    Inventors: Meng-Hao Feng, Chao Chen
  • Patent number: 11393488
    Abstract: Embodiments of the disclosure provide systems and methods for enhancing audio signals. The system may include a communication interface configured to receive multi-channel audio signals acquired from a common signal source. The system may further include at least one processor. The at least one processor may be configured to separate the multi-channel audio signals into a first audio signal and a second audio signal in a time domain. The at least one processor may be further configured to decompose the first audio signal and the second audio signal in a frequency domain to obtain a first decomposition data and a second decomposition data, respectively. The at least one processor may be also configured to estimate a noise component in the frequency domain based on the first decomposition data and the second decomposition data. The at least one processor may be additionally configured to enhance the first audio signal based on the estimated noise component.
    Type: Grant
    Filed: April 24, 2020
    Date of Patent: July 19, 2022
    Assignee: BEIJING DIDI INFINITY TECHNOLOGY AND DEVELOPMENT CO., LTD.
    Inventors: Yi Zhang, Hui Song, Chengyun Deng, Yongtao Sha
  • Patent number: 11380307
    Abstract: A method, computer program, and computer system is provided for automated speech recognition. Audio data corresponding to one or more speakers is received. Covariance matrices of target speech and noise associated with the received audio data are estimated based on a gated recurrent unit-based network. A predicted target waveform corresponding to a target speaker from among the one or more speakers is generated by a minimum variance distortionless response function based on the estimated covariance matrices.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: July 5, 2022
    Assignee: TENCENT AMERICA LLC
    Inventors: Yong Xu, Meng Yu, Shi-Xiong Zhang, Dong Yu
  • Patent number: 11373649
    Abstract: Techniques are described herein for enabling the use of “dynamic” or “context-specific” hot words for an automated assistant. In various implementations, an automated assistant may be operated at least in part on a computing device. Audio data captured by a microphone may be monitored for default hot word(s). Detection of one or more of the default hot words may trigger transition of the automated assistant from a limited hot word listening state into a speech recognition state. Transition of the computing device into a given state may be detected, and in response, the audio data captured by the microphone may be monitored for context-specific hot word(s), in addition to or instead of the default hot word(s). Detection of the context-specific hot word(s) may trigger the automated assistant to perform a responsive action associated with the given state, without requiring detection of default hot word(s).
    Type: Grant
    Filed: August 21, 2018
    Date of Patent: June 28, 2022
    Assignee: GOOGLE LLC
    Inventors: Diego Melendo Casado, Jaclyn Konzelmann
  • Patent number: 11367029
    Abstract: A system and method are presented for adaptive skill level assignments of agents in contact center environments. A client and a service collaborate to automatically determine the effectiveness of an agent handling an interaction that has been routed using skills-based routing. Evaluation operations may be performed including emotion detection, transcription of audio to text, keyword analysis, and sentiment analysis. The results of the evaluation are aggregated with other information such as the interaction's duration, agent skills and agent skill levels, and call requirement skills and skill levels, to update the agent's profile which is then used for subsequent routing operations.
    Type: Grant
    Filed: February 26, 2020
    Date of Patent: June 21, 2022
    Inventors: James Murison, Johnson Tse, Gaurav Mehrotra, Anthony Lam
  • Patent number: 11361677
    Abstract: A computing device, method, and a non-transitory computer readable medium for articulation training for hearing impaired persons is disclosed. The computing device comprises a database including stored mel-frequency cepstral representations of audio recordings associated with text and/or images related to the audio recordings, a microphone configured to receive audible inputs and a display. The computing device is operatively connected to the database, the microphone and the display. The computing device includes circuitry and program instructions stored therein which when executed by one or more processors, cause the system to receive an audible input from the microphone, convert the audible input to a mel-frequency cepstral representation, search the database for a match of the mel-frequency cepstral representation to a stored mel-frequency cepstral representation and display the text and/or images related to the stored mel-frequency cepstral representation when the match is found.
    Type: Grant
    Filed: November 10, 2021
    Date of Patent: June 14, 2022
    Assignee: King Abdulaziz University
    Inventor: Wadee Saleh Ahmed Alhalabi
  • Patent number: 11355033
    Abstract: A method comprises inputting an audio signal into a machine learning circuit to compress the audio signal into a sequence of actuator signals. The machine learning circuit being trained by: receiving a training set of acoustic signals and pre-processing the training set of acoustic signals into pre-processed audio data. The pre-processed audio data including at least a spectrogram. The training further includes training the machine learning circuit using the pre-processed audio data. The neural network has a cost function based on a reconstruction error and a plurality of constraints. The machine learning circuit generates a sequence of haptic cues corresponding to the audio input. The sequence of haptic cues is transmitted to a plurality of cutaneous actuators to generate a sequence of haptic outputs.
    Type: Grant
    Filed: April 10, 2018
    Date of Patent: June 7, 2022
    Assignee: Meta Platforms, Inc.
    Inventors: Brian Alexander Knott, Venkatasiva Prasad Chakkabala
  • Patent number: 11348576
    Abstract: A system configured to process an incoming spoken utterance and to coordinate among multiple speechlet components to execute an action of the utterance, where a trained model considers user history and preference information to select the primary speechlet to execute the action as well as any intermediate speechlets that may be provide input data to the speechlet that will ultimately perform the action. The trained model may also consider current dialog information, feedback data, or other data when determining how to process a dialog.
    Type: Grant
    Filed: December 6, 2017
    Date of Patent: May 31, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Bradford Lynch, Adam D. Baran, Kevindra Pal Singh, Udai Sen Mody
  • Patent number: 11328709
    Abstract: A system for improving dysarthria speech intelligibility and method thereof, are provided. In the system, user only needs to provides a set of paired corpus including a reference corpus and a patient corpus, and a speech disordering module can automatically generate a new corpus completely synchronous with the reference corpus, and the new corpus can be used as a training corpus for training a dysarthria voice conversion model. The present invention does not need to use a conventional corpus alignment technology or a manual manner to perform pre-processing on the training corpus, so that manpower cost and time cost can be reduced, and synchronization of the training corpus can be ensured, thereby improving both training and conversion qualities of the voice conversion model.
    Type: Grant
    Filed: August 20, 2019
    Date of Patent: May 10, 2022
    Assignee: NATIONAL CHUNG CHENG UNIVERSITY
    Inventors: Tay-Jyi Lin, Ching-Hau Sung, Che-Chia Pai, Ching-Wei Yeh
  • Patent number: 11328714
    Abstract: Processing data for speech recognition by generating hypotheses from input data, assigning each hypothesis, a score according to a confidence level value and hypothesis ranking, executing a pass/fail grammar test against each hypothesis, generating replacement hypotheses according to grammar test failures, assigning each replacement hypothesis a score according to a number of hypothesis changes, and providing a set of hypotheses, wherein the set comprises at least one replacement hypotheses.
    Type: Grant
    Filed: January 2, 2020
    Date of Patent: May 10, 2022
    Assignee: International Business Machines Corporation
    Inventors: Andrew R. Freed, Marco Noel, Victor Povar
  • Patent number: 11322157
    Abstract: A method of speaker authentication comprises: receiving a speech signal; dividing the speech signal into segments; and, following each segment, obtaining an authentication score based on said segment and previously received segments, wherein the authentication score represents a probability that the speech signal comes from a specific registered speaker. In response to an authentication request, an authentication result is output based on the authentication score.
    Type: Grant
    Filed: June 6, 2017
    Date of Patent: May 3, 2022
    Assignee: Cirrus Logic, Inc.
    Inventors: Carlos Vaquero Avilés-Casco, David Martínez González, Ryan Roberts
  • Patent number: 11315552
    Abstract: This disclosure describes systems and techniques receiving a request for information from a user and, in response, outputting the requested information along with unsolicited, interesting content that is related to, yet nonresponsive to, the requested information. In some instances, if the requested information is unknown, the techniques may output an indication that the information is unknown, followed by the additional, unsolicited, interesting content.
    Type: Grant
    Filed: March 23, 2018
    Date of Patent: April 26, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Michael Martin George, David Garfield Uffelman, Deepak Maini, Kyle Beyer, Amarpaul Singh Sandhu
  • Patent number: 11302311
    Abstract: An artificial intelligence apparatus for recognizing speech of a user includes a microphone, and a processor configured to receive, via the microphone, a sound signal corresponding to the speech of the user, acquire personalize identification information corresponding to the speech, recognize the speech from the sound signal using a global language model, calculate a reliability for the recognition, and if the calculated reliability exceeds a predetermined first reference value, update a personalized language model corresponding to the personalize identification information using the recognition result.
    Type: Grant
    Filed: August 21, 2019
    Date of Patent: April 12, 2022
    Assignee: LG ELECTRONICS INC.
    Inventors: Boseop Kim, Jaehong Kim
  • Patent number: 11302312
    Abstract: A new model is introduced into a particular domain that receives a routing of a dialog from a speech processing component. A method associated with the model includes running a set of test utterances through the speech processing component that enables a spoken language dialog with a user to establish a base line score associated with processing for the set of test utterances. The speech processing component determines an intent of the user and routes the spoken language dialog to a network-based domain based on the intent. The method includes establishing an automatic test run of the set of test utterances to obtain a current score and, when a threshold associated with a difference between the current score and the base line score is breached, switching, at the network-based domain, from the false accept detection model to a second model.
    Type: Grant
    Filed: September 27, 2019
    Date of Patent: April 12, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Ajay Soni, Xi Chen, Jingqian Zhao, Liu Yang, Prathap Ramachandra, Ruiqi Luo
  • Patent number: 11295730
    Abstract: A method is described that includes processing text and speech from an input utterance using local overrides of default dictionary pronunciations. Applying this method, a word-level grammar used to process the tokens specifies at least one local word phonetic variant that applies within a specific production rule and, within a local context of the specific production rule, the local word phonetic variant overrides one or more default dictionary phonetic versions of the word. This method can be applied to parsing utterances where the pronunciation of some words depends on their syntactic or semantic context.
    Type: Grant
    Filed: August 1, 2019
    Date of Patent: April 5, 2022
    Assignee: SoundHound, Inc.
    Inventors: Keyvan Mohajer, Christopher Wilson, Bernard Mont-Reynaud
  • Patent number: 11257483
    Abstract: Spoken language understanding techniques include training a dynamic neural network mask relative to a static neural network using only post-deployment training data such that the mask zeroes out some of the weights of the static neural network and allows some other weights to pass through and applying a dynamic neural network corresponding to the masked static neural network to input queries to identify outputs for the queries.
    Type: Grant
    Filed: March 29, 2019
    Date of Patent: February 22, 2022
    Assignee: Intel Corporation
    Inventors: Krzysztof Czarnowski, Munir Georges
  • Patent number: 11257491
    Abstract: This application relates generally to modifying visual data based on audio commands and more specifically, to performing complex operations that modify visual data based on one or more audio commands. In some embodiments, a computer system may receive an audio input and identify an audio command based on the audio input. The audio command may be mapped to one or more operations capable of being performed by a multimedia editing application. The computer system may perform the one or more operations to edit to received multimedia data.
    Type: Grant
    Filed: November 29, 2018
    Date of Patent: February 22, 2022
    Assignee: ADOBE INC.
    Inventors: Sarah Kong, Yinglan Ma, Hyunghwan Byun, Chih-Yao Hsieh