Patents Examined by Paras D Shah
  • Patent number: 11806213
    Abstract: A speech transmission compensation apparatus that assists discrimination of speech heard by a user, includes: one or more computers each including a memory and a processor configured to: accept input of a speech signal, detect a specific type of sound in the speech signal, analyze an acoustic characteristic of the specific type of sound in the speech signal and output the acoustic characteristic; accept input of the acoustic characteristic being output by the memory and the processor, generate a vibration signal of a duration corresponding to the acoustic characteristic and output the vibration signal; and accept input of the vibration signal being output by the memory and the processor and provide the user with vibration for the duration on the basis of the vibration signal.
    Type: Grant
    Filed: April 30, 2020
    Date of Patent: November 7, 2023
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Asuka Ono, Momoko Nakatani, Ai Nakane, Yoko Ishii
  • Patent number: 11804232
    Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: October 31, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
  • Patent number: 11790931
    Abstract: A first VAD system outputs a pulse stream for zero crossings in an audio signal. The pulse density of the pulse stream is evaluated to identify speech. The audio signal may have noise added to it before evaluating zero crossings. A second VAD system rectifies each audio signal sample and processes each rectified sample by updating a first statistic and evaluating the rectified sample per a first threshold condition that is a function of the first statistic. Rectified samples meeting the first threshold condition may be used to update a second statistic and the rectified sample evaluated per a second threshold condition that is a function of the second statistic. Rectified samples meeting the second threshold condition may be used to update a third statistic. The audio signal sample may be selected as speech if the second statistic is less than a downscaled third statistic.
    Type: Grant
    Filed: October 27, 2020
    Date of Patent: October 17, 2023
    Assignee: Ambiq Micro, Inc.
    Inventor: Roger David Serwy
  • Patent number: 11790926
    Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.
    Type: Grant
    Filed: January 22, 2021
    Date of Patent: October 17, 2023
    Assignees: Electronics and Telecommunications Research Institute, The Trustees of Indiana University
    Inventors: Mi Suk Lee, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Jin Soo Choi, Minje Kim, Kai Zhen
  • Patent number: 11790928
    Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.
    Type: Grant
    Filed: February 10, 2021
    Date of Patent: October 17, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
  • Patent number: 11775763
    Abstract: Systems and methods for weakly-supervised training a machine-learning model to perform named-entity recognition. All possible entity candidates and all possible rule candidates are automatically identified in an input data set of unlabeled text. An initial training of the machine-learning model is performed using labels assigned to entity candidates by a set of seeding rules as a first set of training data. The trained machine-learning model is then applied to the unlabeled text and a subset of rules from the rule candidates is identified that produces labels that most accurately match the labels assigned by the trained machine-learning model. The machine-learning model is then retrained using the labels assigned by the identified subset of rules as the second set of training data. This process is iteratively repeated to further refine and improve the performance of the machine-learning model for named-entity recognition.
    Type: Grant
    Filed: February 25, 2021
    Date of Patent: October 3, 2023
    Assignee: Robert Bosch GmbH
    Inventors: Jiacheng Li, Haibo Ding, Zhe Feng
  • Patent number: 11776544
    Abstract: An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.
    Type: Grant
    Filed: May 18, 2022
    Date of Patent: October 3, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Hyoeun Kim, Hangil Jeong, Heeyeon Choi
  • Patent number: 11769488
    Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.
    Type: Grant
    Filed: March 3, 2022
    Date of Patent: September 26, 2023
    Assignee: SoundHound AI IP, LLC
    Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
  • Patent number: 11763821
    Abstract: Various tools are disclosed for providing assistive or augmentative means to enhance the fluency and accuracy of persons having speech disabilities. These technologies may automatically ascertain and dynamically improve the accuracy with which automatic speech recognition (ASR) systems recognize utterances of persons having impaired speech conditions. In an embodiment, digitized audio information about a speaker’s utterance is processed to determine a set of candidate words matching the utterance. From these candidate words, a set of concepts is determined using a finite state machine model. A pictogram representing each concept is identified and presented to the speaker so that the speaker may select the pictogram corresponding to the best match of his or her intended meaning associated with the utterance. An action corresponding to speaker’s selection then may be performed. For example, displaying or synthesizing speech from textual information describing the selected concept.
    Type: Grant
    Filed: June 27, 2019
    Date of Patent: September 19, 2023
    Assignee: Cerner Innovation, Inc.
    Inventor: Douglas S. McNair
  • Patent number: 11755841
    Abstract: A mechanism is provided updating a knowledge base of a sentiment analysis system, the knowledge base being operable for storing natural language terms and a score value related to each natural language term, the score value characterizing the sentiment of the natural language term. Messages comprising natural language are received. Using content of the knowledge base, a decision is made as to whether at least one message of the received messages has a positive sentiment or a negative sentiment. A term is extracted from the message that is not present in the knowledge base. Based on a frequency of occurrence of the term in the received messages and the sentiment of the messages in which the term occurs, a score value of the term is calculated, and the term and the calculated score value are stored into the knowledge base.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: September 12, 2023
    Assignee: International Business Machines Corporation
    Inventors: Michele Crudele, Antonio Perrone
  • Patent number: 11756529
    Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.
    Type: Grant
    Filed: December 16, 2020
    Date of Patent: September 12, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
  • Patent number: 11756551
    Abstract: An audio processing system is provided. The audio processing system comprises an input interface configured to accept an audio signal. Further, the audio processing system comprises a memory configured to store a neural network trained to determine different types of attributes of multiple concurrent audio events of different origins, wherein the types of attributes include time-dependent and time-agnostic attributes of speech and non-speech audio events. Further, the audio processing system comprises a processor configured to process the audio signal with the neural network to produce metadata of the audio signal, the metadata including one or multiple attributes of one or multiple audio events in the audio signal.
    Type: Grant
    Filed: October 7, 2020
    Date of Patent: September 12, 2023
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Niko Moritz, Gordon Wichern, Takaaki Hori, Jonathan Le Roux
  • Patent number: 11749281
    Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.
    Type: Grant
    Filed: December 4, 2019
    Date of Patent: September 5, 2023
    Assignee: SoundHound AI IP, LLC
    Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
  • Patent number: 11749414
    Abstract: A mathematical model may be trained to diagnose a medical condition of a person by processing acoustic features and language features of speech of the person. The performance of the mathematical model may be improved by appropriately selecting the features to be used with the mathematical model. Features may be selected by computing a feature selection score for each acoustic feature and each language feature, and then selecting features using the scores, such as by selecting features with the highest scores. In some implementations, stability determinations may be computed for each feature and features may be selected using both the feature selection scores and the stability determinations. A mathematical model may then be trained using the selected features and deployed. In some implementations, prompts may be selected using computed prompt selection scores, and the deployed mathematical model may be used with the selected prompts.
    Type: Grant
    Filed: January 18, 2021
    Date of Patent: September 5, 2023
    Assignee: CANARY SPEECH, LLC
    Inventors: Jangwon Kim, Namhee Kwon, Henry O'Connell, Phillip Walstad, Kevin Shengbin Yang
  • Patent number: 11741317
    Abstract: The disclosure relates to system and method for processing multilingual user inputs using a Single Natural Language Processing (SNLP) model. The method includes receiving a user input in a source language and translating the user input to generate a plurality of translated user inputs in an intermediate language. The method includes using the SNLP model configured only using the intermediate language to generate a plurality of sets of intermediate input vectors in the intermediate language. The method includes processing the plurality of sets of intermediate input vectors in the intermediate language using at least one of a plurality of predefined mechanisms to identify a predetermined response. The method includes translating the predetermined response to generate a translated response that is rendered to the user.
    Type: Grant
    Filed: May 25, 2021
    Date of Patent: August 29, 2023
    Inventor: Rajiv Trehan
  • Patent number: 11727925
    Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: August 15, 2023
    Assignee: GOOGLE LLC
    Inventors: Matthew Sharifi, Victor Carbune
  • Patent number: 11727914
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Grant
    Filed: December 24, 2021
    Date of Patent: August 15, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
  • Patent number: 11727927
    Abstract: Embodiments of the present disclosure disclose a view-based voice interaction method, an apparatus, a server, a terminal and a medium. The method includes: obtaining voice information of a user and voice-action description information of a voice-operable element in a currently displayed view on a terminal; obtaining operational intention of the user by performing semantic recognition on the voice information of the user according to view description information of the voice-operable element; locating a sequence of actions matched with the operational intention of the user in the voice-action list according to the voice-action description information; and delivering the sequence of actions to the terminal for performing.
    Type: Grant
    Filed: May 29, 2020
    Date of Patent: August 15, 2023
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.
    Inventors: Zhou Shen, Dai Tan, Sheng Lv, Kaifang Wu, Yudong Li
  • Patent number: 11727951
    Abstract: Disclosed are a display apparatus, a voice acquiring apparatus and a voice recognition method thereof, the display apparatus including: a display unit which displays an image; a communication unit which communicates with a plurality of external apparatuses; and a controller which includes a voice recognition engine to recognize a user's voice, receives a voice signal from a voice acquiring unit, and controls the communication unit to receive candidate instruction words from at least one of the plurality of external apparatuses to recognize the received voice signal.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: August 15, 2023
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Jong-hyuk Jang, Chan-hee Choi, Hee-seob Ryu, Kyung-mi Park, Seung-kwon Park, Jae-hyun Bae
  • Patent number: 11705135
    Abstract: Detecting a replay attack on a voice biometrics system comprises: receiving a speech signal from a voice source; generating and transmitting an ultrasound signal through a transducer of the device; detecting a reflection of the transmitted ultrasound signal; detecting Doppler shifts in the reflection of the generated ultrasound signal; and identifying whether the received speech signal is indicative of liveness of a speaker based on the detected Doppler shifts. The method further comprises: obtaining information about a position of the device; and adapting the generating and transmitting of the ultrasound signal based on the information about the position of the device.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: July 18, 2023
    Assignee: Cirrus Logic, Inc.
    Inventor: John Paul Lesso