Patents Examined by Paras D Shah

Voice transmission compensation apparatus, voice transmission compensation method and program

Patent number: 11806213

Abstract: A speech transmission compensation apparatus that assists discrimination of speech heard by a user, includes: one or more computers each including a memory and a processor configured to: accept input of a speech signal, detect a specific type of sound in the speech signal, analyze an acoustic characteristic of the specific type of sound in the speech signal and output the acoustic characteristic; accept input of the acoustic characteristic being output by the memory and the processor, generate a vibration signal of a duration corresponding to the acoustic characteristic and output the vibration signal; and accept input of the vibration signal being output by the memory and the processor and provide the user with vibration for the duration on the basis of the vibration signal.

Type: Grant

Filed: April 30, 2020

Date of Patent: November 7, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Asuka Ono, Momoko Nakatani, Ai Nakane, Yoko Ishii
Resampling output signals of QMF based audio codecs

Patent number: 11804232

Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.

Type: Grant

Filed: February 10, 2021

Date of Patent: October 31, 2023

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
Voice activity detection using zero crossing detection

Patent number: 11790931

Abstract: A first VAD system outputs a pulse stream for zero crossings in an audio signal. The pulse density of the pulse stream is evaluated to identify speech. The audio signal may have noise added to it before evaluating zero crossings. A second VAD system rectifies each audio signal sample and processes each rectified sample by updating a first statistic and evaluating the rectified sample per a first threshold condition that is a function of the first statistic. Rectified samples meeting the first threshold condition may be used to update a second statistic and the rectified sample evaluated per a second threshold condition that is a function of the second statistic. Rectified samples meeting the second threshold condition may be used to update a third statistic. The audio signal sample may be selected as speech if the second statistic is less than a downscaled third statistic.

Type: Grant

Filed: October 27, 2020

Date of Patent: October 17, 2023

Assignee: Ambiq Micro, Inc.

Inventor: Roger David Serwy
Method and apparatus for processing audio signal

Patent number: 11790926

Abstract: A method and apparatus for processing an audio signal are disclosed. According to an example embodiment, a method of processing an audio signal may include acquiring a final audio signal for an initial audio signal using a plurality of neural network models generating output audio signals by encoding and decoding input audio signals, calculating a difference between the initial audio signal and the final audio signal in a time domain, converting the initial audio signal and the final audio signal into Mel-spectra, calculating a difference between the Mel-spectra of the initial audio signal and the final audio signal in a frequency domain, training the plurality of neural network models based on results calculated in the time domain and the frequency domain, and generating a new final audio signal distinguished from the final audio signal from the initial audio signal using the trained neural network models.

Type: Grant

Filed: January 22, 2021

Date of Patent: October 17, 2023

Assignees: Electronics and Telecommunications Research Institute, The Trustees of Indiana University

Inventors: Mi Suk Lee, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Jin Soo Choi, Minje Kim, Kai Zhen
Resampling output signals of QMF based audio codecs

Patent number: 11790928

Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.

Type: Grant

Filed: February 10, 2021

Date of Patent: October 17, 2023

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.

Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
Weakly supervised and explainable training of a machine-learning-based named-entity recognition (NER) mechanism

Patent number: 11775763

Abstract: Systems and methods for weakly-supervised training a machine-learning model to perform named-entity recognition. All possible entity candidates and all possible rule candidates are automatically identified in an input data set of unlabeled text. An initial training of the machine-learning model is performed using labels assigned to entity candidates by a set of seeding rules as a first set of training data. The trained machine-learning model is then applied to the unlabeled text and a subset of rules from the rule candidates is identified that produces labels that most accurately match the labels assigned by the trained machine-learning model. The machine-learning model is then retrained using the labels assigned by the identified subset of rules as the second set of training data. This process is iteratively repeated to further refine and improve the performance of the machine-learning model for named-entity recognition.

Type: Grant

Filed: February 25, 2021

Date of Patent: October 3, 2023

Assignee: Robert Bosch GmbH

Inventors: Jiacheng Li, Haibo Ding, Zhe Feng
Artificial intelligence apparatus for recognizing speech of user and method for the same

Patent number: 11776544

Abstract: An embodiment of the present invention provides an artificial intelligence (AI) apparatus for recognizing a speech of a user, the artificial intelligence apparatus includes a memory to store a speech recognition model and a processor to obtain a speech signal for a user speech, to convert the speech signal into a text using the speech recognition model, to measure a confidence level for the conversion, to perform a control operation corresponding to the converted text if the measured confidence level is greater than or equal to a reference value, and to provide feedback for the conversion if the measured confidence level is less than the reference value.

Type: Grant

Filed: May 18, 2022

Date of Patent: October 3, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Jaehong Kim, Hyoeun Kim, Hangil Jeong, Heeyeon Choi
Meaning inference from speech audio

Patent number: 11769488

Abstract: A system and method invoke virtual assistant action, which may comprise an argument. From audio, a probability of an intent is inferred. A probability of a domain and a plurality of variable values may also be inferred. Invoking the action is in response to the intent probability exceeding a threshold. Invoking the action may also be in response to the domain probability exceeding a threshold, a variable value probability exceeding a threshold, detecting an end of utterance, and a specific amount of time having elapsed. The intent probability may increase when the audio includes speech of words with the same meaning in multiple natural languages. Invoking the action may also be conditional on the variable value exceeding its threshold within a certain period of time of the intent probability exceeding its threshold.

Type: Grant

Filed: March 3, 2022

Date of Patent: September 26, 2023

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Tool for assisting people with speech disorder

Patent number: 11763821

Abstract: Various tools are disclosed for providing assistive or augmentative means to enhance the fluency and accuracy of persons having speech disabilities. These technologies may automatically ascertain and dynamically improve the accuracy with which automatic speech recognition (ASR) systems recognize utterances of persons having impaired speech conditions. In an embodiment, digitized audio information about a speaker’s utterance is processed to determine a set of candidate words matching the utterance. From these candidate words, a set of concepts is determined using a finite state machine model. A pictogram representing each concept is identified and presented to the speaker so that the speaker may select the pictogram corresponding to the best match of his or her intended meaning associated with the utterance. An action corresponding to speaker’s selection then may be performed. For example, displaying or synthesizing speech from textual information describing the selected concept.

Type: Grant

Filed: June 27, 2019

Date of Patent: September 19, 2023

Assignee: Cerner Innovation, Inc.

Inventor: Douglas S. McNair
Method for updating a knowledge base of a sentiment analysis system

Patent number: 11755841

Abstract: A mechanism is provided updating a knowledge base of a sentiment analysis system, the knowledge base being operable for storing natural language terms and a score value related to each natural language term, the score value characterizing the sentiment of the natural language term. Messages comprising natural language are received. Using content of the knowledge base, a decision is made as to whether at least one message of the received messages has a positive sentiment or a negative sentiment. A term is extracted from the message that is not present in the knowledge base. Based on a frequency of occurrence of the term in the received messages and the sentiment of the messages in which the term occurs, a score value of the term is calculated, and the term and the calculated score value are stored into the knowledge base.

Type: Grant

Filed: November 19, 2019

Date of Patent: September 12, 2023

Assignee: International Business Machines Corporation

Inventors: Michele Crudele, Antonio Perrone
Method and apparatus for speech recognition, and storage medium

Patent number: 11756529

Abstract: Proposed are a method and apparatus for speech recognition, and a storage medium. The specific solution includes: obtaining audio data to be recognized; decoding the audio data to obtain a first syllable of a to-be-converted word, in which the first syllable is a combination of at least one phoneme corresponding to the to-be-converted word; obtaining a sentence to which the to-be-converted word belongs and a converted word in the sentence, and obtaining a second syllable of the converted word; encoding the first syllable and the second syllable to generate first encoding information of the first syllable; and decoding the first encoding information to obtain a text corresponding to the to-be-converted word.

Type: Grant

Filed: December 16, 2020

Date of Patent: September 12, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Liao Zhang, Xiaoyin Fu, Zhengxiang Jiang, Mingxin Liang, Junyao Shao, Qi Zhang, Zhijie Chen, Qiguang Zang
System and method for producing metadata of an audio signal

Patent number: 11756551

Abstract: An audio processing system is provided. The audio processing system comprises an input interface configured to accept an audio signal. Further, the audio processing system comprises a memory configured to store a neural network trained to determine different types of attributes of multiple concurrent audio events of different origins, wherein the types of attributes include time-dependent and time-agnostic attributes of speech and non-speech audio events. Further, the audio processing system comprises a processor configured to process the audio signal with the neural network to produce metadata of the audio signal, the metadata including one or multiple attributes of one or multiple audio events in the audio signal.

Type: Grant

Filed: October 7, 2020

Date of Patent: September 12, 2023

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Niko Moritz, Gordon Wichern, Takaaki Hori, Jonathan Le Roux
Neural speech-to-meaning

Patent number: 11749281

Abstract: A neural speech-to-meaning system is trained on speech audio expressing specific intents. The system receives speech audio and produces indications of when the speech in the audio matches the intent. Intents may include variables that can have a large range of values, such as the names of places. The neural speech-to-meaning system simultaneously recognizes enumerated values of variables and general intents. Recognized variable values can serve as arguments to API requests made in response to recognized intents. Accordingly, neural speech-to-meaning supports voice virtual assistants that serve users based on API hits.

Type: Grant

Filed: December 4, 2019

Date of Patent: September 5, 2023

Assignee: SoundHound AI IP, LLC

Inventors: Sudharsan Krishnaswamy, Maisy Wieman, Jonah Probell
Selecting speech features for building models for detecting medical conditions

Patent number: 11749414

Abstract: A mathematical model may be trained to diagnose a medical condition of a person by processing acoustic features and language features of speech of the person. The performance of the mathematical model may be improved by appropriately selecting the features to be used with the mathematical model. Features may be selected by computing a feature selection score for each acoustic feature and each language feature, and then selecting features using the scores, such as by selecting features with the highest scores. In some implementations, stability determinations may be computed for each feature and features may be selected using both the feature selection scores and the stability determinations. A mathematical model may then be trained using the selected features and deployed. In some implementations, prompts may be selected using computed prompt selection scores, and the deployed mathematical model may be used with the selected prompts.

Type: Grant

Filed: January 18, 2021

Date of Patent: September 5, 2023

Assignee: CANARY SPEECH, LLC

Inventors: Jangwon Kim, Namhee Kwon, Henry O'Connell, Phillip Walstad, Kevin Shengbin Yang
Method and system for processing multilingual user inputs using single natural language processing model

Patent number: 11741317

Abstract: The disclosure relates to system and method for processing multilingual user inputs using a Single Natural Language Processing (SNLP) model. The method includes receiving a user input in a source language and translating the user input to generate a plurality of translated user inputs in an intermediate language. The method includes using the SNLP model configured only using the intermediate language to generate a plurality of sets of intermediate input vectors in the intermediate language. The method includes processing the plurality of sets of intermediate input vectors in the intermediate language using at least one of a plurality of predefined mechanisms to identify a predetermined response. The method includes translating the predetermined response to generate a translated response that is rendered to the user.

Type: Grant

Filed: May 25, 2021

Date of Patent: August 29, 2023

Inventor: Rajiv Trehan
Cross-device data synchronization based on simultaneous hotword triggers

Patent number: 11727925

Abstract: Techniques are described herein for cross-device data synchronization based on simultaneous hotword triggers.

Type: Grant

Filed: December 8, 2020

Date of Patent: August 15, 2023

Assignee: GOOGLE LLC

Inventors: Matthew Sharifi, Victor Carbune
Intent recognition and emotional text-to-speech learning

Patent number: 11727914

Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.

Type: Grant

Filed: December 24, 2021

Date of Patent: August 15, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
View-based voice interaction method, apparatus, server, terminal and medium

Patent number: 11727927

Abstract: Embodiments of the present disclosure disclose a view-based voice interaction method, an apparatus, a server, a terminal and a medium. The method includes: obtaining voice information of a user and voice-action description information of a voice-operable element in a currently displayed view on a terminal; obtaining operational intention of the user by performing semantic recognition on the voice information of the user according to view description information of the voice-operable element; locating a sequence of actions matched with the operational intention of the user in the voice-action list according to the voice-action description information; and delivering the sequence of actions to the terminal for performing.

Type: Grant

Filed: May 29, 2020

Date of Patent: August 15, 2023

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD.

Inventors: Zhou Shen, Dai Tan, Sheng Lv, Kaifang Wu, Yudong Li
Display apparatus, voice acquiring apparatus and voice recognition method thereof

Patent number: 11727951

Abstract: Disclosed are a display apparatus, a voice acquiring apparatus and a voice recognition method thereof, the display apparatus including: a display unit which displays an image; a communication unit which communicates with a plurality of external apparatuses; and a controller which includes a voice recognition engine to recognize a user's voice, receives a voice signal from a voice acquiring unit, and controls the communication unit to receive candidate instruction words from at least one of the plurality of external apparatuses to recognize the received voice signal.

Type: Grant

Filed: February 13, 2020

Date of Patent: August 15, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jong-hyuk Jang, Chan-hee Choi, Hee-seob Ryu, Kyung-mi Park, Seung-kwon Park, Jae-hyun Bae
Detection of liveness

Patent number: 11705135

Abstract: Detecting a replay attack on a voice biometrics system comprises: receiving a speech signal from a voice source; generating and transmitting an ultrasound signal through a transducer of the device; detecting a reflection of the transmitted ultrasound signal; detecting Doppler shifts in the reflection of the generated ultrasound signal; and identifying whether the received speech signal is indicative of liveness of a speaker based on the detected Doppler shifts. The method further comprises: obtaining information about a position of the device; and adapting the generating and transmitting of the ultrasound signal based on the information about the position of the device.

Type: Grant

Filed: October 1, 2020

Date of Patent: July 18, 2023

Assignee: Cirrus Logic, Inc.

Inventor: John Paul Lesso

prev 1 2 3 4 5 6 … next