Patents Examined by Michelle M Koeth

Machine learning-based audio codec switching

Patent number: 11699452

Abstract: Described herein are techniques, devices, and systems for selectively using a music-capable audio codec on-demand during a communication session. A user equipment (UE) may adaptively transition between using a first audio codec that provides a first audio bandwidth and a second audio codec (e.g., the EVS-FB codec) that provides a second audio bandwidth that is greater than the first audio bandwidth. The transition to the second audio codec may occur in response to determining that sound in the environment of the UE includes frequencies outside of a range of frequencies associated with a human voice, such as by determining that music is being played in the environment of the UE, which allows for selectively using a music-capable audio codec when it would be beneficial to do so.

Type: Grant

Filed: December 8, 2020

Date of Patent: July 11, 2023

Assignee: T-Mobile USA, Inc.

Inventors: Hsin Fu Henry Chiang, Yasmin Karimli, Ming Shan Kwok
Signal processor employing neural network trained using evolutionary feature selection

Patent number: 11694678

Abstract: The evolutionary feature selection algorithm is combined with model evaluation during training to learn feature subsets that maximize speech/non-speech distribution distances. The technique enables ensembling of low-cost models over similar features subspaces increases classification accuracy and has similar computational complexity in practice. Prior to training the models, feature analysis is conducted via an evolutionary feature selection algorithm which measures fitness for each feature subset in the population by its k-fold cross validation score. PCA and LDA based eigen-features are computed for each subset and fitted with a Gaussian Mixture Model from which combinations of feature subsets with Maximum Mean Discrepancy scores are obtained. During inference, the resulting features are extracted from the input signal and given as input to the trained neural networks.

Type: Grant

Filed: October 7, 2020

Date of Patent: July 4, 2023

Assignee: General Dynamics Mission Systems, Inc.

Inventors: David Lee, Scott Blanchard, Nickolas Dodd
Multi-band noise gate

Patent number: 11688409

Abstract: The present disclosure relates to processing a plurality of audio signals. A device receives the plurality of audio signals in the frequency domain and determining an overall attenuation multiplier based on the plurality of audio signals and an overall lookup table that relates decibel values to different overall attenuation multipliers. The device determines an attenuation vector comprising a plurality of bin-specific attenuation multipliers, each bin-specific attenuation multiplier respectively corresponding to a different frequency bin of the plurality of frequency bins. The device scales each bin-specific attenuation value in the attenuation vector with the overall attenuation multiplier, and edits each of the audio signals based on the scaled bin-specific attenuation values in the attenuation vector.

Type: Grant

Filed: November 9, 2020

Date of Patent: June 27, 2023

Assignee: GoPro, Inc.

Inventors: Joyce Gorny, Erich Tisch, Per Magnus Fredrik Hansson
Electronic device with wakeup word determined multi-mode operation

Patent number: 11657800

Abstract: An artificial intelligence device is provided. The artificial intelligence device according to an embodiment of the present disclosure includes: an input unit configured to receive a speech input; and a processor configured to operate in an interaction mode if a second wakeup word for setting an operation mode is recognized after a first wakeup word for calling the artificial intelligence device is recognized, and process one or more commands received after the second wakeup word according to the operation mode indicated by the second wakeup word.

Type: Grant

Filed: April 26, 2019

Date of Patent: May 23, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Jaehong Kim, Hyoeun Kim
Ultra-low-power speech feature extraction circuit based on non-overlapping framing and serial FFT

Patent number: 11651766

Abstract: The present invention discloses an ultra-low-power speech feature extraction circuit based on non-overlapping framing and serial fast Fourier transform (FFT), and belongs to the technical field of computation, calculation or counting. The circuit is oriented to the field of intelligence, and is integrally composed of a pre-process module, a windowing module, a Fourier transform module, a Mel filtering module, an adjacent frame merging module, a discrete cosine transform (DCT) module and other modules by optimizing the architecture of a Mel-frequency Cepstral Coefficients (MFCC) algorithm. Large-scale storage caused by framing is avoided in a non-overlapping framing mode, storage contained in the MFCC algorithm is further reduced, and the circuit area and the power consumption are greatly reduced.

Type: Grant

Filed: February 22, 2021

Date of Patent: May 16, 2023

Assignee: SOUTHEAST UNIVERSITY

Inventors: Weiwei Shan, Lixuan Zhu, Jun Yang, Longxing Shi
Speech recognition method, device, and computer-readable storage medium

Patent number: 11620983

Abstract: The disclosure provides a speech recognition method, a device and a computer-readable storage medium. The method includes obtaining a first voice signal collected from a first microphone in a microphone array and a second voice signal collected from a second microphone in the microphone array, the microphone array including at least two microphones, such as two, three or six microphones. The method further includes extracting enhanced features associated with the first voice signal and the second voice signal through a neural network, and obtaining a speech recognition result based on the enhanced features extracted.

Type: Grant

Filed: August 10, 2020

Date of Patent: April 4, 2023

Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD

Inventors: Ce Zhang, Bin Huang, Xin Li, Jinfeng Bai, Xu Chen, Lei Jia
Enabling autonomous agents to discriminate between questions and requests

Patent number: 11599724

Abstract: Systems, devices, and methods of the present invention relate to text classification. A text classification system accesses an utterance of text. The utterance includes at least one word. The text classification system generates a parse tree for the utterance. The parse tree includes at least one terminal node with a word type. The terminal node represents a word of the utterance. The text classification system applies one or more rules to the text. The text classification system then classifies the utterance as a question or a request for an autonomous agent to perform an action.

Type: Grant

Filed: August 13, 2020

Date of Patent: March 7, 2023

Assignee: Oracle International Corporation

Inventors: Boris Galitsky, Vishal Vishnoi, Anfernee Xu
Conditional teacher-student learning for model training

Patent number: 11586930

Abstract: Embodiments are associated with conditional teacher-student model training. A trained teacher model configured to perform a task may be accessed and an untrained student model may be created. A model training platform may provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data. When it is determined that a teacher posterior matches the associated ground truth label, the platform may conditionally use the teacher posterior to train the student model. When it is determined that a teacher posterior does not match the associated ground truth label, the platform may conditionally use the ground truth label to train the student model. The models might be associated with, for example, automatic speech recognition (e.g., in connection with domain adaptation and/or speaker adaptation).

Type: Grant

Filed: May 13, 2019

Date of Patent: February 21, 2023

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
Dynamic voice accentuation and reinforcement

Patent number: 11581004

Abstract: Systems and methods for dynamic voice accentuation and reinforcement are presented herein. One embodiment comprises one or more audio input sources; one or more audio output sources; one or more band pass filters; and a processing control unit that includes an audio processing unit, and which executes a method: differentiating between audio input sources as vocal sound audio input sources and ambient noise audio input sources; increasing the gain of the vocal sound audio input sources; inverting a polarity of an ambient noise signal received by each of the ambient noise audio input sources; and adding the inverted polarity to either an output signal of at least one of the one or more audio output sources, or to an input signal of at least one of the vocal sound audio input sources, to reduce ambient noise.

Type: Grant

Filed: November 24, 2021

Date of Patent: February 14, 2023

Inventors: Richard Pivnicka, Michael Klasco
Methods and apparatus to improve disambiguation and interpretation in automated text analysis using transducers applied on a structured language space

Patent number: 11568150

Abstract: Methods and apparatus for automated processing of natural language text is described. The text can be preprocessed to produce language-space data that includes descriptive data elements for words. Source code that includes linguistic expressions, and that may be written in a programming language that is user-friendly to linguists, can be compiled to produce finite-state transducers and bi-machine transducers that may be applied directly to the language-space data by a language-processing virtual machine. The language-processing virtual machine can select and execute code segments identified in the finite-state and/or bi-machine transducers to disambiguate meanings of words in the text.

Type: Grant

Filed: July 7, 2020

Date of Patent: January 31, 2023

Assignee: CLRV Technologies, LLC

Inventor: Emmanuel Roche
Re-translation for simultaneous, spoken-language machine translation

Patent number: 11562152

Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for re-translation for simultaneous, spoken-language machine translation. In some implementations, a stream of audio data comprising speech in a first language is received. A transcription for the speech in the stream of audio data is generated using an automated speech recognizer through a series of updates. A translation of the transcription into a second language is generated using a machine translation module. The translation is generated with translation iterations that translate increasing amounts of the transcription, including re-translating previously portions of the transcription. A series of translation updates are provided to a client device based on the translation iterations.

Type: Grant

Filed: September 23, 2020

Date of Patent: January 24, 2023

Assignee: Google LLC

Inventors: Naveen Arivazhagan, Colin Andrew Cherry, Wolfgang Macherey, Te I, George Foster, Pallavi N Baljekar
Artificial intelligence system for capturing context by dilated self-attention

Patent number: 11557283

Abstract: An artificial intelligence (AI) system is disclosed. The AI system includes a processor that processes a sequence of input frames with a neural network including a dilated self-attention module trained to compute a sequence of outputs by transforming each input frame into a corresponding query frame, a corresponding key frame, and a corresponding value frame leading to a sequence of key frames, a sequence of value frames, and a sequence of query frames of same ordering and by performing attention calculations for each query frame with respect to a combination of a portion of the sequences of key and value frames restricted based on a location of the query frame and a dilation sequence of the key frames and a dilation sequence of value frames extracted by processing different frames of the sequences of key and value frames with a predetermined extraction function. Further, the processor renders the sequence of outputs.

Type: Grant

Filed: March 26, 2021

Date of Patent: January 17, 2023

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
Device, method and computer program for acoustic monitoring of a monitoring area

Patent number: 11557279

Abstract: A device for acoustic monitoring of a monitoring area includes first and second sensor systems which have first and second acoustic sensors, processors, and transmitter, respectively, and which may be mounted at different locations of the monitoring area. The first and second processors may be configured to classify first and second audio signals detected by the first and second acoustic sensors so as to obtain first and second classification results, respectively. The first and second transmitter may be configured to transmit the first and second classification results to a central evaluator, respectively. In addition, the device may include the central evaluator, which may be configured to receive the first classification result and to receive the second classification result, and to generate a monitoring output for the monitoring area as a function of the first classification result and the second classification result.

Type: Grant

Filed: October 28, 2020

Date of Patent: January 17, 2023

Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Inventors: Jakob Abesser, Hanna Lukashevich, Steffen Holly, Yvette Körber, Reto Ruch
Method for generating acoustic model

Patent number: 11551672

Abstract: A method for generating an acoustic model is disclosed. The method can generate the acoustic model with high accuracy through learning data including various dialects by training the acoustic model using text data, to which regional information is tagged, and changing a parameter of the acoustic model based on the tagged regional information. The acoustic model can be associated with an artificial intelligence module, an unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.

Type: Grant

Filed: October 1, 2020

Date of Patent: January 10, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Jeehye Lee, Seonyeong Park
Sentence extraction system, sentence extraction method, and information storage medium

Patent number: 11526674

Abstract: A text extracting system includes at least one processor configured to obtain a plurality of texts, specify at least one characteristic expression included in the plurality of texts, and extract, based on the at least one characteristic expression, at least one text to be entered into a question sentence generator from the plurality of texts, where the question sentence generator generating a question sentence from an input sentence.

Type: Grant

Filed: March 1, 2019

Date of Patent: December 13, 2022

Assignee: RAKUTEN GROUP, INC.

Inventors: Masakatsu Hamashita, Takashi Inui, Koji Murakami
Automatically generating conference minutes

Patent number: 11521603

Abstract: A conference minutes generation method is provided, which relates to the technical field of natural language processing. The conference minutes generation method comprises: acquiring a text conference record; dividing the text conference record into a plurality of conference paragraphs, generating a conference paragraph summary for each conference paragraph, and generating a conference record summary based on the conference paragraph summary of each conference paragraph; extracting conference instructions based on the text conference record; and generating the conference minutes based on the conference record summary and the conference instructions.

Type: Grant

Filed: October 1, 2020

Date of Patent: December 6, 2022

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Ke Sun, Ying Liu, Kai Liu, Lei Han, Chao Wang, Yingzhuo Song, Shuai Gao, Liyan Yang, Qianqian Wang, Jing Liu, Di Wei
Wakeword detection using a neural network

Patent number: 11521599

Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.

Type: Grant

Filed: September 20, 2019

Date of Patent: December 6, 2022

Assignee: Amazon Technologies, Inc.

Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
Electronic device for outputting response to speech input by using application and operation method thereof

Patent number: 11508364

Abstract: An artificial intelligence (AI) system is provided. The AI system simulates functions of human brain such as recognition and judgment by utilizing a machine learning algorithm such as deep learning, etc. and an application of the AI system. A method, performed by an electronic device, of outputting a response to a speech input by using an application, includes receiving the speech input, obtaining text corresponding to the speech input by performing speech recognition on the speech input, obtaining metadata for the speech input based on the obtained text, selecting at least one application from among a plurality of applications for outputting the response to the speech input based on the metadata, and outputting the response to the speech input by using the selected at least one application.

Type: Grant

Filed: May 21, 2019

Date of Patent: November 22, 2022

Assignee: Samsung Electronics Co., Ltd.

Inventors: Cheenepalli Srirama Krishna Bhargava, Ankush Gupta
Apparatus for media entity pronunciation using deep learning

Patent number: 11501764

Abstract: Methods, systems, and related products for voice-enabled computer systems are described. A machine-learning model is trained to produce pronunciation output based on text input. The trained machine-learning model is used to produce pronunciation data for text input even where the text input includes numbers, punctuation, emoji, or other non-letter characters. The machine-learning model is further trained based on real-world data from users to improve pronunciation output.

Type: Grant

Filed: May 10, 2019

Date of Patent: November 15, 2022

Assignee: Spotify AB

Inventor: Daniel Bromand
Speech recognition using data analysis and dilation of interlaced audio input

Patent number: 11495216

Abstract: The disclosure includes using dilation of speech content from an interlaced audio input for speech recognition. A learning model is initiated to determine dilation parameters for each of a plurality of audible sounds of speech content from a plurality of speakers received at a computer as an audio input. As part of the learning model, a change of each of a plurality of independent sounds is determined in response to an audio stimulus, the independent sounds being derived from the audio input. The disclosure applies the dilation parameters, respectively, based on the change of each of the independent sounds. A voice print is constructed for each of the speakers based on the independent sounds and the dilation parameters, respectively. Speech content is attributed to each of the plurality of speakers based at least in part on the voice print, respectively, and the independent sounds.

Type: Grant

Filed: September 9, 2020

Date of Patent: November 8, 2022

Assignee: International Business Machines Corporation

Inventors: Aaron K. Baughman, Corey B. Shelton, Stephen C. Hammer, Shikhar Kwatra

prev 1 2 3 4 5 6 … next