Patents Examined by Michelle M Koeth
  • Patent number: 11699452
    Abstract: Described herein are techniques, devices, and systems for selectively using a music-capable audio codec on-demand during a communication session. A user equipment (UE) may adaptively transition between using a first audio codec that provides a first audio bandwidth and a second audio codec (e.g., the EVS-FB codec) that provides a second audio bandwidth that is greater than the first audio bandwidth. The transition to the second audio codec may occur in response to determining that sound in the environment of the UE includes frequencies outside of a range of frequencies associated with a human voice, such as by determining that music is being played in the environment of the UE, which allows for selectively using a music-capable audio codec when it would be beneficial to do so.
    Type: Grant
    Filed: December 8, 2020
    Date of Patent: July 11, 2023
    Assignee: T-Mobile USA, Inc.
    Inventors: Hsin Fu Henry Chiang, Yasmin Karimli, Ming Shan Kwok
  • Patent number: 11694678
    Abstract: The evolutionary feature selection algorithm is combined with model evaluation during training to learn feature subsets that maximize speech/non-speech distribution distances. The technique enables ensembling of low-cost models over similar features subspaces increases classification accuracy and has similar computational complexity in practice. Prior to training the models, feature analysis is conducted via an evolutionary feature selection algorithm which measures fitness for each feature subset in the population by its k-fold cross validation score. PCA and LDA based eigen-features are computed for each subset and fitted with a Gaussian Mixture Model from which combinations of feature subsets with Maximum Mean Discrepancy scores are obtained. During inference, the resulting features are extracted from the input signal and given as input to the trained neural networks.
    Type: Grant
    Filed: October 7, 2020
    Date of Patent: July 4, 2023
    Assignee: General Dynamics Mission Systems, Inc.
    Inventors: David Lee, Scott Blanchard, Nickolas Dodd
  • Patent number: 11688409
    Abstract: The present disclosure relates to processing a plurality of audio signals. A device receives the plurality of audio signals in the frequency domain and determining an overall attenuation multiplier based on the plurality of audio signals and an overall lookup table that relates decibel values to different overall attenuation multipliers. The device determines an attenuation vector comprising a plurality of bin-specific attenuation multipliers, each bin-specific attenuation multiplier respectively corresponding to a different frequency bin of the plurality of frequency bins. The device scales each bin-specific attenuation value in the attenuation vector with the overall attenuation multiplier, and edits each of the audio signals based on the scaled bin-specific attenuation values in the attenuation vector.
    Type: Grant
    Filed: November 9, 2020
    Date of Patent: June 27, 2023
    Assignee: GoPro, Inc.
    Inventors: Joyce Gorny, Erich Tisch, Per Magnus Fredrik Hansson
  • Patent number: 11657800
    Abstract: An artificial intelligence device is provided. The artificial intelligence device according to an embodiment of the present disclosure includes: an input unit configured to receive a speech input; and a processor configured to operate in an interaction mode if a second wakeup word for setting an operation mode is recognized after a first wakeup word for calling the artificial intelligence device is recognized, and process one or more commands received after the second wakeup word according to the operation mode indicated by the second wakeup word.
    Type: Grant
    Filed: April 26, 2019
    Date of Patent: May 23, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Jaehong Kim, Hyoeun Kim
  • Patent number: 11651766
    Abstract: The present invention discloses an ultra-low-power speech feature extraction circuit based on non-overlapping framing and serial fast Fourier transform (FFT), and belongs to the technical field of computation, calculation or counting. The circuit is oriented to the field of intelligence, and is integrally composed of a pre-process module, a windowing module, a Fourier transform module, a Mel filtering module, an adjacent frame merging module, a discrete cosine transform (DCT) module and other modules by optimizing the architecture of a Mel-frequency Cepstral Coefficients (MFCC) algorithm. Large-scale storage caused by framing is avoided in a non-overlapping framing mode, storage contained in the MFCC algorithm is further reduced, and the circuit area and the power consumption are greatly reduced.
    Type: Grant
    Filed: February 22, 2021
    Date of Patent: May 16, 2023
    Assignee: SOUTHEAST UNIVERSITY
    Inventors: Weiwei Shan, Lixuan Zhu, Jun Yang, Longxing Shi
  • Patent number: 11620983
    Abstract: The disclosure provides a speech recognition method, a device and a computer-readable storage medium. The method includes obtaining a first voice signal collected from a first microphone in a microphone array and a second voice signal collected from a second microphone in the microphone array, the microphone array including at least two microphones, such as two, three or six microphones. The method further includes extracting enhanced features associated with the first voice signal and the second voice signal through a neural network, and obtaining a speech recognition result based on the enhanced features extracted.
    Type: Grant
    Filed: August 10, 2020
    Date of Patent: April 4, 2023
    Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTD
    Inventors: Ce Zhang, Bin Huang, Xin Li, Jinfeng Bai, Xu Chen, Lei Jia
  • Patent number: 11599724
    Abstract: Systems, devices, and methods of the present invention relate to text classification. A text classification system accesses an utterance of text. The utterance includes at least one word. The text classification system generates a parse tree for the utterance. The parse tree includes at least one terminal node with a word type. The terminal node represents a word of the utterance. The text classification system applies one or more rules to the text. The text classification system then classifies the utterance as a question or a request for an autonomous agent to perform an action.
    Type: Grant
    Filed: August 13, 2020
    Date of Patent: March 7, 2023
    Assignee: Oracle International Corporation
    Inventors: Boris Galitsky, Vishal Vishnoi, Anfernee Xu
  • Patent number: 11586930
    Abstract: Embodiments are associated with conditional teacher-student model training. A trained teacher model configured to perform a task may be accessed and an untrained student model may be created. A model training platform may provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data. When it is determined that a teacher posterior matches the associated ground truth label, the platform may conditionally use the teacher posterior to train the student model. When it is determined that a teacher posterior does not match the associated ground truth label, the platform may conditionally use the ground truth label to train the student model. The models might be associated with, for example, automatic speech recognition (e.g., in connection with domain adaptation and/or speaker adaptation).
    Type: Grant
    Filed: May 13, 2019
    Date of Patent: February 21, 2023
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
  • Patent number: 11581004
    Abstract: Systems and methods for dynamic voice accentuation and reinforcement are presented herein. One embodiment comprises one or more audio input sources; one or more audio output sources; one or more band pass filters; and a processing control unit that includes an audio processing unit, and which executes a method: differentiating between audio input sources as vocal sound audio input sources and ambient noise audio input sources; increasing the gain of the vocal sound audio input sources; inverting a polarity of an ambient noise signal received by each of the ambient noise audio input sources; and adding the inverted polarity to either an output signal of at least one of the one or more audio output sources, or to an input signal of at least one of the vocal sound audio input sources, to reduce ambient noise.
    Type: Grant
    Filed: November 24, 2021
    Date of Patent: February 14, 2023
    Inventors: Richard Pivnicka, Michael Klasco
  • Patent number: 11568150
    Abstract: Methods and apparatus for automated processing of natural language text is described. The text can be preprocessed to produce language-space data that includes descriptive data elements for words. Source code that includes linguistic expressions, and that may be written in a programming language that is user-friendly to linguists, can be compiled to produce finite-state transducers and bi-machine transducers that may be applied directly to the language-space data by a language-processing virtual machine. The language-processing virtual machine can select and execute code segments identified in the finite-state and/or bi-machine transducers to disambiguate meanings of words in the text.
    Type: Grant
    Filed: July 7, 2020
    Date of Patent: January 31, 2023
    Assignee: CLRV Technologies, LLC
    Inventor: Emmanuel Roche
  • Patent number: 11562152
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for re-translation for simultaneous, spoken-language machine translation. In some implementations, a stream of audio data comprising speech in a first language is received. A transcription for the speech in the stream of audio data is generated using an automated speech recognizer through a series of updates. A translation of the transcription into a second language is generated using a machine translation module. The translation is generated with translation iterations that translate increasing amounts of the transcription, including re-translating previously portions of the transcription. A series of translation updates are provided to a client device based on the translation iterations.
    Type: Grant
    Filed: September 23, 2020
    Date of Patent: January 24, 2023
    Assignee: Google LLC
    Inventors: Naveen Arivazhagan, Colin Andrew Cherry, Wolfgang Macherey, Te I, George Foster, Pallavi N Baljekar
  • Patent number: 11557283
    Abstract: An artificial intelligence (AI) system is disclosed. The AI system includes a processor that processes a sequence of input frames with a neural network including a dilated self-attention module trained to compute a sequence of outputs by transforming each input frame into a corresponding query frame, a corresponding key frame, and a corresponding value frame leading to a sequence of key frames, a sequence of value frames, and a sequence of query frames of same ordering and by performing attention calculations for each query frame with respect to a combination of a portion of the sequences of key and value frames restricted based on a location of the query frame and a dilation sequence of the key frames and a dilation sequence of value frames extracted by processing different frames of the sequences of key and value frames with a predetermined extraction function. Further, the processor renders the sequence of outputs.
    Type: Grant
    Filed: March 26, 2021
    Date of Patent: January 17, 2023
    Assignee: Mitsubishi Electric Research Laboratories, Inc.
    Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
  • Patent number: 11557279
    Abstract: A device for acoustic monitoring of a monitoring area includes first and second sensor systems which have first and second acoustic sensors, processors, and transmitter, respectively, and which may be mounted at different locations of the monitoring area. The first and second processors may be configured to classify first and second audio signals detected by the first and second acoustic sensors so as to obtain first and second classification results, respectively. The first and second transmitter may be configured to transmit the first and second classification results to a central evaluator, respectively. In addition, the device may include the central evaluator, which may be configured to receive the first classification result and to receive the second classification result, and to generate a monitoring output for the monitoring area as a function of the first classification result and the second classification result.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: January 17, 2023
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Jakob Abesser, Hanna Lukashevich, Steffen Holly, Yvette Körber, Reto Ruch
  • Patent number: 11551672
    Abstract: A method for generating an acoustic model is disclosed. The method can generate the acoustic model with high accuracy through learning data including various dialects by training the acoustic model using text data, to which regional information is tagged, and changing a parameter of the acoustic model based on the tagged regional information. The acoustic model can be associated with an artificial intelligence module, an unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: January 10, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Jeehye Lee, Seonyeong Park
  • Patent number: 11526674
    Abstract: A text extracting system includes at least one processor configured to obtain a plurality of texts, specify at least one characteristic expression included in the plurality of texts, and extract, based on the at least one characteristic expression, at least one text to be entered into a question sentence generator from the plurality of texts, where the question sentence generator generating a question sentence from an input sentence.
    Type: Grant
    Filed: March 1, 2019
    Date of Patent: December 13, 2022
    Assignee: RAKUTEN GROUP, INC.
    Inventors: Masakatsu Hamashita, Takashi Inui, Koji Murakami
  • Patent number: 11521603
    Abstract: A conference minutes generation method is provided, which relates to the technical field of natural language processing. The conference minutes generation method comprises: acquiring a text conference record; dividing the text conference record into a plurality of conference paragraphs, generating a conference paragraph summary for each conference paragraph, and generating a conference record summary based on the conference paragraph summary of each conference paragraph; extracting conference instructions based on the text conference record; and generating the conference minutes based on the conference record summary and the conference instructions.
    Type: Grant
    Filed: October 1, 2020
    Date of Patent: December 6, 2022
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Ke Sun, Ying Liu, Kai Liu, Lei Han, Chao Wang, Yingzhuo Song, Shuai Gao, Liyan Yang, Qianqian Wang, Jing Liu, Di Wei
  • Patent number: 11521599
    Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: December 6, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
  • Patent number: 11508364
    Abstract: An artificial intelligence (AI) system is provided. The AI system simulates functions of human brain such as recognition and judgment by utilizing a machine learning algorithm such as deep learning, etc. and an application of the AI system. A method, performed by an electronic device, of outputting a response to a speech input by using an application, includes receiving the speech input, obtaining text corresponding to the speech input by performing speech recognition on the speech input, obtaining metadata for the speech input based on the obtained text, selecting at least one application from among a plurality of applications for outputting the response to the speech input based on the metadata, and outputting the response to the speech input by using the selected at least one application.
    Type: Grant
    Filed: May 21, 2019
    Date of Patent: November 22, 2022
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Cheenepalli Srirama Krishna Bhargava, Ankush Gupta
  • Patent number: 11501764
    Abstract: Methods, systems, and related products for voice-enabled computer systems are described. A machine-learning model is trained to produce pronunciation output based on text input. The trained machine-learning model is used to produce pronunciation data for text input even where the text input includes numbers, punctuation, emoji, or other non-letter characters. The machine-learning model is further trained based on real-world data from users to improve pronunciation output.
    Type: Grant
    Filed: May 10, 2019
    Date of Patent: November 15, 2022
    Assignee: Spotify AB
    Inventor: Daniel Bromand
  • Patent number: 11495216
    Abstract: The disclosure includes using dilation of speech content from an interlaced audio input for speech recognition. A learning model is initiated to determine dilation parameters for each of a plurality of audible sounds of speech content from a plurality of speakers received at a computer as an audio input. As part of the learning model, a change of each of a plurality of independent sounds is determined in response to an audio stimulus, the independent sounds being derived from the audio input. The disclosure applies the dilation parameters, respectively, based on the change of each of the independent sounds. A voice print is constructed for each of the speakers based on the independent sounds and the dilation parameters, respectively. Speech content is attributed to each of the plurality of speakers based at least in part on the voice print, respectively, and the independent sounds.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: November 8, 2022
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Corey B. Shelton, Stephen C. Hammer, Shikhar Kwatra