Patents Examined by Michelle M Koeth
-
Patent number: 11699452Abstract: Described herein are techniques, devices, and systems for selectively using a music-capable audio codec on-demand during a communication session. A user equipment (UE) may adaptively transition between using a first audio codec that provides a first audio bandwidth and a second audio codec (e.g., the EVS-FB codec) that provides a second audio bandwidth that is greater than the first audio bandwidth. The transition to the second audio codec may occur in response to determining that sound in the environment of the UE includes frequencies outside of a range of frequencies associated with a human voice, such as by determining that music is being played in the environment of the UE, which allows for selectively using a music-capable audio codec when it would be beneficial to do so.Type: GrantFiled: December 8, 2020Date of Patent: July 11, 2023Assignee: T-Mobile USA, Inc.Inventors: Hsin Fu Henry Chiang, Yasmin Karimli, Ming Shan Kwok
-
Patent number: 11694678Abstract: The evolutionary feature selection algorithm is combined with model evaluation during training to learn feature subsets that maximize speech/non-speech distribution distances. The technique enables ensembling of low-cost models over similar features subspaces increases classification accuracy and has similar computational complexity in practice. Prior to training the models, feature analysis is conducted via an evolutionary feature selection algorithm which measures fitness for each feature subset in the population by its k-fold cross validation score. PCA and LDA based eigen-features are computed for each subset and fitted with a Gaussian Mixture Model from which combinations of feature subsets with Maximum Mean Discrepancy scores are obtained. During inference, the resulting features are extracted from the input signal and given as input to the trained neural networks.Type: GrantFiled: October 7, 2020Date of Patent: July 4, 2023Assignee: General Dynamics Mission Systems, Inc.Inventors: David Lee, Scott Blanchard, Nickolas Dodd
-
Patent number: 11688409Abstract: The present disclosure relates to processing a plurality of audio signals. A device receives the plurality of audio signals in the frequency domain and determining an overall attenuation multiplier based on the plurality of audio signals and an overall lookup table that relates decibel values to different overall attenuation multipliers. The device determines an attenuation vector comprising a plurality of bin-specific attenuation multipliers, each bin-specific attenuation multiplier respectively corresponding to a different frequency bin of the plurality of frequency bins. The device scales each bin-specific attenuation value in the attenuation vector with the overall attenuation multiplier, and edits each of the audio signals based on the scaled bin-specific attenuation values in the attenuation vector.Type: GrantFiled: November 9, 2020Date of Patent: June 27, 2023Assignee: GoPro, Inc.Inventors: Joyce Gorny, Erich Tisch, Per Magnus Fredrik Hansson
-
Patent number: 11657800Abstract: An artificial intelligence device is provided. The artificial intelligence device according to an embodiment of the present disclosure includes: an input unit configured to receive a speech input; and a processor configured to operate in an interaction mode if a second wakeup word for setting an operation mode is recognized after a first wakeup word for calling the artificial intelligence device is recognized, and process one or more commands received after the second wakeup word according to the operation mode indicated by the second wakeup word.Type: GrantFiled: April 26, 2019Date of Patent: May 23, 2023Assignee: LG ELECTRONICS INC.Inventors: Jaehong Kim, Hyoeun Kim
-
Patent number: 11651766Abstract: The present invention discloses an ultra-low-power speech feature extraction circuit based on non-overlapping framing and serial fast Fourier transform (FFT), and belongs to the technical field of computation, calculation or counting. The circuit is oriented to the field of intelligence, and is integrally composed of a pre-process module, a windowing module, a Fourier transform module, a Mel filtering module, an adjacent frame merging module, a discrete cosine transform (DCT) module and other modules by optimizing the architecture of a Mel-frequency Cepstral Coefficients (MFCC) algorithm. Large-scale storage caused by framing is avoided in a non-overlapping framing mode, storage contained in the MFCC algorithm is further reduced, and the circuit area and the power consumption are greatly reduced.Type: GrantFiled: February 22, 2021Date of Patent: May 16, 2023Assignee: SOUTHEAST UNIVERSITYInventors: Weiwei Shan, Lixuan Zhu, Jun Yang, Longxing Shi
-
Patent number: 11620983Abstract: The disclosure provides a speech recognition method, a device and a computer-readable storage medium. The method includes obtaining a first voice signal collected from a first microphone in a microphone array and a second voice signal collected from a second microphone in the microphone array, the microphone array including at least two microphones, such as two, three or six microphones. The method further includes extracting enhanced features associated with the first voice signal and the second voice signal through a neural network, and obtaining a speech recognition result based on the enhanced features extracted.Type: GrantFiled: August 10, 2020Date of Patent: April 4, 2023Assignee: BAIDU ONLINE NETWORK TECHNOLOGY (BEIJING) CO., LTDInventors: Ce Zhang, Bin Huang, Xin Li, Jinfeng Bai, Xu Chen, Lei Jia
-
Patent number: 11599724Abstract: Systems, devices, and methods of the present invention relate to text classification. A text classification system accesses an utterance of text. The utterance includes at least one word. The text classification system generates a parse tree for the utterance. The parse tree includes at least one terminal node with a word type. The terminal node represents a word of the utterance. The text classification system applies one or more rules to the text. The text classification system then classifies the utterance as a question or a request for an autonomous agent to perform an action.Type: GrantFiled: August 13, 2020Date of Patent: March 7, 2023Assignee: Oracle International CorporationInventors: Boris Galitsky, Vishal Vishnoi, Anfernee Xu
-
Patent number: 11586930Abstract: Embodiments are associated with conditional teacher-student model training. A trained teacher model configured to perform a task may be accessed and an untrained student model may be created. A model training platform may provide training data labeled with ground truths to the teacher model to produce teacher posteriors representing the training data. When it is determined that a teacher posterior matches the associated ground truth label, the platform may conditionally use the teacher posterior to train the student model. When it is determined that a teacher posterior does not match the associated ground truth label, the platform may conditionally use the ground truth label to train the student model. The models might be associated with, for example, automatic speech recognition (e.g., in connection with domain adaptation and/or speaker adaptation).Type: GrantFiled: May 13, 2019Date of Patent: February 21, 2023Assignee: MICROSOFT TECHNOLOGY LICENSING, LLCInventors: Zhong Meng, Jinyu Li, Yong Zhao, Yifan Gong
-
Patent number: 11581004Abstract: Systems and methods for dynamic voice accentuation and reinforcement are presented herein. One embodiment comprises one or more audio input sources; one or more audio output sources; one or more band pass filters; and a processing control unit that includes an audio processing unit, and which executes a method: differentiating between audio input sources as vocal sound audio input sources and ambient noise audio input sources; increasing the gain of the vocal sound audio input sources; inverting a polarity of an ambient noise signal received by each of the ambient noise audio input sources; and adding the inverted polarity to either an output signal of at least one of the one or more audio output sources, or to an input signal of at least one of the vocal sound audio input sources, to reduce ambient noise.Type: GrantFiled: November 24, 2021Date of Patent: February 14, 2023Inventors: Richard Pivnicka, Michael Klasco
-
Patent number: 11568150Abstract: Methods and apparatus for automated processing of natural language text is described. The text can be preprocessed to produce language-space data that includes descriptive data elements for words. Source code that includes linguistic expressions, and that may be written in a programming language that is user-friendly to linguists, can be compiled to produce finite-state transducers and bi-machine transducers that may be applied directly to the language-space data by a language-processing virtual machine. The language-processing virtual machine can select and execute code segments identified in the finite-state and/or bi-machine transducers to disambiguate meanings of words in the text.Type: GrantFiled: July 7, 2020Date of Patent: January 31, 2023Assignee: CLRV Technologies, LLCInventor: Emmanuel Roche
-
Patent number: 11562152Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-storage media, for re-translation for simultaneous, spoken-language machine translation. In some implementations, a stream of audio data comprising speech in a first language is received. A transcription for the speech in the stream of audio data is generated using an automated speech recognizer through a series of updates. A translation of the transcription into a second language is generated using a machine translation module. The translation is generated with translation iterations that translate increasing amounts of the transcription, including re-translating previously portions of the transcription. A series of translation updates are provided to a client device based on the translation iterations.Type: GrantFiled: September 23, 2020Date of Patent: January 24, 2023Assignee: Google LLCInventors: Naveen Arivazhagan, Colin Andrew Cherry, Wolfgang Macherey, Te I, George Foster, Pallavi N Baljekar
-
Patent number: 11557283Abstract: An artificial intelligence (AI) system is disclosed. The AI system includes a processor that processes a sequence of input frames with a neural network including a dilated self-attention module trained to compute a sequence of outputs by transforming each input frame into a corresponding query frame, a corresponding key frame, and a corresponding value frame leading to a sequence of key frames, a sequence of value frames, and a sequence of query frames of same ordering and by performing attention calculations for each query frame with respect to a combination of a portion of the sequences of key and value frames restricted based on a location of the query frame and a dilation sequence of the key frames and a dilation sequence of value frames extracted by processing different frames of the sequences of key and value frames with a predetermined extraction function. Further, the processor renders the sequence of outputs.Type: GrantFiled: March 26, 2021Date of Patent: January 17, 2023Assignee: Mitsubishi Electric Research Laboratories, Inc.Inventors: Niko Moritz, Takaaki Hori, Jonathan Le Roux
-
Patent number: 11557279Abstract: A device for acoustic monitoring of a monitoring area includes first and second sensor systems which have first and second acoustic sensors, processors, and transmitter, respectively, and which may be mounted at different locations of the monitoring area. The first and second processors may be configured to classify first and second audio signals detected by the first and second acoustic sensors so as to obtain first and second classification results, respectively. The first and second transmitter may be configured to transmit the first and second classification results to a central evaluator, respectively. In addition, the device may include the central evaluator, which may be configured to receive the first classification result and to receive the second classification result, and to generate a monitoring output for the monitoring area as a function of the first classification result and the second classification result.Type: GrantFiled: October 28, 2020Date of Patent: January 17, 2023Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.Inventors: Jakob Abesser, Hanna Lukashevich, Steffen Holly, Yvette Körber, Reto Ruch
-
Patent number: 11551672Abstract: A method for generating an acoustic model is disclosed. The method can generate the acoustic model with high accuracy through learning data including various dialects by training the acoustic model using text data, to which regional information is tagged, and changing a parameter of the acoustic model based on the tagged regional information. The acoustic model can be associated with an artificial intelligence module, an unmanned aerial vehicle (UAV), a robot, an augmented reality (AR) device, a virtual reality (VR) device, devices related to 5G services, and the like.Type: GrantFiled: October 1, 2020Date of Patent: January 10, 2023Assignee: LG ELECTRONICS INC.Inventors: Jeehye Lee, Seonyeong Park
-
Patent number: 11526674Abstract: A text extracting system includes at least one processor configured to obtain a plurality of texts, specify at least one characteristic expression included in the plurality of texts, and extract, based on the at least one characteristic expression, at least one text to be entered into a question sentence generator from the plurality of texts, where the question sentence generator generating a question sentence from an input sentence.Type: GrantFiled: March 1, 2019Date of Patent: December 13, 2022Assignee: RAKUTEN GROUP, INC.Inventors: Masakatsu Hamashita, Takashi Inui, Koji Murakami
-
Patent number: 11521603Abstract: A conference minutes generation method is provided, which relates to the technical field of natural language processing. The conference minutes generation method comprises: acquiring a text conference record; dividing the text conference record into a plurality of conference paragraphs, generating a conference paragraph summary for each conference paragraph, and generating a conference record summary based on the conference paragraph summary of each conference paragraph; extracting conference instructions based on the text conference record; and generating the conference minutes based on the conference record summary and the conference instructions.Type: GrantFiled: October 1, 2020Date of Patent: December 6, 2022Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Ke Sun, Ying Liu, Kai Liu, Lei Han, Chao Wang, Yingzhuo Song, Shuai Gao, Liyan Yang, Qianqian Wang, Jing Liu, Di Wei
-
Patent number: 11521599Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.Type: GrantFiled: September 20, 2019Date of Patent: December 6, 2022Assignee: Amazon Technologies, Inc.Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
-
Patent number: 11508364Abstract: An artificial intelligence (AI) system is provided. The AI system simulates functions of human brain such as recognition and judgment by utilizing a machine learning algorithm such as deep learning, etc. and an application of the AI system. A method, performed by an electronic device, of outputting a response to a speech input by using an application, includes receiving the speech input, obtaining text corresponding to the speech input by performing speech recognition on the speech input, obtaining metadata for the speech input based on the obtained text, selecting at least one application from among a plurality of applications for outputting the response to the speech input based on the metadata, and outputting the response to the speech input by using the selected at least one application.Type: GrantFiled: May 21, 2019Date of Patent: November 22, 2022Assignee: Samsung Electronics Co., Ltd.Inventors: Cheenepalli Srirama Krishna Bhargava, Ankush Gupta
-
Patent number: 11501764Abstract: Methods, systems, and related products for voice-enabled computer systems are described. A machine-learning model is trained to produce pronunciation output based on text input. The trained machine-learning model is used to produce pronunciation data for text input even where the text input includes numbers, punctuation, emoji, or other non-letter characters. The machine-learning model is further trained based on real-world data from users to improve pronunciation output.Type: GrantFiled: May 10, 2019Date of Patent: November 15, 2022Assignee: Spotify ABInventor: Daniel Bromand
-
Patent number: 11495216Abstract: The disclosure includes using dilation of speech content from an interlaced audio input for speech recognition. A learning model is initiated to determine dilation parameters for each of a plurality of audible sounds of speech content from a plurality of speakers received at a computer as an audio input. As part of the learning model, a change of each of a plurality of independent sounds is determined in response to an audio stimulus, the independent sounds being derived from the audio input. The disclosure applies the dilation parameters, respectively, based on the change of each of the independent sounds. A voice print is constructed for each of the speakers based on the independent sounds and the dilation parameters, respectively. Speech content is attributed to each of the plurality of speakers based at least in part on the voice print, respectively, and the independent sounds.Type: GrantFiled: September 9, 2020Date of Patent: November 8, 2022Assignee: International Business Machines CorporationInventors: Aaron K. Baughman, Corey B. Shelton, Stephen C. Hammer, Shikhar Kwatra