Patents Examined by Edward Tracy, Jr.

Method and system for detecting anomalous sound

Patent number: 11978476

Abstract: A system and method for detecting anomalous sound are disclosed. The method includes receiving a spectrogram of an audio signal with elements defined by values in a time-frequency domain of the spectrogram. Each of the values corresponds to an element of the spectrogram that is identified by a coordinate in the time-frequency domain. The time-frequency domain of the spectrogram is partitioned into a context region and a target region. The context region and the target region are processed by a neural network using an attentive neural process to recover values of the spectrogram for elements with coordinates in the target region. The recovered values of the elements of the target region are compared with values of elements of the partitioned target region. An anomaly score is determined based on the comparison. The anomaly score is used for performing a control action.

Type: Grant

Filed: September 19, 2021

Date of Patent: May 7, 2024

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Gordon Wichern, Ankush Chakrabarty, Zhong-Qiu Wang, Jonathan Le Roux
Neural tagger with deep multi-level model

Patent number: 11966700

Abstract: Embodiments of the described technologies are capable of reading a text sequence that include at least one word; extracting model input data from the text sequence, where the model input data includes, for each word of the text sequence, segment data and non-segment data; using a first machine learning model and at least one second machine learning model, generating, for each word of the text sequence, a multi-level feature set; outputting, by a third machine learning model, in response to input to the third machine learning model of the multi-level feature set, a tagged version of the text sequence; executing a search based at least in part on the tagged version of the text sequence.

Type: Grant

Filed: March 5, 2021

Date of Patent: April 23, 2024

Assignee: Microsoft Technology Licensing, LLC

Inventors: Yuwei Qiu, Gonzalo Aniano Porcile, Yu Gan, Qin Iris Wang, Haichao Wei, Huiji Gao
Voice-enabled recipe selection

Patent number: 11966964

Abstract: A system including one or more processors and one or more non-transitory computer-readable media storing computing instructions configured to run on the one or more processors and perform receiving a voice command from a user; transforming the voice command; transforming the voice command can include using a natural language understanding and rules execution engine into (a) an intent of the user to add recipe ingredients to a cart and (b) a recipe descriptor; determining a matching recipe from a set of ingested recipes based on the recipe descriptor; determining items and quantities associated with the items that correspond to a set of ingredients included in the matching recipe using a quantity inference algorithm; and automatically adding all of the items and the quantities associated with the items to the cart. Other embodiments are disclosed.

Type: Grant

Filed: January 31, 2020

Date of Patent: April 23, 2024

Assignee: WALMART APOLLO, LLC

Inventors: Snehasish Mukherjee, Deepa Mohan, Haoxuan Chen, Phani Ram Sayapaneni, Ghodratollah Aalipour Hafshejani, Shankara Bhargava Subramanya
Joint works production method and server using collective intelligence

Patent number: 11935518

Abstract: A joint works production method of a joint works production server using collective intelligence includes receiving a subject of joint works from participants of the joint works production, receiving preference information on the received subject from other participants, determining whether to adopt the subject of the joint works according to the received preference information, and classifying, when the subject of the joint works is adopted, the adopted subject of the joint works by subjects and storing the classified subject.

Type: Grant

Filed: March 15, 2021

Date of Patent: March 19, 2024

Inventor: Bang Hyeon Kim
Cross product enhanced harmonic transposition

Patent number: 11935551

Abstract: The present invention relates to audio coding systems which make use of a harmonic transposition method for high frequency reconstruction (HFR). A system and a method for generating a high frequency component of a signal from a low frequency component of the signal is described. The system comprises an analysis filter bank providing a plurality of analysis subband signals of the low frequency component of the signal. It also comprises a non-linear processing unit to generate a synthesis subband signal with a synthesis frequency by modifying the phase of a first and a second of the plurality of analysis subband signals and by combining the phase-modified analysis subband signals. Finally, it comprises a synthesis filter bank for generating the high frequency component of the signal from the synthesis subband signal.

Type: Grant

Filed: May 3, 2023

Date of Patent: March 19, 2024

Assignee: DOLBY INTERNATIONAL AB

Inventors: Lars Villemoes, Per Hedelin
Consistency prediction on streaming sequence models

Patent number: 11929060

Abstract: A method for training a speech recognition model includes receiving a set of training utterance pairs each including a non-synthetic speech representation and a synthetic speech representation of a same corresponding utterance. At each of a plurality of output steps for each training utterance pair in the set of training utterance pairs, the method also includes determining a consistent loss term for the corresponding training utterance pair based on a first probability distribution over possible non-synthetic speech recognition hypotheses generated for the corresponding non-synthetic speech representation and a second probability distribution over possible synthetic speech recognition hypotheses generated for the corresponding synthetic speech representation. The first and second probability distributions are generated for output by the speech recognition model.

Type: Grant

Filed: February 8, 2021

Date of Patent: March 12, 2024

Assignee: Google LLC

Inventors: Zhehuai Chen, Andrew Rosenberg, Bhuvana Ramabhadran, Pedro Jose Moreno Mengibar
System and method for machine learning architecture with variational autoencoder pooling

Patent number: 11914955

Abstract: A computer implemented method is described for conducting text sequence machine learning, the method comprising: receiving an input sequence x=[x1, x2, . . . , xn], to produce a feature vector for a series of hidden states hx=[h1, h2, . . . , hn], wherein the feature vector for the series of hidden states hx is generated by performing pooling over a temporal dimension of all hidden states output by the encoder machine learning data architecture; and extracting from the series of hidden states hx, a mean and a variance parameter, and to encapsulate the mean and the variance parameter as an approximate posterior data structure.

Type: Grant

Filed: May 21, 2020

Date of Patent: February 27, 2024

Assignee: ROYAL BANK OF CANADA

Inventors: Teng Long, Yanshuai Cao, Jackie C. K. Cheung
Deliberation model-based two-pass end-to-end speech recognition

Patent number: 11908461

Abstract: A method of performing speech recognition using a two-pass deliberation architecture includes receiving a first-pass hypothesis and an encoded acoustic frame and encoding the first-pass hypothesis at a hypothesis encoder. The first-pass hypothesis is generated by a recurrent neural network (RNN) decoder model for the encoded acoustic frame. The method also includes generating, using a first attention mechanism attending to the encoded acoustic frame, a first context vector, and generating, using a second attention mechanism attending to the encoded first-pass hypothesis, a second context vector. The method also includes decoding the first context vector and the second context vector at a context vector decoder to form a second-pass hypothesis.

Type: Grant

Filed: January 14, 2021

Date of Patent: February 20, 2024

Assignee: Google LLC

Inventors: Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prakash Prabhavalkar
Customization of recurrent neural network transducers for speech recognition

Patent number: 11908458

Abstract: A computer-implemented method for customizing a recurrent neural network transducer (RNN-T) is provided. The computer implemented method includes synthesizing first domain audio data from first domain text data, and feeding the synthesized first domain audio data into a trained encoder of the recurrent neural network transducer (RNN-T) having an initial condition, wherein the encoder is updated using the synthesized first domain audio data and the first domain text data. The computer implemented method further includes synthesizing second domain audio data from second domain text data, and feeding the synthesized second domain audio data into the updated encoder of the recurrent neural network transducer (RNN-T), wherein the prediction network is updated using the synthesized second domain audio data and the second domain text data. The computer implemented method further includes restoring the updated encoder to the initial condition.

Type: Grant

Filed: December 29, 2020

Date of Patent: February 20, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Gakuto Kurata, George Andrei Saon, Brian E. D. Kingsbury
Orthogonally constrained multi-head attention for speech tasks

Patent number: 11908457

Abstract: A method for operating a neural network includes receiving an input sequence at an encoder. The input sequence is encoded to produce a set of hidden representations. Attention-heads of the neural network calculate attention weights based on the hidden representations. A context vector is calculated for each attention-head based on the attention weights and the hidden representations. Each of the context vectors correspond to a portion of the input sequence. An inference is output based on the context vectors.

Type: Grant

Filed: July 3, 2020

Date of Patent: February 20, 2024

Assignee: QUALCOMM Incorporated

Inventors: Mingu Lee, Jinkyu Lee, Hye Jin Jang, Kyu Woong Hwang
Architecture for generating QA pairs from contexts

Patent number: 11886233

Abstract: The present invention relates to a context-based QA generation architecture, and an object of the present invention is to generate diverse QA pairs from a single context. To achieve the object, the present invention includes a latent variable generating network including at least one encoder and an artificial neural network (Multi-Layer Perceptron: MLP) and configured to train the artificial neural network using a first context, a first question, and a first answer, and generate a second question latent variable and a second answer latent variable by applying the trained artificial neural network to a second context, an answer generating network configured to generate a second answer by decoding the second answer latent variable, and a question generating network configured to generate a second question based on a second context and the second answer.

Type: Grant

Filed: November 12, 2020

Date of Patent: January 30, 2024

Inventors: Dong Hwan Kim, Sung Ju Hwang, Seanie Lee, Dong Bok Lee, Woo Tae Jeong, Han Su Kim, You Kyung Kwon, Hyun Ok Kim
Speech synthesis prosody using a BERT model

Patent number: 11881210

Abstract: A method for generating a prosodic representation includes receiving a text utterance having one or more words. Each word has at least one syllable having at least one phoneme. The method also includes generating, using a Bidirectional Encoder Representations from Transformers (BERT) model, a sequence of wordpiece embeddings and selecting an utterance embedding for the text utterance, the utterance embedding representing an intended prosody. Each wordpiece embedding is associated with one of the one or more words of the text utterance. For each syllable, using the selected utterance embedding and a prosody model that incorporates the BERT model, the method also includes generating a corresponding prosodic syllable embedding for the syllable based on the wordpiece embedding associated with the word that includes the syllable and predicting a duration of the syllable by encoding linguistic features of each phoneme of the syllable with the corresponding prosodic syllable embedding for the syllable.

Type: Grant

Filed: May 5, 2020

Date of Patent: January 23, 2024

Assignee: Google LLC

Inventors: Tom Marius Kenter, Manish Kumar Sharma, Robert Andrew James Clark, Aliaksei Severyn
Method and system for generating an intent classifier

Patent number: 11875128

Abstract: Methods and systems for training an intent classifier. For example, a question-intent tuple dataset comprising data samples is received. Each data sample has a question, an intent, and a task. A pre-trained language model is also received and fine-tuned by adjusting values of learnable parameters. Parameter adjustment is performed by generating a plurality of neural network models. Each neural network model is trained to predict at least one intent of the respective question having a same task value of the tasks of the question-intent tuple dataset. Each task represents a source of the question and the respective intent. The fine-tuned language model generates embeddings for training input data, the training input data comprising a plurality of data samples having questions and intents. Further, feature vectors for the data samples of the training input data are generated and used to train an intent classification model for predicting intents.

Type: Grant

Filed: June 28, 2021

Date of Patent: January 16, 2024

Assignee: Ada Support Inc.

Inventors: Raheleh Makki Niri, Gordon Gibson
Zero-shot cross-lingual transfer learning

Patent number: 11875131

Abstract: Providing a predictive model for a target language by determining an instance weight for a labeled source language textual unit according to a set of unlabeled target language textual units, scaling, by the one or more computer processors, an error between a predicted label for the source language textual unit and a ground-truth label for the source language textual unit according to the instance weight, updating, by the one or more computer processors, network parameters of a predictive neural network model for the target language according to the error, and providing, by the one or more computer processors, the predictive neural network model for the target language to a user.

Type: Grant

Filed: September 16, 2020

Date of Patent: January 16, 2024

Assignee: International Business Machines Corporation

Inventors: Zihui Li, Yunyao Li, Prithviraj Sen, Huaiyu Zhu
Low-resource entity resolution with transfer learning

Patent number: 11875253

Abstract: Methods, systems, and computer program products for low-resource entity resolution with transfer learning are provided herein. A computer-implemented method includes processing input data via a first entity resolution model, wherein the input data comprise labeled input data and unlabeled input data; identifying one or more portions of the unlabeled input data to be used in training a neural network entity resolution model, wherein said identifying comprises applying one or more active learning algorithms to the first entity resolution model; training, using (i) the one or more portions of the unlabeled input data and (ii) one or more deep learning techniques, the neural network entity resolution model; and performing one or more entity resolution tasks by applying the trained neural network entity resolution model to one or more datasets.

Type: Grant

Filed: June 17, 2019

Date of Patent: January 16, 2024

Assignee: International Business Machines Corporation

Inventors: Jungo Kasai, Kun Qian, Sairam Gurajada, Yunyao Li, Lucian Popa
Autonomously motile device with noise suppression

Patent number: 11854564

Abstract: A device capable of autonomous motion may move in an environment and may receive audio data from a microphone. A model may be trained to process the audio data to suppress noise from the audio data. The model may include an encoder that includes one or more convolutional layers, one or more recurrent layers, and a decoder that includes one or more convolutional layers.

Type: Grant

Filed: June 16, 2020

Date of Patent: December 26, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Navin Chatlani, Amit Singh Chhetri
Method and apparatus for combined learning using feature enhancement based on deep neural network and modified loss function for speaker recognition robust to noisy environments

Patent number: 11854554

Abstract: Presented are a combined learning method and device using a transformed loss function and feature enhancement based on a deep neural network for speaker recognition that is robust to a noisy environment.

Type: Grant

Filed: March 30, 2020

Date of Patent: December 26, 2023

Assignee: IUCF-HYU (INDUSTRY-UNIVERSITY COOPERATION FOUNDATION HANYANG UNIVERSITY)

Inventors: Joon-Hyuk Chang, Joonyoung Yang
Speech emotion recognition method and system based on fused population information

Patent number: 11837252

Abstract: The present invention discloses a speech emotion recognition method and system based on fused population information. The method includes the following steps: S1: acquiring a user's audio data; S2: preprocessing the audio data, and obtaining a Mel spectrogram feature; S3: cutting off a front mute segment and a rear mute segment of the Mel spectrogram feature; S4: obtaining population depth feature information through a population classification network; S5: obtaining Mel spectrogram depth feature information through a Mel spectrogram preprocessing network; S6: fusing the population depth feature information and the Mel spectrogram depth feature information through SENet to obtain fused information; and S7: obtaining an emotion recognition result from the fused information through a classification network.

Type: Grant

Filed: June 21, 2022

Date of Patent: December 5, 2023

Assignee: Zhejiang Lab

Inventors: Taihao Li, Shukai Zheng, Yulong Liu, Guanxiong Pei, Shijie Ma
Methods and apparatuses for discriminative pre-training for low resource title compression

Patent number: 11804214

Abstract: A system for generating compressed product titles that can be used in conversational transactions includes a computing device configured to obtain product title data characterizing descriptive product titles of products available on an ecommerce marketplace and to determine compressed product titles based on the product title data using a machine learning model that is pre-trained using a replaced-token detection task. The computing device also stores the compressed product titles for use during conversational transactions.

Type: Grant

Filed: February 26, 2021

Date of Patent: October 31, 2023

Assignee: Walmart Apollo, LLC

Inventors: Snehasish Mukherjee, Phani Ram Sayapaneni, Shankara Bhargava
Linearization of non-linearly transformed signals

Patent number: 11804233

Abstract: A device includes one or more processors configured to perform signal processing including a linear transformation and a non-linear transformation of an input signal to generate a reference target signal. The reference target signal has a linear component associated with the linear transformation and a non-linear component associated with the non-linear transformation. The one or more processors are also configured to perform linear filtering of the input signal by controlling adaptation of the linear filtering to generate an output signal that substantially matches the linear component of the reference target signal.

Type: Grant

Filed: November 15, 2019

Date of Patent: October 31, 2023

Assignee: QUALCOMM Incorporated

Inventors: Lae-Hoon Kim, Dongmei Wang, Cheng-Yu Hung, Erik Visser

1 2 3 next