Patents Examined by Nandini Subramani

System and method for offline embedded abnormal sound fault detection

Patent number: 11984135

Abstract: System and method for offline embedded abnormal sound fault detection are disclosed, the system comprising a sound acquisition module, a sound audio feature extraction module, and a neural network module. The sound audio feature extraction module uses fast Fourier transform to process sample data in a frequency domain, and then inputs the sample data to the neural network module to complete anomaly classification. The neural network module comprises at least one CNN feature extraction layer, a long short-term memory (LSTM) layer, at least one fully connected and at least one classification layer, and a trigger decision layer. The number of network layers of the at least one CNN feature extraction layer is dynamically adjustable, a network structure of the at least one fully connected layer and the at least one classification layer is dynamically variable, and the trigger decision layer is configured to eliminate generalization errors generated by a neural network.

Type: Grant

Filed: March 2, 2021

Date of Patent: May 14, 2024

Assignee: Espressif Systems (Shanghai) Co., Ltd.

Inventor: Wangwang Wang
Multi-encoder end-to-end automatic speech recognition (ASR) for joint modeling of multiple input devices

Patent number: 11978433

Abstract: An end-to-end automatic speech recognition (ASR) system includes: a first encoder configured for close-talk input captured by a close-talk input mechanism; a second encoder configured for far-talk input captured by a far-talk input mechanism; and an encoder selection layer configured to select at least one of the first and second encoders for use in producing ASR output. The selection is made based on at least one of short-time Fourier transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and filter bank derived from at least one of the close-talk input and the far-talk input. If signals from both the close-talk input mechanism and the far-talk input mechanism are present for a speech segment, the encoder selection layer dynamically selects between the close-talk encoder and the far-talk encoder to select the encoder that better recognizes the speech segment. An encoder-decoder model is used to produce the ASR output.

Type: Grant

Filed: June 22, 2021

Date of Patent: May 7, 2024

Assignee: Microsoft Technology Licensing, LLC.

Inventors: Felix Weninger, Marco Gaudesi, Ralf Leibold, Puming Zhan
Translation verification and correction

Patent number: 11966711

Abstract: Embodiments of the present disclosure relate to a solution for translation verification and correction. According to the solution, a neural network is trained to determine an association degree among a group of words in a source or target language. The neural network can be used for translation verification and correction. According to the solution, a group of words in a source language and translations of the group of words in a target language are obtained. An association degree among the group of words and an association degree among the translations can be determined by using the trained neural network. Then, whether there is a wrong translation can be determined based on the association degrees. In some embodiments, corresponding methods, systems and computer program products are provided.

Type: Grant

Filed: May 18, 2021

Date of Patent: April 23, 2024

Assignee: International Business Machines Corporation

Inventors: Guang Ming Zhang, Xiaoyang Yang, Hong Wei Jia, Mo Chi Liu, Yun Wang
Information processing apparatus, keyword detecting apparatus, and information processing method

Patent number: 11961510

Abstract: According to one embodiment, an information processing apparatus includes following units. The acquisition unit acquires first training data including a combination of a voice feature quantity and a correct phoneme label of the voice feature quantity. The training unit trains an acoustic model using the first training data in a manner to output the correct phoneme label in response to input of the voice feature quantity. The extraction unit extracts from the first training data, second training data including voice feature quantities of at least one of a keyword, a sub-word, a syllable, or a phoneme included in the keyword. The adaptation processing unit adapts the trained acoustic model using the second training data to a keyword detection model.

Type: Grant

Filed: February 28, 2020

Date of Patent: April 16, 2024

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Ning Ding, Hiroshi Fujimura
Multilingual intent recognition

Patent number: 11900922

Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.

Type: Grant

Filed: November 10, 2020

Date of Patent: February 13, 2024

Assignee: International Business Machines Corporation

Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
Electronic device and controlling method of electronic device for augmenting learning data for a recognition model

Patent number: 11881211

Abstract: Disclosed are an electronic device and a method of controlling the electronic device. An electronic device according to an embodiment may perform a method comprising: performing natural language understanding for a first text included in learning data, obtaining first information associated with a speech corresponding to the first text being uttered based on a result of the natural language understanding, obtain second information associated with an acoustic feature corresponding to the speech corresponding to the first text being uttered based on the first information, obtaining a plurality of speech signals corresponding to the first text by converting a first speech signal corresponding to the first text based on the first information and the second information, and training a speech recognition model based on the plurality of obtained speech signals and the first text.

Type: Grant

Filed: March 2, 2021

Date of Patent: January 23, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Changwoo Han, Kwangyoun Kim, Chanwoo Kim, Kyungmin Lee, Youngho Han
Few-shot named-entity recognition

Patent number: 11860684

Abstract: A first named entity recognition (NER) system may be adapted to create a second NER system that is able to recognize a new named entity using few-shot learning. The second NER system may process support tokens that provide one or more examples of the new named entity and may process input tokens that may contain the new named entity. The second NER system may use a classifier of the first NER system to compute support token embeddings from the support tokens and input token embeddings from the input tokens. The second NER system may then recognize the new named entity in the input tokens using abstract tag transition probabilities and/or distances between the support token embeddings and the input token embeddings.

Type: Grant

Filed: September 17, 2020

Date of Patent: January 2, 2024

Assignee: ASAPP, INC.

Inventors: Yi Yang, Arzoo Katiyar
Method of switching a circuit from an idle state to an active state based on a trigger signal from am always-on circuit

Patent number: 11848006

Abstract: A method of processing an electrical signal transduced from a voice signal is disclosed. A classification model is applied to the electrical signal to produce a classification indicator. The classification model has been trained using an augmented training dataset. The electrical signal is classified as either one of a first class and a second class in a binary classification. The classifying being performed is a function of the classification indicator. A trigger signal is provided to a user circuit as a result of the electrical signal being classified in the first class of the binary classification.

Type: Grant

Filed: August 24, 2020

Date of Patent: December 19, 2023

Assignee: STMicroelectronics S.r.l.

Inventors: Nunziata Ivana Guarneri, Filippo Naccari
Learning device, learning method, and learning program for images and sound which uses a similarity matrix

Patent number: 11830478

Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.

Type: Grant

Filed: April 1, 2021

Date of Patent: November 28, 2023

Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY

Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
Information processing apparatus and information processing method

Patent number: 11823669

Abstract: According to one embodiment, an information processing apparatus include following units. The first acquisition unit acquires speech data including frames. The second acquisition unit acquires a model trained to, upon input of a feature amount extracted from the speech data, output information indicative of likelihood of each of a plurality of classes including a component of a keyword and a component of background noise. The first calculation unit calculates a keyword score indicative of occurrence probability of the component of the keyword. The second calculation unit calculates a background noise score indicative of occurrence probability of the component of the background noise. The determination unit determines whether or not the speech data includes the keyword.

Type: Grant

Filed: February 28, 2020

Date of Patent: November 21, 2023

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventors: Ning Ding, Hiroshi Fujimura
Example-based voice bot development techniques

Patent number: 11804211

Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).

Type: Grant

Filed: December 4, 2020

Date of Patent: October 31, 2023

Assignee: GOOGLE LLC

Inventors: Asaf Aharoni, Yaniv Leviathan, Eyal Segalis, Gal Elidan, Sasha Goldshtein, Tomer Amiaz, Deborah Cohen
Attentive scoring function for speaker identification

Patent number: 11798562

Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.

Type: Grant

Filed: May 16, 2021

Date of Patent: October 24, 2023

Assignee: Google LLC

Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
Voice-driven application prototyping using machine-learning techniques

Patent number: 11790892

Abstract: A method includes capturing an event, analyzing the event to generate graphs, receiving a natural language utterance, identifying an entity and a command, modifying the graphs; and emitting an application prototype. An application prototyping server includes a processor; and a memory storing instructions that, when executed by the processor, cause the server to capture an event, analyze the captured event to generate graphs, receive a natural language utterance, identify an entity and a command, modify the graphs; and emit an application prototype. A non-transitory computer readable medium containing program instructions that when executed, cause a computer to: capture an event, analyze the captured event to generate graphs, receive a natural language utterance, identify an entity and a command, modify the graphs; and emit an application prototype.

Type: Grant

Filed: May 27, 2020

Date of Patent: October 17, 2023

Assignee: CDW LLC

Inventor: Joseph Kessler
Voice activity detection and dialogue recognition for air traffic control

Patent number: 11783810

Abstract: Illustrative embodiments provide a method and system for communicating air traffic control information. An audio signal comprising voice activity is received. Air traffic control information in the voice activity is identified using an artificial intelligence algorithm. A text transcript of the air traffic control information is generated and displayed on a confirmation display. Voice activity in the audio signal may be detected by identifying portions of the audio signal that comprise speech based on a comparison between the power spectrum of the audio signal and the power spectrum of noise and forming speech segments comprising the portions of the audio signal that comprise speech.

Type: Grant

Filed: July 17, 2020

Date of Patent: October 10, 2023

Assignee: The Boeing Company

Inventors: Stephen Dame, Yu Qiao, Taylor A. Riccetti, David J. Ross, Joshua Welshmeyer, Matthew Sheridan-Smith, Su Ying Li, Zarrin Khiang-Huey Chua, Jose A. Medina, Michelle D. Warren, Simran Pabla, Jasper P. Corleis
Audio processor and method for generating a frequency enhanced audio signal using pulse processing

Patent number: 11776554

Abstract: An audio processor for generating a frequency enhanced audio signal from a source audio signal has: an envelope determiner for determining a temporal envelope of at least a portion of the source audio signal; an analyzer for analyzing the temporal envelope to determine temporal values of certain features of the temporal envelope; a signal synthesizer for generating a synthesis signal, the generating having placing pulses in relation to the determined temporal values, wherein the pulses are weighted using weights derived from amplitudes of the temporal envelope related to the temporal values, where the pulses are placed; and a combiner for combining at least a band of the synthesis signal that is not included in the source audio signal and the source audio signal to obtain the frequency enhanced audio signal.

Type: Grant

Filed: May 27, 2021

Date of Patent: October 3, 2023

Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.

Inventors: Sascha Disch, Michael Sturm
Performing utterance detection using convolution

Patent number: 11769491

Abstract: A system configured to perform utterance detection using data processing techniques that are similar to those used for object detection is provided. For example, the system may treat utterances within audio data as analogous to an object represented within an image and employ techniques to separate and identify individual utterances. The system may include one or more trained models that are trained to perform utterance detection. For example, the system may include a first module to process input audio data and identify whether speech is represented in the input audio data, a second module to apply convolution filters, and a third module configured to determine a boundary identifying a beginning and ending of a portion of the input audio data along with an utterance score indicating how closely the portion of the input audio data represents an utterance.

Type: Grant

Filed: September 29, 2020

Date of Patent: September 26, 2023

Assignee: Amazon Technologies, Inc.

Inventors: Abhishek Bafna, Haithem Albadawi
Communication issue detection using evaluation of multiple machine learning models

Patent number: 11769520

Abstract: Techniques are provided for evaluating multiple machine learning models to identify issues with a communication. One method comprises applying an audio signal associated with a communication to at least two of: (i) a trigger word analysis module that evaluates contextual information to determine if a trigger word is detected in the audio signal; (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected; and (iii) a communication application analysis module that evaluates features provided by a communication application relative to applicable thresholds; and combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue. The combining may evaluate an accuracy of the trigger word analysis module, the audio activity pattern analysis module and/or the communication application analysis module to combine the results.

Type: Grant

Filed: August 17, 2020

Date of Patent: September 26, 2023

Assignee: EMC IP Holding Company LLC

Inventors: Idan Richman Goshen, Shiri Gaber
System and method for passive subject specific monitoring

Patent number: 11741986

Abstract: A method includes obtaining, by an electronic device, an audio segment comprising one or more audio events of a target subject. The method also includes extracting, by the electronic device, audio embeddings from the one or more audio events using an embedding model, the embedding model comprising a trained machine learning model. The method further includes comparing, by the electronic device, the extracted audio embeddings with a match profile of the target subject, the match profile generated during an enrollment stage. The method also includes generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject.

Type: Grant

Filed: August 20, 2020

Date of Patent: August 29, 2023

Assignee: Samsung Electronics Co., Ltd.

Inventors: Korosh Vatanparvar, Tousif Ahmed, Viswam Nathan, Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Jilong Kuang, Jun Gao
Enriched message embedding for conversation deinterleaving

Patent number: 11736428

Abstract: An approach is provided that receives a message and applies a deep analytic analysis to the message. The deep analytic analysis results in a set of enriched message embedding (EME) data that is passed to a trained neural network. Based on a set of scores received from the trained neural network, a conversation is identified from a number of available conversations to which the received message belongs. The received first message is then associated with the identified conversation.

Type: Grant

Filed: June 24, 2019

Date of Patent: August 22, 2023

Assignee: International Business Machines Corporation

Inventors: Devin A. Conley, Priscilla S. Moraes, Lakshminarayanan Krishnamurthy, Oren Sar-Shalom
Artificial intelligence device and method for generating speech having a different speech style

Patent number: 11721319

Abstract: An artificial intelligence device includes a memory and a processor. The memory is configured to store audio data having a predetermined speech style. The processor is configured to generate a condition vector relating to a condition for determining the speech style of the audio data, reduce a dimension of the condition vector to a predetermined reduction dimension, acquire a sparse code vector based on a dictionary vector acquired through sparse dictionary coding with respect to the condition vector having the predetermined reduction dimension, and change a vector element value included in the sparse code vector.

Type: Grant

Filed: February 27, 2020

Date of Patent: August 8, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Minook Kim, Yongchul Park, Sungmin Han, Siyoung Yang, Sangki Kim, Juyeong Jang

1 2 next