Patents Examined by Nandini Subramani
  • Patent number: 11984135
    Abstract: System and method for offline embedded abnormal sound fault detection are disclosed, the system comprising a sound acquisition module, a sound audio feature extraction module, and a neural network module. The sound audio feature extraction module uses fast Fourier transform to process sample data in a frequency domain, and then inputs the sample data to the neural network module to complete anomaly classification. The neural network module comprises at least one CNN feature extraction layer, a long short-term memory (LSTM) layer, at least one fully connected and at least one classification layer, and a trigger decision layer. The number of network layers of the at least one CNN feature extraction layer is dynamically adjustable, a network structure of the at least one fully connected layer and the at least one classification layer is dynamically variable, and the trigger decision layer is configured to eliminate generalization errors generated by a neural network.
    Type: Grant
    Filed: March 2, 2021
    Date of Patent: May 14, 2024
    Assignee: Espressif Systems (Shanghai) Co., Ltd.
    Inventor: Wangwang Wang
  • Patent number: 11978433
    Abstract: An end-to-end automatic speech recognition (ASR) system includes: a first encoder configured for close-talk input captured by a close-talk input mechanism; a second encoder configured for far-talk input captured by a far-talk input mechanism; and an encoder selection layer configured to select at least one of the first and second encoders for use in producing ASR output. The selection is made based on at least one of short-time Fourier transform (STFT), Mel-frequency Cepstral Coefficient (MFCC) and filter bank derived from at least one of the close-talk input and the far-talk input. If signals from both the close-talk input mechanism and the far-talk input mechanism are present for a speech segment, the encoder selection layer dynamically selects between the close-talk encoder and the far-talk encoder to select the encoder that better recognizes the speech segment. An encoder-decoder model is used to produce the ASR output.
    Type: Grant
    Filed: June 22, 2021
    Date of Patent: May 7, 2024
    Assignee: Microsoft Technology Licensing, LLC.
    Inventors: Felix Weninger, Marco Gaudesi, Ralf Leibold, Puming Zhan
  • Patent number: 11966711
    Abstract: Embodiments of the present disclosure relate to a solution for translation verification and correction. According to the solution, a neural network is trained to determine an association degree among a group of words in a source or target language. The neural network can be used for translation verification and correction. According to the solution, a group of words in a source language and translations of the group of words in a target language are obtained. An association degree among the group of words and an association degree among the translations can be determined by using the trained neural network. Then, whether there is a wrong translation can be determined based on the association degrees. In some embodiments, corresponding methods, systems and computer program products are provided.
    Type: Grant
    Filed: May 18, 2021
    Date of Patent: April 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Guang Ming Zhang, Xiaoyang Yang, Hong Wei Jia, Mo Chi Liu, Yun Wang
  • Patent number: 11961510
    Abstract: According to one embodiment, an information processing apparatus includes following units. The acquisition unit acquires first training data including a combination of a voice feature quantity and a correct phoneme label of the voice feature quantity. The training unit trains an acoustic model using the first training data in a manner to output the correct phoneme label in response to input of the voice feature quantity. The extraction unit extracts from the first training data, second training data including voice feature quantities of at least one of a keyword, a sub-word, a syllable, or a phoneme included in the keyword. The adaptation processing unit adapts the trained acoustic model using the second training data to a keyword detection model.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: April 16, 2024
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ning Ding, Hiroshi Fujimura
  • Patent number: 11900922
    Abstract: Embodiments of the present invention provide computer implemented methods, computer program products and computer systems. For example, embodiments of the present invention can access one or more intents and associated entities from limited amount of speech to text training data in a single language. Embodiments of the present invention can locate speech to text training data in one or more other languages using the accessed one or more intents and associated entities to locate speech to text training data in the one or more other languages different than the single language. Embodiments of the present invention can then train a neural network based on the limited amount of speech to text training data in the single language and the located speech to text training data in the one or more other languages.
    Type: Grant
    Filed: November 10, 2020
    Date of Patent: February 13, 2024
    Assignee: International Business Machines Corporation
    Inventors: Samuel Thomas, Hong-Kwang Kuo, Kartik Audhkhasi, Michael Alan Picheny
  • Patent number: 11881211
    Abstract: Disclosed are an electronic device and a method of controlling the electronic device. An electronic device according to an embodiment may perform a method comprising: performing natural language understanding for a first text included in learning data, obtaining first information associated with a speech corresponding to the first text being uttered based on a result of the natural language understanding, obtain second information associated with an acoustic feature corresponding to the speech corresponding to the first text being uttered based on the first information, obtaining a plurality of speech signals corresponding to the first text by converting a first speech signal corresponding to the first text based on the first information and the second information, and training a speech recognition model based on the plurality of obtained speech signals and the first text.
    Type: Grant
    Filed: March 2, 2021
    Date of Patent: January 23, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Changwoo Han, Kwangyoun Kim, Chanwoo Kim, Kyungmin Lee, Youngho Han
  • Patent number: 11860684
    Abstract: A first named entity recognition (NER) system may be adapted to create a second NER system that is able to recognize a new named entity using few-shot learning. The second NER system may process support tokens that provide one or more examples of the new named entity and may process input tokens that may contain the new named entity. The second NER system may use a classifier of the first NER system to compute support token embeddings from the support tokens and input token embeddings from the input tokens. The second NER system may then recognize the new named entity in the input tokens using abstract tag transition probabilities and/or distances between the support token embeddings and the input token embeddings.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: January 2, 2024
    Assignee: ASAPP, INC.
    Inventors: Yi Yang, Arzoo Katiyar
  • Patent number: 11848006
    Abstract: A method of processing an electrical signal transduced from a voice signal is disclosed. A classification model is applied to the electrical signal to produce a classification indicator. The classification model has been trained using an augmented training dataset. The electrical signal is classified as either one of a first class and a second class in a binary classification. The classifying being performed is a function of the classification indicator. A trigger signal is provided to a user circuit as a result of the electrical signal being classified in the first class of the binary classification.
    Type: Grant
    Filed: August 24, 2020
    Date of Patent: December 19, 2023
    Assignee: STMicroelectronics S.r.l.
    Inventors: Nunziata Ivana Guarneri, Filippo Naccari
  • Patent number: 11830478
    Abstract: A learning device calculates a feature of each data included in a pair of datasets in which two modalities among a plurality of modalities are combined, using a model that receives data on a corresponding modality among the modalities and outputs a feature obtained by mapping the received data into an embedding space. The learning device then selects similar data similar to each target data that is data on a first modality in a first dataset of the datasets, from data on a second modality included in a second dataset of the datasets. The learning device further updates a parameter of the model such that the features of the data in the pair included in the first and the second datasets are similar to one another, and the feature of data paired with the target data is similar to the feature of data paired with the similar data.
    Type: Grant
    Filed: April 1, 2021
    Date of Patent: November 28, 2023
    Assignees: NIPPON TELEGRAPH AND TELEPHONE CORPORATION, MASSACHUSETTS INSTITUTE OF TECHNOLOGY
    Inventors: Yasunori Ohishi, Akisato Kimura, Takahito Kawanishi, Kunio Kashino, James R. Glass, David Harwath
  • Patent number: 11823669
    Abstract: According to one embodiment, an information processing apparatus include following units. The first acquisition unit acquires speech data including frames. The second acquisition unit acquires a model trained to, upon input of a feature amount extracted from the speech data, output information indicative of likelihood of each of a plurality of classes including a component of a keyword and a component of background noise. The first calculation unit calculates a keyword score indicative of occurrence probability of the component of the keyword. The second calculation unit calculates a background noise score indicative of occurrence probability of the component of the background noise. The determination unit determines whether or not the speech data includes the keyword.
    Type: Grant
    Filed: February 28, 2020
    Date of Patent: November 21, 2023
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventors: Ning Ding, Hiroshi Fujimura
  • Patent number: 11804211
    Abstract: Implementations are directed to providing a voice bot development platform that enables a third-party developer to train a voice bot based on training instance(s). The training instance(s) can each include training input and training output. The training input can include a portion of a corresponding conversation and a prior context of the corresponding conversation. The training output can include a corresponding ground truth response to the portion of the corresponding conversation. Subsequent to training, the voice bot can be deployed for conducting conversations on behalf of a third-party. In some implementations, the voice bot is further trained based on a corresponding feature emphasis input that attentions the voice bot to a particular feature of the portion of the corresponding conversation. In some additional or alternative implementations, the voice bot is further trained to interact with third-party system(s) via remote procedure calls (RPCs).
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: October 31, 2023
    Assignee: GOOGLE LLC
    Inventors: Asaf Aharoni, Yaniv Leviathan, Eyal Segalis, Gal Elidan, Sasha Goldshtein, Tomer Amiaz, Deborah Cohen
  • Patent number: 11798562
    Abstract: A speaker verification method includes receiving audio data corresponding to an utterance, processing the audio data to generate a reference attentive d-vector representing voice characteristics of the utterance, the evaluation ad-vector includes ne style classes each including a respective value vector concatenated with a corresponding routing vector. The method also includes generating using a self-attention mechanism, at least one multi-condition attention score that indicates a likelihood that the evaluation ad-vector matches a respective reference ad-vector associated with a respective user. The method also includes identifying the speaker of the utterance as the respective user associated with the respective reference ad-vector based on the multi-condition attention score.
    Type: Grant
    Filed: May 16, 2021
    Date of Patent: October 24, 2023
    Assignee: Google LLC
    Inventors: Ignacio Lopez Moreno, Quan Wang, Jason Pelecanos, Yiling Huang, Mert Saglam
  • Patent number: 11790892
    Abstract: A method includes capturing an event, analyzing the event to generate graphs, receiving a natural language utterance, identifying an entity and a command, modifying the graphs; and emitting an application prototype. An application prototyping server includes a processor; and a memory storing instructions that, when executed by the processor, cause the server to capture an event, analyze the captured event to generate graphs, receive a natural language utterance, identify an entity and a command, modify the graphs; and emit an application prototype. A non-transitory computer readable medium containing program instructions that when executed, cause a computer to: capture an event, analyze the captured event to generate graphs, receive a natural language utterance, identify an entity and a command, modify the graphs; and emit an application prototype.
    Type: Grant
    Filed: May 27, 2020
    Date of Patent: October 17, 2023
    Assignee: CDW LLC
    Inventor: Joseph Kessler
  • Patent number: 11783810
    Abstract: Illustrative embodiments provide a method and system for communicating air traffic control information. An audio signal comprising voice activity is received. Air traffic control information in the voice activity is identified using an artificial intelligence algorithm. A text transcript of the air traffic control information is generated and displayed on a confirmation display. Voice activity in the audio signal may be detected by identifying portions of the audio signal that comprise speech based on a comparison between the power spectrum of the audio signal and the power spectrum of noise and forming speech segments comprising the portions of the audio signal that comprise speech.
    Type: Grant
    Filed: July 17, 2020
    Date of Patent: October 10, 2023
    Assignee: The Boeing Company
    Inventors: Stephen Dame, Yu Qiao, Taylor A. Riccetti, David J. Ross, Joshua Welshmeyer, Matthew Sheridan-Smith, Su Ying Li, Zarrin Khiang-Huey Chua, Jose A. Medina, Michelle D. Warren, Simran Pabla, Jasper P. Corleis
  • Patent number: 11776554
    Abstract: An audio processor for generating a frequency enhanced audio signal from a source audio signal has: an envelope determiner for determining a temporal envelope of at least a portion of the source audio signal; an analyzer for analyzing the temporal envelope to determine temporal values of certain features of the temporal envelope; a signal synthesizer for generating a synthesis signal, the generating having placing pulses in relation to the determined temporal values, wherein the pulses are weighted using weights derived from amplitudes of the temporal envelope related to the temporal values, where the pulses are placed; and a combiner for combining at least a band of the synthesis signal that is not included in the source audio signal and the source audio signal to obtain the frequency enhanced audio signal.
    Type: Grant
    Filed: May 27, 2021
    Date of Patent: October 3, 2023
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FĂ–RDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Sascha Disch, Michael Sturm
  • Patent number: 11769491
    Abstract: A system configured to perform utterance detection using data processing techniques that are similar to those used for object detection is provided. For example, the system may treat utterances within audio data as analogous to an object represented within an image and employ techniques to separate and identify individual utterances. The system may include one or more trained models that are trained to perform utterance detection. For example, the system may include a first module to process input audio data and identify whether speech is represented in the input audio data, a second module to apply convolution filters, and a third module configured to determine a boundary identifying a beginning and ending of a portion of the input audio data along with an utterance score indicating how closely the portion of the input audio data represents an utterance.
    Type: Grant
    Filed: September 29, 2020
    Date of Patent: September 26, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Abhishek Bafna, Haithem Albadawi
  • Patent number: 11769520
    Abstract: Techniques are provided for evaluating multiple machine learning models to identify issues with a communication. One method comprises applying an audio signal associated with a communication to at least two of: (i) a trigger word analysis module that evaluates contextual information to determine if a trigger word is detected in the audio signal; (ii) an audio activity pattern analysis module that determines if a silence pattern anomaly is detected; and (iii) a communication application analysis module that evaluates features provided by a communication application relative to applicable thresholds; and combining results of the at least two of the trigger word analysis module, the audio activity pattern analysis module and the communication application analysis module to identify a communication issue. The combining may evaluate an accuracy of the trigger word analysis module, the audio activity pattern analysis module and/or the communication application analysis module to combine the results.
    Type: Grant
    Filed: August 17, 2020
    Date of Patent: September 26, 2023
    Assignee: EMC IP Holding Company LLC
    Inventors: Idan Richman Goshen, Shiri Gaber
  • Patent number: 11741986
    Abstract: A method includes obtaining, by an electronic device, an audio segment comprising one or more audio events of a target subject. The method also includes extracting, by the electronic device, audio embeddings from the one or more audio events using an embedding model, the embedding model comprising a trained machine learning model. The method further includes comparing, by the electronic device, the extracted audio embeddings with a match profile of the target subject, the match profile generated during an enrollment stage. The method also includes generating, by the electronic device, a label for the audio segment based on whether or not the extracted audio embeddings match the match profile, wherein the label enables correlation of the audio segment with the target subject for monitoring a health condition of the target subject.
    Type: Grant
    Filed: August 20, 2020
    Date of Patent: August 29, 2023
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Korosh Vatanparvar, Tousif Ahmed, Viswam Nathan, Ebrahim Nematihosseinabadi, Md Mahbubur Rahman, Jilong Kuang, Jun Gao
  • Patent number: 11736428
    Abstract: An approach is provided that receives a message and applies a deep analytic analysis to the message. The deep analytic analysis results in a set of enriched message embedding (EME) data that is passed to a trained neural network. Based on a set of scores received from the trained neural network, a conversation is identified from a number of available conversations to which the received message belongs. The received first message is then associated with the identified conversation.
    Type: Grant
    Filed: June 24, 2019
    Date of Patent: August 22, 2023
    Assignee: International Business Machines Corporation
    Inventors: Devin A. Conley, Priscilla S. Moraes, Lakshminarayanan Krishnamurthy, Oren Sar-Shalom
  • Patent number: 11721319
    Abstract: An artificial intelligence device includes a memory and a processor. The memory is configured to store audio data having a predetermined speech style. The processor is configured to generate a condition vector relating to a condition for determining the speech style of the audio data, reduce a dimension of the condition vector to a predetermined reduction dimension, acquire a sparse code vector based on a dictionary vector acquired through sparse dictionary coding with respect to the condition vector having the predetermined reduction dimension, and change a vector element value included in the sparse code vector.
    Type: Grant
    Filed: February 27, 2020
    Date of Patent: August 8, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Minook Kim, Yongchul Park, Sungmin Han, Siyoung Yang, Sangki Kim, Juyeong Jang