Patents Examined by Richemond Dorvil
  • Patent number: 12027162
    Abstract: Teacher-student learning can be used to train a keyword spotting (KWS) model using augmented training instance(s). Various implementations include aggressively augmenting (e.g., using spectral augmentation) base audio data to generate augmented audio data, where one or more portions of the base instance of audio data can be masked in the augmented instance of audio data (e.g., one or more time frames can be masked, one or more frequencies can be masked, etc.). Many implementations include processing augmented audio data using a KWS teacher model to generate a soft label, and processing the augmented audio data using a KWS student model to generate predicted output. One or more portions of the KWS student model can be updated based on a comparison of the soft label and the generated predicted output.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: July 2, 2024
    Assignee: GOOGLE LLC
    Inventors: Hyun Jin Park, Pai Zhu, Ignacio Lopez Moreno, Niranjan Subrahmanya
  • Patent number: 12019999
    Abstract: Implementations relate to determining a well-formed phrase to suggest to a user to submit in lieu of a not well-formed phrase. The suggestion is rendered via an interface that is provided to a client device of the user. Those implementations relate to determining that a phrase is not well-formed, identifying alternate phrases that are related to the not well-formed phrase, and scoring the alternate phrases to select one or more of the alternate phrases to render via the interface. Some of those implementations are related to identifying that the phrase is not well-formed based on occurrences of the phrase in documents that are generated by a source with the language of the phrase as the primary language of the creator.
    Type: Grant
    Filed: June 18, 2021
    Date of Patent: June 25, 2024
    Assignee: GOOGLE LLC
    Inventors: Wangqing Yuan, David Kogan, Vincent Lacey, Guanglei Wang, Shaun Post, Bryan Christopher Horling, Michael Anthony Schuler
  • Patent number: 12019987
    Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.
    Type: Grant
    Filed: April 28, 2021
    Date of Patent: June 25, 2024
    Assignee: Wells Fargo Bank, N.A.
    Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
  • Patent number: 12020683
    Abstract: A real-time name mispronunciation detection feature can enable a user to receive instant feedback anytime they have mispronounced another person's name in an online meeting. The feature can receive audio input of a speaker and obtain a transcript of the audio input; identify a name from text of the transcript based on names of meeting participants; and extract a portion of the audio input corresponding to the name identified from the text of the transcript. The feature can obtain a reference pronunciation for the name using a user identifier associated with the name; and can obtain a pronunciation score for the name based on a comparison between the reference pronunciation for the name and the portion of the audio input corresponding to the name. The feature can then determine whether the pronunciation score is below a threshold; and in response, notify the speaker of a pronunciation error.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: June 25, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Tapan Bohra, Akshay Mallipeddi, Amit Srivastava, Ana Karen Parra
  • Patent number: 12014143
    Abstract: In various embodiments, a phrase grounding model automatically performs phrase grounding for a source sentence and a source image. The phrase grounding model determines that a first phrase included in the source sentence matches a first region of the source image based on the first phrase and at least a second phrase included in the source sentence. The phrase grounding model then generates a matched pair that specifies the first phrase and the first region. Subsequently, one or more annotation operations are performed on the source image based on the matched pair. Advantageously, the accuracy of the phrase grounding model is increased relative to prior art solutions where the interrelationships between phrases are typically disregarded.
    Type: Grant
    Filed: February 25, 2019
    Date of Patent: June 18, 2024
    Assignees: DISNEY ENTERPRISES, INC., ETH Zürich (Eidgenössische Technische Hochschule Zürich)
    Inventors: Pelin Dogan, Leonid Sigal, Markus Gross
  • Patent number: 12014730
    Abstract: A voice processing method includes: collecting a voice signal by a microphone of an electronic device, and signal-processing the collected voice signal to obtain a first voice frame segment; performing voice recognition on the first voice frame segment to obtain a first recognition result; in response to the first recognition result not matching a target content and a plurality of tokens in the first recognition result meeting a preset condition, performing frame compensation on the first voice frame segment to obtain a second voice frame segment; and performing voice recognition on the second voice frame segment to obtain a second recognition result. A matching degree between the second recognition result and the target content is greater than a matching degree between the first recognition result and the target content.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: June 18, 2024
    Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.
    Inventor: Xiangyan Xu
  • Patent number: 12001797
    Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.
    Type: Grant
    Filed: May 12, 2021
    Date of Patent: June 4, 2024
    Inventors: Eyal Orbach, Avraham Faizakof, Arnon Mazza, Lev Haikin
  • Patent number: 11983488
    Abstract: Disclosed herein are methods, systems, and computer-readable media for automatically generating and editing text. In an embodiment, a method may include receiving an input text prompt and receiving one or more user instructions. The method may also include accessing a language model based on the input text prompt and the one or more user instructions. The method may also include outputting, using the accessed language model, language model output text. The method may also include editing the input text prompt based on the language model and the one or more user instructions by replacing at least a portion of the input text prompt with the language model output text.
    Type: Grant
    Filed: March 14, 2023
    Date of Patent: May 14, 2024
    Assignee: OpenAI OpCo, LLC
    Inventors: Raul Puri, Qiming Yuan, Alexander Paino, Nikolas Tezak, Nicholas Ryder
  • Patent number: 11972219
    Abstract: This application discloses an intent recognition optimization processing method, apparatus, device and storage medium, and relates to the field of internet technology. The implementation scheme of specific method includes: acquiring a first intent set and at least one original corpus; acquiring a first recognition result of each original corpus, where the first recognition result of any one of the original corpus includes a first intent corresponding to the original corpus recognized by the intent recognition model; acquiring a second recognition result of each original corpus, where the second recognition result of any one of the original corpus includes a second intent corresponding to the original corpus obtained through artificial recognition; and performing optimization processing on the first intent set to obtain a second intent set according to the first recognition result and the second recognition result of each original corpus.
    Type: Grant
    Filed: December 30, 2020
    Date of Patent: April 30, 2024
    Assignees: Beijing Baidu Netcom Science Technology Co., Ltd., BAIDU USA LLC
    Inventors: Zeyu Ning, Xuchen Yao, Wenhao Fang, Bo Fu, Liqin Feng, Xiaomei Chu
  • Patent number: 11967327
    Abstract: A method and a decoder device of generating a concealment audio subframe of an audio signal are provided. The method comprises generating frequency spectra on a subframe basis where consecutive subframes of the audio signal have a property that an applied window shape of first subframe of the consecutive subframes is a mirrored version or a time reversed version of a second subframe of the consecutive subframes. Peaks of a signal spectrum of a previously received audio signal are detected for a concealment subframe, and a phase of each of the peaks is estimated. A time reversed phase adjustment is derived based on the estimated phase and applied to the peaks of the signal spectrum to form time reversed phase adjusted peaks.
    Type: Grant
    Filed: June 4, 2020
    Date of Patent: April 23, 2024
    Assignee: Telefonaktiebolaget LM Ericsson (publ)
    Inventors: Erik Norvell, Chamran Moradi Ashour
  • Patent number: 11961523
    Abstract: Systems and methods are provided for optimizing and securing an enterprise voice service accessed by an external voice assistant device. An enterprise voice assistant installed on a client device acts as an enterprise voice service for an external voice assistant device. The enterprise voice assistant receives a voice query from the external voice assistant device. The voice query is processed using a machine learning model to extract an intent and at least one slot. The extracted intent and at least one slot are used to determine whether a response to the voice query can be generated using local enterprise data that was previously received and stored by the client device from a management server. The response is generated based on the determination by using the local enterprise data or by sending the extracted intent and at least one slot to and receiving the response from the management server.
    Type: Grant
    Filed: September 9, 2020
    Date of Patent: April 16, 2024
    Assignee: VMware, Inc.
    Inventors: Suman Aluvala, Ramani Panchapakesan, Rohit Pradeep Shetty, Arjun Kochhar
  • Patent number: 11948566
    Abstract: The present disclosure describes systems and methods for extensible search, content, and dialog management. Embodiments of the present disclosure provide a dialog system with a trained intent recognition model (e.g., a deep learning model) to receive and understand a natural language query from a user. In cases where intent is not identified for a received query, the dialog system generates one or more candidate responses that may be refined (e.g., using human-in-the-loop curation) to generate a response. The intent recognition model may be updated (e.g., retrained) the accordingly. Upon receiving a subsequent query with similar intent, the dialog system may identify the intent using the updated intent recognition model.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: April 2, 2024
    Assignee: ADOBE INC.
    Inventors: Oliver Brdiczka, Kyoung Tak Kim, Charat Maheshwari
  • Patent number: 11947925
    Abstract: A user input in a source language is received. A set of contextual data is received. The user input is encoded into a user input feature vector. The set of contextual data is encoded into a context feature vector. The user input feature vector and the context feature vector are used to generate a fusion vector. An adaptive neural network is trained to identify a second context feature vector, based on the fusion vector. A second user input in the source language is received for translation into a target language. The adaptive neural network is used to determine, based on the second context feature vector, a second user input feature vector. The second user input feature vector is decoded, based on the source language and the target language, into a target language output. A user is notified of the target language output.
    Type: Grant
    Filed: May 21, 2020
    Date of Patent: April 2, 2024
    Assignee: International Business Machines Corporation
    Inventors: Lei Mei, Kun Yan Yin, Yan Hu, Qi Ruan, Yan Feng Han
  • Patent number: 11948561
    Abstract: A signal processing method to determine whether or not a detected key-phrase is spoken by a wearer of a headphone. The method receives an accelerometer signal from an accelerometer in a headphone and receives a microphone signal from at least one microphone in the headphone. The method detects a key-phrase using the microphone signal and generates a voice activity detection (VAD) signal based on the accelerometer signal. The method determines whether the VAD signal indicates that the detected key-phrase is spoken by a wearer of the headphone. Responsive to determining that the VAD signal indicates that the detected key-phrase is spoken by the wearer of the headphone, triggering a virtual personal assistant (VPA).
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: April 2, 2024
    Assignee: Apple Inc.
    Inventors: Sorin V. Dusan, Sungyub D. Yoo, Dubravko Biruski
  • Patent number: 11942070
    Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
  • Patent number: 11935517
    Abstract: A speech decoding method is performed by a computer device, the speech including a current audio frame and a previous audio frame. The method includes: obtaining a target token corresponding to a smallest decoding score from a first token list including first tokens obtained by decoding the previous audio frame, each first token including a state pair and a decoding score, the state pair being used for characterizing a correspondence between a first state of the first token in a first decoding network corresponding to a low-order language model and a second state of the first token in a second decoding network corresponding to a differential language model; determining pruning parameters according to the target token and an acoustic vector of the current audio frame when the current audio frame is decoded; and decoding the current audio frame according to the first token list, the pruning parameters, and the acoustic vector.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: March 19, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Yiheng Huang, Xiaozheng Jian, Liqiang He
  • Patent number: 11935523
    Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 19, 2024
    Assignee: Master English Oy
    Inventor: Aleksandr Diment
  • Patent number: 11928440
    Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.
    Type: Grant
    Filed: August 25, 2020
    Date of Patent: March 12, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
  • Patent number: 11929089
    Abstract: An apparatus for processing a multichannel audio signal has a plurality of channel signals. The apparatus performs a time scale modulation of the multichannel audio signal and has a phase adaptor and a separator. The phase adaptor provides a processed signal by modifying a phase of a signal based on a combination of the channel signals. The separator provides separated signals based on the processed signal. A corresponding method is provided.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 12, 2024
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Christian Uhle, Michael Kratz, Paul Klose, Timothy Leonard, André Luvizotto, Sebastian Scharrer
  • Patent number: 11922141
    Abstract: Systems and methods are disclosed for a voice/chatbot building system. The voice/chatbot builder may involve receiving an identified intent, receiving a task related to the identified intent, and receiving a response related to both the identified intent and the task. The identified intent, task, and response may form a first conversation. The first conversation may be linked to other conversations to establish contextual relationships among conversations and determine conversation priority. Voice/chatbot building may also train natural language processing machine learning algorithms.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: March 5, 2024
    Assignee: Walmart Apollo, LLC
    Inventors: John Brian Moss, Don Bambico, Jason Charles Benesch, Snehasish Mukherjee