Patents Examined by Richemond Dorvil
-
Patent number: 12027162Abstract: Teacher-student learning can be used to train a keyword spotting (KWS) model using augmented training instance(s). Various implementations include aggressively augmenting (e.g., using spectral augmentation) base audio data to generate augmented audio data, where one or more portions of the base instance of audio data can be masked in the augmented instance of audio data (e.g., one or more time frames can be masked, one or more frequencies can be masked, etc.). Many implementations include processing augmented audio data using a KWS teacher model to generate a soft label, and processing the augmented audio data using a KWS student model to generate predicted output. One or more portions of the KWS student model can be updated based on a comparison of the soft label and the generated predicted output.Type: GrantFiled: March 3, 2021Date of Patent: July 2, 2024Assignee: GOOGLE LLCInventors: Hyun Jin Park, Pai Zhu, Ignacio Lopez Moreno, Niranjan Subrahmanya
-
Patent number: 12019999Abstract: Implementations relate to determining a well-formed phrase to suggest to a user to submit in lieu of a not well-formed phrase. The suggestion is rendered via an interface that is provided to a client device of the user. Those implementations relate to determining that a phrase is not well-formed, identifying alternate phrases that are related to the not well-formed phrase, and scoring the alternate phrases to select one or more of the alternate phrases to render via the interface. Some of those implementations are related to identifying that the phrase is not well-formed based on occurrences of the phrase in documents that are generated by a source with the language of the phrase as the primary language of the creator.Type: GrantFiled: June 18, 2021Date of Patent: June 25, 2024Assignee: GOOGLE LLCInventors: Wangqing Yuan, David Kogan, Vincent Lacey, Guanglei Wang, Shaun Post, Bryan Christopher Horling, Michael Anthony Schuler
-
Patent number: 12019987Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.Type: GrantFiled: April 28, 2021Date of Patent: June 25, 2024Assignee: Wells Fargo Bank, N.A.Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
-
Patent number: 12020683Abstract: A real-time name mispronunciation detection feature can enable a user to receive instant feedback anytime they have mispronounced another person's name in an online meeting. The feature can receive audio input of a speaker and obtain a transcript of the audio input; identify a name from text of the transcript based on names of meeting participants; and extract a portion of the audio input corresponding to the name identified from the text of the transcript. The feature can obtain a reference pronunciation for the name using a user identifier associated with the name; and can obtain a pronunciation score for the name based on a comparison between the reference pronunciation for the name and the portion of the audio input corresponding to the name. The feature can then determine whether the pronunciation score is below a threshold; and in response, notify the speaker of a pronunciation error.Type: GrantFiled: October 28, 2021Date of Patent: June 25, 2024Assignee: Microsoft Technology Licensing, LLCInventors: Tapan Bohra, Akshay Mallipeddi, Amit Srivastava, Ana Karen Parra
-
Patent number: 12014143Abstract: In various embodiments, a phrase grounding model automatically performs phrase grounding for a source sentence and a source image. The phrase grounding model determines that a first phrase included in the source sentence matches a first region of the source image based on the first phrase and at least a second phrase included in the source sentence. The phrase grounding model then generates a matched pair that specifies the first phrase and the first region. Subsequently, one or more annotation operations are performed on the source image based on the matched pair. Advantageously, the accuracy of the phrase grounding model is increased relative to prior art solutions where the interrelationships between phrases are typically disregarded.Type: GrantFiled: February 25, 2019Date of Patent: June 18, 2024Assignees: DISNEY ENTERPRISES, INC., ETH Zürich (Eidgenössische Technische Hochschule Zürich)Inventors: Pelin Dogan, Leonid Sigal, Markus Gross
-
Patent number: 12014730Abstract: A voice processing method includes: collecting a voice signal by a microphone of an electronic device, and signal-processing the collected voice signal to obtain a first voice frame segment; performing voice recognition on the first voice frame segment to obtain a first recognition result; in response to the first recognition result not matching a target content and a plurality of tokens in the first recognition result meeting a preset condition, performing frame compensation on the first voice frame segment to obtain a second voice frame segment; and performing voice recognition on the second voice frame segment to obtain a second recognition result. A matching degree between the second recognition result and the target content is greater than a matching degree between the first recognition result and the target content.Type: GrantFiled: May 17, 2021Date of Patent: June 18, 2024Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.Inventor: Xiangyan Xu
-
Patent number: 12001797Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.Type: GrantFiled: May 12, 2021Date of Patent: June 4, 2024Inventors: Eyal Orbach, Avraham Faizakof, Arnon Mazza, Lev Haikin
-
Patent number: 11983488Abstract: Disclosed herein are methods, systems, and computer-readable media for automatically generating and editing text. In an embodiment, a method may include receiving an input text prompt and receiving one or more user instructions. The method may also include accessing a language model based on the input text prompt and the one or more user instructions. The method may also include outputting, using the accessed language model, language model output text. The method may also include editing the input text prompt based on the language model and the one or more user instructions by replacing at least a portion of the input text prompt with the language model output text.Type: GrantFiled: March 14, 2023Date of Patent: May 14, 2024Assignee: OpenAI OpCo, LLCInventors: Raul Puri, Qiming Yuan, Alexander Paino, Nikolas Tezak, Nicholas Ryder
-
Patent number: 11972219Abstract: This application discloses an intent recognition optimization processing method, apparatus, device and storage medium, and relates to the field of internet technology. The implementation scheme of specific method includes: acquiring a first intent set and at least one original corpus; acquiring a first recognition result of each original corpus, where the first recognition result of any one of the original corpus includes a first intent corresponding to the original corpus recognized by the intent recognition model; acquiring a second recognition result of each original corpus, where the second recognition result of any one of the original corpus includes a second intent corresponding to the original corpus obtained through artificial recognition; and performing optimization processing on the first intent set to obtain a second intent set according to the first recognition result and the second recognition result of each original corpus.Type: GrantFiled: December 30, 2020Date of Patent: April 30, 2024Assignees: Beijing Baidu Netcom Science Technology Co., Ltd., BAIDU USA LLCInventors: Zeyu Ning, Xuchen Yao, Wenhao Fang, Bo Fu, Liqin Feng, Xiaomei Chu
-
Patent number: 11967327Abstract: A method and a decoder device of generating a concealment audio subframe of an audio signal are provided. The method comprises generating frequency spectra on a subframe basis where consecutive subframes of the audio signal have a property that an applied window shape of first subframe of the consecutive subframes is a mirrored version or a time reversed version of a second subframe of the consecutive subframes. Peaks of a signal spectrum of a previously received audio signal are detected for a concealment subframe, and a phase of each of the peaks is estimated. A time reversed phase adjustment is derived based on the estimated phase and applied to the peaks of the signal spectrum to form time reversed phase adjusted peaks.Type: GrantFiled: June 4, 2020Date of Patent: April 23, 2024Assignee: Telefonaktiebolaget LM Ericsson (publ)Inventors: Erik Norvell, Chamran Moradi Ashour
-
Patent number: 11961523Abstract: Systems and methods are provided for optimizing and securing an enterprise voice service accessed by an external voice assistant device. An enterprise voice assistant installed on a client device acts as an enterprise voice service for an external voice assistant device. The enterprise voice assistant receives a voice query from the external voice assistant device. The voice query is processed using a machine learning model to extract an intent and at least one slot. The extracted intent and at least one slot are used to determine whether a response to the voice query can be generated using local enterprise data that was previously received and stored by the client device from a management server. The response is generated based on the determination by using the local enterprise data or by sending the extracted intent and at least one slot to and receiving the response from the management server.Type: GrantFiled: September 9, 2020Date of Patent: April 16, 2024Assignee: VMware, Inc.Inventors: Suman Aluvala, Ramani Panchapakesan, Rohit Pradeep Shetty, Arjun Kochhar
-
Patent number: 11948566Abstract: The present disclosure describes systems and methods for extensible search, content, and dialog management. Embodiments of the present disclosure provide a dialog system with a trained intent recognition model (e.g., a deep learning model) to receive and understand a natural language query from a user. In cases where intent is not identified for a received query, the dialog system generates one or more candidate responses that may be refined (e.g., using human-in-the-loop curation) to generate a response. The intent recognition model may be updated (e.g., retrained) the accordingly. Upon receiving a subsequent query with similar intent, the dialog system may identify the intent using the updated intent recognition model.Type: GrantFiled: March 24, 2021Date of Patent: April 2, 2024Assignee: ADOBE INC.Inventors: Oliver Brdiczka, Kyoung Tak Kim, Charat Maheshwari
-
Patent number: 11947925Abstract: A user input in a source language is received. A set of contextual data is received. The user input is encoded into a user input feature vector. The set of contextual data is encoded into a context feature vector. The user input feature vector and the context feature vector are used to generate a fusion vector. An adaptive neural network is trained to identify a second context feature vector, based on the fusion vector. A second user input in the source language is received for translation into a target language. The adaptive neural network is used to determine, based on the second context feature vector, a second user input feature vector. The second user input feature vector is decoded, based on the source language and the target language, into a target language output. A user is notified of the target language output.Type: GrantFiled: May 21, 2020Date of Patent: April 2, 2024Assignee: International Business Machines CorporationInventors: Lei Mei, Kun Yan Yin, Yan Hu, Qi Ruan, Yan Feng Han
-
Patent number: 11948561Abstract: A signal processing method to determine whether or not a detected key-phrase is spoken by a wearer of a headphone. The method receives an accelerometer signal from an accelerometer in a headphone and receives a microphone signal from at least one microphone in the headphone. The method detects a key-phrase using the microphone signal and generates a voice activity detection (VAD) signal based on the accelerometer signal. The method determines whether the VAD signal indicates that the detected key-phrase is spoken by a wearer of the headphone. Responsive to determining that the VAD signal indicates that the detected key-phrase is spoken by the wearer of the headphone, triggering a virtual personal assistant (VPA).Type: GrantFiled: October 28, 2019Date of Patent: April 2, 2024Assignee: Apple Inc.Inventors: Sorin V. Dusan, Sungyub D. Yoo, Dubravko Biruski
-
Patent number: 11942070Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.Type: GrantFiled: January 29, 2021Date of Patent: March 26, 2024Assignee: International Business Machines CorporationInventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
-
Patent number: 11935517Abstract: A speech decoding method is performed by a computer device, the speech including a current audio frame and a previous audio frame. The method includes: obtaining a target token corresponding to a smallest decoding score from a first token list including first tokens obtained by decoding the previous audio frame, each first token including a state pair and a decoding score, the state pair being used for characterizing a correspondence between a first state of the first token in a first decoding network corresponding to a low-order language model and a second state of the first token in a second decoding network corresponding to a differential language model; determining pruning parameters according to the target token and an acoustic vector of the current audio frame when the current audio frame is decoded; and decoding the current audio frame according to the first token list, the pruning parameters, and the acoustic vector.Type: GrantFiled: March 3, 2021Date of Patent: March 19, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Yiheng Huang, Xiaozheng Jian, Liqiang He
-
Patent number: 11935523Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.Type: GrantFiled: November 15, 2019Date of Patent: March 19, 2024Assignee: Master English OyInventor: Aleksandr Diment
-
Patent number: 11928440Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.Type: GrantFiled: August 25, 2020Date of Patent: March 12, 2024Assignee: Rovi Guides, Inc.Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
-
Patent number: 11929089Abstract: An apparatus for processing a multichannel audio signal has a plurality of channel signals. The apparatus performs a time scale modulation of the multichannel audio signal and has a phase adaptor and a separator. The phase adaptor provides a processed signal by modifying a phase of a signal based on a combination of the channel signals. The separator provides separated signals based on the processed signal. A corresponding method is provided.Type: GrantFiled: October 31, 2018Date of Patent: March 12, 2024Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FÖRDERUNG DER ANGEWANDTEN FORSCHUNG E.V.Inventors: Christian Uhle, Michael Kratz, Paul Klose, Timothy Leonard, André Luvizotto, Sebastian Scharrer
-
Patent number: 11922141Abstract: Systems and methods are disclosed for a voice/chatbot building system. The voice/chatbot builder may involve receiving an identified intent, receiving a task related to the identified intent, and receiving a response related to both the identified intent and the task. The identified intent, task, and response may form a first conversation. The first conversation may be linked to other conversations to establish contextual relationships among conversations and determine conversation priority. Voice/chatbot building may also train natural language processing machine learning algorithms.Type: GrantFiled: January 29, 2021Date of Patent: March 5, 2024Assignee: Walmart Apollo, LLCInventors: John Brian Moss, Don Bambico, Jason Charles Benesch, Snehasish Mukherjee