Patents Examined by Alexander G Marlow

Systems and methods for handling multilingual queries

Patent number: 12333268

Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.

Type: Grant

Filed: February 8, 2024

Date of Patent: June 17, 2025

Assignee: ADEIA GUIDES INC.

Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
Internet calling method and apparatus, computer device, and storage medium

Patent number: 12335328

Abstract: This disclosure provides a network call method and apparatus, a computer device, and a storage medium, and belongs to the field of audio data processing. The method includes: performing time-frequency transformation on an acquired audio signal, to obtain a plurality of pieces of frequency domain information of the audio signal; determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information; and encoding the audio signal based on the target bit rate, and performing a network call based on the encoded audio signal.

Type: Grant

Filed: October 21, 2021

Date of Patent: June 17, 2025

Assignee: Tencent Technology (Shenzhen) Company Limited

Inventor: Junbin Liang
System and method for direct speech translation system

Patent number: 12327091

Abstract: A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.

Type: Grant

Filed: January 13, 2020

Date of Patent: June 10, 2025

Assignee: Applications Technology (AppTek), LLC

Inventors: Evgeny Matusov, Jintao Jiang, Mudar Yaghi
Posted information extraction control device and computer readable storage medium

Patent number: 12293150

Abstract: A registree management function receives member (user) registration, carries out a survey upon registration, performs category classification for the registered user, learns the classified categories, and the like. A comment analysis function performs text mining on comments acquired from an SNS posted comment server, determines post origin positions identified by the text mining and the level of credibility thereof, and executes evaluation and the like of a target relating to a theme. An information provision function edits a social heat map generated based on the results of analyzing the comments to be provided to the user, and also performs user category management and the like.

Type: Grant

Filed: June 17, 2019

Date of Patent: May 6, 2025

Assignee: Takenaka Corporation

Inventors: Kuniaki Andou, Rikuto Kunimoto, Takeshi Takai, Kazuo Ohtake
Systems and methods for flexible regularized distillation of natural language processing models to facilitate interpretation

Patent number: 12288029

Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.

Type: Grant

Filed: May 14, 2024

Date of Patent: April 29, 2025

Assignee: Wells Fargo Bank, N.A.

Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
Voice feedback for user interface of media playback device

Patent number: 12283271

Abstract: A method of providing voice feedback to a listener as part of a user interface of a media playback system may include: storing multiple different voice feedback recordings in at least one computer-readable storage device, where each of the multiple different voice feedback recordings is of a different voice artist; receiving a listener command corresponding to a musical selection; determining an identifying musical characteristic of the musical selection; selecting a first voice feedback recording from the multiple different voice feedback recordings, where the first voice feedback recording corresponds to the identifying musical characteristic; and playing the first voice feedback recording to the listener via the media playback system.

Type: Grant

Filed: May 18, 2021

Date of Patent: April 22, 2025

Assignee: Spotify AB

Inventor: Sten Garmark
Unsupervised parallel tacotron non-autoregressive and controllable text-to-speech

Patent number: 12249315

Abstract: A method for training a non-autoregressive TTS model includes obtaining a sequence representation of an encoded text sequence concatenated with a variational embedding. The method also includes using a duration model network to predict a phoneme duration for each phoneme represented by the encoded text sequence. Based on the predicted phoneme durations, the method also includes learning an interval representation and an auxiliary attention context representation. The method also includes upsampling, using the interval representation and the auxiliary attention context representation, the sequence representation into an upsampled output specifying a number of frames. The method also includes generating, based on the upsampled output, one or more predicted mel-frequency spectrogram sequences for the encoded text sequence.

Type: Grant

Filed: October 31, 2023

Date of Patent: March 11, 2025

Assignee: Google LLC

Inventors: Isaac Elias, Byungha Chun, Jonathan Shen, Ye Jia, Yu Zhang, Yonghui Wu
Sub-models for neural contextual biasing

Patent number: 12230258

Abstract: A method for contextual biasing for speech recognition includes obtaining a base automatic speech recognition (ASR) model trained on non-biased data and a sub-model trained on biased data representative of a particular domain. The method includes receiving a speech recognition request including audio data characterizing an utterance captured in streaming audio. The method further includes determining whether the speech recognition request includes a contextual indicator indicating the particular domain. When the speech recognition request does not include the contextual indicator, the method includes generating, using the base ASR model, a first speech recognition result of the utterance by processing the audio data.

Type: Grant

Filed: April 19, 2022

Date of Patent: February 18, 2025

Assignee: Google LLC

Inventors: Fadi Biadsy, Pedro J. Moreno Mengibar
Self-supervised speech representations for fake audio detection

Patent number: 12198718

Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.

Type: Grant

Filed: August 9, 2023

Date of Patent: January 14, 2025

Assignee: Google LLC

Inventors: Joel Shor, Alanna Foster Slocum
Method and system for obtaining similarity rates between electronic documents

Patent number: 12124798

Abstract: A method is disclosed for calculating similarity rates between electronic documents. The similarity rate is calculated based on a count of matching phrases between the electronic documents and distances between subsequent matching phrases in each of the electronic documents. A system is also disclosed for comparing the electronic documents to obtain their similarity rates. A computing device determines at least one first proximity parameter based on the number of matched words in a matching phrase and at least one second proximity parameter based on distances between the subsequent matching phrases in each of the electronic documents. The similarity rate is determined based on the first and second proximity parameters.

Type: Grant

Filed: August 30, 2021

Date of Patent: October 22, 2024

Assignee: KYOCERA DOCUMENT SOLUTIONS INC.

Inventor: Oleg Y. Zakharov
Concurrent multi-path processing of audio signals for automatic speech recognition systems

Patent number: 12080274

Abstract: A system and method for concurrent multi-path processing of audio signals for automatic speech recognition is presented. Audio information defining a set of audio signals may be obtained (502). The audio signals may convey mixed audio content produced by multiple audio sources. A set of source-specific audio signals may be determined by demixing the mixed audio content produced by the multiple audio sources. Determining the set of source-specific audio signals may comprises providing the set of audio signals to both a first signal processing path and a second signal processing path (504). The first signal processing path may determine a value of a demixing parameter for demixing the mixed audio content (506). The second signal processing path may apply the value of the demixing parameter to the individual audio signals of the set of audio signals (508) to generate the individual source-specific audio signals (510).

Type: Grant

Filed: February 28, 2019

Date of Patent: September 3, 2024

Assignee: Beijing DiDi Infinity Technology and Development Co., Ltd.

Inventors: Yi Zhang, Hui Song, Yongtao Sha, Chengyun Deng
Systems and methods for natural language processing (NLP) model robustness determination

Patent number: 12073181

Abstract: Systems, apparatuses, methods, and computer program products are disclosed for determining robustness information for an NLP model. Modification rules, such as replacement rules and/or insertion rules, are used to generate instances of modified test data based on instances of test data that comprise words and have a syntax and a semantic meaning. The instances of test data and modified test data are provided to the NLP model and the output of the NLP model is analyzed to determine output changing instances of modified test data, which are instances of modified test data yielded output from the NLP model that is different and/or not similar to the output yielded from the NLP model for the corresponding instance of test data. Robustness information for the NLP model is determined based at least in part on the output changing instances of modified test data.

Type: Grant

Filed: April 21, 2023

Date of Patent: August 27, 2024

Assignee: Wells Fargo Bank, N.A.

Inventors: Tarun Joshi, Rahul Singh, Vijayan Nair, Agus Sudjianto
Providing a well-formed alternate phrase as a suggestion in lieu of a not well-formed phrase

Patent number: 12019999

Abstract: Implementations relate to determining a well-formed phrase to suggest to a user to submit in lieu of a not well-formed phrase. The suggestion is rendered via an interface that is provided to a client device of the user. Those implementations relate to determining that a phrase is not well-formed, identifying alternate phrases that are related to the not well-formed phrase, and scoring the alternate phrases to select one or more of the alternate phrases to render via the interface. Some of those implementations are related to identifying that the phrase is not well-formed based on occurrences of the phrase in documents that are generated by a source with the language of the phrase as the primary language of the creator.

Type: Grant

Filed: June 18, 2021

Date of Patent: June 25, 2024

Assignee: GOOGLE LLC

Inventors: Wangqing Yuan, David Kogan, Vincent Lacey, Guanglei Wang, Shaun Post, Bryan Christopher Horling, Michael Anthony Schuler
Systems and methods for flexible regularized distillation of natural language processing models to facilitate interpretation

Patent number: 12019987

Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.

Type: Grant

Filed: April 28, 2021

Date of Patent: June 25, 2024

Assignee: Wells Fargo Bank, N.A.

Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
Voice processing method, electronic device, and storage medium

Patent number: 12014730

Abstract: A voice processing method includes: collecting a voice signal by a microphone of an electronic device, and signal-processing the collected voice signal to obtain a first voice frame segment; performing voice recognition on the first voice frame segment to obtain a first recognition result; in response to the first recognition result not matching a target content and a plurality of tokens in the first recognition result meeting a preset condition, performing frame compensation on the first voice frame segment to obtain a second voice frame segment; and performing voice recognition on the second voice frame segment to obtain a second recognition result. A matching degree between the second recognition result and the target content is greater than a matching degree between the first recognition result and the target content.

Type: Grant

Filed: May 17, 2021

Date of Patent: June 18, 2024

Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.

Inventor: Xiangyan Xu
System and method of automatic topic detection in text

Patent number: 12001797

Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.

Type: Grant

Filed: May 12, 2021

Date of Patent: June 4, 2024

Inventors: Eyal Orbach, Avraham Faizakof, Arnon Mazza, Lev Haikin
Adaptive language translation using context features

Patent number: 11947925

Abstract: A user input in a source language is received. A set of contextual data is received. The user input is encoded into a user input feature vector. The set of contextual data is encoded into a context feature vector. The user input feature vector and the context feature vector are used to generate a fusion vector. An adaptive neural network is trained to identify a second context feature vector, based on the fusion vector. A second user input in the source language is received for translation into a target language. The adaptive neural network is used to determine, based on the second context feature vector, a second user input feature vector. The second user input feature vector is decoded, based on the source language and the target language, into a target language output. A user is notified of the target language output.

Type: Grant

Filed: May 21, 2020

Date of Patent: April 2, 2024

Assignee: International Business Machines Corporation

Inventors: Lei Mei, Kun Yan Yin, Yan Hu, Qi Ruan, Yan Feng Han
Speech decoding method and apparatus, computer device, and storage medium

Patent number: 11935517

Abstract: A speech decoding method is performed by a computer device, the speech including a current audio frame and a previous audio frame. The method includes: obtaining a target token corresponding to a smallest decoding score from a first token list including first tokens obtained by decoding the previous audio frame, each first token including a state pair and a decoding score, the state pair being used for characterizing a correspondence between a first state of the first token in a first decoding network corresponding to a low-order language model and a second state of the first token in a second decoding network corresponding to a differential language model; determining pruning parameters according to the target token and an acoustic vector of the current audio frame when the current audio frame is decoded; and decoding the current audio frame according to the first token list, the pruning parameters, and the acoustic vector.

Type: Grant

Filed: March 3, 2021

Date of Patent: March 19, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Yiheng Huang, Xiaozheng Jian, Liqiang He
Systems and methods for handling multilingual queries

Patent number: 11928440

Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.

Type: Grant

Filed: August 25, 2020

Date of Patent: March 12, 2024

Assignee: Rovi Guides, Inc.

Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
Interfacing with applications via dynamically updating natural language processing

Patent number: 11893993

Abstract: Dynamic interfacing with applications is provided. For example, a system receives a first input audio signal. The system processes, via a natural language processing technique, the first input audio signal to identify an application. The system activates the application for execution on the client computing device. The application declares a function the application is configured to perform. The system modifies the natural language processing technique responsive to the function declared by the application. The system receives a second input audio signal. The system processes, via the modified natural language processing technique, the second input audio signal to detect one or more parameters. The system determines that the one or more parameters are compatible for input into an input field of the application. The system generates an action data structure for the application. The system inputs the action data structure into the application, which executes the action data structure.

Type: Grant

Filed: November 28, 2022

Date of Patent: February 6, 2024

Assignee: GOOGLE LLC

Inventors: Quazi Hussain, Adam Coimbra, Ilya Firman

1 2 3 next