Patents Examined by Alexander G Marlow
-
Patent number: 12333268Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.Type: GrantFiled: February 8, 2024Date of Patent: June 17, 2025Assignee: ADEIA GUIDES INC.Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
-
Patent number: 12335328Abstract: This disclosure provides a network call method and apparatus, a computer device, and a storage medium, and belongs to the field of audio data processing. The method includes: performing time-frequency transformation on an acquired audio signal, to obtain a plurality of pieces of frequency domain information of the audio signal; determining a target bit rate corresponding to the audio signal according to the plurality of pieces of frequency domain information; and encoding the audio signal based on the target bit rate, and performing a network call based on the encoded audio signal.Type: GrantFiled: October 21, 2021Date of Patent: June 17, 2025Assignee: Tencent Technology (Shenzhen) Company LimitedInventor: Junbin Liang
-
Patent number: 12327091Abstract: A system for translating speech from at least two source languages into another target language provides direct speech to target language translation. The target text is converted to speech in the target language through a TTS system. The system simplifies speech recognition and translation process by providing direct translation, includes mechanisms described herein that facilitate mixed language source speech translation, and punctuating output text streams in the target language. It also in some embodiments allows translation of speech into the target language to reflect the voice of the speaker of the source speech based on characteristics of the source language speech and speaker's voice and to produce subtitled data in the target language corresponding to the source speech. The system uses models having been trained using (i) encoder-decoder architectures with attention mechanisms and training data using TTS and (ii) parallel text training data in more than two different languages.Type: GrantFiled: January 13, 2020Date of Patent: June 10, 2025Assignee: Applications Technology (AppTek), LLCInventors: Evgeny Matusov, Jintao Jiang, Mudar Yaghi
-
Patent number: 12293150Abstract: A registree management function receives member (user) registration, carries out a survey upon registration, performs category classification for the registered user, learns the classified categories, and the like. A comment analysis function performs text mining on comments acquired from an SNS posted comment server, determines post origin positions identified by the text mining and the level of credibility thereof, and executes evaluation and the like of a target relating to a theme. An information provision function edits a social heat map generated based on the results of analyzing the comments to be provided to the user, and also performs user category management and the like.Type: GrantFiled: June 17, 2019Date of Patent: May 6, 2025Assignee: Takenaka CorporationInventors: Kuniaki Andou, Rikuto Kunimoto, Takeshi Takai, Kazuo Ohtake
-
Patent number: 12288029Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.Type: GrantFiled: May 14, 2024Date of Patent: April 29, 2025Assignee: Wells Fargo Bank, N.A.Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
-
Patent number: 12283271Abstract: A method of providing voice feedback to a listener as part of a user interface of a media playback system may include: storing multiple different voice feedback recordings in at least one computer-readable storage device, where each of the multiple different voice feedback recordings is of a different voice artist; receiving a listener command corresponding to a musical selection; determining an identifying musical characteristic of the musical selection; selecting a first voice feedback recording from the multiple different voice feedback recordings, where the first voice feedback recording corresponds to the identifying musical characteristic; and playing the first voice feedback recording to the listener via the media playback system.Type: GrantFiled: May 18, 2021Date of Patent: April 22, 2025Assignee: Spotify ABInventor: Sten Garmark
-
Patent number: 12249315Abstract: A method for training a non-autoregressive TTS model includes obtaining a sequence representation of an encoded text sequence concatenated with a variational embedding. The method also includes using a duration model network to predict a phoneme duration for each phoneme represented by the encoded text sequence. Based on the predicted phoneme durations, the method also includes learning an interval representation and an auxiliary attention context representation. The method also includes upsampling, using the interval representation and the auxiliary attention context representation, the sequence representation into an upsampled output specifying a number of frames. The method also includes generating, based on the upsampled output, one or more predicted mel-frequency spectrogram sequences for the encoded text sequence.Type: GrantFiled: October 31, 2023Date of Patent: March 11, 2025Assignee: Google LLCInventors: Isaac Elias, Byungha Chun, Jonathan Shen, Ye Jia, Yu Zhang, Yonghui Wu
-
Patent number: 12230258Abstract: A method for contextual biasing for speech recognition includes obtaining a base automatic speech recognition (ASR) model trained on non-biased data and a sub-model trained on biased data representative of a particular domain. The method includes receiving a speech recognition request including audio data characterizing an utterance captured in streaming audio. The method further includes determining whether the speech recognition request includes a contextual indicator indicating the particular domain. When the speech recognition request does not include the contextual indicator, the method includes generating, using the base ASR model, a first speech recognition result of the utterance by processing the audio data.Type: GrantFiled: April 19, 2022Date of Patent: February 18, 2025Assignee: Google LLCInventors: Fadi Biadsy, Pedro J. Moreno Mengibar
-
Patent number: 12198718Abstract: A method for determining synthetic speech includes receiving audio data characterizing speech in audio data obtained by a user device. The method also includes generating, using a trained self-supervised model, a plurality of audio features vectors each representative of audio features of a portion of the audio data. The method also includes generating, using a shallow discriminator model, a score indicating a presence of synthetic speech in the audio data based on the corresponding audio features of each audio feature vector of the plurality of audio feature vectors. The method also includes determining whether the score satisfies a synthetic speech detection threshold. When the score satisfies the synthetic speech detection threshold, the method includes determining that the speech in the audio data obtained by the user device comprises synthetic speech.Type: GrantFiled: August 9, 2023Date of Patent: January 14, 2025Assignee: Google LLCInventors: Joel Shor, Alanna Foster Slocum
-
Patent number: 12124798Abstract: A method is disclosed for calculating similarity rates between electronic documents. The similarity rate is calculated based on a count of matching phrases between the electronic documents and distances between subsequent matching phrases in each of the electronic documents. A system is also disclosed for comparing the electronic documents to obtain their similarity rates. A computing device determines at least one first proximity parameter based on the number of matched words in a matching phrase and at least one second proximity parameter based on distances between the subsequent matching phrases in each of the electronic documents. The similarity rate is determined based on the first and second proximity parameters.Type: GrantFiled: August 30, 2021Date of Patent: October 22, 2024Assignee: KYOCERA DOCUMENT SOLUTIONS INC.Inventor: Oleg Y. Zakharov
-
Patent number: 12080274Abstract: A system and method for concurrent multi-path processing of audio signals for automatic speech recognition is presented. Audio information defining a set of audio signals may be obtained (502). The audio signals may convey mixed audio content produced by multiple audio sources. A set of source-specific audio signals may be determined by demixing the mixed audio content produced by the multiple audio sources. Determining the set of source-specific audio signals may comprises providing the set of audio signals to both a first signal processing path and a second signal processing path (504). The first signal processing path may determine a value of a demixing parameter for demixing the mixed audio content (506). The second signal processing path may apply the value of the demixing parameter to the individual audio signals of the set of audio signals (508) to generate the individual source-specific audio signals (510).Type: GrantFiled: February 28, 2019Date of Patent: September 3, 2024Assignee: Beijing DiDi Infinity Technology and Development Co., Ltd.Inventors: Yi Zhang, Hui Song, Yongtao Sha, Chengyun Deng
-
Patent number: 12073181Abstract: Systems, apparatuses, methods, and computer program products are disclosed for determining robustness information for an NLP model. Modification rules, such as replacement rules and/or insertion rules, are used to generate instances of modified test data based on instances of test data that comprise words and have a syntax and a semantic meaning. The instances of test data and modified test data are provided to the NLP model and the output of the NLP model is analyzed to determine output changing instances of modified test data, which are instances of modified test data yielded output from the NLP model that is different and/or not similar to the output yielded from the NLP model for the corresponding instance of test data. Robustness information for the NLP model is determined based at least in part on the output changing instances of modified test data.Type: GrantFiled: April 21, 2023Date of Patent: August 27, 2024Assignee: Wells Fargo Bank, N.A.Inventors: Tarun Joshi, Rahul Singh, Vijayan Nair, Agus Sudjianto
-
Patent number: 12019999Abstract: Implementations relate to determining a well-formed phrase to suggest to a user to submit in lieu of a not well-formed phrase. The suggestion is rendered via an interface that is provided to a client device of the user. Those implementations relate to determining that a phrase is not well-formed, identifying alternate phrases that are related to the not well-formed phrase, and scoring the alternate phrases to select one or more of the alternate phrases to render via the interface. Some of those implementations are related to identifying that the phrase is not well-formed based on occurrences of the phrase in documents that are generated by a source with the language of the phrase as the primary language of the creator.Type: GrantFiled: June 18, 2021Date of Patent: June 25, 2024Assignee: GOOGLE LLCInventors: Wangqing Yuan, David Kogan, Vincent Lacey, Guanglei Wang, Shaun Post, Bryan Christopher Horling, Michael Anthony Schuler
-
Patent number: 12019987Abstract: Systems, apparatuses, methods, and computer program products are disclosed for distillation of a natural language processing model. An example method includes receiving, by communications circuitry, a set of text data comprising a set of observations and predicting, by processing circuitry and using the NLP model, classifications for each observation in the text data. The example method further includes generating, by model training engine, a balanced sampled data structure based on the predicted classifications for each observation in the text data and training, by the model training engine, a surrogate model using the balanced sampled data structure. The example method further includes identifying, by an interpreter and from the surrogate model, a set of most-influential tokens in the text data.Type: GrantFiled: April 28, 2021Date of Patent: June 25, 2024Assignee: Wells Fargo Bank, N.A.Inventors: Ye Yu, Harsh Singhal, Wayne B. Shoumaker
-
Patent number: 12014730Abstract: A voice processing method includes: collecting a voice signal by a microphone of an electronic device, and signal-processing the collected voice signal to obtain a first voice frame segment; performing voice recognition on the first voice frame segment to obtain a first recognition result; in response to the first recognition result not matching a target content and a plurality of tokens in the first recognition result meeting a preset condition, performing frame compensation on the first voice frame segment to obtain a second voice frame segment; and performing voice recognition on the second voice frame segment to obtain a second recognition result. A matching degree between the second recognition result and the target content is greater than a matching degree between the first recognition result and the target content.Type: GrantFiled: May 17, 2021Date of Patent: June 18, 2024Assignee: BEIJING XIAOMI MOBILE SOFTWARE CO., LTD.Inventor: Xiangyan Xu
-
Patent number: 12001797Abstract: A method and system for automatic topic detection in text may include receiving a text document of a corpus of documents and extracting one or more phrases from the document, based on one or more syntactic patterns. For each phrase, embodiments of the invention may: apply a word embedding neural network on one or more words of the phrase, to obtain one or more respective word embedding vectors; calculate a weighted phrase embedding vector, and compute a phrase saliency score, based on the weighted phrase embedding vector. Embodiments of the invention may subsequently produce one or more topic labels, representing one or more respective topics in the document, based on the computed phrase saliency scores, and may select one or more topic labels according to their relevance to the business domain of the corpus.Type: GrantFiled: May 12, 2021Date of Patent: June 4, 2024Inventors: Eyal Orbach, Avraham Faizakof, Arnon Mazza, Lev Haikin
-
Patent number: 11947925Abstract: A user input in a source language is received. A set of contextual data is received. The user input is encoded into a user input feature vector. The set of contextual data is encoded into a context feature vector. The user input feature vector and the context feature vector are used to generate a fusion vector. An adaptive neural network is trained to identify a second context feature vector, based on the fusion vector. A second user input in the source language is received for translation into a target language. The adaptive neural network is used to determine, based on the second context feature vector, a second user input feature vector. The second user input feature vector is decoded, based on the source language and the target language, into a target language output. A user is notified of the target language output.Type: GrantFiled: May 21, 2020Date of Patent: April 2, 2024Assignee: International Business Machines CorporationInventors: Lei Mei, Kun Yan Yin, Yan Hu, Qi Ruan, Yan Feng Han
-
Patent number: 11935517Abstract: A speech decoding method is performed by a computer device, the speech including a current audio frame and a previous audio frame. The method includes: obtaining a target token corresponding to a smallest decoding score from a first token list including first tokens obtained by decoding the previous audio frame, each first token including a state pair and a decoding score, the state pair being used for characterizing a correspondence between a first state of the first token in a first decoding network corresponding to a low-order language model and a second state of the first token in a second decoding network corresponding to a differential language model; determining pruning parameters according to the target token and an acoustic vector of the current audio frame when the current audio frame is decoded; and decoding the current audio frame according to the first token list, the pruning parameters, and the acoustic vector.Type: GrantFiled: March 3, 2021Date of Patent: March 19, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventors: Yiheng Huang, Xiaozheng Jian, Liqiang He
-
Patent number: 11928440Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.Type: GrantFiled: August 25, 2020Date of Patent: March 12, 2024Assignee: Rovi Guides, Inc.Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
-
Patent number: 11893993Abstract: Dynamic interfacing with applications is provided. For example, a system receives a first input audio signal. The system processes, via a natural language processing technique, the first input audio signal to identify an application. The system activates the application for execution on the client computing device. The application declares a function the application is configured to perform. The system modifies the natural language processing technique responsive to the function declared by the application. The system receives a second input audio signal. The system processes, via the modified natural language processing technique, the second input audio signal to detect one or more parameters. The system determines that the one or more parameters are compatible for input into an input field of the application. The system generates an action data structure for the application. The system inputs the action data structure into the application, which executes the action data structure.Type: GrantFiled: November 28, 2022Date of Patent: February 6, 2024Assignee: GOOGLE LLCInventors: Quazi Hussain, Adam Coimbra, Ilya Firman