Patents Examined by Pierre-Louis Desir

Audio signal processing method and apparatus, electronic device, and storage medium

Patent number: 12039995

Abstract: This application discloses an audio signal processing method performed by an electronic device. According to this application, embedding processing is performed on a mixed audio signal by mapping the mixed audio signal to an embedding space, to obtain an embedding feature of the mixed audio signal in the embedding space; and generalized feature extraction is performed on the embedding feature, so that a generalized feature of a target component in the mixed audio signal can be obtained through extraction. The generalized feature of the target component has good generalization capability and expression capability, and can be used for different scenarios. Audio signal processing is performed on the mixed audio signal based on the generalized feature of the target component to obtain information of the audio signal of the target object, thereby improving the robustness and generalization of an audio signal processing process, and improving the accuracy of audio signal processing.

Type: Grant

Filed: February 8, 2022

Date of Patent: July 16, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Jun Wang, Wingyip Lam
Methods and systems for summarization of multiple documents using a machine learning approach

Patent number: 12032905

Abstract: An abstractive technique and an extractive technique are used to generate concise natural-language summaries of related text documents. The abstractive step generates a machine summary by constructing a graph with nodes representing unique pairs of tokens and corresponding parts-of-speech (POS), and with edge sequences representing token/POS pairs comprising sentences of a corresponding topic group from the text documents. Ranked candidate summary sentences are generated using subgraphs of the graph having initial and final nodes corresponding with valid sentence start and end pairs. The machine summary includes representative summary sentence(s) selected from each topic group's ranked candidates. The extractive step generates a natural-language summary from the machine summary by computing, for each topic group, numerical suitability measures providing comparisons between the representative summary sentence and sentences of the topic group.

Type: Grant

Filed: October 16, 2020

Date of Patent: July 9, 2024

Assignee: AMADEUS S.A.S.

Inventors: Christophe Blaya, Srudeep Kumar Reddy Katamreddy, Bernard Jean Marie Rannou, Bastien Dechamps
Speech synthesis method and apparatus, and readable storage medium

Patent number: 12033612

Abstract: A speech synthesis method includes: converting a text input sequence into a text feature representation sequence; inputting the text feature representation sequence into an encoder including N encoding layers; the N encoding layers including an encoding layer Ei and an encoding layer Ei+1; the encoding layer Ei+1 including a first multi-head self-attention network; acquiring a first attention matrix and a historical text encoded sequence outputted by the encoding layer Ei, and generating a second attention matrix of the encoding layer Ei+1 according to residual connection between the first attention matrix and the first multi-head self-attention network and the historical text encoded sequence; and generating a target text encoded sequence of the encoding layer Ei+1 according to the second attention matrix and the historical text encoded sequence, and generating synthesized speech data matched with the text input sequence based on the target text encoded sequence.

Type: Grant

Filed: November 10, 2022

Date of Patent: July 9, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventors: Yibin Zheng, Xinhui Li, Li Lu
Methods and apparatus to fingerprint an audio signal via exponential normalization

Patent number: 12032628

Abstract: Methods, apparatus, systems, and articles of manufacture are disclosed to fingerprint an audio signal via exponential normalization. An example apparatus includes an audio segmenter to divide an audio signal into a plurality of audio segments including a first audio segment and a second audio segment, the first audio segment including a first time-frequency bin, the second audio segment including a second time-frequency bin, a mean calculator to determine a first exponential mean value associated with the first time frequency bin based on a first magnitude of the audio signal associated with the first time frequency bin and a second exponential mean value associated with the second time frequency bin based on a second magnitude of the audio signal associated with the second time frequency bin and the first exponential mean value.

Type: Grant

Filed: November 26, 2019

Date of Patent: July 9, 2024

Assignee: Gracenote, Inc.

Inventors: Alexander Berrian, Matthew James Wilkinson, Robert Coover
Method, apparatus and device for quality control and storage medium

Patent number: 12032906

Abstract: A method, apparatus and device for quality control and storage media relate to the field of artificial intelligence technology, particularly to the field of natural language understanding and knowledge graphs, which may be applied in the medical field. The method includes: acquiring text information to be detected and domain of the text information; acquiring preset questions and machine reading comprehension model according to the domain; inputting the questions and the text information into the machine reading comprehension model to obtain extracted answers; outputting quality control information in response to the answers being not empty.

Type: Grant

Filed: June 29, 2021

Date of Patent: July 9, 2024

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Qifei Zeng, Yuhong Zheng, Weijian Xu, Tao Li
Method and apparatus for recognizing speech, electronic device and storage medium

Patent number: 12033615

Abstract: The disclosure provides a method and an apparatus for recognizing a speech, an electronic device and a storage medium. A speech to be recognized is obtained. An acoustic feature of the speech to be recognized and a language feature of the speech to be recognized are obtained. The speech to be recognized is input to a pronunciation difference statistics to generate a differential pronunciation pair corresponding to the speech to be recognized. The text information of the speech to be recognized is generated based on the differential pronunciation pair, the acoustic feature and the language feature.

Type: Grant

Filed: October 12, 2021

Date of Patent: July 9, 2024

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Yinlou Zhao, Liao Zhang, Zhengxiang Jiang
Out-of-domain data augmentation for natural language processing

Patent number: 12026468

Abstract: Techniques for out-of-domain data augmentation for training chatbot systems in natural language processing. In one particular aspect, a method is provided that includes receiving a training set of utterances for training a machine-learning model to identify one or more intents for one or more utterances, and augmenting the training set of utterances with out-of-domain (OOD) examples. The augmenting includes: generating a data set of OOD examples, filtering out OOD examples from the data set of OOD examples, determining a difficulty value for each OOD example remaining within the filtered data set of the OOD examples, and generating augmented batches of utterances comprising utterances from the training set of utterances and utterances from the filtered data set of the OOD based on the difficulty value for each OOD. Thereafter, the machine-learning model is trained using the augmented batches of utterances in accordance with a curriculum training protocol.

Type: Grant

Filed: October 28, 2021

Date of Patent: July 2, 2024

Assignee: Oracle International Corporation

Inventors: Elias Luqman Jalaluddin, Vishal Vishnoi, Thanh Long Duong, Mark Edward Johnson, Poorya Zaremoodi, Gautam Singaraju, Ying Xu, Vladislav Blinov, Yu-Heng Hong
Phonemes and graphemes for neural text-to-speech

Patent number: 12020685

Abstract: A method includes receiving a text input including a sequence of words represented as an input encoder embedding. The input encoder embedding includes a plurality of tokens, with the plurality of tokens including a first set of grapheme tokens representing the text input as respective graphemes and a second set of phoneme tokens representing the text input as respective phonemes. The method also includes, for each respective phoneme token of the second set of phoneme tokens: identifying a respective word of the sequence of words corresponding to the respective phoneme token and determining a respective grapheme token representing the respective word of the sequence of words corresponding to the respective phoneme token. The method also includes generating an output encoder embedding based on a relationship between each respective phoneme token and the corresponding grapheme token determined to represent a same respective word as the respective phoneme token.

Type: Grant

Filed: December 10, 2021

Date of Patent: June 25, 2024

Assignee: Google LLC

Inventors: Ye Jia, Byungha Chun, Yu Zhang, Jonathan Shen, Yonghui Wu
Electronic device and control method thereof

Patent number: 12020699

Abstract: An electronic device is disclosed. The present electronic device comprises: a voice receiving unit; and a processor, wherein the processor: when a user's voice is received through the voice receiving unit, determines an accumulation level of utterance history information corresponding to the characteristics of the user's voice; when the accumulation level of utterance history information is below a predetermined threshold level, provides response information corresponding to the user's voice on the basis of user information related to the characteristics of the user's voice; and when the accumulation level of utterance history information is equal to or higher than the predetermined threshold level, provides response information corresponding to the user's voice on the basis of the user information and the utterance history information.

Type: Grant

Filed: September 9, 2019

Date of Patent: June 25, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Jisun Park, Minjin Rho
Universal semi-word model for vocabulary contraction in automatic speech recognition

Patent number: 12008986

Abstract: A speech recognition system includes, or has access to, conventional speech recognizer data, including a conventional acoustic model and pronunciation dictionary. The speech recognition system generates restructured speech recognizer data from the conventional speech recognizer data. When used at runtime by a speech recognizer module, the restructured speech recognizer data produces more accurate and efficient results than those produced using the conventional speech recognizer data. The restructuring involves segmenting entries of the conventional pronunciation dictionary and acoustic model according to their constituent phonemes and grouping those entries with the same initial N phonemes, for some integer N (e.g., N=3), and deriving a restructured dictionary with a corresponding semi-word acoustic model for the various grouped entries.

Type: Grant

Filed: April 27, 2020

Date of Patent: June 11, 2024

Assignee: Interactions LLC

Inventors: Ilija Zeljkovic, Andrej Ljolje
High-performance microcoded text parser

Patent number: 11989508

Abstract: The performance of a text parser implemented with a state machine is improved by reducing a critical dependence path. In one aspect, all possible current states for a given text input are read from a state table circuit, and the correct next state and output are then selected therefrom by an output multiplexer based on the current state, removing dependence on the current state from the table read, and allowing the read(s) to be pipelined. Further, multiple input units are configured to operate on multiple text characters in parallel, with each input unit propagating outputs for its state table circuit to the next downstream input unit. Each downstream input unit is configured to use the propagated states to provide the proper outputs to appropriates multiplexer inputs. The number of possible output states may be dynamically reduced, thereby reducing the size of the output multiplexer needed to select the next state.

Type: Grant

Filed: February 17, 2021

Date of Patent: May 21, 2024

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Daniel Lo, Blake D. Pelton
Using speech recognition to improve cross-language speech synthesis

Patent number: 11990117

Abstract: A method for training a speech recognition model includes obtaining a multilingual text-to-speech (TTS) model. The method also includes generating a native synthesized speech representation for an input text sequence in a first language that is conditioned on speaker characteristics of a native speaker of the first language. The method also includes generating a cross-lingual synthesized speech representation for the input text sequence in the first language that is conditioned on speaker characteristics of a native speaker of a different second language. The method also includes generating a first speech recognition result for the native synthesized speech representation and a second speech recognition result for the cross-lingual synthesized speech representation. The method also includes determining a consistent loss term based on the first speech recognition result and the second speech recognition result and updating parameters of the speech recognition model based on the consistent loss term.

Type: Grant

Filed: October 20, 2021

Date of Patent: May 21, 2024

Assignee: Google LLC

Inventors: Zhehuai Chen, Bhuvana Ramabhadran, Andrew Rosenberg, Yu Zhang, Pedro J. Moreno Mengibar
Reading assistance sponsorship system and methodolgy

Patent number: 11977844

Abstract: Systems and processes for providing differentiated advertising sponsorship of a fabricated reading product are provided. Natural language digital text characterized by a sentence, comprised of words is user selected. The text is linguistically analyzed in furtherance of displaying a fabricated reading product corresponding to the text. The words of the sentence of the text, and the sentences, are evaluated with regard to word/sentence attributes for the words of the sentences text in furtherance of supplying only the word/sentence attributes to an advertising sponsor. Based upon either or both of the word and sentence evaluation, determining whether to supply an ad from the advertising sponsor in relation to a display of the fabricated reading product, and, in connection to supplying an ad, further determining placement position of the ad in relation to the display of the fabricated reading product.

Type: Grant

Filed: February 5, 2021

Date of Patent: May 7, 2024

Assignee: Walker Reading Technologies, Inc.

Inventor: Randall C. Walker
System, method and programmed product for uniquely identifying participants in a recorded streaming teleconference

Patent number: 11978456

Abstract: Systems, methods and programmed products for using visual information in a video stream of a recording streaming teleconference among a plurality of participants to diarize speech, involving obtaining respective components of the teleconference including a respective audio component, a respective video component, respective teleconference metadata, and transcription data, parsing components into speech segments, tagging speech segments with source feeds, and diarizing the teleconference so as to label the speech segments based on neural network or heuristic analysis of visual information.

Type: Grant

Filed: February 15, 2022

Date of Patent: May 7, 2024

Assignee: GONG.IO LTD

Inventors: Shlomi Medalion, Omri Allouche, Maxim Bulanov
Simple control of many parameters

Patent number: 11978462

Abstract: Various aspects of the present disclosure are directed to a process for modifying an audio signal. For example, one process for modifying an audio signal is disclosed including the following steps: determining a compression parameter of the audio signal that should be modified; fractionizing the audio signal into different frequency bands; obtaining the values of the compression parameter for each frequency band; and compressing at least a part of the frequency bands as a function of the determined compression parameter. Various other embodiments of the present disclosure are directed to a device for modifying an audio signal.

Type: Grant

Filed: July 10, 2018

Date of Patent: May 7, 2024

Assignee: ISUNIYE LLC

Inventor: Zlatan Ribic
Selecting a meaning of a word of a phrase

Patent number: 11960840

Abstract: A method executed by a computing device includes determining a set of identigens for each query word of a query to produce sets of identigens, where a set of identigens represents different meanings of a word of the query. The method further includes obtaining a first identigen selection for a first query word from the first set of identigens. The method further includes interpreting, using identigen pairing rules and based on the first identigen selection, the sets of identigens to produce a query entigen group. The method further includes accessing a knowledge database utilizing the query entigen group to produce a response entigen group. The method further includes generating a response to the query using the response entigen group, where the response includes at least one response word.

Type: Grant

Filed: June 21, 2021

Date of Patent: April 16, 2024

Assignee: entigenlogic LLC

Inventors: Frank John Williams, Stephen Emerson Sundberg, Ameeta Vasant Reed, Dennis Arlen Roberson, Thomas James MacTavish, Karl Olaf Knutson, Jessy Thomas, Niklas Josiah MacTavish, David Michael Corns, II, Andrew Chu, Kyle Edward Alberth, Ali Fattahian, Zachary John McCord, Ahmad Abdelqader Abunaser, Gary W. Grube
Multilingual simultaneous interpretation using a distributed ledger

Patent number: 11960850

Abstract: Methods, apparatuses, and computer program products for multilingual simultaneous interpretation using a distributed ledger are disclosed. A multilingual interpretation server determines that a new word has been added to a first language node, the first language node corresponding to a first language, and broadcasts, to a plurality of other language nodes, a request to interpret the new word, wherein each of the plurality of other language nodes corresponds to a different language. Each of the plurality of language nodes interprets the new word into a particular language and adds the new word and one or more interpretations of the new word to an entry in a dictionary ledger. A multilingual interpretation service provides simultaneous multilingual translations of from a source language to a plurality of target languages using the shared distributed dictionary ledger. A multilingual interpretation client is provided for accessing services provided by the multilingual interpretation service.

Type: Grant

Filed: February 21, 2020

Date of Patent: April 16, 2024

Assignee: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Howard N. Anglin, Su Liu, Fehmina Merchant, Debbie Anglin
Systems and methods for gunshot detection

Patent number: 11955136

Abstract: Various embodiments of a system and associated method for detecting and localizing gunshots are disclosed herein.

Type: Grant

Filed: March 29, 2021

Date of Patent: April 9, 2024

Assignee: Arizona Board of Regents on behalf of Arizona State University

Inventor: Garth Paine
System and method for simultaneous multilingual dubbing of video-audio programs

Patent number: 11942093

Abstract: A system and method to perform dubbing automatically for multiple languages at the same time using speech-to-text transcriptions, language translation, and artificial intelligence engines to perform the actual dubbing in the voice likeness of the original speaker.

Type: Grant

Filed: March 5, 2020

Date of Patent: March 26, 2024

Assignee: SYNCWORDS LLC

Inventors: Aleksandr Dubinsky, Taras Sereda
Acronym definition network

Patent number: 11941360

Abstract: Systems and methods for natural language processing are described. Embodiments of the inventive concept are configured to receive an input sequence and a plurality of candidate long forms for a short form contained in the input sequence, encode the input sequence to produce an input sequence representation, encode each of the plurality of candidate long forms to produce a plurality of candidate long form representations, wherein each of the candidate long form representations is based on a plurality of sample expressions and each of the sample expressions includes a candidate long form and contextual information, compute a plurality of similarity scores based on the candidate long form representations and the input sequence representation, and select a long form for the short form based on the plurality of similarity scores.

Type: Grant

Filed: November 5, 2020

Date of Patent: March 26, 2024

Assignee: ADOBE INC.

Inventors: Franck Dernoncourt, Amir Pouran Ben Veyseh

1 2 3 4 5 … next