Patents Examined by Shreyans A Patel

Method and system for dereverberation of speech signals

Patent number: 11790930

Abstract: A system and method for reverberation reduction is disclosed. A first Deep Neural Network (DNN) produces a first estimate of a target direct-path signal from a mixture of acoustic signals that include the target direct-path signal and a reverberation of the target direct-path signal. A filter modeling a room impulse response (RIR) for the first estimate is estimated. The filter when applied to the first estimate of the target direct-path signal generates a result closest to a residual between the mixture of the acoustic signals and the first estimate of the target direct-path signal according to a distance function. A mixture with reduced reverberation of the target direct-path signal is obtained by removing the result of applying the filter to the first estimate of the target direct-path signal from the received mixture. A second DNN produces a second estimate of the target direct-path signal from the mixture with reduced reverberation.

Type: Grant

Filed: March 10, 2022

Date of Patent: October 17, 2023

Assignee: Mitsubishi Electric Research Laboratories, Inc.

Inventors: Zhong-Qiu Wang, Gordon Wichern, Jonathan Le Roux
Methods of encoding and decoding audio signal using side information, and encoder and decoder for performing the methods

Patent number: 11783844

Abstract: Disclosed are methods of encoding and decoding an audio signal using side information, and an encoder and a decoder for performing the methods. The method of encoding an audio signal using side information includes identifying an input signal, the input signal being an original audio signal, extracting side information from the input signal using a learning model trained to extract side information from a feature vector of the input signal, encoding the input signal, and generating a bitstream by combining the encoded input signal and the side information.

Type: Grant

Filed: November 16, 2021

Date of Patent: October 10, 2023

Assignees: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE, GWANGJU INSTITUTE OF SCIENCE AND TECHNOLOGY

Inventors: Woo-taek Lim, Seung Kwon Beack, Jongmo Sung, Tae Jin Lee, Inseon Jang, Jong Won Shin, Soojoong Hwang, Youngju Cheon, Sangwook Han
Automatic interpretation server and method thereof

Patent number: 11776557

Abstract: Provided is a zero user interface (UI)-based automatic interpretation method including receiving a plurality of speech signals uttered by a plurality of users from a plurality of terminal devices, acquiring a plurality of speech energies from the plurality of received speech signals, determining main speech signal uttered in a current utterance turn among the plurality of speech signals by comparing the plurality of acquired speech energies, and transmitting an automatic interpretation result acquired by performing automatic interpretation on the determined main speech signal to the plurality of terminal devices.

Type: Grant

Filed: April 2, 2021

Date of Patent: October 3, 2023

Assignee: ELECTRONICS AND TELECOMMUNICATIONS RESEARCH INSTITUTE

Inventors: Seung Yun, Sang Hun Kim, Min Kyu Lee
Unsupervised alignment for text to speech synthesis using neural networks

Patent number: 11769481

Abstract: Generation of synthetic speech from an input text sequence may be difficult when durations of individual phonemes forming the input text sequence are unknown. A predominantly parallel process may model speech rhythm as a separate generative distribution such that phoneme duration may be sampled at inference. Additional information such as pitch or energy may also be sampled to provide improved diversity for synthetic speech generation.

Type: Grant

Filed: October 7, 2021

Date of Patent: September 26, 2023

Assignee: Nvidia Corporation

Inventors: Kevin Shih, Jose Rafael Valle Gomes da Costa, Rohan Badlani, Adrian Lancucki, Wei Ping, Bryan Catanzaro
Multilingual text-to-speech synthesis

Patent number: 11769483

Abstract: A multilingual text-to-speech synthesis method and system are disclosed. The method includes receiving an articulatory feature of a speaker regarding a first language, receiving an input text of a second language, and generating output speech data for the input text of the second language that simulates the speaker's speech by inputting the input text of the second language and the articulatory feature of the speaker regarding the first language to a single artificial neural network multilingual text-to-speech synthesis model. The single artificial neural network multilingual text-to-speech synthesis model is generated by learning similarity information between phonemes of the first language and phonemes of the second language based on a first learning data of the first language and a second learning data of the second language.

Type: Grant

Filed: November 23, 2021

Date of Patent: September 26, 2023

Assignee: NEOSAPIENCE, INC.

Inventors: Taesu Kim, Younggun Lee
Electronic apparatus and controlling method thereof

Patent number: 11763799

Abstract: An electronic apparatus and a controlling method thereof are provided. The electronic apparatus includes a microphone; a memory configured to store a text-to-speech (TTS) model and a plurality of evaluation texts; and a processor configured to: obtain a first reference vector of a user speech spoken by a user based the user speech being received through the microphone, generate a plurality of candidate reference vectors based on the first reference vector, obtain a plurality of synthesized sounds by inputting the plurality of candidate reference vectors and the plurality of evaluation texts to the TTS model, identify at least one synthesized sound of the plurality of synthesized sounds based on a similarity between characteristics of the plurality of synthesized sounds and the user speech, and store a second reference vector of the at least one synthesized sound in the memory as a reference vector corresponding to the user for the TTS model.

Type: Grant

Filed: December 17, 2021

Date of Patent: September 19, 2023

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Sangjun Park, Kyoungbo Min, Kihyun Choo, Seungdo Choi
Audio enhancement through supervised latent variable representation of target speech and noise

Patent number: 11763832

Abstract: Systems and methods for generating an enhanced audio signal comprise a trained neural network configured to receive an input audio signal and generate an enhanced target signal, the trained neural network comprising a pre-processing neural network configured to receive a segment of the input audio signal and output an audio classification, the pre-processing neural network including at least one hidden layer comprising an embedding vector, and a noise reduction neural network configured to receive the segment of the input audio signal, and the embedding vector and generate the enhanced target signal. The pre-processing neural network may comprise a target signal pre-processing neural network configured to output a target signal classification and comprising at least one hidden layer comprising a target embedding vector.

Type: Grant

Filed: May 1, 2020

Date of Patent: September 19, 2023

Assignees: Synaptics Incorporated, The Trustees of Indiana University

Inventors: Francesco Nesta, Minje Kim, Sanna Wager
Pitch emphasis apparatus, method and program for the same

Patent number: 11749295

Abstract: Provided is pitch enhancement processing having little unnaturalness even in time segments for consonants, and having little unnaturalness to listeners caused by discontinuities even when time segments for consonants and other time segments switch frequently. A pitch emphasis apparatus carries out the following as the pitch enhancement processing: for a time segment in which a spectral envelope of a signal has been determined to be flat, obtaining an output signal for each of times in the time segment, the output signal being a signal including a signal obtained by adding (1) a signal obtained by multiplying the signal of a time, further in the past than the time by a number of samples T0 corresponding to a pitch period of the time segment, a pitch gain ?0 of the time segment, a predetermined constant B0, and a value greater than 0 and less than 1, to (2) the signal of the time.

Type: Grant

Filed: August 31, 2022

Date of Patent: September 5, 2023

Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION

Inventors: Yutaka Kamamoto, Ryosuke Sugiura, Takehiro Moriya
Text-to-speech synthesis system and method

Patent number: 11741942

Abstract: A method, computer program product, and computer system for text-to-speech synthesis is disclosed. Synthetic speech data for an input text may be generated. The synthetic speech data may be compared to recorded reference speech data corresponding to the input text. Based on, at least in part, the comparison of the synthetic speech data to the recorded reference speech data, at least one feature indicative of at least one difference between the synthetic speech data and the recorded reference speech data may be extracted. A speech gap filling model may be generated based on, at least in part, the at least one feature extracted. A speech output may be generated based on, at least in part, the speech gap filling model.

Type: Grant

Filed: August 3, 2022

Date of Patent: August 29, 2023

Assignee: Telepathy Labs, Inc

Inventors: Piero Perucci, Martin Reber, Vijeta Avijeet
Voice aging using machine learning

Patent number: 11735158

Abstract: This specification describes systems and methods for aging voice audio, in particular voice audio in computer games. According to one aspect of this specification, there is described a method for aging speech audio data. The method comprises: inputting an initial audio signal and an age embedding into a machine-learned age convertor model, wherein: the initial audio signal comprises speech audio; and the age embedding is based on an age classification of a plurality of speech audio samples of subjects in a target age category; processing, by the machine-learned age convertor model, the initial audio signal and the age embedding to generate an age-altered audio signal, wherein the age-altered audio signal corresponds to a version of the initial audio signal in the target age category; and outputting, from the machine-learned age convertor model, the age-altered audio signal.

Type: Grant

Filed: August 11, 2021

Date of Patent: August 22, 2023

Assignee: ELECTRONIC ARTS INC.

Inventors: Kilol Gupta, Zahra Shakeri, Ping Zhong, Siddharth Gururani, Mohsen Sardari
Method and system of automatic speech recognition with highly efficient decoding

Patent number: 11735164

Abstract: A system, article, and method of automatic speech recognition with highly efficient decoding is accomplished by frequent beam width adjustment.

Type: Grant

Filed: August 9, 2021

Date of Patent: August 22, 2023

Assignee: Intel Corporation

Inventors: Piotr Rozen, Joachim Hofer
Method and terminal for generating simulated voice of virtual teacher

Patent number: 11727915

Abstract: Disclosed are a method and a terminal for generating simulated voices of virtual teachers. Real voice samples of teachers are collected and converted into text sequences, and a text emotion polarity training set and a text tone training set are constructed according to the text sequences; a lexical item emotion model is constructed based on lexical items in the text sequences and is trained by using the emotion polarity training set, and word vectors, an emotion polarity vector, and a weight parameter are obtained by training; and the similarity between the word vector and the emotion polarity vector is calculated, and emotion features are extracted according to a similarity calculation result, a conditional vocoder is constructed according to the voice styles and emotion features to generate new voices with emotion changes. The method and the terminal contribute to satisfying the application requirements of high-quality virtual teachers.

Type: Grant

Filed: January 18, 2023

Date of Patent: August 15, 2023

Assignees: Fujian TQ Digital Inc., Central China Normal University

Inventors: Dejian Liu, Zhenhua Fang, Zheng Zhong, Jian Xu
Word sense disambiguation using a deep logico-neural network

Patent number: 11687724

Abstract: Word sense disambiguation using a glossary layer embedded in a deep neural network includes receiving, by one or more processors, input sentences including a plurality of words. At least two words in the plurality of words are homonyms. The one or more processors convert the plurality of words associated with each input sentence into a first vector including possible senses for the at least two words. The first vector is then combined with a second vector including a domain-specific contextual vector associated with the at least two words. The combination of the first vector with the second vector is fed into a recurrent deep logico-neural network model to generate a third vector that includes word senses for the at least two words. A threshold is set for the third vector to generate a fourth vector including a final word sense vector for the at least two words.

Type: Grant

Filed: September 30, 2020

Date of Patent: June 27, 2023

Assignee: International Business Machines Corporation

Inventors: Ismail Yunus Akhalwaya, Naweed Aghmad Khan, Francois Pierre Luus, Ndivhuwo Makondo, Ryan Nelson Riegel, Alexander Gray
Learnable speed control of speech synthesis

Patent number: 11682379

Abstract: A method, computer program, and computer system is provided for synthesizing speech at one or more speeds. A context associated with one or more phonemes corresponding to a speaking voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a voice sample corresponding to the speaking voice is synthesized using the generated mel-spectrogram features.

Type: Grant

Filed: February 24, 2022

Date of Patent: June 20, 2023

Assignee: TENCENT AMERICA LLC

Inventors: Chengzhu Yu, Dong Yu
Artificial intelligence apparatus for recognizing speech including multiple languages, and method for the same

Patent number: 11682388

Abstract: An AI apparatus includes a microphone to acquire speech data including multiple languages, and a processor to acquire text data corresponding to the speech data, determine a main language from languages included in the text data, acquire a translated text data obtained by translating a text data portion, which has a language other than the main language, in the main language, acquire a morpheme analysis result for the translated text data, extract a keyword for intention analysis from the morpheme analysis result, acquire an intention pattern matched to the keyword, and perform an operation corresponding to the intention pattern.

Type: Grant

Filed: June 2, 2022

Date of Patent: June 20, 2023

Assignee: LG ELECTRONICS INC

Inventors: Yejin Kim, Hyun Yu, Jonghoon Chae
Synthesized speech generation

Patent number: 11676571

Abstract: A device for speech generation includes one or more processors configured to receive one or more control parameters indicating target speech characteristics. The one or more processors are also configured to process, using a multi-encoder, an input representation of speech based on the one or more control parameters to generate encoded data corresponding to an audio signal that represents a version of the speech based on the target speech characteristics.

Type: Grant

Filed: January 21, 2021

Date of Patent: June 13, 2023

Assignee: QUALCOMM Incorporated

Inventors: Kyungguen Byun, Sunkuk Moon, Shuhua Zhang, Vahid Montazeri, Lae-Hoon Kim, Erik Visser
Speech processing

Patent number: 11676577

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for adapting a language model are disclosed. In one aspect, a method includes the actions of receiving transcriptions of utterances that were received by computing devices operating in a domain and that are in a source language. The actions further include generating translated transcriptions of the transcriptions of the utterances in a target language. The actions further include receiving a language model for the target language. The actions further include biasing the language model for the target language by increasing the likelihood of the language model selecting terms included in the translated transcriptions. The actions further include generating a transcription of an utterance in the target language using the biased language model and while operating in the domain.

Type: Grant

Filed: September 9, 2021

Date of Patent: June 13, 2023

Assignee: Google LLC

Inventors: Petar Aleksic, Benjamin Paul Hillson Haynor
Intelligent system that dynamically improves its knowledge and code-base for natural language understanding

Patent number: 11675977

Abstract: Systems, methods, and apparatuses are presented for a novel natural language tokenizer and tagger. In some embodiments, a method for tokenizing text for natural language processing comprises: generating from a pool of documents, a set of statistical models comprising one or more entries each indicating a likelihood of appearance of a character/letter sequence in the pool of documents; receiving a set of rules comprising rules that identify character/letter sequences as valid tokens; transforming one or more entries in the statistical models into new rules that are added to the set of rules when the entries indicate a high likelihood; receiving a document to be processed; dividing the document to be processed into tokens based on the set of statistical models and the set of rules, wherein the statistical models are applied where the rules fail to unambiguously tokenize the document; and outputting the divided tokens for natural language processing.

Type: Grant

Filed: March 27, 2020

Date of Patent: June 13, 2023

Assignee: Daash Intelligence, Inc.

Inventors: Robert J. Munro, Rob Voigt, Schuyler D. Erle, Brendan D. Callahan, Gary C. King, Jessica D. Long, Jason Brenier, Tripti Saxena, Stefan Krawczyk
Systems and methods for identifying and classifying community-sourced documents as true documents

Patent number: 11669688

Abstract: A system and a corresponding computer-implemented method identifies and classifies community-sourced documents as true documents. The community-sourced documents include one or more data objects such as data items, including text, strings, phrases, and words; image items, including still image items, video image items, and icons; and drawing items. The system and corresponding method then report the analysis results.

Type: Grant

Filed: June 7, 2021

Date of Patent: June 6, 2023

Assignee: Architecture Technology Corporation

Inventors: Eric R. Chartier, Andrew Murphy, William Colligan, Paul C. Davis
Time domain spectral bandwidth replication

Patent number: 11670311

Abstract: A wireless audio system for encoding and decoding an audio signal using spectral bandwidth replication is provided. Bandwidth extension is performed in the time-domain, enabling low-latency audio coding.

Type: Grant

Filed: April 12, 2021

Date of Patent: June 6, 2023

Assignee: Shure Acquisition Holdings, Inc.

Inventors: Wenshun Tian, Michael Ryan Lester

prev 1 2 3 4 5 6 7 … next