Patents Examined by Richemond Dorvil

Electronic device for speech recognition and control method thereof

Patent number: 11887617

Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.

Type: Grant

Filed: May 31, 2019

Date of Patent: January 30, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
Efficient automatic punctuation with robust inference

Patent number: 11886813

Abstract: A system and method of operating a system for automatically punctuating text using non-recurrent neural networks is disclosed. The system and method at least: applying a text string to a first component of a non-recurrent neural network trained to generate one or more contextualized vectors, wherein the first component determines the contextualized vectors by processing each word in the text string in parallel with one another; applying the contextualized vectors to a second component of the non-recurrent neural network trained to generate a set of probability values for each word in the text string, wherein the second component determines the set of probability values by processing the contextualized vectors in parallel with one another; and transmitting the set of probability values to a text generation engine to generate a formatted text string based on the set of probability values.

Type: Grant

Filed: September 24, 2020

Date of Patent: January 30, 2024

Assignee: Capital One Services, LLC

Inventors: Maury Courtland, Adam Faulkner, Gayle McElvain
Validation of revised IVR prompt translation

Patent number: 11875132

Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.

Type: Grant

Filed: May 13, 2021

Date of Patent: January 16, 2024

Assignee: Intrado Corporation

Inventors: Terry Olson, Mark L. Sempek, Roger Wehrle
Systems and methods for scripted audio production

Patent number: 11875797

Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.

Type: Grant

Filed: June 22, 2021

Date of Patent: January 16, 2024

Assignee: Pozotron Inc.

Inventors: Jakub Poznanski, Kostiantyn Hlushak
Voice activity detection device and method

Patent number: 11875779

Abstract: Disclosed is a voice activity detection (VAD) device and method capable of referring to an environment detection result and thereby selecting one of multiple VAD results as a basis for determining whether a voice activity occurs. The VAD device includes an environment detection circuit, a VAD circuit, and a voice activity decision circuit. The environment detection circuit is configured to process an audio input signal and thereby generate an environment detection result. The VAD circuit is configured to analyze the audio input signal with multiple VAD algorithms and thereby generate multiple VAD results. The voice activity decision circuit is configured to select one of the multiple VAD results according to the environment detection result.

Type: Grant

Filed: September 3, 2021

Date of Patent: January 16, 2024

Assignee: REALTEK SEMICONDUCTOR CORPORATION

Inventor: Yi-Cheng Huang
Voice processing method and apparatus, computer- readable storage medium, and computer device

Patent number: 11869516

Abstract: A voice processing method is provided for a terminal. The method includes: performing voice speed detection on a voice obtained from a voice source, to obtain a voice speed value of the voice; obtaining a forward error correction (FEC) redundancy; adjusting the FEC redundancy according to the voice speed value to obtain a target redundancy; performing voice encoding on the voice to obtain a voice encoded packet; performing FEC encoding on the voice encoded packet according to the target redundancy to obtain a redundancy packet; and transmitting the redundancy packet and the voice encoded packet to a receiving end.

Type: Grant

Filed: November 8, 2021

Date of Patent: January 9, 2024

Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED

Inventor: Junbin Liang
Method and server for processing text sequence for machine processing task

Patent number: 11868737

Abstract: Methods and servers for preparing a sequence for a machine processing task. The method includes acquiring: (i) a vocabulary storing tokens, (ii) a merge table indicating possible mergers between pairs of tokens, and (iii) a text sequence. For a given word from the sequence, the method includes using the vocabulary for splitting the word into an initial sequence, and iteratively merging tokens of the initial sequence to generate a final sequence for the given word. The iterative merging includes, at a given merging iteration using the merge table for identifying merges between pairs of adjacent tokens in a current sequence of the given merging iteration, excluding at least one of merge based on a pre-determined probability, and using the reduced set merges for generating a new sequence by performing at least one merge. The new sequence is to be used as a current sequence during a next merging iteration.

Type: Grant

Filed: April 24, 2021

Date of Patent: January 9, 2024

Assignee: DIRECT CURSUS TECHNOLOGY L.L.C

Inventors: Dmitry Viktorovich Yemelyanenko, Ivan Sergeevich Provilkov, Elena Aleksandrovna Voyta
Communication transmission device and voice quality determination method for communication transmission device

Patent number: 11842750

Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.

Type: Grant

Filed: February 13, 2019

Date of Patent: December 12, 2023

Assignee: Nippon Telegraph and Telephone Corporation

Inventor: Takuo Kanamitsu
Assessing speaker recognition performance

Patent number: 11837238

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Grant

Filed: October 21, 2020

Date of Patent: December 5, 2023

Assignee: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11830509

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 28, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823695

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Voice activity detection in audio signal

Patent number: 11823706

Abstract: A method of detecting human voice activity includes determining a presence of human voice in a frame of audio signal using a plurality of features extracted from the frame of audio signal. The extracted features can include a number of zero-crossings, a periodicity metric, an energy ratio between a low frequency band and a high frequency band, and an envelope-to-floor ratio (EFR) in the frame of audio signal. Each of the features is associated with predefined criteria indicative of a presence of human voice, and based on comparisons of the features to the respective predefined criteria, the voice activity detector determines whether the frame of audio signal includes a human voice.

Type: Grant

Filed: October 14, 2019

Date of Patent: November 21, 2023

Assignee: Meta Platforms, Inc.

Inventors: Jun Yang, Joshua Bingham
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823694

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Unsupervised parallel tacotron non-autoregressive and controllable text-to-speech

Patent number: 11823656

Abstract: A method for training a non-autoregressive TTS model includes obtaining a sequence representation of an encoded text sequence concatenated with a variational embedding. The method also includes using a duration model network to predict a phoneme duration for each phoneme represented by the encoded text sequence. Based on the predicted phoneme durations, the method also includes learning an interval representation and an auxiliary attention context representation. The method also includes upsampling, using the interval representation and the auxiliary attention context representation, the sequence representation into an upsampled output specifying a number of frames. The method also includes generating, based on the upsampled output, one or more predicted mel-frequency spectrogram sequences for the encoded text sequence.

Type: Grant

Filed: May 21, 2021

Date of Patent: November 21, 2023

Assignee: Google LLC

Inventors: Isaac Elias, Byungha Chun, Jonathan Shen, Ye Jia, Yu Zhang, Yonghui Wu
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823696

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Artificial intelligence robot for providing voice recognition function and method of operating the same

Patent number: 11810575

Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.

Type: Grant

Filed: June 12, 2019

Date of Patent: November 7, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Inho Lee, Junmin Lee
Methods and apparatus to fingerprint an audio signal

Patent number: 11798577

Abstract: Methods, apparatus, systems, and articles of manufacture to fingerprint an audio signal. An example apparatus disclosed herein includes an audio segmenter to divide an audio signal into a plurality of audio segments, a bin normalizer to normalize the second audio segment to thereby create a first normalized audio segment, a subfingerprint generator to generate a first subfingerprint from the first normalized audio segment, the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment, a portion strength evaluator to determine a likelihood of the first portion to change, and a portion replacer to, in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint.

Type: Grant

Filed: March 4, 2021

Date of Patent: October 24, 2023

Assignee: Gracenote, Inc.

Inventors: Alexander Topchy, Christen V. Nielsen, Jeremey M. Davis
Recommending action(s) based on entity or entity type

Patent number: 11790173

Abstract: In various implementations described herein, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.

Type: Grant

Filed: October 28, 2020

Date of Patent: October 17, 2023

Assignee: GOOGLE LLC

Inventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
Machine translation apparatus and method

Patent number: 11790186

Abstract: Proposed are a machine translation apparatus and a machine translation method for displaying a translation result through a user interface. The machine translation method may include: display an initial machine translation result for a first translation target sentence; correcting the initial machine translation result according to a manipulation result of a user on the user interface unit, and displaying the corrected machine translation result; and analyzing a difference between the corrected machine translation result and the initial machine translation result, and reflecting the analysis result to perform machine translation on a second translation target sentence. The machine translation apparatus and the method can be used to efficiently acquire a high-quality translation within a short time while minimizing time, cost and effort of a user, which used to be required for a conventional machine translation process.

Type: Grant

Filed: March 24, 2021

Date of Patent: October 17, 2023

Assignee: XL8 Inc

Inventors: Kang Kim, Jin Hyung Park, Young Hoon Jung
Method and apparatus for mining entity focus in text

Patent number: 11775761

Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.

Type: Grant

Filed: September 17, 2020

Date of Patent: October 3, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu

prev 1 2 3 4 5 6 … next