Patents Examined by Richemond Dorvil
-
Patent number: 11887617Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.Type: GrantFiled: May 31, 2019Date of Patent: January 30, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
-
Patent number: 11886813Abstract: A system and method of operating a system for automatically punctuating text using non-recurrent neural networks is disclosed. The system and method at least: applying a text string to a first component of a non-recurrent neural network trained to generate one or more contextualized vectors, wherein the first component determines the contextualized vectors by processing each word in the text string in parallel with one another; applying the contextualized vectors to a second component of the non-recurrent neural network trained to generate a set of probability values for each word in the text string, wherein the second component determines the set of probability values by processing the contextualized vectors in parallel with one another; and transmitting the set of probability values to a text generation engine to generate a formatted text string based on the set of probability values.Type: GrantFiled: September 24, 2020Date of Patent: January 30, 2024Assignee: Capital One Services, LLCInventors: Maury Courtland, Adam Faulkner, Gayle McElvain
-
Patent number: 11875132Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.Type: GrantFiled: May 13, 2021Date of Patent: January 16, 2024Assignee: Intrado CorporationInventors: Terry Olson, Mark L. Sempek, Roger Wehrle
-
Patent number: 11875797Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.Type: GrantFiled: June 22, 2021Date of Patent: January 16, 2024Assignee: Pozotron Inc.Inventors: Jakub Poznanski, Kostiantyn Hlushak
-
Patent number: 11875779Abstract: Disclosed is a voice activity detection (VAD) device and method capable of referring to an environment detection result and thereby selecting one of multiple VAD results as a basis for determining whether a voice activity occurs. The VAD device includes an environment detection circuit, a VAD circuit, and a voice activity decision circuit. The environment detection circuit is configured to process an audio input signal and thereby generate an environment detection result. The VAD circuit is configured to analyze the audio input signal with multiple VAD algorithms and thereby generate multiple VAD results. The voice activity decision circuit is configured to select one of the multiple VAD results according to the environment detection result.Type: GrantFiled: September 3, 2021Date of Patent: January 16, 2024Assignee: REALTEK SEMICONDUCTOR CORPORATIONInventor: Yi-Cheng Huang
-
Patent number: 11869516Abstract: A voice processing method is provided for a terminal. The method includes: performing voice speed detection on a voice obtained from a voice source, to obtain a voice speed value of the voice; obtaining a forward error correction (FEC) redundancy; adjusting the FEC redundancy according to the voice speed value to obtain a target redundancy; performing voice encoding on the voice to obtain a voice encoded packet; performing FEC encoding on the voice encoded packet according to the target redundancy to obtain a redundancy packet; and transmitting the redundancy packet and the voice encoded packet to a receiving end.Type: GrantFiled: November 8, 2021Date of Patent: January 9, 2024Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITEDInventor: Junbin Liang
-
Patent number: 11868737Abstract: Methods and servers for preparing a sequence for a machine processing task. The method includes acquiring: (i) a vocabulary storing tokens, (ii) a merge table indicating possible mergers between pairs of tokens, and (iii) a text sequence. For a given word from the sequence, the method includes using the vocabulary for splitting the word into an initial sequence, and iteratively merging tokens of the initial sequence to generate a final sequence for the given word. The iterative merging includes, at a given merging iteration using the merge table for identifying merges between pairs of adjacent tokens in a current sequence of the given merging iteration, excluding at least one of merge based on a pre-determined probability, and using the reduced set merges for generating a new sequence by performing at least one merge. The new sequence is to be used as a current sequence during a next merging iteration.Type: GrantFiled: April 24, 2021Date of Patent: January 9, 2024Assignee: DIRECT CURSUS TECHNOLOGY L.L.CInventors: Dmitry Viktorovich Yemelyanenko, Ivan Sergeevich Provilkov, Elena Aleksandrovna Voyta
-
Patent number: 11842750Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.Type: GrantFiled: February 13, 2019Date of Patent: December 12, 2023Assignee: Nippon Telegraph and Telephone CorporationInventor: Takuo Kanamitsu
-
Patent number: 11837238Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.Type: GrantFiled: October 21, 2020Date of Patent: December 5, 2023Assignee: Google LLCInventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
-
Patent number: 11830509Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 28, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823695Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823706Abstract: A method of detecting human voice activity includes determining a presence of human voice in a frame of audio signal using a plurality of features extracted from the frame of audio signal. The extracted features can include a number of zero-crossings, a periodicity metric, an energy ratio between a low frequency band and a high frequency band, and an envelope-to-floor ratio (EFR) in the frame of audio signal. Each of the features is associated with predefined criteria indicative of a presence of human voice, and based on comparisons of the features to the respective predefined criteria, the voice activity detector determines whether the frame of audio signal includes a human voice.Type: GrantFiled: October 14, 2019Date of Patent: November 21, 2023Assignee: Meta Platforms, Inc.Inventors: Jun Yang, Joshua Bingham
-
Patent number: 11823694Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823656Abstract: A method for training a non-autoregressive TTS model includes obtaining a sequence representation of an encoded text sequence concatenated with a variational embedding. The method also includes using a duration model network to predict a phoneme duration for each phoneme represented by the encoded text sequence. Based on the predicted phoneme durations, the method also includes learning an interval representation and an auxiliary attention context representation. The method also includes upsampling, using the interval representation and the auxiliary attention context representation, the sequence representation into an upsampled output specifying a number of frames. The method also includes generating, based on the upsampled output, one or more predicted mel-frequency spectrogram sequences for the encoded text sequence.Type: GrantFiled: May 21, 2021Date of Patent: November 21, 2023Assignee: Google LLCInventors: Isaac Elias, Byungha Chun, Jonathan Shen, Ye Jia, Yu Zhang, Yonghui Wu
-
Patent number: 11823696Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11810575Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.Type: GrantFiled: June 12, 2019Date of Patent: November 7, 2023Assignee: LG ELECTRONICS INC.Inventors: Inho Lee, Junmin Lee
-
Patent number: 11798577Abstract: Methods, apparatus, systems, and articles of manufacture to fingerprint an audio signal. An example apparatus disclosed herein includes an audio segmenter to divide an audio signal into a plurality of audio segments, a bin normalizer to normalize the second audio segment to thereby create a first normalized audio segment, a subfingerprint generator to generate a first subfingerprint from the first normalized audio segment, the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment, a portion strength evaluator to determine a likelihood of the first portion to change, and a portion replacer to, in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint.Type: GrantFiled: March 4, 2021Date of Patent: October 24, 2023Assignee: Gracenote, Inc.Inventors: Alexander Topchy, Christen V. Nielsen, Jeremey M. Davis
-
Patent number: 11790173Abstract: In various implementations described herein, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.Type: GrantFiled: October 28, 2020Date of Patent: October 17, 2023Assignee: GOOGLE LLCInventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
-
Patent number: 11790186Abstract: Proposed are a machine translation apparatus and a machine translation method for displaying a translation result through a user interface. The machine translation method may include: display an initial machine translation result for a first translation target sentence; correcting the initial machine translation result according to a manipulation result of a user on the user interface unit, and displaying the corrected machine translation result; and analyzing a difference between the corrected machine translation result and the initial machine translation result, and reflecting the analysis result to perform machine translation on a second translation target sentence. The machine translation apparatus and the method can be used to efficiently acquire a high-quality translation within a short time while minimizing time, cost and effort of a user, which used to be required for a conventional machine translation process.Type: GrantFiled: March 24, 2021Date of Patent: October 17, 2023Assignee: XL8 IncInventors: Kang Kim, Jin Hyung Park, Young Hoon Jung
-
Patent number: 11775761Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.Type: GrantFiled: September 17, 2020Date of Patent: October 3, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu