Patents Examined by Richemond Dorvil
  • Patent number: 11887617
    Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: January 30, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
  • Patent number: 11886813
    Abstract: A system and method of operating a system for automatically punctuating text using non-recurrent neural networks is disclosed. The system and method at least: applying a text string to a first component of a non-recurrent neural network trained to generate one or more contextualized vectors, wherein the first component determines the contextualized vectors by processing each word in the text string in parallel with one another; applying the contextualized vectors to a second component of the non-recurrent neural network trained to generate a set of probability values for each word in the text string, wherein the second component determines the set of probability values by processing the contextualized vectors in parallel with one another; and transmitting the set of probability values to a text generation engine to generate a formatted text string based on the set of probability values.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: January 30, 2024
    Assignee: Capital One Services, LLC
    Inventors: Maury Courtland, Adam Faulkner, Gayle McElvain
  • Patent number: 11875132
    Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.
    Type: Grant
    Filed: May 13, 2021
    Date of Patent: January 16, 2024
    Assignee: Intrado Corporation
    Inventors: Terry Olson, Mark L. Sempek, Roger Wehrle
  • Patent number: 11875797
    Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.
    Type: Grant
    Filed: June 22, 2021
    Date of Patent: January 16, 2024
    Assignee: Pozotron Inc.
    Inventors: Jakub Poznanski, Kostiantyn Hlushak
  • Patent number: 11875779
    Abstract: Disclosed is a voice activity detection (VAD) device and method capable of referring to an environment detection result and thereby selecting one of multiple VAD results as a basis for determining whether a voice activity occurs. The VAD device includes an environment detection circuit, a VAD circuit, and a voice activity decision circuit. The environment detection circuit is configured to process an audio input signal and thereby generate an environment detection result. The VAD circuit is configured to analyze the audio input signal with multiple VAD algorithms and thereby generate multiple VAD results. The voice activity decision circuit is configured to select one of the multiple VAD results according to the environment detection result.
    Type: Grant
    Filed: September 3, 2021
    Date of Patent: January 16, 2024
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventor: Yi-Cheng Huang
  • Patent number: 11869516
    Abstract: A voice processing method is provided for a terminal. The method includes: performing voice speed detection on a voice obtained from a voice source, to obtain a voice speed value of the voice; obtaining a forward error correction (FEC) redundancy; adjusting the FEC redundancy according to the voice speed value to obtain a target redundancy; performing voice encoding on the voice to obtain a voice encoded packet; performing FEC encoding on the voice encoded packet according to the target redundancy to obtain a redundancy packet; and transmitting the redundancy packet and the voice encoded packet to a receiving end.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: January 9, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11868737
    Abstract: Methods and servers for preparing a sequence for a machine processing task. The method includes acquiring: (i) a vocabulary storing tokens, (ii) a merge table indicating possible mergers between pairs of tokens, and (iii) a text sequence. For a given word from the sequence, the method includes using the vocabulary for splitting the word into an initial sequence, and iteratively merging tokens of the initial sequence to generate a final sequence for the given word. The iterative merging includes, at a given merging iteration using the merge table for identifying merges between pairs of adjacent tokens in a current sequence of the given merging iteration, excluding at least one of merge based on a pre-determined probability, and using the reduced set merges for generating a new sequence by performing at least one merge. The new sequence is to be used as a current sequence during a next merging iteration.
    Type: Grant
    Filed: April 24, 2021
    Date of Patent: January 9, 2024
    Assignee: DIRECT CURSUS TECHNOLOGY L.L.C
    Inventors: Dmitry Viktorovich Yemelyanenko, Ivan Sergeevich Provilkov, Elena Aleksandrovna Voyta
  • Patent number: 11842750
    Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.
    Type: Grant
    Filed: February 13, 2019
    Date of Patent: December 12, 2023
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventor: Takuo Kanamitsu
  • Patent number: 11837238
    Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
    Type: Grant
    Filed: October 21, 2020
    Date of Patent: December 5, 2023
    Assignee: Google LLC
    Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
  • Patent number: 11830509
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 28, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823695
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823706
    Abstract: A method of detecting human voice activity includes determining a presence of human voice in a frame of audio signal using a plurality of features extracted from the frame of audio signal. The extracted features can include a number of zero-crossings, a periodicity metric, an energy ratio between a low frequency band and a high frequency band, and an envelope-to-floor ratio (EFR) in the frame of audio signal. Each of the features is associated with predefined criteria indicative of a presence of human voice, and based on comparisons of the features to the respective predefined criteria, the voice activity detector determines whether the frame of audio signal includes a human voice.
    Type: Grant
    Filed: October 14, 2019
    Date of Patent: November 21, 2023
    Assignee: Meta Platforms, Inc.
    Inventors: Jun Yang, Joshua Bingham
  • Patent number: 11823694
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823656
    Abstract: A method for training a non-autoregressive TTS model includes obtaining a sequence representation of an encoded text sequence concatenated with a variational embedding. The method also includes using a duration model network to predict a phoneme duration for each phoneme represented by the encoded text sequence. Based on the predicted phoneme durations, the method also includes learning an interval representation and an auxiliary attention context representation. The method also includes upsampling, using the interval representation and the auxiliary attention context representation, the sequence representation into an upsampled output specifying a number of frames. The method also includes generating, based on the upsampled output, one or more predicted mel-frequency spectrogram sequences for the encoded text sequence.
    Type: Grant
    Filed: May 21, 2021
    Date of Patent: November 21, 2023
    Assignee: Google LLC
    Inventors: Isaac Elias, Byungha Chun, Jonathan Shen, Ye Jia, Yu Zhang, Yonghui Wu
  • Patent number: 11823696
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11810575
    Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: November 7, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Inho Lee, Junmin Lee
  • Patent number: 11798577
    Abstract: Methods, apparatus, systems, and articles of manufacture to fingerprint an audio signal. An example apparatus disclosed herein includes an audio segmenter to divide an audio signal into a plurality of audio segments, a bin normalizer to normalize the second audio segment to thereby create a first normalized audio segment, a subfingerprint generator to generate a first subfingerprint from the first normalized audio segment, the first subfingerprint including a first portion corresponding to a location of an energy extremum in the normalized second audio segment, a portion strength evaluator to determine a likelihood of the first portion to change, and a portion replacer to, in response to determining the likelihood does not satisfy a threshold, replace the first portion with a second portion to thereby generate a second subfingerprint.
    Type: Grant
    Filed: March 4, 2021
    Date of Patent: October 24, 2023
    Assignee: Gracenote, Inc.
    Inventors: Alexander Topchy, Christen V. Nielsen, Jeremey M. Davis
  • Patent number: 11790173
    Abstract: In various implementations described herein, a partial free-form natural language input may be received from a user at an input component of a computing device. The partial free-form natural language input may identify an entity without identifying a responsive action and may be directed by the user to an automated assistant that operates at least in part on the computing device. The partial free-form natural language input may be analyzed to identify the entity. Based on the identified entity, a plurality or superset of candidate responsive actions may be identified, filtered, and/or ranked based on one or more signals. The automated assistant may then provide output that recommends one or more of the candidate responsive actions based on the ranking and/or filtering.
    Type: Grant
    Filed: October 28, 2020
    Date of Patent: October 17, 2023
    Assignee: GOOGLE LLC
    Inventors: Keun Soo Yim, Kyung Yul Lim, Umesh Patil
  • Patent number: 11790186
    Abstract: Proposed are a machine translation apparatus and a machine translation method for displaying a translation result through a user interface. The machine translation method may include: display an initial machine translation result for a first translation target sentence; correcting the initial machine translation result according to a manipulation result of a user on the user interface unit, and displaying the corrected machine translation result; and analyzing a difference between the corrected machine translation result and the initial machine translation result, and reflecting the analysis result to perform machine translation on a second translation target sentence. The machine translation apparatus and the method can be used to efficiently acquire a high-quality translation within a short time while minimizing time, cost and effort of a user, which used to be required for a conventional machine translation process.
    Type: Grant
    Filed: March 24, 2021
    Date of Patent: October 17, 2023
    Assignee: XL8 Inc
    Inventors: Kang Kim, Jin Hyung Park, Young Hoon Jung
  • Patent number: 11775761
    Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: October 3, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu