Patents Examined by Ethan Daniel Kim
  • Patent number: 11978461
    Abstract: An encoding and decoding method for digital audio watermarking and data hiding in transient acoustic content is disclosed. The audio signal is segmented into overlapping frames and each frame is decomposed into frequency bands. A special transient detector is used to detect frames characterized by transient audio signals (rapidly rising signal amplitude envelope and a relatively broadband spectrum with rapidly evolving spectral content, such as speech fricatives, drum beats, etc.). Frames falling on or containing transients are detected and encoded with binary watermark data by unconditionally hard-modulating the signal frequency band signals according to rules determined by the value of the respective associated binary data bits of the watermark data and without reference to the characteristics of the watermarked band signals. The method is undetectable by human listeners and unusually resistant to the degrading effects of acoustic reverberation.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: May 7, 2024
    Inventor: Alex Radzishevsky
  • Patent number: 11948561
    Abstract: A signal processing method to determine whether or not a detected key-phrase is spoken by a wearer of a headphone. The method receives an accelerometer signal from an accelerometer in a headphone and receives a microphone signal from at least one microphone in the headphone. The method detects a key-phrase using the microphone signal and generates a voice activity detection (VAD) signal based on the accelerometer signal. The method determines whether the VAD signal indicates that the detected key-phrase is spoken by a wearer of the headphone. Responsive to determining that the VAD signal indicates that the detected key-phrase is spoken by the wearer of the headphone, triggering a virtual personal assistant (VPA).
    Type: Grant
    Filed: October 28, 2019
    Date of Patent: April 2, 2024
    Assignee: Apple Inc.
    Inventors: Sorin V. Dusan, Sungyub D. Yoo, Dubravko Biruski
  • Patent number: 11942070
    Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
  • Patent number: 11935523
    Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 19, 2024
    Assignee: Master English Oy
    Inventor: Aleksandr Diment
  • Patent number: 11922967
    Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: March 5, 2024
    Assignee: Gracenote, Inc.
    Inventors: Amanmeet Garg, Aneesh Vartakavi
  • Patent number: 11908450
    Abstract: A conversation design is received for a conversation bot that enables the conversation bot to provide a service using a conversation flow specified at least in part by the conversation design. The conversation design specifies in a first human language at least a portion of a message content to be provided by the conversation bot. It is identified that an end-user of the conversation bot prefers to converse in a second human language different from the first human language. In response to a determination that the message content is to be provided by the conversation bot to the end-user, the message content of the conversation design is dynamically translated for the end-user from the first human language to the second human language. The translated message content is provided to the end-user in a message from the conversation bot.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: February 20, 2024
    Assignee: ServiceNow, Inc.
    Inventors: Jebakumar Mathuram Santhosm Swvigaradoss, Satya Sarika Sunkara, Ankit Goel, Rajesh Voleti, Rishabh Verma, Patrick Casey, Rao Surapaneni
  • Patent number: 11908486
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: January 20, 2023
    Date of Patent: February 20, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11893135
    Abstract: A system for automated text anonymisation of clinical text, the system including an AI pipeline module to configure symbolic AI pipeline components for detecting protected health information (PHI) in the clinical text; a masking module for masking the detected PHI in the clinical text and generating a de-identified clinical text output file as well as a corresponding label file with de-identified information. The pipeline components may include at least one non-symbolic AI pipeline component or machine learning model.
    Type: Grant
    Filed: February 19, 2021
    Date of Patent: February 6, 2024
    Assignee: Harrison AI Pty Ltd
    Inventor: Benjamin Clayton Hachey
  • Patent number: 11887617
    Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: January 30, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
  • Patent number: 11875132
    Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.
    Type: Grant
    Filed: May 13, 2021
    Date of Patent: January 16, 2024
    Assignee: Intrado Corporation
    Inventors: Terry Olson, Mark L. Sempek, Roger Wehrle
  • Patent number: 11842750
    Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.
    Type: Grant
    Filed: February 13, 2019
    Date of Patent: December 12, 2023
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventor: Takuo Kanamitsu
  • Patent number: 11837238
    Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.
    Type: Grant
    Filed: October 21, 2020
    Date of Patent: December 5, 2023
    Assignee: Google LLC
    Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
  • Patent number: 11830509
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 28, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823695
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823694
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11823696
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: March 3, 2023
    Date of Patent: November 21, 2023
    Assignee: Dolby International AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11817080
    Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.
    Type: Grant
    Filed: October 11, 2019
    Date of Patent: November 14, 2023
    Assignee: GOOGLE LLC
    Inventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
  • Patent number: 11810586
    Abstract: A noise cancellation method including generating a first voice signal by canceling a first portion of noise included in an input voice signal using a first network, the first network being a trained u-net structure, and the first portion of the noise being in a time domain, applying a first window to the first voice signal, performing a fast Fourier transform on the first windowed voice signal to acquire a magnitude signal and a phase signal, acquiring a mask using a second network based on the magnitude signal, the second network being another trained u-net structure, applying the mask to the magnitude signal, generating a second voice signal by canceling a second portion of the noise by performing an inverse fast Fourier transform on the first windowed voice signal based on the masked magnitude signal and the phase signal, and applying a second window to the second voice signal.
    Type: Grant
    Filed: August 5, 2021
    Date of Patent: November 7, 2023
    Assignee: LINE PLUS CORPORATION
    Inventors: Ki Jun Kim, JongHewk Park
  • Patent number: 11810575
    Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.
    Type: Grant
    Filed: June 12, 2019
    Date of Patent: November 7, 2023
    Assignee: LG ELECTRONICS INC.
    Inventors: Inho Lee, Junmin Lee
  • Patent number: 11775761
    Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.
    Type: Grant
    Filed: September 17, 2020
    Date of Patent: October 3, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu