Patents Examined by Ethan Daniel Kim

Transient audio watermarks resistant to reverberation effects

Patent number: 11978461

Abstract: An encoding and decoding method for digital audio watermarking and data hiding in transient acoustic content is disclosed. The audio signal is segmented into overlapping frames and each frame is decomposed into frequency bands. A special transient detector is used to detect frames characterized by transient audio signals (rapidly rising signal amplitude envelope and a relatively broadband spectrum with rapidly evolving spectral content, such as speech fricatives, drum beats, etc.). Frames falling on or containing transients are detected and encoded with binary watermark data by unconditionally hard-modulating the signal frequency band signals according to rules determined by the value of the respective associated binary data bits of the watermark data and without reference to the characteristics of the watermarked band signals. The method is undetectable by human listeners and unusually resistant to the degrading effects of acoustic reverberation.

Type: Grant

Filed: November 8, 2021

Date of Patent: May 7, 2024

Inventor: Alex Radzishevsky
Automatic speech recognition imposter rejection on a headphone with an accelerometer

Patent number: 11948561

Abstract: A signal processing method to determine whether or not a detected key-phrase is spoken by a wearer of a headphone. The method receives an accelerometer signal from an accelerometer in a headphone and receives a microphone signal from at least one microphone in the headphone. The method detects a key-phrase using the microphone signal and generates a voice activity detection (VAD) signal based on the accelerometer signal. The method determines whether the VAD signal indicates that the detected key-phrase is spoken by a wearer of the headphone. Responsive to determining that the VAD signal indicates that the detected key-phrase is spoken by the wearer of the headphone, triggering a virtual personal assistant (VPA).

Type: Grant

Filed: October 28, 2019

Date of Patent: April 2, 2024

Assignee: Apple Inc.

Inventors: Sorin V. Dusan, Sungyub D. Yoo, Dubravko Biruski
Voice cloning transfer for speech synthesis

Patent number: 11942070

Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.

Type: Grant

Filed: January 29, 2021

Date of Patent: March 26, 2024

Assignee: International Business Machines Corporation

Inventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
Detection of correctness of pronunciation

Patent number: 11935523

Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.

Type: Grant

Filed: November 15, 2019

Date of Patent: March 19, 2024

Assignee: Master English Oy

Inventor: Aleksandr Diment
System and method for podcast repetitive content detection

Patent number: 11922967

Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.

Type: Grant

Filed: December 10, 2020

Date of Patent: March 5, 2024

Assignee: Gracenote, Inc.

Inventors: Amanmeet Garg, Aneesh Vartakavi
Dynamic translation for a conversation

Patent number: 11908450

Abstract: A conversation design is received for a conversation bot that enables the conversation bot to provide a service using a conversation flow specified at least in part by the conversation design. The conversation design specifies in a first human language at least a portion of a message content to be provided by the conversation bot. It is identified that an end-user of the conversation bot prefers to converse in a second human language different from the first human language. In response to a determination that the message content is to be provided by the conversation bot to the end-user, the message content of the conversation design is dynamically translated for the end-user from the first human language to the second human language. The translated message content is provided to the end-user in a message from the conversation bot.

Type: Grant

Filed: May 26, 2020

Date of Patent: February 20, 2024

Assignee: ServiceNow, Inc.

Inventors: Jebakumar Mathuram Santhosm Swvigaradoss, Satya Sarika Sunkara, Ankit Goel, Rajesh Voleti, Rishabh Verma, Patrick Casey, Rao Surapaneni
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11908486

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: January 20, 2023

Date of Patent: February 20, 2024

Assignee: DOLBY INTERNATIONAL AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Method and system for automated text

Patent number: 11893135

Abstract: A system for automated text anonymisation of clinical text, the system including an AI pipeline module to configure symbolic AI pipeline components for detecting protected health information (PHI) in the clinical text; a masking module for masking the detected PHI in the clinical text and generating a de-identified clinical text output file as well as a corresponding label file with de-identified information. The pipeline components may include at least one non-symbolic AI pipeline component or machine learning model.

Type: Grant

Filed: February 19, 2021

Date of Patent: February 6, 2024

Assignee: Harrison AI Pty Ltd

Inventor: Benjamin Clayton Hachey
Electronic device for speech recognition and control method thereof

Patent number: 11887617

Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.

Type: Grant

Filed: May 31, 2019

Date of Patent: January 30, 2024

Assignee: SAMSUNG ELECTRONICS CO., LTD.

Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
Validation of revised IVR prompt translation

Patent number: 11875132

Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.

Type: Grant

Filed: May 13, 2021

Date of Patent: January 16, 2024

Assignee: Intrado Corporation

Inventors: Terry Olson, Mark L. Sempek, Roger Wehrle
Communication transmission device and voice quality determination method for communication transmission device

Patent number: 11842750

Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.

Type: Grant

Filed: February 13, 2019

Date of Patent: December 12, 2023

Assignee: Nippon Telegraph and Telephone Corporation

Inventor: Takuo Kanamitsu
Assessing speaker recognition performance

Patent number: 11837238

Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.

Type: Grant

Filed: October 21, 2020

Date of Patent: December 5, 2023

Assignee: Google LLC

Inventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11830509

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 28, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823695

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823694

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Integration of high frequency reconstruction techniques with reduced post-processing delay

Patent number: 11823696

Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.

Type: Grant

Filed: March 3, 2023

Date of Patent: November 21, 2023

Assignee: Dolby International AB

Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
Using corrections, of predicted textual segments of spoken utterances, for training of on-device speech recognition model

Patent number: 11817080

Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.

Type: Grant

Filed: October 11, 2019

Date of Patent: November 14, 2023

Assignee: GOOGLE LLC

Inventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
Methods and apparatuses for noise reduction based on time and frequency analysis using deep learning

Patent number: 11810586

Abstract: A noise cancellation method including generating a first voice signal by canceling a first portion of noise included in an input voice signal using a first network, the first network being a trained u-net structure, and the first portion of the noise being in a time domain, applying a first window to the first voice signal, performing a fast Fourier transform on the first windowed voice signal to acquire a magnitude signal and a phase signal, acquiring a mask using a second network based on the magnitude signal, the second network being another trained u-net structure, applying the mask to the magnitude signal, generating a second voice signal by canceling a second portion of the noise by performing an inverse fast Fourier transform on the first windowed voice signal based on the masked magnitude signal and the phase signal, and applying a second window to the second voice signal.

Type: Grant

Filed: August 5, 2021

Date of Patent: November 7, 2023

Assignee: LINE PLUS CORPORATION

Inventors: Ki Jun Kim, JongHewk Park
Artificial intelligence robot for providing voice recognition function and method of operating the same

Patent number: 11810575

Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.

Type: Grant

Filed: June 12, 2019

Date of Patent: November 7, 2023

Assignee: LG ELECTRONICS INC.

Inventors: Inho Lee, Junmin Lee
Method and apparatus for mining entity focus in text

Patent number: 11775761

Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.

Type: Grant

Filed: September 17, 2020

Date of Patent: October 3, 2023

Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.

Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu

1 2 next