Patents Examined by Ethan Daniel Kim
-
Patent number: 11978461Abstract: An encoding and decoding method for digital audio watermarking and data hiding in transient acoustic content is disclosed. The audio signal is segmented into overlapping frames and each frame is decomposed into frequency bands. A special transient detector is used to detect frames characterized by transient audio signals (rapidly rising signal amplitude envelope and a relatively broadband spectrum with rapidly evolving spectral content, such as speech fricatives, drum beats, etc.). Frames falling on or containing transients are detected and encoded with binary watermark data by unconditionally hard-modulating the signal frequency band signals according to rules determined by the value of the respective associated binary data bits of the watermark data and without reference to the characteristics of the watermarked band signals. The method is undetectable by human listeners and unusually resistant to the degrading effects of acoustic reverberation.Type: GrantFiled: November 8, 2021Date of Patent: May 7, 2024Inventor: Alex Radzishevsky
-
Patent number: 11948561Abstract: A signal processing method to determine whether or not a detected key-phrase is spoken by a wearer of a headphone. The method receives an accelerometer signal from an accelerometer in a headphone and receives a microphone signal from at least one microphone in the headphone. The method detects a key-phrase using the microphone signal and generates a voice activity detection (VAD) signal based on the accelerometer signal. The method determines whether the VAD signal indicates that the detected key-phrase is spoken by a wearer of the headphone. Responsive to determining that the VAD signal indicates that the detected key-phrase is spoken by the wearer of the headphone, triggering a virtual personal assistant (VPA).Type: GrantFiled: October 28, 2019Date of Patent: April 2, 2024Assignee: Apple Inc.Inventors: Sorin V. Dusan, Sungyub D. Yoo, Dubravko Biruski
-
Patent number: 11942070Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.Type: GrantFiled: January 29, 2021Date of Patent: March 26, 2024Assignee: International Business Machines CorporationInventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
-
Patent number: 11935523Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.Type: GrantFiled: November 15, 2019Date of Patent: March 19, 2024Assignee: Master English OyInventor: Aleksandr Diment
-
Patent number: 11922967Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.Type: GrantFiled: December 10, 2020Date of Patent: March 5, 2024Assignee: Gracenote, Inc.Inventors: Amanmeet Garg, Aneesh Vartakavi
-
Patent number: 11908450Abstract: A conversation design is received for a conversation bot that enables the conversation bot to provide a service using a conversation flow specified at least in part by the conversation design. The conversation design specifies in a first human language at least a portion of a message content to be provided by the conversation bot. It is identified that an end-user of the conversation bot prefers to converse in a second human language different from the first human language. In response to a determination that the message content is to be provided by the conversation bot to the end-user, the message content of the conversation design is dynamically translated for the end-user from the first human language to the second human language. The translated message content is provided to the end-user in a message from the conversation bot.Type: GrantFiled: May 26, 2020Date of Patent: February 20, 2024Assignee: ServiceNow, Inc.Inventors: Jebakumar Mathuram Santhosm Swvigaradoss, Satya Sarika Sunkara, Ankit Goel, Rajesh Voleti, Rishabh Verma, Patrick Casey, Rao Surapaneni
-
Patent number: 11908486Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: January 20, 2023Date of Patent: February 20, 2024Assignee: DOLBY INTERNATIONAL ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11893135Abstract: A system for automated text anonymisation of clinical text, the system including an AI pipeline module to configure symbolic AI pipeline components for detecting protected health information (PHI) in the clinical text; a masking module for masking the detected PHI in the clinical text and generating a de-identified clinical text output file as well as a corresponding label file with de-identified information. The pipeline components may include at least one non-symbolic AI pipeline component or machine learning model.Type: GrantFiled: February 19, 2021Date of Patent: February 6, 2024Assignee: Harrison AI Pty LtdInventor: Benjamin Clayton Hachey
-
Patent number: 11887617Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.Type: GrantFiled: May 31, 2019Date of Patent: January 30, 2024Assignee: SAMSUNG ELECTRONICS CO., LTD.Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
-
Patent number: 11875132Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.Type: GrantFiled: May 13, 2021Date of Patent: January 16, 2024Assignee: Intrado CorporationInventors: Terry Olson, Mark L. Sempek, Roger Wehrle
-
Patent number: 11842750Abstract: A communication transmitting apparatus is connected between IP telephones, and includes a tone storage unit configured to store tone data T that is unique, an adding unit configured to add the tone data T to the voice data V transmitted from the IP telephone to generate addition data, an arithmetic processing unit configured to convert a format of the addition data according to a prescribed specification to generate converted data including converted voice data Vc and tone data Tc, a separating unit configured to separate the tone data Tc from the converted data, and a comparison determination unit configured to determine that if the tone data T added to the voice data V before conversion performed by the arithmetic processing unit is different from the tone data Tc separated from the voice data Vc by the separating unit after the conversion, there is quality degradation in the voice data Vc.Type: GrantFiled: February 13, 2019Date of Patent: December 12, 2023Assignee: Nippon Telegraph and Telephone CorporationInventor: Takuo Kanamitsu
-
Patent number: 11837238Abstract: A method for evaluating a verification model includes receiving a first and a second set of verification results where each verification result indicates whether a primary model or an alternative model verifies an identity of a user as a registered user. The method further includes identifying each verification result in the first and second sets that includes a performance metric. The method also includes determining a first score of the primary model based on a number of the verification results identified in the first set that includes the performance metric and determining a second score of the alternative model based on a number of the verification results identified in the second set that includes the performance metric. The method further includes determining whether a verification capability of the alternative model is better than a verification capability of the primary model based on the first score and the second score.Type: GrantFiled: October 21, 2020Date of Patent: December 5, 2023Assignee: Google LLCInventors: Jason Pelecanos, Pu-sen Chao, Yiling Huang, Quan Wang
-
Patent number: 11830509Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 28, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823695Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823694Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11823696Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.Type: GrantFiled: March 3, 2023Date of Patent: November 21, 2023Assignee: Dolby International ABInventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
-
Patent number: 11817080Abstract: Processor(s) of a client device can: receive audio data that captures a spoken utterance of a user of the client device; process, using an on-device speech recognition model, the audio data to generate a predicted textual segment that is a prediction of the spoken utterance; cause at least part of the predicted textual segment to be rendered (e.g., visually and/or audibly); receive further user interface input that is a correction of the predicted textual segment to an alternate textual segment; and generate a gradient based on comparing at least part of the predicted output to ground truth output that corresponds to the alternate textual segment. The gradient is used, by processor(s) of the client device, to update weights of the on-device speech recognition model and/or is transmitted to a remote system for use in remote updating of global weights of a global speech recognition model.Type: GrantFiled: October 11, 2019Date of Patent: November 14, 2023Assignee: GOOGLE LLCInventors: Françoise Beaufays, Johan Schalkwyk, Giovanni Motta
-
Methods and apparatuses for noise reduction based on time and frequency analysis using deep learning
Patent number: 11810586Abstract: A noise cancellation method including generating a first voice signal by canceling a first portion of noise included in an input voice signal using a first network, the first network being a trained u-net structure, and the first portion of the noise being in a time domain, applying a first window to the first voice signal, performing a fast Fourier transform on the first windowed voice signal to acquire a magnitude signal and a phase signal, acquiring a mask using a second network based on the magnitude signal, the second network being another trained u-net structure, applying the mask to the magnitude signal, generating a second voice signal by canceling a second portion of the noise by performing an inverse fast Fourier transform on the first windowed voice signal based on the masked magnitude signal and the phase signal, and applying a second window to the second voice signal.Type: GrantFiled: August 5, 2021Date of Patent: November 7, 2023Assignee: LINE PLUS CORPORATIONInventors: Ki Jun Kim, JongHewk Park -
Patent number: 11810575Abstract: An artificial intelligence robot for providing a voice recognition service includes a memory configured to store voice identification information, a microphone configured to receive a voice command; and a processor configured to extract voice identification information from a wake-up command included in the voice command and used to activate the voice recognition service and operate the voice recognition function in a deactivation state when the extracted voice identification information does not match the voice identification information stored in the memory.Type: GrantFiled: June 12, 2019Date of Patent: November 7, 2023Assignee: LG ELECTRONICS INC.Inventors: Inho Lee, Junmin Lee
-
Patent number: 11775761Abstract: A method for mining an entity focus in a text may include: performing word and phrase feature extraction on an input text; inputting an extracted word and phrase feature into a text coding network for coding, to obtain a coding sequence of the input text; processing the coding sequence of the input text using a core entity labeling network to predict a position of a core entity in the input text; extracting a subsequence corresponding to the core entity in the input text from the coding sequence of the input text, based on the position of the core entity in the input text; and predicting a position of a focus corresponding to the core entity in the input text using a focus labeling network, based on the coding sequence of the input text and the subsequence corresponding to the core entity in the input text.Type: GrantFiled: September 17, 2020Date of Patent: October 3, 2023Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.Inventors: Shu Wang, Kexin Ren, Xiaohan Zhang, Zhifan Feng, Yang Zhang, Yong Zhu