Patents Examined by Richemond Dorvil
  • Patent number: 11942070
    Abstract: A method, computer system, and a computer program product for speech synthesis is provided. The present invention may include generating one or more final voiceprints. The present invention may include generating one or more voice clones based on the one or more final voiceprints. The present invention may include classifying the one or more voice clones into a grouping using a language model, wherein the language model is trained using manually classified uncloned voice samples. The present invention may include identifying a cluster within the grouping, wherein the cluster is identified by determining a difference between corresponding vectors of the one or more voice clones below a similarity threshold. The present invention may include generating a new archetypal voice by blending the one or more voice clones of the cluster where the difference between the corresponding vectors is below the similarity threshold.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: March 26, 2024
    Assignee: International Business Machines Corporation
    Inventors: Aaron K. Baughman, Gray Franklin Cannon, Sara Perelman, Gary William Reiss, Corey B. Shelton
  • Patent number: 11935517
    Abstract: A speech decoding method is performed by a computer device, the speech including a current audio frame and a previous audio frame. The method includes: obtaining a target token corresponding to a smallest decoding score from a first token list including first tokens obtained by decoding the previous audio frame, each first token including a state pair and a decoding score, the state pair being used for characterizing a correspondence between a first state of the first token in a first decoding network corresponding to a low-order language model and a second state of the first token in a second decoding network corresponding to a differential language model; determining pruning parameters according to the target token and an acoustic vector of the current audio frame when the current audio frame is decoded; and decoding the current audio frame according to the first token list, the pruning parameters, and the acoustic vector.
    Type: Grant
    Filed: March 3, 2021
    Date of Patent: March 19, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventors: Yiheng Huang, Xiaozheng Jian, Liqiang He
  • Patent number: 11935523
    Abstract: There is provided automatic detection of pronunciation errors in spoken words utilizing a neural network model that is trained for a target phoneme. The target phoneme may be a phoneme in English language. The pronunciation errors may be detected in English words.
    Type: Grant
    Filed: November 15, 2019
    Date of Patent: March 19, 2024
    Assignee: Master English Oy
    Inventor: Aleksandr Diment
  • Patent number: 11928440
    Abstract: Systems and methods for handling multilingual queries are provided. One example method includes receiving, at a computing device, an input, wherein the input comprises a multi-lingual query comprising at least a first source language and a second source language. The multi-lingual query is translated, word for word, into a destination language to produce a monolingual query, with the word order of the multilingual query and the word order of the monolingual query being the same. The monolingual query is processed using natural language processing to map the mono-lingual query to a natural language query in the destination language.
    Type: Grant
    Filed: August 25, 2020
    Date of Patent: March 12, 2024
    Assignee: Rovi Guides, Inc.
    Inventors: Ajay Kumar Mishra, Jeffry Copps Robert Jose
  • Patent number: 11929089
    Abstract: An apparatus for processing a multichannel audio signal has a plurality of channel signals. The apparatus performs a time scale modulation of the multichannel audio signal and has a phase adaptor and a separator. The phase adaptor provides a processed signal by modifying a phase of a signal based on a combination of the channel signals. The separator provides separated signals based on the processed signal. A corresponding method is provided.
    Type: Grant
    Filed: October 31, 2018
    Date of Patent: March 12, 2024
    Assignee: FRAUNHOFER-GESELLSCHAFT ZUR FĂ–RDERUNG DER ANGEWANDTEN FORSCHUNG E.V.
    Inventors: Christian Uhle, Michael Kratz, Paul Klose, Timothy Leonard, André Luvizotto, Sebastian Scharrer
  • Patent number: 11922141
    Abstract: Systems and methods are disclosed for a voice/chatbot building system. The voice/chatbot builder may involve receiving an identified intent, receiving a task related to the identified intent, and receiving a response related to both the identified intent and the task. The identified intent, task, and response may form a first conversation. The first conversation may be linked to other conversations to establish contextual relationships among conversations and determine conversation priority. Voice/chatbot building may also train natural language processing machine learning algorithms.
    Type: Grant
    Filed: January 29, 2021
    Date of Patent: March 5, 2024
    Assignee: Walmart Apollo, LLC
    Inventors: John Brian Moss, Don Bambico, Jason Charles Benesch, Snehasish Mukherjee
  • Patent number: 11922967
    Abstract: In one aspect, a method includes detecting a fingerprint match between query fingerprint data representing at least one audio segment within podcast content and reference fingerprint data representing known repetitive content within other podcast content, detecting a feature match between a set of audio features across multiple time-windows of the podcast content, and detecting a text match between at least one query text sentences from a transcript of the podcast content and reference text sentences, the reference text sentences comprising text sentences from the known repetitive content within the other podcast content. The method also includes responsive to the detections, generating sets of labels identifying potential repetitive content within the podcast content. The method also includes selecting, from the sets of labels, a consolidated set of labels identifying segments of repetitive content within the podcast content, and responsive to selecting the consolidated set of labels, performing an action.
    Type: Grant
    Filed: December 10, 2020
    Date of Patent: March 5, 2024
    Assignee: Gracenote, Inc.
    Inventors: Amanmeet Garg, Aneesh Vartakavi
  • Patent number: 11915709
    Abstract: An inter-channel phase difference (IPD) parameter extraction method includes obtaining a parameter for obtaining an information extraction manner for a current frame of a multi-channel signal; obtaining an IPD parameter extraction manner for the current frame based on the parameter for obtaining the information extraction manner, where the obtained IPD parameter extraction manner is one of at least two preset IPD parameter extraction manners; and obtaining an IPD parameter of the current frame based on the obtained IPD parameter extraction manner for the current frame.
    Type: Grant
    Filed: June 16, 2022
    Date of Patent: February 27, 2024
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Xingtao Zhang, Haiting Li, Zexin Liu, Lei Miao
  • Patent number: 11908486
    Abstract: A method for decoding an encoded audio bitstream is disclosed. The method includes receiving the encoded audio bitstream and decoding the audio data to generate a decoded lowband audio signal. The method further includes extracting high frequency reconstruction metadata and filtering the decoded lowband audio signal with an analysis filterbank to generate a filtered lowband audio signal. The method also includes extracting a flag indicating whether either spectral translation or harmonic transposition is to be performed on the audio data and regenerating a highband portion of the audio signal using the filtered lowband audio signal and the high frequency reconstruction metadata in accordance with the flag. The high frequency regeneration is performed as a post-processing operation with a delay of 3010 samples per audio channel.
    Type: Grant
    Filed: January 20, 2023
    Date of Patent: February 20, 2024
    Assignee: DOLBY INTERNATIONAL AB
    Inventors: Kristofer Kjoerling, Lars Villemoes, Heiko Purnhagen, Per Ekstrand
  • Patent number: 11908450
    Abstract: A conversation design is received for a conversation bot that enables the conversation bot to provide a service using a conversation flow specified at least in part by the conversation design. The conversation design specifies in a first human language at least a portion of a message content to be provided by the conversation bot. It is identified that an end-user of the conversation bot prefers to converse in a second human language different from the first human language. In response to a determination that the message content is to be provided by the conversation bot to the end-user, the message content of the conversation design is dynamically translated for the end-user from the first human language to the second human language. The translated message content is provided to the end-user in a message from the conversation bot.
    Type: Grant
    Filed: May 26, 2020
    Date of Patent: February 20, 2024
    Assignee: ServiceNow, Inc.
    Inventors: Jebakumar Mathuram Santhosm Swvigaradoss, Satya Sarika Sunkara, Ankit Goel, Rajesh Voleti, Rishabh Verma, Patrick Casey, Rao Surapaneni
  • Patent number: 11908488
    Abstract: A computing device translates a spoken word into a corresponding ordered set of spoken phonemes and analyzes correctness of the spoken word relative to a target word. The analyzing includes attempting to locate each of the spoken phonemes in an ordered set of grapheme-phoneme correspondences (GPCs) describing the target word, and determining whether or not the ordered set of spoken phonemes comprises a same number of phonemes as in the ordered set of GPCs. The analyzing also includes comparing the order of the ordered set of spoken phonemes against the order of the ordered set of GPCs. The computing device generates a report, based on the analyzing, that identifies at least one of the GPCs in the ordered set of GPCs as having been incorrectly applied in decoding the target word.
    Type: Grant
    Filed: May 28, 2021
    Date of Patent: February 20, 2024
    Assignee: METAMETRICS, INC.
    Inventor: Neena Marie Saha
  • Patent number: 11893999
    Abstract: Techniques for enrolling a user in a system's user recognition functionality without requiring the user speak particular speech are described. The system may determine characteristics unique to a user input. The system may generate an implicit voice profile from user inputs having similar characteristics. After an implicit voice profile is generated, the system may receive a user input having speech characteristics similar to that of the implicit voice profile. The system may ask the user if the user wants the system to associate the implicit voice profile with a particular user identifier. If the user responds affirmatively, the system may request an identifier of a user profile (e.g., a user name). In response to receiving the user's name, the system may identify a user profile associated with the name and associate the implicit voice profile with the user profile, thereby converting the implicit voice profile into an explicit voice profile.
    Type: Grant
    Filed: August 6, 2018
    Date of Patent: February 6, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Sai Sailesh Kopuri, John Moore, Sundararajan Srinivasan, Aparna Khare, Arindam Mandal, Spyridon Matsoukas, Rohit Prasad
  • Patent number: 11893135
    Abstract: A system for automated text anonymisation of clinical text, the system including an AI pipeline module to configure symbolic AI pipeline components for detecting protected health information (PHI) in the clinical text; a masking module for masking the detected PHI in the clinical text and generating a de-identified clinical text output file as well as a corresponding label file with de-identified information. The pipeline components may include at least one non-symbolic AI pipeline component or machine learning model.
    Type: Grant
    Filed: February 19, 2021
    Date of Patent: February 6, 2024
    Assignee: Harrison AI Pty Ltd
    Inventor: Benjamin Clayton Hachey
  • Patent number: 11893993
    Abstract: Dynamic interfacing with applications is provided. For example, a system receives a first input audio signal. The system processes, via a natural language processing technique, the first input audio signal to identify an application. The system activates the application for execution on the client computing device. The application declares a function the application is configured to perform. The system modifies the natural language processing technique responsive to the function declared by the application. The system receives a second input audio signal. The system processes, via the modified natural language processing technique, the second input audio signal to detect one or more parameters. The system determines that the one or more parameters are compatible for input into an input field of the application. The system generates an action data structure for the application. The system inputs the action data structure into the application, which executes the action data structure.
    Type: Grant
    Filed: November 28, 2022
    Date of Patent: February 6, 2024
    Assignee: GOOGLE LLC
    Inventors: Quazi Hussain, Adam Coimbra, Ilya Firman
  • Patent number: 11887617
    Abstract: An electronic device for speech recognition includes a multi-channel microphone array required for remote speech recognition. The electronic device improves efficiency and performance of speech recognition of the electronic device in a space where noise other than speech to be recognized exists. A control method includes receiving a plurality of audio signals output from a plurality of sources through a plurality of microphones and analyzing the audio signals and obtaining information on directions in which the audio signals are input and information on input times of the audio signals. A target source for speech recognition among the plurality of sources is determined on the basis of the obtained information on the directions in which the plurality of audio signals are input, and the obtained information on the input times of the plurality of audio signals, and an audio signal obtained from the determined target source is processed.
    Type: Grant
    Filed: May 31, 2019
    Date of Patent: January 30, 2024
    Assignee: SAMSUNG ELECTRONICS CO., LTD.
    Inventors: Ki Hoon Shin, Jonguk Yoo, Sangmoon Lee
  • Patent number: 11886813
    Abstract: A system and method of operating a system for automatically punctuating text using non-recurrent neural networks is disclosed. The system and method at least: applying a text string to a first component of a non-recurrent neural network trained to generate one or more contextualized vectors, wherein the first component determines the contextualized vectors by processing each word in the text string in parallel with one another; applying the contextualized vectors to a second component of the non-recurrent neural network trained to generate a set of probability values for each word in the text string, wherein the second component determines the set of probability values by processing the contextualized vectors in parallel with one another; and transmitting the set of probability values to a text generation engine to generate a formatted text string based on the set of probability values.
    Type: Grant
    Filed: September 24, 2020
    Date of Patent: January 30, 2024
    Assignee: Capital One Services, LLC
    Inventors: Maury Courtland, Adam Faulkner, Gayle McElvain
  • Patent number: 11875132
    Abstract: An example operation may include one or more of transferring a copy of a plurality of revised translation data sets to be added to an IVR application into a grid structure, each revised translation data set comprising a prompt name in a first field, an IVR prompt in a second field, a translation of the IVR prompt into a different language in a third field, and a timestamp in a fourth field, executing, via a processor, an accuracy validation on the plurality of revised translation data sets, wherein, for each revised translation data set, the processor identifies whether a respective translation in a different language in a third field is an accurate translation of a respective IVR prompt in a second field based on attributes of the respective translation and the respective IVR prompt, and displaying results of the accuracy validation via a user interface.
    Type: Grant
    Filed: May 13, 2021
    Date of Patent: January 16, 2024
    Assignee: Intrado Corporation
    Inventors: Terry Olson, Mark L. Sempek, Roger Wehrle
  • Patent number: 11875797
    Abstract: A scripted audio production system in which the scripted audio production computerized process decreases production time by improving computerized processes and technological systems for pronunciation research and script preparation, narration, editing, proofing and mastering. The system enables the user to upload their manuscript and recorded audio of the narration of the manuscript to the system. The system then compares the recorded audio against previously uploaded manuscript and any mistakes or deviations from the manuscript are highlighted or otherwise indicated to the user. The system automatically pieces together the last-read audio into a clean file without the need for significant user interaction. The process may also be performed on the recorded audio by the narrator first uploading the audio and manuscript to the scripted audio production technology system.
    Type: Grant
    Filed: June 22, 2021
    Date of Patent: January 16, 2024
    Assignee: Pozotron Inc.
    Inventors: Jakub Poznanski, Kostiantyn Hlushak
  • Patent number: 11875779
    Abstract: Disclosed is a voice activity detection (VAD) device and method capable of referring to an environment detection result and thereby selecting one of multiple VAD results as a basis for determining whether a voice activity occurs. The VAD device includes an environment detection circuit, a VAD circuit, and a voice activity decision circuit. The environment detection circuit is configured to process an audio input signal and thereby generate an environment detection result. The VAD circuit is configured to analyze the audio input signal with multiple VAD algorithms and thereby generate multiple VAD results. The voice activity decision circuit is configured to select one of the multiple VAD results according to the environment detection result.
    Type: Grant
    Filed: September 3, 2021
    Date of Patent: January 16, 2024
    Assignee: REALTEK SEMICONDUCTOR CORPORATION
    Inventor: Yi-Cheng Huang
  • Patent number: 11869516
    Abstract: A voice processing method is provided for a terminal. The method includes: performing voice speed detection on a voice obtained from a voice source, to obtain a voice speed value of the voice; obtaining a forward error correction (FEC) redundancy; adjusting the FEC redundancy according to the voice speed value to obtain a target redundancy; performing voice encoding on the voice to obtain a voice encoded packet; performing FEC encoding on the voice encoded packet according to the target redundancy to obtain a redundancy packet; and transmitting the redundancy packet and the voice encoded packet to a receiving end.
    Type: Grant
    Filed: November 8, 2021
    Date of Patent: January 9, 2024
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang