Patents Examined by Paras D Shah
  • Patent number: 12154552
    Abstract: A natural language understanding (NLU) system generates in-place annotations for natural language utterances or other types of time-based media based on stand-off annotations. The in-place annotations are associated with particular sub-sequences of an annotation, which provides richer information than stand-off annotations, which are associated only with an utterance as a whole. To generate the in-place annotations for an utterance, the NLU system applies an encoder network and a decoder network to obtain attention weights for the various tokens within the utterance. The NLU system disqualifies tokens of the utterance based on their corresponding attention weights, and selects highest-scoring contiguous sequences of tokens between the disqualified tokens. In-place annotations are associated with the selected sequences.
    Type: Grant
    Filed: August 31, 2021
    Date of Patent: November 26, 2024
    Assignee: Interactions LLC
    Inventors: Brian David Lester, Srinivas Bangalore
  • Patent number: 12148426
    Abstract: Embodiments of the disclosure generally relate to a dialog system allowing for automatically reactivating a speech acquiring mode after the dialog system delivers a response to a user request. The reactivation parameters, such as a delay, depend on a number of predetermined factors and conversation scenarios. The embodiments further provide for a method of operating of the dialog system. An exemplary method comprises the steps of: activating a speech acquiring mode, receiving a first input of a user, deactivating the speech acquiring mode, obtaining a first response associated with the first input, delivering the first response to the user, determining that a conversation mode is activated, and, based on the determination, automatically re-activating the speech acquiring mode within a first predetermined time period after delivery of the first response to the user.
    Type: Grant
    Filed: May 18, 2022
    Date of Patent: November 19, 2024
    Assignee: GOOGLE LLC
    Inventors: Ilya Gennadyevich Gelfenbeyn, Artem Goncharuk, Pavel Aleksandrovich Sirotin
  • Patent number: 12148444
    Abstract: Methods, systems, and computer program products for generating, from an input character sequence, an output sequence of audio data representing the input character sequence. The output sequence of audio data includes a respective audio output sample for each of a number of time steps. One example method includes, for each of the time steps: generating a mel-frequency spectrogram for the time step by processing a representation of a respective portion of the input character sequence using a decoder neural network; generating a probability distribution over a plurality of possible audio output samples for the time step by processing the mel-frequency spectrogram for the time step using a vocoder neural network; and selecting the audio output sample for the time step from the possible audio output samples in accordance with the probability distribution.
    Type: Grant
    Filed: April 5, 2021
    Date of Patent: November 19, 2024
    Assignee: Google LLC
    Inventors: Yonghui Wu, Jonathan Shen, Ruoming Pang, Ron J. Weiss, Michael Schuster, Navdeep Jaitly, Zongheng Yang, Zhifeng Chen, Yu Zhang, Yuxuan Wang, Russell John Wyatt Skerry-Ryan, Ryan M. Rifkin, Ioannis Agiomyrgiannakis
  • Patent number: 12141542
    Abstract: Methods and servers for training a translation model for translation between a rare language from a group and a target language. The method includes acquiring an actual example of translation and using a transliteration function for generating a synthetic actual example of translation. The method includes acquiring a sentence in the target language, generating an artificial translation of that sentence using back-translation, and thereby generating a given artificial example of translation. The method includes generating a synthetic artificial example based on the given artificial example. The method includes training the translation model based on the synthetic actual example of translation and the synthetic artificial example of translation.
    Type: Grant
    Filed: December 17, 2021
    Date of Patent: November 12, 2024
    Assignee: Y.E. Hub Armenia LLC
    Inventors: Anton Aleksandrovich Dvorkovich, Roman Olegovich Peshkurov
  • Patent number: 12131750
    Abstract: A method for enhancing detection of synthetic voice data is provided that includes converting, by an electronic device, monophonic voice data into stereophonic voice data. The stereophonic voice data includes a first channel signal and a second channel signal. Moreover, the method includes decomposing, by a trained machine learning model, the stereophonic voice data into a mid-signal and a side signal. The method also includes determining artifacts indicative of synthetic generation in the structured and secondary artifacts, calculating, based on the determined artifacts, a probability score reflecting the likelihood the monophonic voice data was synthetically generated, and comparing the probability score against a threshold value. When the probability score satisfies the threshold value, there is a high likelihood that the monophonic voice data includes synthetic artifacts, and an alert is generated indicating the monophonic voice data is potentially fraudulent.
    Type: Grant
    Filed: May 10, 2024
    Date of Patent: October 29, 2024
    Assignee: Daon Technology
    Inventors: Raphael A. Rodriguez, Olena Mizynchuk, Davyd Mizynchuk
  • Patent number: 12131730
    Abstract: A keyword is extracted robustly despite a voice recognition result including an error. A model storage unit 10 stores a keyword extraction model that accepts word vector representations of a plurality of words as an input and extracts and outputs a word vector representation of a word to be extracted as a keyword. A speech detection unit 11 detects a speech part from a voice signal. A voice recognition unit 12 executes voice recognition on the speech part of the voice signal and outputs a confusion network which is a voice recognition result. A word vector representation generating unit 13 generates a word vector representation including reliability of voice recognition with regard to each candidate word for each confusion set. A keyword extraction unit 14 inputs the word vector representation of the candidate word to the keyword extraction model in descending order of the reliability and obtains the word vector representation of the keyword.
    Type: Grant
    Filed: November 19, 2019
    Date of Patent: October 29, 2024
    Assignee: NIPPON TELEGRAPH AND TELEPHONE CORPORATION
    Inventors: Takashi Nakamura, Tomohiro Tanaka
  • Patent number: 12125473
    Abstract: Embodiments of this disclosure disclose a speech recognition method, apparatus, and device, and a storage medium. The method in the embodiments of this disclosure includes: adjusting a probability of a relationship between at least one pair of elements in a language recognition model according to a probability of the relationship between the at least one pair of elements in a textual segment; inputting a to-be-recognized speech into a speech recognition model including the language recognition model; and determining, according to the adjusted probability of relationship between the at least tone pair of elements in the language recognition model, a sequence of elements corresponding to the to-be-recognized speech as a speech recognition result.
    Type: Grant
    Filed: March 4, 2021
    Date of Patent: October 22, 2024
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventor: Tao Li
  • Patent number: 12118309
    Abstract: A method converts from compressed language to natural language. The method includes receiving an element string. The element string is in a compressed language format and is extracted from a document in a structured document language. The method includes tokenizing the element string to form multiple element tokens, generating a token set from the element tokens, and generating a name string from multiple token sets. The name string is in a natural language format.
    Type: Grant
    Filed: September 28, 2021
    Date of Patent: October 15, 2024
    Assignee: Intuit Inc.
    Inventors: Esmé Manandise, Glenn Carter Scott
  • Patent number: 12112741
    Abstract: A method, computer program product, and computing system for defining a model representative of a plurality of acoustic variations to a speech signal, thus defining a plurality of time-varying spectral modifications. The plurality of time-varying spectral modifications may be applied to a reference signal using a filtering operation, thus generating a time-varying spectrally-augmented signal.
    Type: Grant
    Filed: February 18, 2021
    Date of Patent: October 8, 2024
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Patrick A. Naylor, Dushyant Sharma, Uwe Helmut Jost, William F Ganong, III
  • Patent number: 12062365
    Abstract: An apparatus for training a dialogue summary model according to an embodiment includes a parameter transferer configured to transfer one or more learning parameter values of a pre-trained natural language processing model to a sequence-to-sequence-based dialogue summary model, and a model trainer configured to train the dialogue summary model by using the transferred learning parameter values as initial values for learning parameters of each of an encoder and a decoder in the dialogue summary model.
    Type: Grant
    Filed: October 29, 2021
    Date of Patent: August 13, 2024
    Assignee: SAMSUNG SDS CO., LTD.
    Inventors: Hyun Jae Lee, Hyun Jin Choi, Jae Woong Yun, Ju Dong Kim, Bong Kyu Hwang, Seong Ho Joe, Young June Gwon
  • Patent number: 12062360
    Abstract: The present invention has an issue of effectively reducing the input load related to a voice trigger. There is provided an information processing device comprising a registration control unit that dynamically controls registration of startup phrases used as start triggers of a voice interaction session, in which the registration control unit temporarily additionally registers at least one of the startup phrases based on input voice. There is also provided an information processing method comprising dynamically controlling, by a processor, registration of startup phrases used as start triggers of a voice interaction session, in which the controlling further includes temporarily additionally registering at least one of the startup phrases based on input voice.
    Type: Grant
    Filed: March 12, 2019
    Date of Patent: August 13, 2024
    Assignee: SONY CORPORATION
    Inventors: Hiro Iwase, Yuhei Taki, Kunihito Sawai
  • Patent number: 12020693
    Abstract: An electronic device includes an audio sensor, a memory, and at least one processor coupled to the audio sensor and the memory. The at least one processor is configured to receive, via the audio sensor an audio input. The at least one processor is further configured to perform, using an automatic speech recognition (ASR) model and an entity prediction model, out-of-vocabulary prediction of an entity. The at least one processor is further configured to receive an ASR hypothesis including the predicted entity. The at least one processor is further configured to output text including the predicted entity.
    Type: Grant
    Filed: January 27, 2021
    Date of Patent: June 25, 2024
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Suhel Jaber, Anil Yadav, Melvin Lobo, Sukrat Gupta
  • Patent number: 11996118
    Abstract: An important task in several wellness applications is detection of emotional valence from speech. Two types of features of speech signals are used to detect valence: acoustic features and text features. Acoustic features are derived from short frames of speech, while text features are derived from the text transcription. Present disclosure provides systems and methods that determine the effect of text on acoustic features. Acoustic features of speech segments carrying emotion words are to be treated differently from other segments that do not carry such words. Only specific speech segments of the input speech signal are considered based on a dictionary specific to a language to assess emotional valence. A model trained (or trained classifier) for specific language either by including the acoustic features of the emotion related words or by omitting it is used by the system for determining emotional valence in an input speech signal.
    Type: Grant
    Filed: October 19, 2021
    Date of Patent: May 28, 2024
    Assignee: TATA CONSULTANCY SERVICES LIMITED
    Inventors: Ramesh Kumar Ramakrishnan, Venkata Subramanian Viraraghavan, Rahul Dasharath Gavas, Sachin Patel, Gauri Deshpande
  • Patent number: 11977855
    Abstract: The Zero User Interface (UI)-based automatic speech translation system and method can solve problems such as the procedural inconvenience of inputting speech signals and the malfunction of speech recognition due to crosstalk when users who speak difference languages have a face-to-face conversation. The system includes an automatic speech translation server, speaker terminals and a counterpart terminal. The automatic speech translation server selects a speech signal of a speaker among multiple speech signals received from speaker terminals connected to an automatic speech translation service and transmits a result of translating the speech signal of the speaker into a target language to a counterpart terminal.
    Type: Grant
    Filed: November 9, 2021
    Date of Patent: May 7, 2024
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Sang Hun Kim, Seung Yun, Min Kyu Lee, Joon Gyu Maeng, Dong Hyun Kim
  • Patent number: 11967326
    Abstract: Disclosed are a Digital Enhanced Cordless Telecommunications (DECT) base station, a handset and a communication system. The DECT base station is installed with at least two different voice codecs and is configured to sequentially judge whether the voice codecs of the DECT base station match voice codecs of a handset according to a first codec information data packet sent by the handset, and when a matching is successful, send a response information to the handset to perform voice communication with the handset through a matched voice codec. By using the DECT base station can guarantee the normal voice communication between the DECT base station and the handset in the condition of installing voice codecs in the DECT base station.
    Type: Grant
    Filed: September 30, 2020
    Date of Patent: April 23, 2024
    Assignee: YEALINK (XIAMEN) NETWORK TECHNOLOGY CO., LTD.
    Inventors: Wanjian Feng, Zhipeng Lin
  • Patent number: 11967332
    Abstract: A computer-implemented method for correcting muffled speech caused by facial coverings is disclosed. The computer-implemented method includes monitoring a user's speech for speech distortion. The computer-implemented method further includes determining that the user's speech is distorted. The computer-implemented method further includes determining that a cause of the user's speech distortion is based, at least in part, on a presence of a particular type of facial covering. The computer-implemented method further includes automatically correcting the speech distortion of the user based, at least in part, on the particular type of facial covering causing the speech distortion.
    Type: Grant
    Filed: September 17, 2021
    Date of Patent: April 23, 2024
    Assignee: International Business Machines Corporation
    Inventors: Girmaw Abebe Tadesse, Michael S. Gordon, Komminist Weldemariam
  • Patent number: 11961531
    Abstract: An apparatus for processing an audio signal includes a configurable first audio signal processor for processing the audio signal in accordance with different configuration settings to obtain a processed audio signal, wherein the apparatus is adapted so that different configuration settings result in different sampling rates of the processed audio signal. The apparatus furthermore includes n analysis filter bank having a first number of analysis filter bank channels, a synthesis filter bank having a second number of synthesis filter bank channels, a second audio processor being adapted to receive and process an audio signal having a predetermined sampling rate, and a controller for controlling the first number of analysis filter bank channels or the second number of synthesis filter bank channels in accordance with a configuration setting.
    Type: Grant
    Filed: May 5, 2022
    Date of Patent: April 16, 2024
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Markus Lohwasser, Manuel Jander, Max Neuendorf, Ralf Geiger, Markus Schnell, Matthias Hildenbrand, Tobias Chalupka
  • Patent number: 11955026
    Abstract: A method, computer program product, and computer system for public speaking guidance is provided. A processor retrieves speaker data regarding a speech made by a user. A processor separates the speaker data into one or more speaker modalities. A processor extracts one or more speaker features from the speaker data for the one or more speaker modalities. A processor generates a performance classification based on the one or more speaker features. A processor sends to the user guidance regarding the speech based on the performance classification.
    Type: Grant
    Filed: September 26, 2019
    Date of Patent: April 9, 2024
    Assignee: International Business Machines Corporation
    Inventors: Cheng-Fang Lin, Ching-Chun Liu, Ting-Chieh Yu, Yu-Siang Chen, Ryan Young
  • Patent number: 11914965
    Abstract: Disclosed systems relate to generating questions from text. In an example, a method includes forming a first semantic tree from a first reference text and second semantic tree from a second reference text. The method includes identifying a set of semantic nodes that are in the first semantic tree but not in the second semantic tree. The method includes forming a first syntactic tree for the first reference text and a second syntactic tree for the second reference text. The method includes identifying a set of syntactic nodes that are in the first syntactic tree but not in the second syntactic tree. The method includes mapping the set of semantic nodes to the set of syntactic nodes by identifying a correspondence between a semantic node and a syntactic node, forming a question fragment from a normalized word, and providing the question fragment to a user device.
    Type: Grant
    Filed: July 30, 2021
    Date of Patent: February 27, 2024
    Assignee: Oracle International Corporation
    Inventor: Boris Galitsky
  • Patent number: 11907820
    Abstract: Techniques are provided for improving computers as tools for assisting in negotiations. Specifically, techniques are provided for using a trained machine learning system to predict the likelihood that a party to a negotiation intends to comply with terms that are under consideration. In some negotiations, each party of a negotiation may use the techniques described herein to determine terms to offer the other party. In such situations, both parties may be both terms-receiving parties and terms-offering parties. By using a trained machine learning system to predict the intent of a party, the possibility of human bias significantly reduced, allowing proposed terms to be based more on objective facts and predictive indicators rather than the prejudices of the agents that have been delegated the responsibility of proposing terms.
    Type: Grant
    Filed: November 28, 2018
    Date of Patent: February 20, 2024
    Assignee: LendingClub Corporation
    Inventor: Ramaswamy Alagianambi