Patents Examined by Douglas Godbold
  • Patent number: 11797782
    Abstract: A cross-lingual voice conversion system and method comprises a voice feature extractor configured to receive a first voice audio segment in a first language and a second voice audio segment in a second language, and extract, respectively, audio features comprising first-voice, speaker-dependent acoustic features and second-voice, speaker-independent linguistic features. One or more generators are configured to receive extracted features, and produce therefrom a third voice candidate keeping the first-voice, speaker-dependent acoustic features and the second-voice, speaker-independent linguistic features, wherein the third voice candidate speaks the second language. One or more discriminators are configured to compare the third voice candidate with the ground truth data, and provide results of the comparison back to the generator for refining the third voice candidate.
    Type: Grant
    Filed: December 30, 2020
    Date of Patent: October 24, 2023
    Assignee: TMRW Foundation IP S. À R.L.
    Inventor: Cevat Yerli
  • Patent number: 11798566
    Abstract: The present disclosure discloses a data transmission method performed by a computer device and a non-transitory computer-readable storage medium. According to the present disclosure, voice criticality analysis is performed on a to-be-transmitted audio to obtain a criticality level of each to-be-transmitted audio frame in the to-be-transmitted audio, and a corrected redundancy multiple of each to-be-transmitted audio frame is obtained according to a current redundancy multiple and a redundant transmission factor corresponding to the criticality level of each to-be-transmitted audio frame. Therefore, each to-be-transmitted audio frame is duplicated according to a corrected redundancy multiple of each to-be-transmitted audio frame, to obtain at least one redundancy data packet, and the at least one redundancy data packet is transmitted to a target terminal, which can improve the network anti-packet loss effect without causing network congestion.
    Type: Grant
    Filed: October 28, 2021
    Date of Patent: October 24, 2023
    Assignee: TENCENT TECHNOLOGY (SHENZHEN) COMPANY LIMITED
    Inventor: Junbin Liang
  • Patent number: 11798547
    Abstract: A voice activated device for interaction with a digital assistant is provided. The device comprises a housing, one or more processors, and memory, the memory coupled to the one or more processors and comprising instructions for automatically identifying and connecting to a digital assistant server. The device further comprises a power supply, a wireless network module, and a human-machine interface. The human-machine interface consists essentially of: at least one speaker, at least one microphone, an ADC coupled to the microphone, a DAC coupled to the at least one speaker, and zero or more additional components selected from the set consisting of: a touch-sensitive surface, one or more cameras, and one or more LEDs. The device is configured to act as an interface for speech communications between the user and a digital assistant of the user on the digital assistant server.
    Type: Grant
    Filed: August 6, 2020
    Date of Patent: October 24, 2023
    Assignee: Apple Inc.
    Inventor: Kevin Milden
  • Patent number: 11797773
    Abstract: Navigating text using an extended discourse tree. In an example, a method accesses an extended discourse tree that includes a first discourse tree for a first document and a second discourse tree for a second document. The method determines a first elementary discourse unit that is responsive to a query from a user device and a corresponding first position. The method further determines a set of navigation options including a first rhetorical relationship between the first elementary discourse unit and a second elementary discourse unit of the first discourse tree and a second rhetorical relationship between the first elementary discourse unit and a third elementary discourse unit of the second discourse tree. The method presents the rhetorical relationships to a user device. Responsive to receiving, from a user device, a selection of a rhetorical relationship, the method presents a corresponding elementary discourse unit to the user device.
    Type: Grant
    Filed: February 24, 2022
    Date of Patent: October 24, 2023
    Assignee: Oracle International Corporation
    Inventor: Boris Galitsky
  • Patent number: 11790933
    Abstract: Systems and methods are disclosed for displaying electronic multimedia content to a user. One computer-implemented method for manipulating electronic multimedia content includes generating, using a processor, a speech model and at least one speaker model of an individual speaker. The method further includes receiving electronic media content over a network; extracting an audio track from the electronic media content; and detecting speech segments within the electronic media content based on the speech model. The method further includes detecting a speaker segment within the electronic media content and calculating a probability of the detected speaker segment involving the individual speaker based on the at least one speaker model.
    Type: Grant
    Filed: March 31, 2020
    Date of Patent: October 17, 2023
    Assignee: Verizon Patent and Licensing Inc.
    Inventors: Peter F. Kocks, Guoning Hu, Ping-Hao Wu
  • Patent number: 11790171
    Abstract: A natural language understanding method begins with a radiological report text containing clinical findings. Errors in the text are corrected by analyzing character-level optical transformation costs weighted by a frequency analysis over a corpus corresponding to the report text. For each word within the report text, a word embedding is obtained, character-level embeddings are determined, and the word and character-level embeddings are concatenated to a neural network which generates a plurality of NER tagged spans for the report text. A set of linked relationships are calculated for the NER tagged spans by generating masked text sequences based on the report text and determined pairs of potentially linked NER spans. A dense adjacency matrix is calculated based on attention weights obtained from providing the one or more masked text sequences to a Transformer deep learning network, and graph convolutions are then performed over the calculated dense adjacency matrix.
    Type: Grant
    Filed: April 15, 2020
    Date of Patent: October 17, 2023
    Assignee: Covera Health
    Inventors: Ron Vianu, W. Nathaniel Brown, Gregory Allen Dubbin, Daniel Robert Elgort, Benjamin L. Odry, Benjamin Sellman Suutari, Jefferson Chen
  • Patent number: 11783841
    Abstract: A method and system for secure speaker authentication between a caller device and a first device using an authentication server are provided. The system comprises extracting features into a feature matrix from an incoming audio call; generating a partial i-vector, wherein the partial i-vector includes a first low-order statistic; sending the partial i-vector to the authentication server; and receiving from the authentication server a match score generated based on a full i-vector and another i-vector being stored on the authentication server, wherein the full i-vector is generated from the partial i-vector.
    Type: Grant
    Filed: March 15, 2021
    Date of Patent: October 10, 2023
    Assignee: ILLUMA LABS INC.
    Inventor: Milind Borkar
  • Patent number: 11775778
    Abstract: Embodiments of the disclosed technologies incorporate taxonomy information into a cross-lingual entity graph and input the taxonomy-informed cross-lingual entity graph into a graph neural network. The graph neural network computes semantic alignment scores for node pairs. The semantic alignment scores are used to determine whether a node pair represents a valid machine translation.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: October 3, 2023
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Zhuliu Li, Xiao Yan, Yiming Wang, Jaewon Yang
  • Patent number: 11776549
    Abstract: Techniques are described herein for multi-factor audio watermarking. A method includes: receiving audio data; processing the audio data to generate predicted output that indicates a probability of one or more hotwords being present in the audio data; determining that the predicted output satisfies a threshold that is indicative of the one or more hotwords being present in the audio data; in response to determining that the predicted output satisfies the threshold, processing the audio data using automatic speech recognition to generate a speech transcription feature; detecting a watermark that is embedded in the audio data; and in response to detecting the watermark: determining that the speech transcription feature corresponds to one of a plurality of stored speech transcription features; and in response to determining that the speech transcription feature corresponds to one of the plurality of stored speech transcription features, suppressing processing of a query included in the audio data.
    Type: Grant
    Filed: December 7, 2020
    Date of Patent: October 3, 2023
    Assignee: GOOGLE LLC
    Inventors: Aleks Kracun, Matthew Sharifi
  • Patent number: 11775773
    Abstract: A virtual assistant server determines at least one user intent based on an analysis of a received conversational user input. One or more of a plurality of views is identified based on the at least one user intent. Further, the virtual assistant server retrieves content based on the at least one user intent or the identified one or more views. The virtual assistant server determines one of a plurality of graphical user interface layers to display for each of one or more parts of the content and the identified one or more views based at least on one or more factors related to the content. Subsequently, the virtual assistant server outputs instructions based on the determined one of the graphical user interface layers in response to the received conversational user input.
    Type: Grant
    Filed: December 15, 2020
    Date of Patent: October 3, 2023
    Assignee: KORE.AI, INC.
    Inventors: Rajkumar Koneru, Prasanna Kumar Arikala Gunalan
  • Patent number: 11769480
    Abstract: The present disclosure discloses a method and apparatus for training a model, a method and apparatus for synthesizing a speech, a device and a storage medium, and relates to the field of natural language processing and deep learning technology. The method for training a model may include: determining a phoneme feature and a prosodic word boundary feature of sample text data; inserting a pause character into the phoneme feature according to the prosodic word boundary feature to obtain a combined feature of the sample text data; and training an initial speech synthesis model according to the combined feature of the sample text data, to obtain a target speech synthesis model.
    Type: Grant
    Filed: December 3, 2020
    Date of Patent: September 26, 2023
    Assignee: BEIJING BAIDU NETCOM SCIENCE AND TECHNOLOGY CO., LTD.
    Inventors: Zhengkun Gao, Junteng Zhang, Wenfu Wang, Tao Sun
  • Patent number: 11755849
    Abstract: The present disclosure provides an information switching method. The method includes: obtaining tilting information after an tilt direction of a device changes; searching a pre-set tilt direction matching the tilting information and determining pre-set information corresponding to the matched pre-set tilt direction; and switching first input information of the device to second input information, where the second input information is determined based on the pre-set information matching the pre-set tilt direction.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: September 12, 2023
    Assignee: BEIJING SOGOU TECHNOLOGY DEVELOPMENT CO., LTD.
    Inventor: Hailei Ma
  • Patent number: 11749297
    Abstract: A voice quality estimation apparatus includes: a packet sequence creation unit configured to create a first sequence by applying a first characteristic indicating that quality degradation caused by packet loss is perceived by a user all at once, to a sequence consisting of elements each indicating whether or not a packet of a voice call has been lost; a smoothing unit configured to create a second sequence from the first sequence; a degradation amount emphasis unit configured to create a third sequence from the second sequence; a packet loss tolerance characteristics reflection unit configured to create a fourth sequence from the third sequence; a degradation amount calculation unit configured to calculate a degradation amount from the fourth sequence; and a listening quality estimation unit configured to estimate voice quality that is to be experienced by the user, from the degradation amount.
    Type: Grant
    Filed: February 13, 2020
    Date of Patent: September 5, 2023
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Hitoshi Aoki, Atsuko Kurashima, Ginga Kawaguchi
  • Patent number: 11749275
    Abstract: Systems and processes for application integration with a digital assistant are provided. In accordance with one example, a method includes, at an electronic device having one or more processors and memory, receiving a natural-language user input; identifying, with the one or more processors, an intent object of a set of intent objects and a parameter associated with the intent, where the intent object and the parameter are derived from the natural-language user input. The method further includes identifying a software application associated with the intent object of the set of intent objects; and providing the intent object and the parameter to the software application.
    Type: Grant
    Filed: October 8, 2021
    Date of Patent: September 5, 2023
    Assignee: Apple Inc.
    Inventors: Robert A. Walker, II, Brandon J. Newendorp, Rohit Dasari, Richard D. Giuli, Thomas R. Gruber, Carey E. Radebaugh, Ashish Garg, Vineet Khosla, Jonathan H. Russell, Corey Peterson
  • Patent number: 11741955
    Abstract: A method to select a response in a multi-turn conversation between a user and a conversational bot. The conversation is composed of a set of events, wherein an event is a linear sequence of observations that are user speech or physical actions. Queries are processed against a set of conversations that are organized as a set of inter-related data tables, with events and observations stored in distinct tables. As the multi-turn conversation proceeds, a data model comprising an observation history, together with a hierarchy of events determined to represent the conversation up to at least one turn, is persisted. When a new input (speech or physical action) is received, it is classified using a statistical model to generate a result. The result is then mapped to an observation in the data model. Using the mapped observation, a look-up is performed into the data tables to retrieve a possible response.
    Type: Grant
    Filed: February 22, 2021
    Date of Patent: August 29, 2023
    Assignee: Drift.com, Inc.
    Inventors: Jeffrey D. Orkin, Christopher M. Ward
  • Patent number: 11735183
    Abstract: This disclosure relates generally to optically switchable devices, and more particularly, to methods for controlling optically switchable devices. In various embodiments, one or more optically switchable devices may be controlled via voice control and/or gesture control.
    Type: Grant
    Filed: February 22, 2021
    Date of Patent: August 22, 2023
    Assignee: View, Inc.
    Inventors: Dhairya Shrivastava, Mark D. Mendenhall
  • Patent number: 11735196
    Abstract: Described are an encoder for coding speech-like content and/or general audio content, wherein the encoder is configured to embed, at least in some frames, parameters in a bitstream, which parameters enhance a concealment in case an original frame is lost, corrupted or delayed, and a decoder for decoding speech-like content and/or general audio content, wherein the decoder is configured to use parameters which are sent later in time to enhance a concealment in case an original frame is lost, corrupted or delayed, as well as a method for encoding and a method for decoding.
    Type: Grant
    Filed: December 18, 2020
    Date of Patent: August 22, 2023
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Jérémie Lecomte, Benjamin Schubert, Michael Schnabel, Martin Dietz
  • Patent number: 11721318
    Abstract: A method, computer program, and computer system is provided for converting a singing first singing voice associated with a first speaker to a second singing voice associated with a second speaker. A context associated with one or more phonemes corresponding to the first singing voice is encoded, and the one or more phonemes are aligned to one or more target acoustic frames based on the encoded context. One or more mel-spectrogram features are recursively generated from the aligned phonemes and target acoustic frames, and a sample corresponding to the first singing voice is converted to a sample corresponding to the second singing voice using the generated mel-spectrogram features.
    Type: Grant
    Filed: October 14, 2021
    Date of Patent: August 8, 2023
    Assignee: TENCENT AMERICA LLC
    Inventors: Chengzhu Yu, Heng Lu, Chao Weng, Dong Yu
  • Patent number: 11710482
    Abstract: Systems and processes for operating a virtual assistant to provide natural assistant interaction are provided. In accordance with one or more examples, a method includes, at an electronic device with one or more processors and memory: receiving a first audio stream including one or more utterances; determining whether the first audio stream includes a lexical trigger; generating one or more candidate text representations of the one or more utterances; determining whether at least one candidate text representation of the one or more candidate text representations is to be disregarded by the virtual assistant. If at least one candidate text representation is to be disregarded, one or more candidate intents are generated based on candidate text representations of the one or more candidate text representations other than the to be disregarded at least one candidate text representation.
    Type: Grant
    Filed: October 8, 2020
    Date of Patent: July 25, 2023
    Assignee: Apple Inc.
    Inventors: Juan Carlos Garcia, Paul S. McCarthy, Kurt Piersol
  • Patent number: 11706568
    Abstract: Devices, systems and processes for providing an adaptive audio environment are disclosed. For an embodiment, a system may include a wearable device and a hub. The hub may include an interface module configured to communicatively couple the wearable device and the hub and a processor, configured to execute non-transient computer executable instructions for a machine learning engine configured to apply a first machine learning process to at least one data packet received from the wearable device and output an action-reaction data set and for a sounds engine configured to apply a sound adapting process to the action-reaction data set and provide audio output data to the wearable device via the interface module.
    Type: Grant
    Filed: November 1, 2021
    Date of Patent: July 18, 2023
    Assignee: DISH Network L.L.C.
    Inventors: Rima Shah, Nicholas Brandon Newell