Patents Examined by Paras D Shah
  • Patent number: 11355117
    Abstract: Embodiments of the disclosure generally relate to a dialog system allowing for automatically reactivating a speech acquiring mode after the dialog system delivers a response to a user request. The reactivation parameters, such as a delay, depend on a number of predetermined factors and conversation scenarios. The embodiments further provide for a method of operating of the dialog system. An exemplary method comprises the steps of: activating a speech acquiring mode, receiving a first input of a user, deactivating the speech acquiring mode, obtaining a first response associated with the first input, delivering the first response to the user, determining that a conversation mode is activated, and, based on the determination, automatically re-activating the speech acquiring mode within a first predetermined time period after delivery of the first response to the user.
    Type: Grant
    Filed: August 11, 2020
    Date of Patent: June 7, 2022
    Assignee: GOOGLE LLC
    Inventors: Ilya Gennadyevich Gelfenbeyn, Artem Goncharuk, Pavel Aleksandrovich Sirotin
  • Patent number: 11348577
    Abstract: Methods, systems, and media for presenting interactive audio content are provided.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: May 31, 2022
    Inventor: Peter Zetterberg
  • Patent number: 11343631
    Abstract: In processing a multi-channel audio signal having at least three original channels, first and second downmix channels derived from the original channels are provided. For a selected original channel of the original channels, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data to be transmitted to a low-level decoder, which only decodes the first and second downmix channels, or to a high-level decoder, which provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: August 23, 2019
    Date of Patent: May 24, 2022
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 11335352
    Abstract: A voice identity feature extractor training method includes extracting a voice feature vector of training voice, The method may include determining a corresponding I-vector according to the voice feature vector of the training voice. The method may include adjusting a weight of a neural network model by using the I-vector as a first target output of the neural network model, to obtain a first neural network model, The method may include obtaining a voice feature vector of target detecting voice and determining an output result of the first neural network model for the voice feature vector of the target detecting voice. The method may include determining an I-vector latent variable. The method may include estimating a posterior mean of the I-vector latent variable, and adjusting a weight of the first neural network model using the posterior mean as a second target output, to obtain a voice identity feature extractor.
    Type: Grant
    Filed: October 16, 2019
    Date of Patent: May 17, 2022
    Assignee: Tencent Technology (Shenzhen) Company Limited
    Inventors: Na Li, Jun Wang
  • Patent number: 11308282
    Abstract: A method for determining document compatibility between documents stored locally on a plurality of user devices, while maintaining the confidentiality of each of the respective documents. The method includes requesting and receiving a token from each of the plurality of user devices, the token indicative of the presence or absence of a specific element in each respective document. The method further includes comparing the value of each of the respective tokens. When each of the tokens have a true value, the specific element for each respective document to be compatible and sends a message to each of the plurality of user devices indicating the compatibility of the respective documents. When at least one of the tokens has a false value, the specific element for each respective document to be incompatible and sends a message to each of the plurality of user devices indicating the incompatibility of the respective documents.
    Type: Grant
    Filed: October 18, 2019
    Date of Patent: April 19, 2022
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Fardin Abdi Taghi Abad, Austin Walters, Jeremy Edward Goodsitt, Reza Farivar, Vincent Pham, Anh Truong, Kenneth Taylor, Mark Watson
  • Patent number: 11289102
    Abstract: Present disclosure provide an audio signal encoding method and communication terminal, which relate to the communications field. The method and communication terminal are used to obtain an analog audio signal and encoding the analog audio signal to obtain a bitstream representing the analog audio signal, in which a proper bit allocation for spectral coefficients can be performed.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: March 29, 2022
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zexin Liu, Bin Wang, Lei Miao
  • Patent number: 11276389
    Abstract: A personalized text-to-speech system configured to perform speaker adaption is disclosed. The TTS system includes an acoustic model comprising a base neural network and a differential neural network. The base neural network is configured to generate acoustic parameters corresponding to a base speaker or voice actor, while the differential neural network is configured to generate acoustic parameters corresponding to differences between acoustic parameters of the base speaker and a particular target speaker. The output of the acoustic model is then a weighted linear combination of the output from the base neural network and differential neural network. The base neural network and differential neural network share a first input layer and first plurality of hidden layers. Thereafter, the base neural network further comprises a second plurality of hidden layers and output layer. In parallel, the differential neural network further comprises a third plurality of hidden layers and separate output layer.
    Type: Grant
    Filed: December 2, 2019
    Date of Patent: March 15, 2022
    Assignee: OBEN, INC.
    Inventor: Sandesh Aryal
  • Patent number: 11270707
    Abstract: A method of analysis of an audio signal comprises: receiving an audio signal representing speech; extracting first and second components of the audio signal representing first and second acoustic classes of the speech respectively; analysing the first and second components of the audio signal with models of the first and second acoustic classes of the speech of an enrolled user. Based on the analysing, information is obtained information about at least one of a channel and noise affecting the audio signal.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: March 8, 2022
    Assignee: Cirrus Logic, Inc.
    Inventor: John Paul Lesso
  • Patent number: 11264023
    Abstract: Input context for a statistical dialog manager may be provided. Upon receiving a spoken query from a user, the query may be categorized according to at least one context clue. The spoken query may then be converted to text according to a statistical dialog manager associated with the category of the query and a response to the spoken query may be provided to the user.
    Type: Grant
    Filed: May 22, 2019
    Date of Patent: March 1, 2022
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Bodell, John Bain, Robert Chambers, Karen M. Cross, Michael Kim, Nick Gedge, Daniel Frederick Penn, Kunal Patel, Edward Mark Tecot, Jeremy C. Waltmunson
  • Patent number: 11250843
    Abstract: Disclosed are a speech recognition method capable of communicating with other electronic devices and an external server in a 5G communication condition by performing speech recognition by executing an artificial intelligence (AI) algorithm and/or a machine learning algorithm. The speech recognition method may comprise performing speech recognition by using an acoustic model and a language model stored in a speech database, determining whether the speech recognition of the spoken sentence is successful, storing speech recognition failure data when the speech recognition of the spoken sentence fails, analyzing the speech recognition failure data of the spoken sentence and updating the acoustic model or the language model by adding the recognition failure data to a learning database of the acoustic model or the language model when the cause of the speech recognition failure is due to the acoustic model or the language model and machine-learning the acoustic model or the language model.
    Type: Grant
    Filed: September 11, 2019
    Date of Patent: February 15, 2022
    Assignee: LG Electronics Inc.
    Inventor: Hwan Sik Yun
  • Patent number: 11250053
    Abstract: The technology relates to systems and methods for transcribing audio of a meeting. Upon transcribing the audio, the systems and methods can parse different portions of the prescribed audio so that they may attribute the different portions to a particular speaker. These transcribed portions that are attributed to a particular speaker are made available for viewing and interacting using a graphical user interface.
    Type: Grant
    Filed: July 9, 2019
    Date of Patent: February 15, 2022
    Assignee: NASDAQ, INC.
    Inventors: Christopher Avore, Joseph McNeil, Christian Eckels
  • Patent number: 11250847
    Abstract: A wireless communication system includes a smart device configured for transcribing text-to-speech (STT) for display. The smart device interfaces with a radio communications device, for example, in an aircraft (AC). The system includes a filter for optimizing STT functions. Such functions are further optimized by restricting the databases of information, including geographic locations, aircraft identifications and carrier information, whereby the database search functions are optimized. Methods for wireless communications using smart devices and STT functionality are disclosed.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: February 15, 2022
    Assignee: Appareo Systems, LLC
    Inventors: Joshua N. Gelinske, Bradley R. Thurow, Jesse S. Trana, Dakota M. Smith, Nicholas L. Butts, Jeffrey L. Johnson, Derek B. Aslakson, Jaden C. Young
  • Patent number: 11238226
    Abstract: A method, computer program product, and computer system for identifying, by a computing device, a model for predicting conversational phrases for a communication between at least a first user and a second user. The model may be trained based upon, at least in part, an attribute associated with the second user. At least one conversational phrase may be predicted for the communication between the first user and the second user. The at least one conversational phrase may be provided to the second user as an optional phrase to be sent to the first user.
    Type: Grant
    Filed: November 15, 2018
    Date of Patent: February 1, 2022
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Paul Joseph Vozila, Peter Stubley, Jean-Francois Beaumont, Ding Liu, William F. Ganong, III
  • Patent number: 11238842
    Abstract: An example intent-recognition system comprises a processor and memory storing instructions. The instructions cause the processor to receive speech input comprising spoken words. The instructions cause the processor to generate text results based on the speech input and generate acoustic feature annotations based on the speech input. The instructions also cause the processor to apply an intent model to the text result and the acoustic feature annotations to recognize an intent based on the speech input. An example system for adapting an emotional text-to-speech model comprises a processor and memory. The memory stores instructions that cause the processor to receive training examples comprising speech input and receive labelling data comprising emotion information associated with the speech input. The instructions also cause the processor to extract audio signal vectors from the training examples and generate an emotion-adapted voice font model based on the audio signal vectors and the labelling data.
    Type: Grant
    Filed: June 7, 2017
    Date of Patent: February 1, 2022
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Pei Zhao, Kaisheng Yao, Max Leung, Bo Yan, Jian Luan, Yu Shi, Malone Ma, Mei-Yuh Hwang
  • Patent number: 11227117
    Abstract: A method, a device and a computer program product for processing a segment are proposed. In the method, a property of at least one of a first segment and a second segment in a segment set is obtained. The segment set includes a plurality of segments belonging to at least one conversation. The second segment occurs after the first segment. A boundary feature of at least one of the first segment and the second segment is determined based on the property. The boundary feature indicates whether there is a boundary of a conversation after the first segment.
    Type: Grant
    Filed: August 3, 2018
    Date of Patent: January 18, 2022
    Assignee: International Business Machines Corporation
    Inventors: Jonathan F. Brunn, Yuan Cheng, Jonathan Dunne, Bo Jiang, Ming Wan
  • Patent number: 11227609
    Abstract: A method of analysis of an audio signal comprises: receiving an audio signal representing speech; extracting first and second components of the audio signal representing first and second acoustic classes of the speech respectively; analysing the first and second components of the audio signal with models of the first and second acoustic classes of the speech of an enrolled user. Based on the analysing, information is obtained information about at least one of a channel and noise affecting the audio signal.
    Type: Grant
    Filed: October 10, 2018
    Date of Patent: January 18, 2022
    Assignee: Cirrus Logic, Inc.
    Inventor: John Paul Lesso
  • Patent number: 11222650
    Abstract: A device and a method for generating synchronous corpus is disclosed. Firstly, script data and a dysarthria voice signal having a dysarthria consonant signal are received and the position of the dysarthria consonant signal is detected, wherein the script data have text corresponding to the dysarthria voice signal. Then, normal phoneme data corresponding to the text are searched and the text is converted into a normal voice signal based on the normal phoneme data corresponding to the text. The dysarthria consonant signal is replaced with the normal consonant signal based on the positions of the normal consonant signal and the dysarthria consonant signal, thereby synchronously converting the dysarthria voice signal into a synthesized voice signal. The synthesized voice signal and the dysarthria voice signal are provided to train a voice conversion model, retain the timbre of the dysarthria voices and improve the communication situations.
    Type: Grant
    Filed: March 18, 2020
    Date of Patent: January 11, 2022
    Assignee: NATIONAL CHUNG CHENG UNIVERSITY
    Inventors: Tay Jyi Lin, Ching Wei Yeh, Shun Pu Yang, Chen Zong Liao
  • Patent number: 11189264
    Abstract: Implementations set forth herein relate to speech recognition techniques for handling variations in speech among users (e.g. due to different accents) and processing features of user context in order to expand a number of speech recognition hypotheses when interpreting a spoken utterance from a user. In order to adapt to an accent of the user, terms common to multiple speech recognition hypotheses can be filtered out in order to identify inconsistent terms apparent in a group of hypotheses. Mappings between inconsistent terms can be stored for subsequent users as term correspondence data. In this way, supplemental speech recognition hypotheses can be generated and subject to probability-based scoring for identifying a speech recognition hypothesis that most correlates to a spoken utterance provided by a user. In some implementations, prior to scoring, hypotheses can be supplemented based on contextual data, such as on-screen content and/or application capabilities.
    Type: Grant
    Filed: July 17, 2019
    Date of Patent: November 30, 2021
    Assignee: GOOGLE LLC
    Inventors: Ágoston Weisz, Alexandru Dovlecel, Gleb Skobeltsyn, Evgeny Cherepanov, Justas Klimavicius, Yihui Ma, Lukas Lopatovsky
  • Patent number: 11176927
    Abstract: A computer-implemented method for providing an adaptive dialogue system is provided. Here, there is an automatic capture of at least one dialogue segment from a dialogue participant communicating with the dialogue system. There is an automatic comparison of the captured dialogue segment with dialogue segments of a stored dialogue segment model. After the comparison, there is an automatic assignment of at least one corresponding dialogue segment to the captured dialogue segment according to the dialogue segment model if the captured dialogue segment is contained in the dialogue segment model or there is an automatic addition of the captured dialogue segment to a dialogue segment group if the captured dialogue segment is not contained in the dialogue segment model. A dialogue segment is generated depending on the dialogue segments of the dialogue segment group and the generated dialogue segment is stored in the dialogue segment model.
    Type: Grant
    Filed: May 8, 2019
    Date of Patent: November 16, 2021
    Inventor: Manfred Langen
  • Patent number: 11170175
    Abstract: Certain aspects of the present disclosure provide techniques for generating a replacement sentence with the same or similar meaning but a different sentiment than an input sentence. The method generally includes receiving a request for a replacement sentence and iteratively determining a next word of the replacement sentence word-by-word based on an input sentence. Iteratively determining the next word generally includes evaluating a set of words of the input sentence using a language model configured to output candidate sentences and evaluating the candidate sentences using a sentiment model configured to output sentiment scores for the candidates sentences. Iteratively determining the next word further includes calculating convex combinations for the candidate sentences and selecting an ending word of one of the candidate sentences as the next word of the replacement sentence. The method further includes transmitting the replacement sentence in response to the request for the replacement sentence.
    Type: Grant
    Filed: July 1, 2019
    Date of Patent: November 9, 2021
    Assignee: INTUIT, INC.
    Inventors: Manav Kohli, Cindy Osmon, Nicholas Roberts