Patents Examined by Paras D Shah
  • Patent number: 10540961
    Abstract: Described herein are systems and methods for creating and using Convolutional Recurrent Neural Networks (CRNNs) for small-footprint keyword spotting (KWS) systems. Inspired by the large-scale state-of-the-art speech recognition systems, in embodiments, the strengths of convolutional layers to utilize the structure in the data in time and frequency domains are combined with recurrent layers to utilize context for the entire processed frame. The effect of architecture parameters were examined to determine preferred model embodiments given the performance versus model size tradeoff. Various training strategies are provided to improve performance. In embodiments, using only ˜230 k parameters and yielding acceptably low latency, a CRNN model embodiment demonstrated high accuracy and robust performance in a wide range of environments.
    Type: Grant
    Filed: August 28, 2017
    Date of Patent: January 21, 2020
    Assignee: Baidu USA LLC
    Inventors: Sercan Arik, Markus Kliegl, Rewon Child, Joel Hestness, Andrew Gibiansky, Christopher Fougner, Ryan Prenger, Adam Coates
  • Patent number: 10535346
    Abstract: A collaborative speech processing computer receives packets of sampled audio streams. The sampled audio streams are forwarded to a speech-to-text conversion server via a data network. Packets are received via the data network that contain text strings converted from the sampled audio steams by the speech-to-text conversion server. Speakers are identified who are associated with the text strings contained in the data packets. The text strings and the identifiers of the associated speakers are added to a dialog data structure in a repository memory. Content of at least a portion of the dialog data structure is displayed on a display device.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: January 14, 2020
    Assignee: CA, Inc.
    Inventors: Preethi Raja, Jagadeeshwaran Karunanithy, Shamayel Mohammed Farooqui, Jagadishwara Chary Sriramoju, Sai Kumar Bochkar
  • Patent number: 10535345
    Abstract: An interactive method/system generates a fictional story. A user interface receives human speech and transmits machine-generated speech. A processor(s) is programmed to execute functions that include parsing the human speech into fragments thereof and identifying a primary fragment from the fragments wherein the primary fragment includes a verb. A generalized intent is associated with the verb and at least one object is associated with the generalized intent. The generalized intent and each object associated with the generalized intent are stored. An open-ended question is generated based on the generalized intent wherein the open-ended question is provided to the user interface for transmission as machine-generated speech. After the above steps are repeated for a number of cycles, a sequence of sentences is generated using each generalized intent and object(s) associated with the generalized intent. The sequence of sentences is transmitted as machine-generated speech from the user interface.
    Type: Grant
    Filed: October 20, 2017
    Date of Patent: January 14, 2020
    Inventor: Yingjia Liu
  • Patent number: 10515634
    Abstract: An apparatus for searching for geographic information using interactive voice recognition includes: a receiver configured to receive a voice signal; a voice recognition unit configured to recognize the voice signal; a result analysis processing unit configured to search for geographic information on the basis of the recognized voice signal, and analyze a search result of the geographic information; and a question generating unit configured to generate a question in response to the result of determination. A method for searching for geographic information using interactive voice recognition includes: receiving a voice signal, and recognizing the voice signal; searching for geographic information on the basis of the recognized voice signal; analyzing a search result of the geographic information; and generating a question in response to the result of determination.
    Type: Grant
    Filed: December 7, 2017
    Date of Patent: December 24, 2019
    Assignees: HYUNDAI MOTOR COMPANY, KIA MOTORS CORPORATION
    Inventor: Kyu Seop Bang
  • Patent number: 10504535
    Abstract: A Mobile Voice Self Service (MVSS) mobile device and method thereof. A VoiceXML browser that is implemented directly on the MVSS mobile device may request a VoiceXML application and process it. A call data manager may also be implemented on the MVSS mobile device and may provide call data that may authorize access to advanced Media Resource Control Protocol (MRCP) services, such as Automatic Speech Recognition (ASR) or Text-To-Speech (TTS). A media resource gateway may then provide the advanced MRCP services to the VoiceXML application processed by the VoiceXML application browser. Hotkey navigations and bookmarked application points to VoiceXML applications may be created and applied through application analysis and state tracking. Therein, VoiceXML document transitions and user input are stored to maintain application state changes until the user requests creation of an application bookmark.
    Type: Grant
    Filed: November 7, 2017
    Date of Patent: December 10, 2019
    Assignee: West Corporation
    Inventor: Chad Daniel Fox
  • Patent number: 10504515
    Abstract: A voice control device includes a microphone module, a voice encoding module, a display and a processing unit. The voice encoding module is electrically connected to the microphone module. The processing unit is electrically connected to the voice encoding module and the display. The microphone module receives a voice signal and transmits the received voice signal to the voice encoding module. One of the voice encoding module and the processing unit analyzes and processes the voice signal to determine a sound source direction of the voice signal and obtains response information according to the voice signal. The processing unit controls the display to rotate to the sound source direction and transmits the response information to the display for displaying the response information.
    Type: Grant
    Filed: December 15, 2017
    Date of Patent: December 10, 2019
    Assignee: Chiun Mai Communication Systems, Inc.
    Inventors: Yu-Yang Chih, Ming-Chun Ho, Ming-Fu Tsai, Cheng-Ping Liu, Fu-Bin Wang, Shih-Lun Lin
  • Patent number: 10504513
    Abstract: A dock device connects participating devices such as a tablet device and an audio activated device, allowing them to operate as a single device. These participating devices may be associated with different accounts, each account being associated with particular “speechlets” or data processing functions. A natural language understanding (NLU) system uses NLU models to process text obtained from an automatic speech recognition (ASR) system to determine a set of possible intents. A second set of possible intents may then be generated that is limited to those possible intents that correspond to the speechlets associated with the docked device. The intents within the second set of possible intents are ranked, and the highest ranked intent may be deemed to be the intent of the user. Command data corresponding to the highest ranked intent may be generated and used to perform the action associated with that intent.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: December 10, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Timothy Thomas Gray, Michal Grzegorz Kurpanik, Jenny Toi Wah Lam, Sarveshwar Nigam, Shirin Saleem, Jonhenry A. Righter, Jeremy Richard Hill, Kavya Ravikumar, Joe Virgil Fernandez, Kynan Dylan Antos, Kelly James Vanee
  • Patent number: 10504505
    Abstract: Disclosed herein are systems, computer-implemented methods, and tangible computer-readable storage media for speaker recognition personalization. The method recognizes speech received from a speaker interacting with a speech interface using a set of allocated resources, the set of allocated resources including bandwidth, processor time, memory, and storage. The method records metrics associated with the recognized speech, and after recording the metrics, modifies at least one of the allocated resources in the set of allocated resources commensurate with the recorded metrics. The method recognizes additional speech from the speaker using the modified set of allocated resources. Metrics can include a speech recognition confidence score, processing speed, dialog behavior, requests for repeats, negative responses to confirmations, and task completions.
    Type: Grant
    Filed: December 4, 2017
    Date of Patent: December 10, 2019
    Assignee: NUANCE COMMUNICATIONS, INC.
    Inventors: Andrej Ljolje, Alistair D. Conkie, Ann K. Syrdal
  • Patent number: 10498898
    Abstract: A method for configuring a topic-specific chatbot: clustering, by a processor, a plurality of transcripts of interactions between customers and human agents of a contact center of an enterprise to generate a plurality of clusters of interactions, each cluster of interactions corresponding to a topic, each of the interactions including agent phrases and customer phrases; for each cluster of the plurality of clusters of interactions: extracting, by the processor, a topic-specific dialogue tree for the cluster; pruning, by the processor, the topic-specific dialogue tree to generate a deterministic dialogue tree; and configuring, by the processor, a topic-specific chatbot in accordance with the deterministic dialogue tree; and outputting, by the processor, the one or more topic-specific chatbots, each of the topic-specific chatbots being configured to generate, automatically, responses to messages regarding the topic of the topic-specific chatbot from a customer in an interaction between the customer and the ente
    Type: Grant
    Filed: December 13, 2017
    Date of Patent: December 3, 2019
    Inventors: Arnon Mazza, Avraham Faizakof, Amir Lev-Tov, Tamir Tapuhi, Yochai Konig
  • Patent number: 10496364
    Abstract: According to one embodiment, in response to a text stream originated from a user at an electronic device (e.g., home device), a natural language processing (NLP) operation is performed on the text stream. An object described by the text stream is determined based on the NLP operation. One or more colors associated with the object are determined. A light control command is then transmitted from the electronic device to each of the smart lights to control a color of the smart light based on the one or more colors associated with the object, such that the smart lights are lit with the colors associated with the object. The text stream may be converted from a voice stream using a speech recognition process.
    Type: Grant
    Filed: October 31, 2017
    Date of Patent: December 3, 2019
    Assignee: BAIDU USA LLC
    Inventor: Xuchen Yao
  • Patent number: 10490195
    Abstract: Systems, methods, and devices related to establishing voice identity profiles for use with voice-controlled devices are provided. The embodiments disclosed enhance user experience by customizing the enrollment process to utilize voice recognition for each user based on historical information which can be used in the selection process of phrases a user speaks during enrollment of a voice recognition function or skill. The selection process can utilize phrases that have already been spoken to the electronic device; it can utilize phrases, contacts, or other personalized information it can obtain from the user account of the person enrolling; it can use any of the information just described to select specific words to enhance the probably of achieving higher phonetic matches based on words the individual user is more likely to speak to the device.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: November 26, 2019
    Assignee: Amazon Technologies, Inc.
    Inventors: Vishwanathan Krishnamoorthy, Sundararajan Srinivasan, Spyridon Matsoukas, Aparna Khare, Arindam Mandal, Krishna Subramanian, Gregory Michael Hart
  • Patent number: 10481860
    Abstract: A Solar Tablet verbal with nano scale layers, lithium battery a solar MP3 player, e-books reader, e-newspaper reader, and e-magazine reader. All units are operable by verbal command and can work manually from an ultra-high definition touch screen. The solar technology utilizes the Photo electric effect with nano scale layers to boost solar cell efficiency. The tablet has encryption software.
    Type: Grant
    Filed: May 29, 2015
    Date of Patent: November 19, 2019
    Inventor: Gregory Walker Johnson
  • Patent number: 10475461
    Abstract: In particular embodiments, one or more computer-readable non-transitory storage media embody software that is operable when executed to receive an audio waveform fingerprint and a client-determined location from a client device. The received audio waveform fingerprint may be compared to a database of stored audio waveform fingerprints, each stored audio waveform fingerprint associated with an object in an object database. One or more matching audio waveform fingerprints may be found from a comparison set of audio waveform fingerprints obtained from the audio waveform fingerprint database. Location information associated with a location of the client device may be determined, and the location information may be sent to the client device. The client device may be operable to update the client-determined location based at least in part on the location information.
    Type: Grant
    Filed: January 25, 2016
    Date of Patent: November 12, 2019
    Assignee: Facebook, Inc.
    Inventors: Matthew Nicholas Papakipos, David Harry Garcia
  • Patent number: 10468036
    Abstract: A method for mixing, processing and enhancing signals using signal decomposition is presented. A method for improving sorting of decomposed signal parts using cross-component similarity is also provided.
    Type: Grant
    Filed: April 30, 2014
    Date of Patent: November 5, 2019
    Assignee: ACCUSONUS, INC.
    Inventors: Alexandros Tsilfidis, Elias Kokkinis
  • Patent number: 10468024
    Abstract: A method includes: acquiring first voice information indicating a voice of a user input from a microphone; outputting, to a server via a network, first text string information generated from the first voice information, when the first text string information does not match any of pieces of text string information in the first database; acquiring, from the server, first semantic information and/or a control command corresponding to the first semantic information, when a second database includes a piece of text string information matched with the first text string information and the matched piece of text string information is associated with the first semantic information therein; instructing at least one device to execute an operation based on the first semantic information and/or the control command; and outputting, to a speaker, second voice information generated from second text string information, the second text string information being registered and associated with the first semantic information in the
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: November 5, 2019
    Assignee: PANAONIC INTELLECTUAL PROPERTY CORPORATION OF AMERICA
    Inventors: Yuri Nishikawa, Katsuyoshi Yamagami
  • Patent number: 10468017
    Abstract: Methods and systems are provided for a speech system of a vehicle. In particular, a method is taught for associating a speech utterance with a voice command in response to a failed voice control attempt followed by a successfully voice control attempt.
    Type: Grant
    Filed: December 14, 2017
    Date of Patent: November 5, 2019
    Assignee: GM GLOBAL TECHNOLOGY OPERATIONS LLC
    Inventors: Ron M Hecht, Yael Shmueli Friedland, Ariel Telpaz, Omer Tsimhoni, Peggy Wang
  • Patent number: 10467341
    Abstract: A method for determining document compatibility between documents stored locally on a plurality of user devices, while maintaining the confidentiality of each of the respective documents. The method includes requesting and receiving a token from each of the plurality of user devices, the token indicative of the presence or absence of a specific element in each respective document. The method further includes comparing the value of each of the respective tokens. When each of the tokens have a true value, the specific element for each respective document to be compatible and sends a message to each of the plurality of user devices indicating the compatibility of the respective documents. When at least one of the tokens has a false value, the specific element for each respective document to be incompatible and sends a message to each of the plurality of user devices indicating the incompatibility of the respective documents.
    Type: Grant
    Filed: May 1, 2019
    Date of Patent: November 5, 2019
    Assignee: CAPITAL ONE SERVICES, LLC
    Inventors: Fardin Abdi Taghi Abad, Austin Walters, Jeremy Edward Goodsitt, Reza Farivar, Vincent Pham, Anh Truong, Kenneth Taylor, Mark Watson
  • Patent number: 10460739
    Abstract: A gain adjustment apparatus for use in decoding of audio that has been encoded with separate gain and shape representations includes an accuracy meter configured to estimate an accuracy measure of the shape representation, and to determine a gain correction based on the estimated accuracy measure. An envelope adjuster further included in the apparatus is configured to adjust the gain representation based on the determined gain correction.
    Type: Grant
    Filed: August 4, 2017
    Date of Patent: October 29, 2019
    Assignee: TELEFONAKTIEBOLAGET LM ERICSSON (PUBL)
    Inventors: Erik Norvell, Volodya Grancharov
  • Patent number: 10455344
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data for transmission to a decoder. A low level decoder only decodes the first and second downmix channels. A high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: October 22, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10453117
    Abstract: A system capable of performing natural language understanding (NLU) using different application domains in parallel. A model takes incoming query text and determines a list of potential supplemental intent categories corresponding to the text. Supplemental applications within those categories are then identified as likely candidates for responding to the query. Application specific domains, including NLU components for the particular supplemental applications, are then activated and process the query text in parallel. Further, certain system default domains may also process incoming queries substantially in parallel with the supplemental applications. The different results are scored and ranked to determine highest scoring NLU results.
    Type: Grant
    Filed: June 29, 2016
    Date of Patent: October 22, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Simon Peter Reavely, Rohit Prasad, Imre Attila Kiss, Manoj Sindhwani