Patents Examined by Paras D Shah
  • Patent number: 10446147
    Abstract: Techniques for providing a contextual voice user interface that enables a user to query a speech processing system with respect to the decisions made to answer the user's command are described. The speech processing system may store speech processing pipeline data used to process a command. At some point after the system outputs content deemed responsive to the command, a user may speak an utterance corresponding to an inquiry with respect to the processing performed to respond to the command. For example, the user may state “why did you tell me that?” In response thereto, the speech processing system may determine the stored speech processing pipeline data used to respond to the command, and may generate output audio data that describes the data and computing decisions involved in determining the content deemed responsive to the command.
    Type: Grant
    Filed: June 27, 2017
    Date of Patent: October 15, 2019
    Assignee: AMAZON TECHNOLOGIES, INC.
    Inventors: Michael James Moniz, Abishek Ravi, Ryan Scott Aldrich, Michael Bennett Adams
  • Patent number: 10433091
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data for transmission to a decoder. A low level decoder only decodes the first and second downmix channels. A high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: October 1, 2019
    Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10425364
    Abstract: According to an embodiment of the present invention, a system dynamically processes an incoming message. Initially, a system receives a message including a plurality of different contexts. In response to receiving the message, a processor in the system partitions the message into a plurality of sections each associated with a corresponding context, wherein each context includes an inquiry to receive a reply. The processor further generates one or more replies for a corresponding context and generates a response to the message including the sections and the one or more replies, wherein each reply is inserted into a section in the response associated with the corresponding context. Embodiments of the present invention further include a method and computer program product for dynamically processing messages in substantially the same manner described above.
    Type: Grant
    Filed: June 26, 2017
    Date of Patent: September 24, 2019
    Assignee: International Business Machines Corporation
    Inventors: Paul R. Bastide, Sathyanarayanan Srinivasan
  • Patent number: 10425757
    Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data for transmission to a decoder. A low level decoder only decodes the first and second downmix channels. A high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.
    Type: Grant
    Filed: April 5, 2019
    Date of Patent: September 24, 2019
    Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V
    Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
  • Patent number: 10424302
    Abstract: Techniques are described related to turn-based reinforcement learning for dialog management. In various implementations, dialog states and corresponding responsive actions generated during a multi-turn human-to-computer dialog session may be obtained. A plurality of turn-level training instances may be generated, each including: a given dialog state of the plurality of dialog states at an outset of a given turn of the human-to-computer dialog session; and a given responsive action that was selected based on the given dialog state. One or more of the turn-level training instances may further include a turn-level feedback value that reflects on the given responsive action selected during the given turn. A reward value may be generated based on an outcome of the human-to-computer dialog session. The dialog management policy model may be trained based on turn-level feedback values of the turn-level training instance(s) and the reward value.
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: September 24, 2019
    Assignee: GOOGLE LLC
    Inventors: Pararth Shah, Larry Paul Heck, Dilek Hakkani-Tur
  • Patent number: 10418027
    Abstract: An electronic device is provided, which includes a storage configured to store a voice recognition application including a wakeup word for entering into a voice command recognition mode, a sensor configured to sense a sound signal, and a processor configured to convert the sound signal into a digital signal and to transfer the converted digital signal to the application, wherein the application identifies whether a characteristic value of the digital signal is equal to or higher than a predetermined threshold level if the digital signal is received, performs voice recognition for the digital signal if the characteristic value of the digital signal is equal to or higher than the predetermined threshold level, and activates the voice command recognition mode if a keyword of a voice included in the digital signal coincides with the wakeup word.
    Type: Grant
    Filed: October 12, 2017
    Date of Patent: September 17, 2019
    Assignee: Samsung Electronics Co., Ltd.
    Inventors: Young-min Ko, Jin-geun Park
  • Patent number: 10410651
    Abstract: A de-reverberation control method and device of sound producing equipment are disclosed. The method includes that: when a piece of equipment performs audio playing, a voice signal from a user is collected in real time; a relative position of the user with respect to the equipment and acoustic parameters of a room environment in which the equipment is located, are acquired; according to one or more of the relative position and the acoustic parameters, a corresponding microphone in the equipment is selected, and a corresponding voice enhancement mode is called to perform de-reverberation; a voice command word from the user is acquired to control the equipment to perform a corresponding function, as a respond to the user. The present solution can improve the recognition accuracy of a voice command, and improve user interaction experience.
    Type: Grant
    Filed: December 20, 2017
    Date of Patent: September 10, 2019
    Assignee: Beijing Xiaoniao Tingting Technology Co., Ltd.
    Inventors: Shasha Lou, Bo Li
  • Patent number: 10403286
    Abstract: A system and method for facilitating user interaction with a voice application. A VoiceXML browser runs locally on a mobile device. Supporting components, such as a Resource Manager, a Call Data Manager, and a MRCP Gateway Client support operation of the VoiceXML browser. The Resource Manager servers either those files stored locally on the mobile device, or files accessible via a network connection using the wireless or mobile broadband capabilities of the mobile device. The Call Data Manager communicates call-specific data back to the application's system of origin or another configured target system. The MRCP Gateway Client provides the VoiceXML browser with access to media resources via a MRCP Gateway Client.
    Type: Grant
    Filed: September 5, 2017
    Date of Patent: September 3, 2019
    Assignee: West Corporation
    Inventor: Chad Daniel Fox
  • Patent number: 10394519
    Abstract: A service providing apparatus including an occupant detector configured to detect presence of each of a plurality of occupants in a vehicle and a control unit including a CPU and a memory coupled to the memory, wherein the CPU and the memory are configured to perform: estimating an individual feeling of the each of the plurality of occupants detected by the occupant detector; estimating a general mood representing an entire feeling of the plurality of occupants, based on the estimated individual feeling of the each of the plurality of occupants; deciding a service to be provided to a group of the plurality of occupants, based on the estimated general mood; and outputting a command to provide the decided service.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: August 27, 2019
    Assignee: Honda Motor Co., Ltd.
    Inventors: Tomoko Shintani, Hiromitsu Yuhara, Eisuke Soma
  • Patent number: 10395650
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotword triggers when detecting a hotword in recorded media are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio corresponding to playback of an item of media content. The actions further include determining, by the computing device, that the audio includes an utterance of a predefined hotword and that the audio includes an audio watermark. The actions further include analyzing, by the computing device, the audio watermark. The actions further include based on analyzing the audio watermark, determining, by the computing device, whether to perform speech recognition on a portion of the audio following the predefined hotword.
    Type: Grant
    Filed: June 5, 2017
    Date of Patent: August 27, 2019
    Assignee: Google LLC
    Inventor: Ricardo Antonio Garcia
  • Patent number: 10387548
    Abstract: The technology relates to systems and methods for transcribing audio of a meeting. Upon transcribing the audio, the systems and methods can parse different portions of the prescribed audio so that they may attribute the different portions to a particular speaker. These transcribed portions that are attributed to a particular speaker are made available for viewing and interacting using a graphical user interface.
    Type: Grant
    Filed: April 12, 2016
    Date of Patent: August 20, 2019
    Assignee: NASDAQ, Inc.
    Inventors: Christopher Avore, Joseph McNeil, Christian Eckels
  • Patent number: 10388302
    Abstract: A method for processing an initial signal includes a useful signal and added noise, which comprises a step of frequency selective analysis providing starting from initial signal a plurality of wideband analysis signals corresponding to one of the analyzed frequencies, and comprising the following actions: zero or more complex frequency translations, one or more undersampling operations, computation of the instantaneous Amplitude, of the instantaneous Phase, and of the instantaneous Frequency of the wideband analysis signals. This information then allow to detect modulations of signals included in high levels of noise and to detect with a good probability the presence of a signal in a high level of noise.
    Type: Grant
    Filed: December 23, 2015
    Date of Patent: August 20, 2019
    Inventor: Yves Reza
  • Patent number: 10388282
    Abstract: Embodiments cover a voice command device and a server computing device that communicates with the voice command device. In one embodiment, a voice command device comprises a speaker, a microphone, a wireless communication module, and a processing device. The processing device is to scan for wireless advertising packets from a plurality of medical devices at an interval and detect a wireless advertising packet from a medical device of the plurality of medical devices as a result of the scanning. The processing device is further to receive medical data for a living entity from the medical device and send the medical data to a server computing device, wherein the server computing device is to generate a message associated with the medical data. The processing device is to receive the message and output the message via the speaker.
    Type: Grant
    Filed: January 25, 2017
    Date of Patent: August 20, 2019
    Assignee: CliniCloud Inc.
    Inventors: Hon Weng Chong, An Lin
  • Patent number: 10360914
    Abstract: Using many speech recognition engines, one can select which one is best at any given iteration of sending a command to a device to be interpreted and carried out. Depending on the context, a different result of many results received from speech recognition engines is chosen. The context is determined based on window history, including rendered webpages represented by URLs previously displayed on the device or windows resulting from executed code on the computing device. In this manner, the operation of the computer is improved as a more accurate result of receiving audio and processing it to text many times is used.
    Type: Grant
    Filed: January 26, 2017
    Date of Patent: July 23, 2019
    Assignee: ESSENCE, INC
    Inventors: Holly R Corcoran, Barry Klein, Llewellyn Q Morake
  • Patent number: 10354656
    Abstract: Improvements in speaker identification and verification are provided via an attention model for speaker recognition and the end-to-end training thereof. A speaker discriminative convolutional neural network (CNN) is used to directly extract frame-level speaker features that are weighted and combined to form an utterance-level speaker recognition vector via the attention model. The CNN and attention model are join-optimized via an end-to-end training algorithm that imitates the speaker recognition process and uses the most-similar utterances from imposters for each speaker.
    Type: Grant
    Filed: June 23, 2017
    Date of Patent: July 16, 2019
    Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC
    Inventors: Yong Zhao, Jinyu Li, Yifan Gong, Shixiong Zhang, Zhuo Chen
  • Patent number: 10348427
    Abstract: Determining effect of changes in parameters may include, during a time interval, rotating from setting a first parameter to a first value for a first time period, to setting the first parameter to a second value for a second time period such that the time interval includes multiple first time periods with the first parameter set to the first value sequenced with multiple second time periods with the first parameter set to the second value; obtaining, for the time interval, a first set of ratings corresponding to the first time periods and a second set of ratings corresponding to the second time periods; averaging, for the time interval, the first set of ratings to a first average rating and the second set of ratings to a second average rating; and correlating the first average rating to the first value and the second average rating to the second value.
    Type: Grant
    Filed: July 19, 2017
    Date of Patent: July 9, 2019
    Assignee: TLS Corp.
    Inventor: Barry Blesser
  • Patent number: 10347257
    Abstract: Present disclosure provide an audio signal encoding method and encoder, which relate to the communications field and can perform proper bit allocation for spectral coefficients of an audio signal. The method includes: splitting spectral coefficients of a current frame into subbands, acquiring quantized energy envelopes of the subbands; adjusting quantized energy envelopes values of some subbands; perform bit allocation according to adjusted quantized energy envelopes of the some subbands; quantizing a spectral coefficient of a subband to which at least one bit is allocated after the bit allocation.
    Type: Grant
    Filed: July 14, 2017
    Date of Patent: July 9, 2019
    Assignee: HUAWEI TECHNOLOGIES CO., LTD.
    Inventors: Zexin Liu, Bin Wang, Lei Miao
  • Patent number: 10346548
    Abstract: An apparatus has a network interface circuit to receive a source sentence from a network connected client device. A processor is connected to the network interface circuit. A memory is connected to the processor. The memory stores translation data and instructions executed by the processor. The instructions executed by the processor operate a neural machine translation system. A translation hypothesis is formed from a prefix of a target sentence comprising an initial sequence of target words supplied by a user through an interface. The hypothesis is generated by the neural machine translation system that performs a constrained prefix decoding that repeatedly predicts a next word from previous target words. A suffix of the target sentence comprising a final sequence of words corresponding to a final sequence of words in the source sentence is formed using a beam search that constrains translation to match the prefix.
    Type: Grant
    Filed: September 26, 2017
    Date of Patent: July 9, 2019
    Assignee: Lilt, Inc.
    Inventors: Joern Wuebker, Spence Green, Minh-Thang Luong, John DeNero
  • Patent number: 10332514
    Abstract: Input context for a statistical dialog manager may be provided. Upon receiving a spoken query from a user, the query may be categorized according to at least one context clue. The spoken query may then be converted to text according to a statistical dialog manager associated with the category of the query and a response to the spoken query may be provided to the user.
    Type: Grant
    Filed: February 17, 2017
    Date of Patent: June 25, 2019
    Assignee: Microsoft Technology Licensing, LLC
    Inventors: Michael Bodell, John Bain, Robert Chambers, Karen M. Cross, Michael Kim, Nick Gedge, Daniel Frederick Penn, Kunal Patel, Edward Mark Tecot, Jeremy C. Waltmunson
  • Patent number: 10325600
    Abstract: Examples disclosed herein provide the ability to identify the location of an individual within a room by using a combination of microphone arrays and voice pattern matching. In one example, a computing device may extract a voice detected by microphones of a microphone array located in a room, perform voice pattern matching to identify an individual associated with the extracted voice, and determine a location of the individual in the room based on an intensity of the voice detected individually by the microphones of the microphone array.
    Type: Grant
    Filed: March 27, 2015
    Date of Patent: June 18, 2019
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventors: James M Mann, Harold Merkel, Silas Morris