Patents Examined by Paras D Shah

Contextual voice user interface

Patent number: 10446147

Abstract: Techniques for providing a contextual voice user interface that enables a user to query a speech processing system with respect to the decisions made to answer the user's command are described. The speech processing system may store speech processing pipeline data used to process a command. At some point after the system outputs content deemed responsive to the command, a user may speak an utterance corresponding to an inquiry with respect to the processing performed to respond to the command. For example, the user may state “why did you tell me that?” In response thereto, the speech processing system may determine the stored speech processing pipeline data used to respond to the command, and may generate output audio data that describes the data and computing decisions involved in determining the content deemed responsive to the command.

Type: Grant

Filed: June 27, 2017

Date of Patent: October 15, 2019

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Michael James Moniz, Abishek Ravi, Ryan Scott Aldrich, Michael Bennett Adams
Compatible multi-channel coding-decoding

Patent number: 10433091

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data for transmission to a decoder. A low level decoder only decodes the first and second downmix channels. A high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: April 5, 2019

Date of Patent: October 1, 2019

Assignee: Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V.

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
Dynamic conversation management based on message context

Patent number: 10425364

Abstract: According to an embodiment of the present invention, a system dynamically processes an incoming message. Initially, a system receives a message including a plurality of different contexts. In response to receiving the message, a processor in the system partitions the message into a plurality of sections each associated with a corresponding context, wherein each context includes an inquiry to receive a reply. The processor further generates one or more replies for a corresponding context and generates a response to the message including the sections and the one or more replies, wherein each reply is inserted into a section in the response associated with the corresponding context. Embodiments of the present invention further include a method and computer program product for dynamically processing messages in substantially the same manner described above.

Type: Grant

Filed: June 26, 2017

Date of Patent: September 24, 2019

Assignee: International Business Machines Corporation

Inventors: Paul R. Bastide, Sathyanarayanan Srinivasan
Compatible multi-channel coding/decoding

Patent number: 10425757

Abstract: In processing a multi-channel audio signal having at least three original channels, a first downmix channel and a second downmix channel are provided, which are derived from the original channels. For a selected original channel, channel side information are calculated such that a downmix channel or a combined downmix channel including the first and the second downmix channels, when weighted using the channel side information, results in an approximation of the selected original channel. The channel side information and the first and second downmix channels form output data for transmission to a decoder. A low level decoder only decodes the first and second downmix channels. A high level decoder provides a full multi-channel audio signal based on the downmix channels and the channel side information.

Type: Grant

Filed: April 5, 2019

Date of Patent: September 24, 2019

Assignee: Fraunhofer-Gesellschaft zur Foerderung der angewandten Forschung e.V

Inventors: Juergen Herre, Johannes Hilpert, Stefan Geyersberger, Andreas Hoelzer, Claus Spenger
Turn-based reinforcement learning for dialog management

Patent number: 10424302

Abstract: Techniques are described related to turn-based reinforcement learning for dialog management. In various implementations, dialog states and corresponding responsive actions generated during a multi-turn human-to-computer dialog session may be obtained. A plurality of turn-level training instances may be generated, each including: a given dialog state of the plurality of dialog states at an outset of a given turn of the human-to-computer dialog session; and a given responsive action that was selected based on the given dialog state. One or more of the turn-level training instances may further include a turn-level feedback value that reflects on the given responsive action selected during the given turn. A reward value may be generated based on an outcome of the human-to-computer dialog session. The dialog management policy model may be trained based on turn-level feedback values of the turn-level training instance(s) and the reward value.

Type: Grant

Filed: October 12, 2017

Date of Patent: September 24, 2019

Assignee: GOOGLE LLC

Inventors: Pararth Shah, Larry Paul Heck, Dilek Hakkani-Tur
Electronic device and method for controlling the same

Patent number: 10418027

Abstract: An electronic device is provided, which includes a storage configured to store a voice recognition application including a wakeup word for entering into a voice command recognition mode, a sensor configured to sense a sound signal, and a processor configured to convert the sound signal into a digital signal and to transfer the converted digital signal to the application, wherein the application identifies whether a characteristic value of the digital signal is equal to or higher than a predetermined threshold level if the digital signal is received, performs voice recognition for the digital signal if the characteristic value of the digital signal is equal to or higher than the predetermined threshold level, and activates the voice command recognition mode if a keyword of a voice included in the digital signal coincides with the wakeup word.

Type: Grant

Filed: October 12, 2017

Date of Patent: September 17, 2019

Assignee: Samsung Electronics Co., Ltd.

Inventors: Young-min Ko, Jin-geun Park
De-reverberation control method and device of sound producing equipment

Patent number: 10410651

Abstract: A de-reverberation control method and device of sound producing equipment are disclosed. The method includes that: when a piece of equipment performs audio playing, a voice signal from a user is collected in real time; a relative position of the user with respect to the equipment and acoustic parameters of a room environment in which the equipment is located, are acquired; according to one or more of the relative position and the acoustic parameters, a corresponding microphone in the equipment is selected, and a corresponding voice enhancement mode is called to perform de-reverberation; a voice command word from the user is acquired to control the equipment to perform a corresponding function, as a respond to the user. The present solution can improve the recognition accuracy of a voice command, and improve user interaction experience.

Type: Grant

Filed: December 20, 2017

Date of Patent: September 10, 2019

Assignee: Beijing Xiaoniao Tingting Technology Co., Ltd.

Inventors: Shasha Lou, Bo Li
VoiceXML browser and supporting components for mobile devices

Patent number: 10403286

Abstract: A system and method for facilitating user interaction with a voice application. A VoiceXML browser runs locally on a mobile device. Supporting components, such as a Resource Manager, a Call Data Manager, and a MRCP Gateway Client support operation of the VoiceXML browser. The Resource Manager servers either those files stored locally on the mobile device, or files accessible via a network connection using the wireless or mobile broadband capabilities of the mobile device. The Call Data Manager communicates call-specific data back to the application's system of origin or another configured target system. The MRCP Gateway Client provides the VoiceXML browser with access to media resources via a MRCP Gateway Client.

Type: Grant

Filed: September 5, 2017

Date of Patent: September 3, 2019

Assignee: West Corporation

Inventor: Chad Daniel Fox
Service providing apparatus and method

Patent number: 10394519

Abstract: A service providing apparatus including an occupant detector configured to detect presence of each of a plurality of occupants in a vehicle and a control unit including a CPU and a memory coupled to the memory, wherein the CPU and the memory are configured to perform: estimating an individual feeling of the each of the plurality of occupants detected by the occupant detector; estimating a general mood representing an entire feeling of the plurality of occupants, based on the estimated individual feeling of the each of the plurality of occupants; deciding a service to be provided to a group of the plurality of occupants, based on the estimated general mood; and outputting a command to provide the decided service.

Type: Grant

Filed: September 26, 2017

Date of Patent: August 27, 2019

Assignee: Honda Motor Co., Ltd.

Inventors: Tomoko Shintani, Hiromitsu Yuhara, Eisuke Soma
Recorded media hotword trigger suppression

Patent number: 10395650

Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for suppressing hotword triggers when detecting a hotword in recorded media are disclosed. In one aspect, a method includes the actions of receiving, by a computing device, audio corresponding to playback of an item of media content. The actions further include determining, by the computing device, that the audio includes an utterance of a predefined hotword and that the audio includes an audio watermark. The actions further include analyzing, by the computing device, the audio watermark. The actions further include based on analyzing the audio watermark, determining, by the computing device, whether to perform speech recognition on a portion of the audio following the predefined hotword.

Type: Grant

Filed: June 5, 2017

Date of Patent: August 27, 2019

Assignee: Google LLC

Inventor: Ricardo Antonio Garcia
Systems and methods for transcript processing

Patent number: 10387548

Abstract: The technology relates to systems and methods for transcribing audio of a meeting. Upon transcribing the audio, the systems and methods can parse different portions of the prescribed audio so that they may attribute the different portions to a particular speaker. These transcribed portions that are attributed to a particular speaker are made available for viewing and interacting using a graphical user interface.

Type: Grant

Filed: April 12, 2016

Date of Patent: August 20, 2019

Assignee: NASDAQ, Inc.

Inventors: Christopher Avore, Joseph McNeil, Christian Eckels
Methods for processing and analyzing a signal, and devices implementing such methods

Patent number: 10388302

Abstract: A method for processing an initial signal includes a useful signal and added noise, which comprises a step of frequency selective analysis providing starting from initial signal a plurality of wideband analysis signals corresponding to one of the analyzed frequencies, and comprising the following actions: zero or more complex frequency translations, one or more undersampling operations, computation of the instantaneous Amplitude, of the instantaneous Phase, and of the instantaneous Frequency of the wideband analysis signals. This information then allow to detect modulations of signals included in high levels of noise and to detect with a good probability the presence of a signal in a high level of noise.

Type: Grant

Filed: December 23, 2015

Date of Patent: August 20, 2019

Inventor: Yves Reza
Medical voice command device

Patent number: 10388282

Abstract: Embodiments cover a voice command device and a server computing device that communicates with the voice command device. In one embodiment, a voice command device comprises a speaker, a microphone, a wireless communication module, and a processing device. The processing device is to scan for wireless advertising packets from a plurality of medical devices at an interval and detect a wireless advertising packet from a medical device of the plurality of medical devices as a result of the scanning. The processing device is further to receive medical data for a living entity from the medical device and send the medical data to a server computing device, wherein the server computing device is to generate a message associated with the medical data. The processing device is to receive the message and output the message via the speaker.

Type: Grant

Filed: January 25, 2017

Date of Patent: August 20, 2019

Assignee: CliniCloud Inc.

Inventors: Hon Weng Chong, An Lin
Speech recognition based on context and multiple recognition engines

Patent number: 10360914

Abstract: Using many speech recognition engines, one can select which one is best at any given iteration of sending a command to a device to be interpreted and carried out. Depending on the context, a different result of many results received from speech recognition engines is chosen. The context is determined based on window history, including rendered webpages represented by URLs previously displayed on the device or windows resulting from executed code on the computing device. In this manner, the operation of the computer is improved as a more accurate result of receiving audio and processing it to text many times is used.

Type: Grant

Filed: January 26, 2017

Date of Patent: July 23, 2019

Assignee: ESSENCE, INC

Inventors: Holly R Corcoran, Barry Klein, Llewellyn Q Morake
Speaker recognition

Patent number: 10354656

Abstract: Improvements in speaker identification and verification are provided via an attention model for speaker recognition and the end-to-end training thereof. A speaker discriminative convolutional neural network (CNN) is used to directly extract frame-level speaker features that are weighted and combined to form an utterance-level speaker recognition vector via the attention model. The CNN and attention model are join-optimized via an end-to-end training algorithm that imitates the speaker recognition process and uses the most-similar utterances from imposters for each speaker.

Type: Grant

Filed: June 23, 2017

Date of Patent: July 16, 2019

Assignee: MICROSOFT TECHNOLOGY LICENSING, LLC

Inventors: Yong Zhao, Jinyu Li, Yifan Gong, Shixiong Zhang, Zhuo Chen
Optimizing parameters in deployed systems operating in delayed feedback real world environments

Patent number: 10348427

Abstract: Determining effect of changes in parameters may include, during a time interval, rotating from setting a first parameter to a first value for a first time period, to setting the first parameter to a second value for a second time period such that the time interval includes multiple first time periods with the first parameter set to the first value sequenced with multiple second time periods with the first parameter set to the second value; obtaining, for the time interval, a first set of ratings corresponding to the first time periods and a second set of ratings corresponding to the second time periods; averaging, for the time interval, the first set of ratings to a first average rating and the second set of ratings to a second average rating; and correlating the first average rating to the first value and the second average rating to the second value.

Type: Grant

Filed: July 19, 2017

Date of Patent: July 9, 2019

Assignee: TLS Corp.

Inventor: Barry Blesser
Encoding method and apparatus

Patent number: 10347257

Abstract: Present disclosure provide an audio signal encoding method and encoder, which relate to the communications field and can perform proper bit allocation for spectral coefficients of an audio signal. The method includes: splitting spectral coefficients of a current frame into subbands, acquiring quantized energy envelopes of the subbands; adjusting quantized energy envelopes values of some subbands; perform bit allocation according to adjusted quantized energy envelopes of the some subbands; quantizing a spectral coefficient of a subband to which at least one bit is allocated after the bit allocation.

Type: Grant

Filed: July 14, 2017

Date of Patent: July 9, 2019

Assignee: HUAWEI TECHNOLOGIES CO., LTD.

Inventors: Zexin Liu, Bin Wang, Lei Miao
Apparatus and method for prefix-constrained decoding in a neural machine translation system

Patent number: 10346548

Abstract: An apparatus has a network interface circuit to receive a source sentence from a network connected client device. A processor is connected to the network interface circuit. A memory is connected to the processor. The memory stores translation data and instructions executed by the processor. The instructions executed by the processor operate a neural machine translation system. A translation hypothesis is formed from a prefix of a target sentence comprising an initial sequence of target words supplied by a user through an interface. The hypothesis is generated by the neural machine translation system that performs a constrained prefix decoding that repeatedly predicts a next word from previous target words. A suffix of the target sentence comprising a final sequence of words corresponding to a final sequence of words in the source sentence is formed using a beam search that constrains translation to match the prefix.

Type: Grant

Filed: September 26, 2017

Date of Patent: July 9, 2019

Assignee: Lilt, Inc.

Inventors: Joern Wuebker, Spence Green, Minh-Thang Luong, John DeNero
Using multiple modality input to feedback context for natural language understanding

Patent number: 10332514

Abstract: Input context for a statistical dialog manager may be provided. Upon receiving a spoken query from a user, the query may be categorized according to at least one context clue. The spoken query may then be converted to text according to a statistical dialog manager associated with the category of the query and a response to the spoken query may be provided to the user.

Type: Grant

Filed: February 17, 2017

Date of Patent: June 25, 2019

Assignee: Microsoft Technology Licensing, LLC

Inventors: Michael Bodell, John Bain, Robert Chambers, Karen M. Cross, Michael Kim, Nick Gedge, Daniel Frederick Penn, Kunal Patel, Edward Mark Tecot, Jeremy C. Waltmunson
Locating individuals using microphone arrays and voice pattern matching

Patent number: 10325600

Abstract: Examples disclosed herein provide the ability to identify the location of an individual within a room by using a combination of microphone arrays and voice pattern matching. In one example, a computing device may extract a voice detected by microphones of a microphone array located in a room, perform voice pattern matching to identify an individual associated with the extracted voice, and determine a location of the individual in the room based on an intensity of the voice detected individually by the microphones of the microphone array.

Type: Grant

Filed: March 27, 2015

Date of Patent: June 18, 2019

Assignee: Hewlett-Packard Development Company, L.P.

Inventors: James M Mann, Harold Merkel, Silas Morris

prev … 8 9 10 11 12 13 14 15 16 … next