Patents by Inventor Spyridon Matsoukas

Spyridon Matsoukas has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Wakeword and acoustic event detection

Patent number: 11043218

Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.

Type: Grant

Filed: June 26, 2019

Date of Patent: June 22, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
Scoring of natural language processing hypotheses

Patent number: 11043205

Abstract: A natural language processing system that can determine an overall score for a natural language hypothesis using hypothesis-specific component scores from different aspects of NLU processing. The individual component scores may be weighted by weights trained to optimize the overall scores relative to each other. Each domain of the system may be configured with a separate component that determines the overall score with respect to the domain. Natural language hypotheses can be ranked using the overall score either within a specific domain or on a cross-domain basis.

Type: Grant

Filed: December 12, 2017

Date of Patent: June 22, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Chengwei Su, Sankaranarayanan Ananthakrishnan, Spyridon Matsoukas, Rahul Gupta, Kelly James Vanee
Voice profile updating

Patent number: 11004454

Abstract: Techniques for updating voice profiles used to perform user recognition are described. A system may use clustering techniques to update voice profiles. When the system receives audio data representing a spoken user input, the system may store the audio data. Periodically, the system may recall, from storage, audio data (representing previous user inputs). The system may identify clusters of the audio data, with each cluster including similar or identical speech characteristics. The system may determine a cluster is substantially similar to an existing voice profile. If this occurs, the system may create an updated voice profile using the original voice profile and the cluster of audio data.

Type: Grant

Filed: November 6, 2018

Date of Patent: May 11, 2021

Assignee: Amazon Technologies, Inc.

Inventors: Sundararajan Srinivasan, Arindam Mandal, Krishna Subramanian, Spyridon Matsoukas, Aparna Khare, Rohit Prasad
KEYWORD DETECTION MODELING USING CONTEXTUAL INFORMATION

Publication number: 20210134276

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

Type: Application

Filed: November 5, 2020

Publication date: May 6, 2021

Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
USER PRESENCE DETECTION

Publication number: 20210027798

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

Type: Application

Filed: September 16, 2020

Publication date: January 28, 2021

Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
Keyword detection modeling using contextual information

Patent number: 10832662

Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.

Type: Grant

Filed: July 3, 2017

Date of Patent: November 10, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
User presence detection

Patent number: 10796716

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

Type: Grant

Filed: October 11, 2018

Date of Patent: October 6, 2020

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
INTENT RE-RANKER

Publication number: 20200279555

Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.

Type: Application

Filed: March 11, 2020

Publication date: September 3, 2020

Inventors: Alexandra R. Shapiro, Melanie Chie Bomke Gens, Spyridon Matsoukas, Kellen Gillespie, Rahul Goel
SPEECH BASED USER RECOGNITION

Publication number: 20200193967

Abstract: Systems, methods, and devices for verifying a user are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server. The server performs ASR on the audio data to determine ASR confidence data. The server, in parallel, performs user verification on the audio data to determine user verification confidence data. The server may modify the user verification confidence data using the ASR confidence data. In addition or alternatively, the server may modify the user verification confidence data using at least one of a location of the speech-controlled device within a building, a type of the speech-controlled device, or a geographic location of the speech-controlled device.

Type: Application

Filed: December 23, 2019

Publication date: June 18, 2020

Inventors: Spyridon Matsoukas, Aparna Khare, Vishwanathan Krishnamoorthy, Shamitha Somashekar, Arindam Mandal
Speech processing optimizations based on microphone array

Patent number: 10679621

Abstract: Systems and methods for utilizing microphone array information for acoustic modeling are disclosed. Audio data may be received from a device having a microphone array configuration. Microphone configuration data may also be received that indicates the configuration of the microphone array. The microphone configuration data may be utilized as an input vector to an acoustic model, along with the audio data, to generate phoneme data. Additionally, the microphone configuration data may be utilized to train and/or generate acoustic models, select an acoustic model to perform speech recognition with, and/or to improve trigger sound detection.

Type: Grant

Filed: March 21, 2018

Date of Patent: June 9, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Shiva Kumar Sundaram, Minhua Wu, Anirudh Raju, Spyridon Matsoukas, Arindam Mandal, Kenichi Kumatani
NATURAL LANGUAGE SPEECH PROCESSING APPLICATION SELECTION

Publication number: 20200152195

Abstract: Techniques for limiting natural language processing performed on input data are described. A system receives input data from a device. The input data corresponds to a command to be executed by the system. The system determines applications likely configured to execute the command. The system performs named entity recognition and intent classification with respect to only the applications likely configured to execute the command.

Type: Application

Filed: November 25, 2019

Publication date: May 14, 2020

Inventors: Ruhi Sarikaya, Rohit Prasad, Kerry Hammil, Spyridon Matsoukas, Nikko Strom, Frédéric Johan Georges Deramat, Stephen Frederick Potter, Young-Bum Kim
System command processing

Patent number: 10600419

Abstract: Techniques for performing command processing are described. A system receives, from a device, input data corresponding to a command. The system determines NLU processing results associated with multiple applications. The system also determines NLU confidences for the NLU processing results for each application. The system sends NLU processing results to a portion of the multiple applications, and receives output data or instructions from the portion of the applications. The system ranks the portion of the applications based at least in part on the NLU processing results associated with the portion of the applications as well as the output data or instructions provided by the portion of the applications. The system may also rank the portion of the applications using other data. The system causes content corresponding to output data or instructions provided by the highest ranked application to be output to a user.

Type: Grant

Filed: September 22, 2017

Date of Patent: March 24, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Ruhi Sarikaya, Rohit Prasad, Kerry Hammil, Spyridon Matsoukas, Nikko Strom, Frédéric Johan Georges Deramat, Stephen Frederick Potter, Young-Bum Kim
Intent re-ranker

Patent number: 10600406

Abstract: Methods and systems for determining an intent of an utterance using contextual information associated with a requesting device are described herein. Voice activated electronic devices may, in some embodiments, be capable of displaying content using a display screen. Entity data representing the content rendered by the display screen may describe entities having similar attributes as an identified intent from natural language understanding processing. Natural language understanding processing may attempt to resolve one or more declared slots for a particular intent and may generate an initial list of intent hypotheses ranked to indicate which are most likely to correspond to the utterance. The entity data may be compared with the declared slots for the intent hypotheses, and the list of intent hypothesis may be re-ranked to account for matching slots from the contextual metadata. The top ranked intent hypothesis after re-ranking may then be selected as the utterance's intent.

Type: Grant

Filed: March 20, 2017

Date of Patent: March 24, 2020

Assignee: Amazon Technologies, Inc.

Inventors: Alexandra R. Shapiro, Melanie Chie Bomke Gens, Spyridon Matsoukas, Kellen Gillespie, Rahul Goel
Speech based user recognition

Patent number: 10522134

Abstract: Systems, methods, and devices for verifying a user are disclosed. A speech-controlled device captures a spoken command, and sends audio data corresponding thereto to a server. The server performs ASR on the audio data to determine ASR confidence data. The server, in parallel, performs user verification on the audio data to determine user verification confidence data. The server may modify the user verification confidence data using the ASR confidence data. In addition or alternatively, the server may modify the user verification confidence data using at least one of a location of the speech-controlled device within a building, a type of the speech-controlled device, or a geographic location of the speech-controlled device.

Type: Grant

Filed: December 22, 2016

Date of Patent: December 31, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Spyridon Matsoukas, Aparna Khare, Vishwanathan Krishnamoorthy, Shamitha Somashekar, Arindam Mandal
Natural language speech processing application selection

Patent number: 10504512

Abstract: Techniques for limiting natural language processing performed on input data are described. A system receives input data from a device. The input data corresponds to a command to be executed by the system. The system determines applications likely configured to execute the command. The system performs named entity recognition and intent classification with respect to only the applications likely configured to execute the command.

Type: Grant

Filed: September 22, 2017

Date of Patent: December 10, 2019

Assignee: AMAZON TECHNOLOGIES, INC.

Inventors: Ruhi Sarikaya, Rohit Prasad, Kerry Hammil, Spyridon Matsoukas, Nikko Strom, Frédéric Johan Georges Deramat, Stephen Frederick Potter, Young-Bum Kim
Using system command utterances to generate a speaker profile

Patent number: 10490195

Abstract: Systems, methods, and devices related to establishing voice identity profiles for use with voice-controlled devices are provided. The embodiments disclosed enhance user experience by customizing the enrollment process to utilize voice recognition for each user based on historical information which can be used in the selection process of phrases a user speaks during enrollment of a voice recognition function or skill. The selection process can utilize phrases that have already been spoken to the electronic device; it can utilize phrases, contacts, or other personalized information it can obtain from the user account of the person enrolling; it can use any of the information just described to select specific words to enhance the probably of achieving higher phonetic matches based on words the individual user is more likely to speak to the device.

Type: Grant

Filed: September 26, 2017

Date of Patent: November 26, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Vishwanathan Krishnamoorthy, Sundararajan Srinivasan, Spyridon Matsoukas, Aparna Khare, Arindam Mandal, Krishna Subramanian, Gregory Michael Hart
Keyword spotting using multi-task configuration

Patent number: 10304440

Abstract: An approach to keyword spotting makes use of acoustic parameters that are trained on a keyword spotting task as well as on a second speech recognition task, for example, a large vocabulary continuous speech recognition task. The parameters may be optimized according to a weighted measure that weighs the keyword spotting task more highly than the other task, and that weighs utterances of a keyword more highly than utterances of other speech. In some applications, a keyword spotter configured with the acoustic parameters is used for trigger or wake word detection.

Type: Grant

Filed: June 30, 2016

Date of Patent: May 28, 2019

Assignee: Amazon Technologies, Inc.

Inventors: Sankaran Panchapagesan, Bjorn Hoffmeister, Arindam Mandal, Aparna Khare, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Ming Sun
User presence detection

Patent number: 10121494

Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.

Type: Grant

Filed: March 30, 2017

Date of Patent: November 6, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
Speech processing with learned representation of user interaction history

Patent number: 10032463

Abstract: An automatic speech recognition (“ASR”) system produces, for particular users, customized speech recognition results by using data regarding prior interactions of the users with the system. A portion of the ASR system (e.g., a neural-network-based language model) can be trained to produce an encoded representation of a user's interactions with the system based on, e.g., transcriptions of prior utterances made by the user. This user-specific encoded representation of interaction history is then used by the language model to customize ASR processing for the user.

Type: Grant

Filed: December 29, 2015

Date of Patent: July 24, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Ariya Rastrow, Nikko Ström, Spyridon Matsoukas, Markus Dreyer, Ankur Gandhe, Denis Sergeyevich Filimonov, Julian Chan, Rohit Prasad
Class-based discriminative training of speech models

Patent number: 9892726

Abstract: Features are disclosed for modifying a statistical model to more accurately discriminate between classes of input data. A subspace of the total model parameter space can be learned such that individual points in the subspace, corresponding to the various classes, are discriminative with respect to the classes. The subspace can be learned using an iterative process whereby an initial subspace is used to generate data and maximize an objective function. The objective function can correspond to maximizing the posterior probability of the correct class for a given input. The initial subspace, data, and objective function can be used to generate a new subspace that better discriminates between classes. The process may be repeated as desired. A model modified using such a subspace can be used to classify input data.

Type: Grant

Filed: December 17, 2014

Date of Patent: February 13, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Sri Venkata Surya Siva Rama Krishna Garimella, Spyridon Matsoukas, Ariya Rastrow, Bjorn Hoffmeister

prev 1 2 3 next