Patents by Inventor Shiv Naga Prasad Vitaladevuni
Shiv Naga Prasad Vitaladevuni has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20210027798Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.Type: ApplicationFiled: September 16, 2020Publication date: January 28, 2021Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
-
Patent number: 10872599Abstract: A device monitors audio data for a predetermined and/or user-defined wakeword. The device detects an error in detecting the wakeword in the audio data, such as a false-positive detection of the wakeword or a false-negative detection of the wakeword. Upon detecting the error, the device updates a model trained to detect the wakeword to create an updated trained model; the updated trained model reduces or eliminates further errors in detecting the wakeword. Data corresponding to the updated trained model may be collected by a server from a plurality of devices and used to create an updated trained model aggregating the data; this updated trained model may be sent to some or all of the devices.Type: GrantFiled: June 28, 2018Date of Patent: December 22, 2020Assignee: Amazon Technologies, Inc.Inventors: Shuang Wu, Thibaud Senechal, Gengshen Fu, Shiv Naga Prasad Vitaladevuni
-
Publication number: 20200388273Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.Type: ApplicationFiled: July 23, 2020Publication date: December 10, 2020Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
-
Patent number: 10832662Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.Type: GrantFiled: July 3, 2017Date of Patent: November 10, 2020Assignee: Amazon Technologies, Inc.Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
-
Patent number: 10796716Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.Type: GrantFiled: October 11, 2018Date of Patent: October 6, 2020Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
-
Patent number: 10777189Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.Type: GrantFiled: December 5, 2017Date of Patent: September 15, 2020Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
-
Patent number: 10510340Abstract: Techniques for using a dynamic wakeword detection threshold are described. A server(s) may receive audio data corresponding to an utterance from a device in response to the device detecting a wakeword using a wakeword detection threshold. The server(s) may then determine the device should use a lower wakeword detection threshold for a duration of time. In addition to sending the device output data responsive to the utterance, the server(s) may send the device an instruction to use the lower wakeword detection threshold for the duration of time. Alternatively, the server(s) may train a machine learning model to determine when the device should use a lower wakeword detection threshold. The server(s) may send the trained machine learned model to the device for use at runtime.Type: GrantFiled: December 5, 2017Date of Patent: December 17, 2019Assignee: Amazon Technologies, Inc.Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
-
Patent number: 10460729Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by using a neural network to determine an indicator of presence of the acoustic trigger. In some example, the neural network combines a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance.Type: GrantFiled: June 30, 2017Date of Patent: October 29, 2019Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Aaron Lee Mathers Challenner, Yixin Gao, Shiv Naga Prasad Vitaladevuni
-
Patent number: 10460722Abstract: A method for selective transmission of audio data to a speech processing server uses detection of an acoustic trigger in the audio data in determining the data to transmit. Detection of the acoustic trigger makes use of an efficient computation approach that reduces the amount of run-time computation required, or equivalently improves accuracy for a given amount of computation, by combining a “time delay” structure in which intermediate results of computations are reused at various time delays, thereby avoiding computation of computing new results, and decomposition of certain transformations to require fewer arithmetic operations without sacrificing significant performance. For a given amount of computation capacity the combination of these two techniques provides improved accuracy as compared to current approaches.Type: GrantFiled: June 30, 2017Date of Patent: October 29, 2019Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, David Snyder, Yixin Gao, Nikko Strom, Spyros Matsoukas, Shiv Naga Prasad Vitaladevuni
-
Patent number: 10354184Abstract: A system and method is disclosed for predicting user behavior in response to various tasks and or/applications. This system can be a neural network-based joint model. The neural network can include a base neural network portion and one or more task-specific neural network portions. The artificial neural network can be initialized and trained using data from multiple users for multiple tasks and/or applications. This user data can be related to characteristics and behavior, including age, gender, geographic location, purchases, past search history, and customer reviews. Additional task-specific neural network portions can be added to the neural network and may be trained using a task-specific subset of the training data. The joint model can be used to predict user behavior in response to an identified task and/or application. The tasks and/or applications can relate to use of a website by users.Type: GrantFiled: June 24, 2014Date of Patent: July 16, 2019Assignee: Amazon Technologies, Inc.Inventors: Shiv Naga Prasad Vitaladevuni, Nikko Ström, Rohit Prasad
-
Patent number: 10304440Abstract: An approach to keyword spotting makes use of acoustic parameters that are trained on a keyword spotting task as well as on a second speech recognition task, for example, a large vocabulary continuous speech recognition task. The parameters may be optimized according to a weighted measure that weighs the keyword spotting task more highly than the other task, and that weighs utterances of a keyword more highly than utterances of other speech. In some applications, a keyword spotter configured with the acoustic parameters is used for trigger or wake word detection.Type: GrantFiled: June 30, 2016Date of Patent: May 28, 2019Assignee: Amazon Technologies, Inc.Inventors: Sankaran Panchapagesan, Bjorn Hoffmeister, Arindam Mandal, Aparna Khare, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Ming Sun
-
Patent number: 10283111Abstract: Automatic speech recognition (ASR) processing including a feedback configuration to allow for improved disambiguation between ASR hypotheses. After ASR processing of an incoming utterance where the ASR outputs an N-best list including multiple hypotheses, the multiple hypotheses are passed downstream for further processing. The downstream further processing may include natural language understanding (NLU) or other processing to determine a command result for each hypothesis. The command results are compared to determine if any hypotheses of the N-best list would yield similar command results. If so, the hypothesis(es) with similar results are removed from the N-best list so that only one hypothesis of the similar results remains in the N-best list. The remaining non-similar hypotheses are sent for disambiguation, or, if only one hypothesis remains, it is sent for execution.Type: GrantFiled: December 19, 2016Date of Patent: May 7, 2019Assignee: Amazon Technologies, Inc.Inventors: Francois Mairesse, Paul Frederick Raccuglia, Shiv Naga Prasad Vitaladevuni, Simon Peter Reavely
-
Patent number: 10121494Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.Type: GrantFiled: March 30, 2017Date of Patent: November 6, 2018Assignee: Amazon Technologies, Inc.Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
-
Patent number: 9940949Abstract: In a speech-based system, a wake word or other trigger expression is used to preface user speech that is intended as a command. The system receives multiple directional audio signals, each of which emphasizes sound from a different direction. The trigger expression is detected in an individual directional audio signal by comparing a confidence score with a confidence threshold. An individual confidence threshold is specified for each directional audio signal. The confidence thresholds are adjusted during operation of the system based on performance information that is generated during operation of the system. As an example, performance information may include the number of times that the trigger expression has been detected in each of the directional audio signals.Type: GrantFiled: December 19, 2014Date of Patent: April 10, 2018Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Shiv Naga Prasad Vitaladevuni, Philip Ryan Hilmes
-
Patent number: 9899021Abstract: Features are disclosed for modeling user interaction with a detection system using a stochastic dynamical model in order to determine or adjust detection thresholds. The model may incorporate numerous features, such as the probability of false rejection and false acceptance of a user utterance and the cost associated with each potential action. The model may determine or adjust detection thresholds so as to minimize the occurrence of false acceptances and false rejections while preserving other desirable characteristics. The model may further incorporate background and speaker statistics. Adjustments to the model or other operation parameters can be implemented based on the model, user statistics, and/or additional data.Type: GrantFiled: December 20, 2013Date of Patent: February 20, 2018Assignee: Amazon Technologies, Inc.Inventors: Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Rohit Prasad
-
Publication number: 20180012593Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.Type: ApplicationFiled: July 3, 2017Publication date: January 11, 2018Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
-
Patent number: 9704478Abstract: Features are disclosed for filtering portions of an output audio signal in order to improve automatic speech recognition on an input signal which may include a representation of the output signal. A signal that includes audio content can be received, and a frequency or band of frequencies can be selected to be filtered from the signal. The frequency band may correspond to a desired frequency band for speech recognition. An input signal can be obtained comprising audio data corresponding to a user utterance and presentation of the output signal. Automatic speech recognition can be performed on the input signal. In some cases, an acoustic model trained for use with such frequency band filtering may be used to perform speech recognition.Type: GrantFiled: December 2, 2013Date of Patent: July 11, 2017Assignee: Amazon Technologies, Inc.Inventors: Shiv Naga Prasad Vitaladevuni, Amit Singh Chhetri, Phillip Ryan Hilmes, Rohit Prasad
-
Patent number: 9697828Abstract: Features are disclosed for detecting words in audio using environmental information and/or contextual information in addition to acoustic features associated with the words to be detected. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.Type: GrantFiled: June 20, 2014Date of Patent: July 4, 2017Assignee: Amazon Technologies, Inc.Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
-
Patent number: 9600231Abstract: A revised support vector machine (SVM) classifier is offered to distinguish between true keywords and false positives based on output from a keyword spotting component of a speech recognition system. The SVM operates on a reduced set of feature dimensions, where the feature dimensions are selected based on their ability to distinguish between true keywords and false positives. Further, support vectors pairs are merged to create a reduced set of re-weighted support vectors. These techniques result in an SVM that may be operated using reduced computing resources, thus improving system performance.Type: GrantFiled: June 26, 2015Date of Patent: March 21, 2017Assignee: AMAZON TECHNOLOGIES, INC.Inventors: Ming Sun, Björn Hoffmeister, Shiv Naga Prasad Vitaladevuni, Varun Kumar Nagaraja
-
Patent number: 9589560Abstract: Features are disclosed for estimating a false rejection rate in a detection system. The false rejection rate can be estimated by fitting a model to a distribution of detection confidence scores. An estimated false rejection rate can then be computed for confidence scores that fall below a threshold. The false rejection rate and model can be verified once the detection system has been deployed by obtaining additional data with confidence scores falling below the threshold. Adjustments to the model or other operational parameters can be implemented based on the verified false rejection rate, model, or additional data.Type: GrantFiled: December 19, 2013Date of Patent: March 7, 2017Assignee: Amazon Technologies, Inc.Inventors: Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Rohit Prasad