Patents by Inventor Shiv Naga Prasad Vitaladevuni
Shiv Naga Prasad Vitaladevuni has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20240029730Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.Type: ApplicationFiled: May 24, 2023Publication date: January 25, 2024Inventors: Rohit Prasad, Shiv Naga Prasad Vitaladevuni, Prem Natarajan
-
Publication number: 20230410833Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.Type: ApplicationFiled: April 6, 2023Publication date: December 21, 2023Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
-
Patent number: 11769496Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.Type: GrantFiled: December 12, 2019Date of Patent: September 26, 2023Assignee: Amazon Technologies, Inc.Inventors: Rohit Prasad, Shiv Naga Prasad Vitaladevuni, Prem Natarajan
-
Patent number: 11699433Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.Type: GrantFiled: July 23, 2020Date of Patent: July 11, 2023Assignee: Amazon Technologies, Inc.Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
-
Patent number: 11670299Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: May 17, 2021Date of Patent: June 6, 2023Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11657804Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.Type: GrantFiled: November 5, 2020Date of Patent: May 23, 2023Assignee: Amazon Technologies, Inc.Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
-
Patent number: 11657832Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.Type: GrantFiled: September 16, 2020Date of Patent: May 23, 2023Assignee: Amazon Technologies, Inc.Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
-
Patent number: 11521599Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.Type: GrantFiled: September 20, 2019Date of Patent: December 6, 2022Assignee: Amazon Technologies, Inc.Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
-
Patent number: 11355102Abstract: A neural network model of a user device is trained to map different words represented in audio data to different points in an N-dimensional embedding space. When the user device determines that a mapped point corresponds to a wakeword, it causes further audio processing, such as automatic speech recognition or natural-language understanding, to be performed on the audio data. The user device may first create the wakeword by first processing audio data representing the wakeword to determine the mapped point in the embedding space.Type: GrantFiled: December 12, 2019Date of Patent: June 7, 2022Assignee: Amazon Technologies, Inc.Inventors: Yuriy Mishchenko, Thibaud Senechal, Anish N. Shah, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11308939Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.Type: GrantFiled: September 25, 2018Date of Patent: April 19, 2022Assignee: Amazon Technologies, Inc.Inventors: Yixin Gao, Ming Sun, Varun Nagaraja, Gengshen Fu, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Publication number: 20220093094Abstract: A natural language system may be configured to act as a participant in a conversation between two users. The system may determine when a user expression such as speech, a gesture, or the like is directed from one user to the other. The system may processing input data related the expression (such as audio data, input data, language processing result data, conversation context data, etc.) to determine if the system should interject a response to the user-to-user expression. If so, the system may process the input data to determine a response and output it. The system may track that response as part of the data related to the ongoing conversation.Type: ApplicationFiled: December 4, 2020Publication date: March 24, 2022Inventors: Prakash Krishnan, Arindam Mandal, Siddhartha Reddy Jonnalagadda, Nikko Strom, Ariya Rastrow, Shiv Naga Prasad Vitaladevuni, Angeliki Metallinou, Vincent Auvray, Minmin Shen, Josey Diego Sandoval, Rohit Prasad, Thomas Taylor, Amotz Maimon
-
Publication number: 20220093093Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.Type: ApplicationFiled: December 4, 2020Publication date: March 24, 2022Inventors: Prakash Krishnan, Arindam Mandal, Nikko Strom, Pradeep Natarajan, Ariya Rastrow, Shiv Naga Prasad Vitaladevuni, David Chi-Wai Tang, Aaron Challenner, Xu Zhang, Krishna Anisetty, Josey Diego Sandoval, Rohit Prasad, Premkumar Natarajan
-
Publication number: 20210398533Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.Type: ApplicationFiled: June 28, 2021Publication date: December 23, 2021Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
-
Patent number: 11205420Abstract: A system and method performs wakeword detection using a neural network model that includes a recurrent neural network (RNN) for processing variable-length wakewords. To prevent the model from being influenced by non-wakeword speech, multiple instances of the model are created to process audio data, and each instance is configured to use weights determined by training data. The model may instead or in addition be used to process the audio data only when a likelihood that the audio data corresponds to the wakeword is greater than a threshold. The model may process the audio data as represented by groups of acoustic feature vectors; computations for feature vectors common to different groups may be re-used.Type: GrantFiled: June 10, 2019Date of Patent: December 21, 2021Assignee: Amazon Technologies, Inc.Inventors: Gengshen Fu, Thibaud Senechal, Shiv Naga Prasad Vitaladevuni, Michael J. Rodehorst, Varun K. Nagaraja
-
Publication number: 20210358497Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: ApplicationFiled: May 17, 2021Publication date: November 18, 2021Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11132990Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: June 26, 2019Date of Patent: September 28, 2021Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Patent number: 11069353Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.Type: GrantFiled: May 6, 2019Date of Patent: July 20, 2021Assignee: Amazon Technologies, Inc.Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
-
Patent number: 11043218Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.Type: GrantFiled: June 26, 2019Date of Patent: June 22, 2021Assignee: Amazon Technologies, Inc.Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
-
Publication number: 20210134276Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.Type: ApplicationFiled: November 5, 2020Publication date: May 6, 2021Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
-
Patent number: 10964315Abstract: An approach to wakeword detection uses an explicit representation of non-wakeword speech in the form of subword (e.g., phonetic monophone) units that do not necessarily occur in the wakeword and that broadly represent general speech. These subword units are arranged in a “background” model, which at runtime essentially competes with the wakeword model such that a wakeword is less likely to be declare as occurring when the input matches that background model well. An HMM may be used with the model to locate possible occurrences of the wakeword. Features are determined from portions of the input corresponding to subword units of the wakeword detected using the HMM. A secondary classifier is then used to process the features to yield a decision of whether the wakeword occurred.Type: GrantFiled: June 30, 2017Date of Patent: March 30, 2021Assignee: Amazon Technologies, Inc.Inventors: Minhua Wu, Sankaran Panchapagesan, Ming Sun, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Ryan Paul Thomas, Arindam Mandal