Patents by Inventor Shiv Naga Prasad Vitaladevuni

Shiv Naga Prasad Vitaladevuni has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20240029730
    Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.
    Type: Application
    Filed: May 24, 2023
    Publication date: January 25, 2024
    Inventors: Rohit Prasad, Shiv Naga Prasad Vitaladevuni, Prem Natarajan
  • Publication number: 20230410833
    Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.
    Type: Application
    Filed: April 6, 2023
    Publication date: December 21, 2023
    Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
  • Patent number: 11769496
    Abstract: Described are techniques for predicting when data associated with a user input is likely to be selected for deletion. The system may use a trained model to assist with such predictions. The trained model can be configured based on deletions associated with a user profile. An example process can including receiving user input data corresponding to the user profile, and processing the user input data to determine a user command. Based on characteristic data of the user command, the trained model can be used to determine that data corresponding to the user command is likely to be selected for deletion. The trained model can be iteratively updated based on additional user commands, including previously received user commands to delete user input data.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: September 26, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohit Prasad, Shiv Naga Prasad Vitaladevuni, Prem Natarajan
  • Patent number: 11699433
    Abstract: Techniques for using a dynamic wakeword detection threshold are described. A device detects a wakeword in audio data using a first wakeword detection threshold value. Thereafter, the device receives audio including speech. If the device receives the audio within a predetermined duration of time after detecting the previous wakeword, the device attempts to detect a wakeword in second audio data, corresponding to the audio including the speech, using a second, lower wakeword detection threshold value.
    Type: Grant
    Filed: July 23, 2020
    Date of Patent: July 11, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Gengshen Fu, Shiv Naga Prasad Vitaladevuni, Paul McIntyre, Shuang Wu
  • Patent number: 11670299
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: May 17, 2021
    Date of Patent: June 6, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechai, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11657804
    Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
    Type: Grant
    Filed: November 5, 2020
    Date of Patent: May 23, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
  • Patent number: 11657832
    Abstract: A speech-capture device can capture audio data during wakeword monitoring and use the audio data to determine if a user is present nearby the device, even if no wakeword is spoken. Audio such as speech, human originating sounds (e.g., coughing, sneezing), or other human related noises (e.g., footsteps, doors closing) can be used to detect audio. Audio frames are individually scored as to whether a human presence is detected in the particular audio frames. The scores are then smoothed relative to nearby frames to create a decision for a particular frame. Presence information can then be sent according to a periodic schedule to a remote device to create a presence “heartbeat” that regularly identifies whether a user is detected proximate to a speech-capture device.
    Type: Grant
    Filed: September 16, 2020
    Date of Patent: May 23, 2023
    Assignee: Amazon Technologies, Inc.
    Inventors: Shiva Kumar Sundaram, Chao Wang, Shiv Naga Prasad Vitaladevuni, Spyridon Matsoukas, Arindam Mandal
  • Patent number: 11521599
    Abstract: A system and method performs wakeword detection using a feedforward neural network model. A first output of the model indicates when the wakeword appears on a right side of a first window of input audio data. A second output of the model indicates when the wakeword appears in the center of a second window of input audio data. A third output of the model indicates when the wakeword appears on a left side of a third window of input audio data. Using these outputs, the system and method determine a beginpoint and endpoint of the wakeword.
    Type: Grant
    Filed: September 20, 2019
    Date of Patent: December 6, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Christin Jose, Yuriy Mishchenko, Anish N. Shah, Alex Escott, Parind Shah, Shiv Naga Prasad Vitaladevuni, Thibaud Senechal
  • Patent number: 11355102
    Abstract: A neural network model of a user device is trained to map different words represented in audio data to different points in an N-dimensional embedding space. When the user device determines that a mapped point corresponds to a wakeword, it causes further audio processing, such as automatic speech recognition or natural-language understanding, to be performed on the audio data. The user device may first create the wakeword by first processing audio data representing the wakeword to determine the mapped point in the embedding space.
    Type: Grant
    Filed: December 12, 2019
    Date of Patent: June 7, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Yuriy Mishchenko, Thibaud Senechal, Anish N. Shah, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11308939
    Abstract: A system and method performs wakeword detection and automatic speech recognition using the same acoustic model. A mapping engine maps phones/senones output by the acoustic model to phones/senones corresponding to the wakeword. A hidden Markov model (HMM) may determine that the wakeword is present in audio data; the HMM may have multiple paths for multiple wakewords or may have multiple models. Once the wakeword is detected, ASR is performed using the acoustic model.
    Type: Grant
    Filed: September 25, 2018
    Date of Patent: April 19, 2022
    Assignee: Amazon Technologies, Inc.
    Inventors: Yixin Gao, Ming Sun, Varun Nagaraja, Gengshen Fu, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Publication number: 20220093094
    Abstract: A natural language system may be configured to act as a participant in a conversation between two users. The system may determine when a user expression such as speech, a gesture, or the like is directed from one user to the other. The system may processing input data related the expression (such as audio data, input data, language processing result data, conversation context data, etc.) to determine if the system should interject a response to the user-to-user expression. If so, the system may process the input data to determine a response and output it. The system may track that response as part of the data related to the ongoing conversation.
    Type: Application
    Filed: December 4, 2020
    Publication date: March 24, 2022
    Inventors: Prakash Krishnan, Arindam Mandal, Siddhartha Reddy Jonnalagadda, Nikko Strom, Ariya Rastrow, Shiv Naga Prasad Vitaladevuni, Angeliki Metallinou, Vincent Auvray, Minmin Shen, Josey Diego Sandoval, Rohit Prasad, Thomas Taylor, Amotz Maimon
  • Publication number: 20220093093
    Abstract: A system can operate a speech-controlled device in a mode where the speech-controlled device determines that an utterance is directed at the speech-controlled device using image data showing the user speaking the utterance. If the user is directing the user's gaze at the speech-controlled device while speaking, the system may determine the utterance is system directed and thus may perform further speech processing based on the utterance. If the user's gaze is directed elsewhere, the system may determine the utterance is not system directed (for example directed at another user) and thus the system may not perform further speech processing based on the utterance and may take other actions, for example discarding audio data of the utterance.
    Type: Application
    Filed: December 4, 2020
    Publication date: March 24, 2022
    Inventors: Prakash Krishnan, Arindam Mandal, Nikko Strom, Pradeep Natarajan, Ariya Rastrow, Shiv Naga Prasad Vitaladevuni, David Chi-Wai Tang, Aaron Challenner, Xu Zhang, Krishna Anisetty, Josey Diego Sandoval, Rohit Prasad, Premkumar Natarajan
  • Publication number: 20210398533
    Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.
    Type: Application
    Filed: June 28, 2021
    Publication date: December 23, 2021
    Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
  • Patent number: 11205420
    Abstract: A system and method performs wakeword detection using a neural network model that includes a recurrent neural network (RNN) for processing variable-length wakewords. To prevent the model from being influenced by non-wakeword speech, multiple instances of the model are created to process audio data, and each instance is configured to use weights determined by training data. The model may instead or in addition be used to process the audio data only when a likelihood that the audio data corresponds to the wakeword is greater than a threshold. The model may process the audio data as represented by groups of acoustic feature vectors; computations for feature vectors common to different groups may be re-used.
    Type: Grant
    Filed: June 10, 2019
    Date of Patent: December 21, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Gengshen Fu, Thibaud Senechal, Shiv Naga Prasad Vitaladevuni, Michael J. Rodehorst, Varun K. Nagaraja
  • Publication number: 20210358497
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Application
    Filed: May 17, 2021
    Publication date: November 18, 2021
    Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11132990
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: September 28, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Patent number: 11069353
    Abstract: A system and method performs multilingual wakeword detection by determining a language corresponding to the wakeword. A first wakeword-detection component, which may execute using a digital-signal processor, determines that audio data includes a representation of the wakeword and determines a language corresponding to the wakeword. A second, more accurate wakeword-detection component may then process the audio data using the language to confirm that it includes the representation of the wakeword. The audio data may then be sent to a remote system for further processing.
    Type: Grant
    Filed: May 6, 2019
    Date of Patent: July 20, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Yixin Gao, Ming Sun, Jason Krone, Shiv Naga Prasad Vitaladevuni, Yuzong Liu
  • Patent number: 11043218
    Abstract: A system processes audio data to detect when it includes a representation of a wakeword or of an acoustic event. The system may receive or determine acoustic features for the audio data, such as log-filterbank energy (LFBE). The acoustic features may be used by a first, wakeword-detection model to detect the wakeword; the output of this model may be further processed using a softmax function, to smooth it, and to detect spikes. The same acoustic features may be also be used by a second, acoustic-event-detection model to detect the acoustic event; the output of this model may be further processed using a sigmoid function and a classifier. Another model may be used to extract additional features from the LFBE data; these additional features may be used by the other models.
    Type: Grant
    Filed: June 26, 2019
    Date of Patent: June 22, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Ming Sun, Thibaud Senechal, Yixin Gao, Anish N. Shah, Spyridon Matsoukas, Chao Wang, Shiv Naga Prasad Vitaladevuni
  • Publication number: 20210134276
    Abstract: Features are disclosed for detecting words in audio using contextual information in addition to automatic speech recognition results. A detection model can be generated and used to determine whether a particular word, such as a keyword or “wake word,” has been uttered. The detection model can operate on features derived from an audio signal, contextual information associated with generation of the audio signal, and the like. In some embodiments, the detection model can be customized for particular users or groups of users based usage patterns associated with the users.
    Type: Application
    Filed: November 5, 2020
    Publication date: May 6, 2021
    Inventors: Rohit Prasad, Kenneth John Basye, Spyridon Matsoukas, Rajiv Ramachandran, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister
  • Patent number: 10964315
    Abstract: An approach to wakeword detection uses an explicit representation of non-wakeword speech in the form of subword (e.g., phonetic monophone) units that do not necessarily occur in the wakeword and that broadly represent general speech. These subword units are arranged in a “background” model, which at runtime essentially competes with the wakeword model such that a wakeword is less likely to be declare as occurring when the input matches that background model well. An HMM may be used with the model to locate possible occurrences of the wakeword. Features are determined from portions of the input corresponding to subword units of the wakeword detected using the HMM. A secondary classifier is then used to process the features to yield a decision of whether the wakeword occurred.
    Type: Grant
    Filed: June 30, 2017
    Date of Patent: March 30, 2021
    Assignee: Amazon Technologies, Inc.
    Inventors: Minhua Wu, Sankaran Panchapagesan, Ming Sun, Shiv Naga Prasad Vitaladevuni, Bjorn Hoffmeister, Ryan Paul Thomas, Arindam Mandal