Patents by Inventor Rohit Prakash

Rohit Prakash has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11990133
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
    Type: Grant
    Filed: July 7, 2023
    Date of Patent: May 21, 2024
    Assignee: GOOGLE LLC
    Inventors: Asaf Aharoni, Arun Narayanan, Nir Shabat, Parisa Haghani, Galen Tsai Chuang, Yaniv Leviathan, Neeraj Gaur, Pedro J. Moreno Mengibar, Rohit Prakash Prabhavalkar, Zhongdi Qu, Austin Severn Waters, Tomer Amiaz, Michiel A. U. Bacchiani
  • Publication number: 20240153498
    Abstract: A method includes receiving context biasing data that includes a set of unspoken textual utterances corresponding to a particular context. The method also includes obtaining a list of carrier phrases associated with the particular context. For each respective unspoken textual utterance, the method includes generating a corresponding training data pair that includes the respective unspoken textual utterance and a carrier phrase. For each respective training data pair, the method includes tokenizing the respective training data pair into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit, receiving the first higher order textual feature representation, and generating a first probability distribution over possible text units. The method also includes training a speech recognition model based on the first probability distribution over possible text units.
    Type: Application
    Filed: October 20, 2023
    Publication date: May 9, 2024
    Applicant: Google LLC
    Inventors: Tara N. Sainath, Rohit Prakash Prabhavalkar, Diamantino Antonio Caseiro, Patrick Maxim Rondon, Cyril Allauzen
  • Publication number: 20240144917
    Abstract: A method includes obtaining a base encoder from a pre-trained model, and receiving training data comprising a sequence of acoustic frames characterizing an utterance paired with a ground-truth transcription of the utterance. At each of a plurality of output steps, the method includes: generating, by the base encoder, a first encoded representation for a corresponding acoustic frame; generating, by an exporter network configured to receive a continuous sequence of first encoded representations generated by the base encoder, a second encoded representation for a corresponding acoustic frame; generating, by an exporter decoder, a probability distribution over possible logits; and determining an exporter decoder loss based on the probability distribution over possible logits generated by the exporter decoder at the corresponding output step and the ground-truth transcription.
    Type: Application
    Filed: October 25, 2023
    Publication date: May 2, 2024
    Applicant: Google LLC
    Inventors: Rami Magdi Fahmi Botros, Rohit Prakash Prabhavalkar, Johan Schalkwyk, Tara N. Sainath, Ciprian Ioan Chelba, Francoise Beaufays
  • Patent number: 11971263
    Abstract: Techniques for generating geographical units that can be used to generate delivery routes are described herein. Geospatial vector data and barrier geospatial vector data for a geographical area may be obtained. Seed points for one or more portions of the geographical area may be determined based at least in part on historical delivery volume for the geographical area. A plurality of polygons that represent the geographical area may be determined based at least in part on an algorithm that uses the seed points, the geospatial vector data, and the barrier geospatial vector data. Coordinates for a geographical unit of a plurality of geographical units that divide the geographical area may be determined based at least in part on the plurality of polygons and a polygon-to-polygon barrier aware drive time matrix that identifies a calculated cost for traveling from one polygon to another polygon using barriers identified in the barrier geospatial vector data.
    Type: Grant
    Filed: August 11, 2020
    Date of Patent: April 30, 2024
    Assignee: Amazon Technologies, Inc.
    Inventors: Rohit Malshe, Dipal Patel Gupta, Liron David Yedidsion, Abhilasha Prakash Katariya, Jin Ye, Natarajan Gautam
  • Patent number: 11966611
    Abstract: The disclosed technology relates determining a first subset of a plurality drives having a first zone size and a second subset of the plurality of drives having a second zone size different from the first zone size, within a redundant array of independent disks (RAID) group. A prevailing zone size between the first zone size and the second zone size is determined. One or more logical zones within the determined first subset of the plurality of drives and the determined second subset of the plurality of drives for a received input-output operation is reserved based on the determined prevailing zone size. The received input-output operation is completed within the reserved one or more logical zones within the determined first subset of the plurality of drives and the determined second subset of the plurality of drives.
    Type: Grant
    Filed: June 9, 2023
    Date of Patent: April 23, 2024
    Assignee: NETAPP, INC.
    Inventors: Rohit Shankar Singh, Douglas P. Doucette, Abhijeet Prakash Gole, Sushilkumar Gangadharan
  • Patent number: 11954503
    Abstract: The present invention provides for building a knowledgebase of dependencies between Configuration Items(CIs) associated with IT computing environment. In operation, the present invention provides for mapping a plurality of Configuration Items(CI) with respective one or more actions. The present invention further provides for tracking and capturing of one or more actions performed on one or more CIs in relation to resolving an activity related to a reported CI. Further, the present invention provides for identifying dependencies between one or more CIs and the reported CI based on the captured one or more actions. Furthermore, the present invention provides for building a knowledgebase of dependencies between CIs of the computing environment based on the identified dependencies between one or more CIs and the reported CI. Yet further, the present invention provides for generating visual representations of dependencies between CIs.
    Type: Grant
    Filed: May 12, 2022
    Date of Patent: April 9, 2024
    Assignee: COGNIZANT TECHNOLOGY SOLUTIONS INDIA PVT. LTD.
    Inventors: Rohit Prakash, Rohan Prakash, Yogesh Sosale Gundurao, Ambarish Poojari, Ragini Suresh, Pooja Jagadish
  • Patent number: 11948570
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for detecting utterances of a key phrase in an audio signal. One of the methods includes receiving, by a key phrase spotting system, an audio signal encoding one or more utterances; while continuing to receive the audio signal, generating, by the key phrase spotting system, an attention output using an attention mechanism that is configured to compute the attention output based on a series of encodings generated by an encoder comprising one or more neural network layers; generating, by the key phrase spotting system and using attention output, output that indicates whether the audio signal likely encodes the key phrase; and providing, by the key phrase spotting system, the output that indicates whether the audio signal likely encodes the key phrase.
    Type: Grant
    Filed: March 9, 2022
    Date of Patent: April 2, 2024
    Assignee: Google LLC
    Inventors: Wei Li, Rohit Prakash Prabhavalkar, Kanury Kanishka Rao, Yanzhang He, Ian C. Mcgraw, Anton Bakhtin
  • Patent number: 11948062
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer storage media, for implementing a compressed recurrent neural network (RNN). One of the systems includes a compressed RNN, the compressed RNN comprising a plurality of recurrent layers, wherein each of the recurrent layers has a respective recurrent weight matrix and a respective inter-layer weight matrix, and wherein at least one of recurrent layers is compressed such that a respective recurrent weight matrix of the compressed layer is defined by a first compressed weight matrix and a projection matrix and a respective inter-layer weight matrix of the compressed layer is defined by a second compressed weight matrix and the projection matrix.
    Type: Grant
    Filed: December 4, 2020
    Date of Patent: April 2, 2024
    Assignee: Google LLC
    Inventors: Ouais Alsharif, Rohit Prakash Prabhavalkar, Ian C. McGraw, Antoine Jean Bruguier
  • Patent number: 11942076
    Abstract: A method includes receiving audio data encoding an utterance spoken by a native speaker of a first language, and receiving a biasing term list including one or more terms in a second language different than the first language. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data to generate speech recognition scores for both wordpieces and corresponding phoneme sequences in the first language. The method also includes rescoring the speech recognition scores for the phoneme sequences based on the one or more terms in the biasing term list, and executing, using the speech recognition scores for the wordpieces and the rescored speech recognition scores for the phoneme sequences, a decoding graph to generate a transcription for the utterance.
    Type: Grant
    Filed: February 16, 2022
    Date of Patent: March 26, 2024
    Assignee: Google LLC
    Inventors: Ke Hu, Golan Pundak, Rohit Prakash Prabhavalkar, Antoine Jean Bruguier, Tara N. Sainath
  • Patent number: 11922932
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses a set of speech recognition hypothesis samples, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.
    Type: Grant
    Filed: March 31, 2023
    Date of Patent: March 5, 2024
    Assignee: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Yonghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Patricia Kannan
  • Patent number: 11908461
    Abstract: A method of performing speech recognition using a two-pass deliberation architecture includes receiving a first-pass hypothesis and an encoded acoustic frame and encoding the first-pass hypothesis at a hypothesis encoder. The first-pass hypothesis is generated by a recurrent neural network (RNN) decoder model for the encoded acoustic frame. The method also includes generating, using a first attention mechanism attending to the encoded acoustic frame, a first context vector, and generating, using a second attention mechanism attending to the encoded first-pass hypothesis, a second context vector. The method also includes decoding the first context vector and the second context vector at a context vector decoder to form a second-pass hypothesis.
    Type: Grant
    Filed: January 14, 2021
    Date of Patent: February 20, 2024
    Assignee: Google LLC
    Inventors: Ke Hu, Tara N. Sainath, Ruoming Pang, Rohit Prakash Prabhavalkar
  • Publication number: 20240028829
    Abstract: A method includes receiving training data that includes a set of unspoken textual utterances. For each respective unspoken textual utterance, the method includes, tokenizing the respective textual utterance into a sequence of sub-word units, generating a first higher order textual feature representation for a corresponding sub-word unit tokenized from the respective unspoken textual utterance, receiving the first higher order textual feature representation generated by a text encoder, and generating a first probability distribution over possible text units. The method also includes training an encoder based on the first probability distribution over possible text units generated by a first-pass decoder for each respective unspoken textual utterance in the set of unspoken textual utterances.
    Type: Application
    Filed: July 1, 2023
    Publication date: January 25, 2024
    Applicant: Google LLC
    Inventors: Tara N. Sainath, Zhouyuan Huo, Zhehuai Chen, Yu Zhang, Weiran Wang, Trevor Strohman, Rohit Prakash Prabhavalkar, Bo Li, Ankur Bapna
  • Publication number: 20230352027
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
    Type: Application
    Filed: July 7, 2023
    Publication date: November 2, 2023
    Inventors: Asaf Aharoni, Arun Narayanan, Nir Shabat, Parisa Haghani, Galen Tsai Chuang, Yaniv Leviathan, Neeraj Gaur, Pedro J. Moreno Mengibar, Rohit Prakash Prabhavalkar, Zhongdi Qu, Austin Severn Waters, Tomer Amiaz, Michiel A.U. Bacchiani
  • Publication number: 20230343332
    Abstract: A joint segmenting and ASR model includes an encoder and decoder. The encoder configured to: receive a sequence of acoustic frames characterizing one or more utterances; and generate, at each output step, a higher order feature representation for a corresponding acoustic frame. The decoder configured to: receive the higher order feature representation and generate, at each output step: a probability distribution over possible speech recognition hypotheses, and an indication of whether the corresponding output step corresponds to an end of speech segment. The j oint segmenting and ASR model trained on a set of training samples, each training sample including: audio data characterizing a spoken utterance; and a corresponding transcription of the spoken utterance, the corresponding transcription having an end of speech segment ground truth token inserted into the corresponding transcription automatically based on a set of heuristic-based rules and exceptions applied to the training sample.
    Type: Application
    Filed: April 20, 2023
    Publication date: October 26, 2023
    Applicant: Google LLC
    Inventors: Ronny Huang, Shuo-yiin Chang, David Rybach, Rohit Prakash Prabhavalkar, Tara N. Sainath, Cyril Allauzen, Charles Caleb Peyser, Zhiyun Lu
  • Publication number: 20230317021
    Abstract: A multi-mode display includes a mode selector to select one of a plurality of modes, each of the modes having a different light configuration, wherein one mode comprises a reduced color space mode, and one or more light sources controlled by the mode selector, the one or more light sources used to display content to a user with the multi-mode display.
    Type: Application
    Filed: April 4, 2023
    Publication date: October 5, 2023
    Applicant: Avegant Corp.
    Inventors: Aaron Matthew Eash, Edward Chia Ning Tang, Warren Cornelius Welch, III, Christopher David Westra, Rohit Prakash, William Tze-Tse Chien
  • Publication number: 20230298570
    Abstract: A method includes generating, using an audio encoder, a higher-order feature representation for each acoustic frame in a sequence of acoustic frames; generating, using a decoder, based on the higher-order feature representation, a plurality of speech recognition hypotheses, each hypotheses corresponding to a candidate transcription of an utterance and having an associated first likelihood score; generating, using an external language model, for each speech recognition hypothesis, a second likelihood score; determining, using a learnable fusion module, for each speech recognition hypothesis, a set of fusion weights based on the higher-order feature representation and the speech recognition hypothesis; and generating, using the learnable fusion module, for each speech recognition hypothesis, a third likelihood score based on the first likelihood score, the second likelihood score, and the set of fusion weights, the audio encoder and decoder trained using minimum additive error rate training in the presence of t
    Type: Application
    Filed: March 21, 2023
    Publication date: September 21, 2023
    Applicant: Google LLC
    Inventors: Weiran Wang, Tongzhou Chen, Tara N. Sainath, Ehsan Variani, Rohit Prakash Prabhavalkar, Ronny Huang, Bhuvana Ramabhadran, Neeraj Gaur, Sepand Mavandadi, Charles Caleb Peyser, Trevor Strohman, Yangzhang He, David Rybach
  • Publication number: 20230274736
    Abstract: A method of biasing speech recognition includes receiving audio data encoding an utterance and obtaining a set of one or more biasing phrases corresponding to a context of the utterance. Each biasing phrase in the set of one or more biasing phrases includes one or more words. The method also includes processing, using a speech recognition model, acoustic features derived from the audio data and grapheme and phoneme data derived from the set of one or more biasing phrases to generate an output of the speech recognition model. The method also includes determining a transcription for the utterance based on the output of the speech recognition model.
    Type: Application
    Filed: May 4, 2023
    Publication date: August 31, 2023
    Applicant: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Golan Pundak, Tara N. Sainath, Antoine Jean Bruguier
  • Patent number: 11741966
    Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for an automated calling system are disclosed. In one aspect, a method includes the actions of receiving audio data of an utterance spoken by a user who is having a telephone conversation with a bot. The actions further include determining a context of the telephone conversation. The actions further include determining a user intent of a first previous portion of the telephone conversation spoken by the user and a bot intent of a second previous portion of the telephone conversation outputted by a speech synthesizer of the bot. The actions further include, based on the audio data of the utterance, the context of the telephone conversation, the user intent, and the bot intent, generating synthesized speech of a reply by the bot to the utterance. The actions further include, providing, for output, the synthesized speech.
    Type: Grant
    Filed: October 12, 2022
    Date of Patent: August 29, 2023
    Assignee: GOOGLE LLC
    Inventors: Asaf Aharoni, Arun Narayanan, Nir Shabat, Parisa Haghani, Galen Tsai Chuang, Yaniv Leviathan, Neeraj Gaur, Pedro J. Moreno Mengibar, Rohit Prakash Prabhavalkar, Zhongdi Qu, Austin Severn Waters, Tomer Amiaz, Michiel A. U. Bacchiani
  • Publication number: 20230259368
    Abstract: The present invention provides for building a knowledgebase of dependencies between Configuration Items(CIs) associated with IT computing environment. In operation, the present invention provides for mapping a plurality of Configuration Items(CI) with respective one or more actions. The present invention further provides for tracking and capturing of one or more actions performed on one or more CIs in relation to resolving an activity related to a reported CI. Further, the present invention provides for identifying dependencies between one or more CIs and the reported CI based on the captured one or more actions. Furthermore, the present invention provides for building a knowledgebase of dependencies between CIs of the computing environment based on the identified dependencies between one or more CIs and the reported CI. Yet further, the present invention provides for generating visual representations of dependencies between CIs.
    Type: Application
    Filed: May 12, 2022
    Publication date: August 17, 2023
    Inventors: Rohit Prakash, Rohan Prakash, Yogesh Sosale Gundurao, Ambarish Poojari, Ragini Suresh, Pooja Jagadish
  • Publication number: 20230237995
    Abstract: Methods, systems, and apparatus, including computer programs encoded on computer-readable storage media, for speech recognition using attention-based sequence-to-sequence models. In some implementations, audio data indicating acoustic characteristics of an utterance is received. A sequence of feature vectors indicative of the acoustic characteristics of the utterance is generated. The sequence of feature vectors is processed using a speech recognition model that has been trained using a loss function that uses a set of speech recognition hypothesis samples, the speech recognition model including an encoder, an attention module, and a decoder. The encoder and decoder each include one or more recurrent neural network layers. A sequence of output vectors representing distributions over a predetermined set of linguistic units is obtained. A transcription for the utterance is obtained based on the sequence of output vectors. Data indicating the transcription of the utterance is provided.
    Type: Application
    Filed: March 31, 2023
    Publication date: July 27, 2023
    Applicant: Google LLC
    Inventors: Rohit Prakash Prabhavalkar, Tara N. Sainath, Younghui Wu, Patrick An Phu Nguyen, Zhifeng Chen, Chung-Cheng Chiu, Anjuli Kannan