Patents by Inventor Prashant SRIDHAR

Prashant SRIDHAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Patent number: 11922951
    Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.
    Type: Grant
    Filed: January 3, 2022
    Date of Patent: March 5, 2024
    Assignee: GOOGLE LLC
    Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
  • Patent number: 11646011
    Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.
    Type: Grant
    Filed: June 22, 2022
    Date of Patent: May 9, 2023
    Assignee: GOOGLE LLC
    Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
  • Publication number: 20220328035
    Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.
    Type: Application
    Filed: June 22, 2022
    Publication date: October 13, 2022
    Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
  • Publication number: 20220319501
    Abstract: The amount of future context used in a speech processing application allows for tradeoffs between performance and the delay in providing results to users. Existing speech processing applications may be trained with a specified future context size and perform poorly when used in production with a different future context size. A speech processing application trained using a stochastic future context allows a trained neural network to be used in production with different amounts of future context. During an update step in training, a future-context size may be sampled from a probability distribution, used to mask a neural network, and compute an output of the masked neural network. The output may then be used to compute a loss value and update parameters of the neural network. The trained neural network may then be used in production with different amounts of future context to provide greater flexibility for production speech processing applications.
    Type: Application
    Filed: November 18, 2021
    Publication date: October 6, 2022
    Inventors: Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu Jeong Han
  • Patent number: 11410641
    Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.
    Type: Grant
    Filed: November 27, 2019
    Date of Patent: August 9, 2022
    Assignee: GOOGLE LLC
    Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
  • Patent number: 11379792
    Abstract: An inventory management server is provided. The inventory management server includes at least one processor, and at least one memory. The at least one memory includes computer program code configured to cause the inventory management server at least to receive tracking data assigned to a product from a payment network, interrogate a mapping table containing assigned product to tracking data information, for the presence of the received tracking data, update an inventory database of the product stocked at the merchant inventory in response to detection of the presence of the received tracking data, and transmit acknowledgement data indicative of the inventory database update. The tracking data is transmitted by a merchant via a payment terminal in communication with the payment network.
    Type: Grant
    Filed: June 16, 2017
    Date of Patent: July 5, 2022
    Assignee: MASTERCARD ASIA/PACIFIC PTE. LTD.
    Inventors: Hao Tang, Senxian Zhuo, Xijing Wang, Bensam Joyson, Naman Aggarwal, Donghao Huang, Prashant Sridhar, Martin Collings, Perry Kick
  • Publication number: 20220122611
    Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.
    Type: Application
    Filed: January 3, 2022
    Publication date: April 21, 2022
    Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
  • Patent number: 11217254
    Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.
    Type: Grant
    Filed: October 10, 2019
    Date of Patent: January 4, 2022
    Assignee: GOOGLE LLC
    Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
  • Publication number: 20200335083
    Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.
    Type: Application
    Filed: November 27, 2019
    Publication date: October 22, 2020
    Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
  • Publication number: 20200202869
    Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.
    Type: Application
    Filed: October 10, 2019
    Publication date: June 25, 2020
    Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
  • Publication number: 20170372264
    Abstract: An inventory management server is provided. The inventory management server includes at least one processor, and at least one memory. The at least one memory includes computer program code configured to cause the inventory management server at least to receive tracking data assigned to a product from a payment network, interrogate a mapping table containing assigned product to tracking data information, for the presence of the received tracking data, update an inventory database of the product stocked at the merchant inventory in response to detection of the presence of the received tracking data, and transmit acknowledgement data indicative of the inventory database update. The tracking data is transmitted by a merchant via a payment terminal in communication with the payment network.
    Type: Application
    Filed: June 16, 2017
    Publication date: December 28, 2017
    Inventors: Hao Tang, Senxian Zhuo, Xijing Wang, Bensam Joyson, Naman Aggarwal, Donghao Huang, Prashant Sridhar, Martin Collings, Perry Kick
  • Publication number: 20170201377
    Abstract: There is provided a data processor implemented method for dynamic authentication of an object. There is also provided non-transitory computer readable storage mediums and systems for carrying out dynamic authentication of an object.
    Type: Application
    Filed: January 9, 2017
    Publication date: July 13, 2017
    Applicant: MASTERCARD ASIA/PACIFIC PTE LTD
    Inventors: Hao TANG, Xijing WANG, Senxian ZHUO, Yong-How CHIN, Jiaming LI, Bensam JOYSON, Donghao HUANG, Martin COLLINGS, Prashant SRIDHAR, Perry KICK