Patents by Inventor Prashant SRIDHAR
Prashant SRIDHAR has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11922951Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.Type: GrantFiled: January 3, 2022Date of Patent: March 5, 2024Assignee: GOOGLE LLCInventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
-
Patent number: 11646011Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.Type: GrantFiled: June 22, 2022Date of Patent: May 9, 2023Assignee: GOOGLE LLCInventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
-
Publication number: 20220328035Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.Type: ApplicationFiled: June 22, 2022Publication date: October 13, 2022Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
-
Publication number: 20220319501Abstract: The amount of future context used in a speech processing application allows for tradeoffs between performance and the delay in providing results to users. Existing speech processing applications may be trained with a specified future context size and perform poorly when used in production with a different future context size. A speech processing application trained using a stochastic future context allows a trained neural network to be used in production with different amounts of future context. During an update step in training, a future-context size may be sampled from a probability distribution, used to mask a neural network, and compute an output of the masked neural network. The output may then be used to compute a loss value and update parameters of the neural network. The trained neural network may then be used in production with different amounts of future context to provide greater flexibility for production speech processing applications.Type: ApplicationFiled: November 18, 2021Publication date: October 6, 2022Inventors: Kwangyoun Kim, Felix Wu, Prashant Sridhar, Kyu Jeong Han
-
Patent number: 11410641Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.Type: GrantFiled: November 27, 2019Date of Patent: August 9, 2022Assignee: GOOGLE LLCInventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
-
Patent number: 11379792Abstract: An inventory management server is provided. The inventory management server includes at least one processor, and at least one memory. The at least one memory includes computer program code configured to cause the inventory management server at least to receive tracking data assigned to a product from a payment network, interrogate a mapping table containing assigned product to tracking data information, for the presence of the received tracking data, update an inventory database of the product stocked at the merchant inventory in response to detection of the presence of the received tracking data, and transmit acknowledgement data indicative of the inventory database update. The tracking data is transmitted by a merchant via a payment terminal in communication with the payment network.Type: GrantFiled: June 16, 2017Date of Patent: July 5, 2022Assignee: MASTERCARD ASIA/PACIFIC PTE. LTD.Inventors: Hao Tang, Senxian Zhuo, Xijing Wang, Bensam Joyson, Naman Aggarwal, Donghao Huang, Prashant Sridhar, Martin Collings, Perry Kick
-
Publication number: 20220122611Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.Type: ApplicationFiled: January 3, 2022Publication date: April 21, 2022Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
-
Patent number: 11217254Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.Type: GrantFiled: October 10, 2019Date of Patent: January 4, 2022Assignee: GOOGLE LLCInventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
-
Publication number: 20200335083Abstract: Methods and systems for training and/or using a language selection model for use in determining a particular language of a spoken utterance captured in audio data. Features of the audio data can be processed using the trained language selection model to generate a predicted probability for each of N different languages, and a particular language selected based on the generated probabilities. Speech recognition results for the particular language can be utilized responsive to selecting the particular language of the spoken utterance. Many implementations are directed to training the language selection model utilizing tuple losses in lieu of traditional cross-entropy losses. Training the language selection model utilizing the tuple losses can result in more efficient training and/or can result in a more accurate and/or robust model—thereby mitigating erroneous language selections for spoken utterances.Type: ApplicationFiled: November 27, 2019Publication date: October 22, 2020Inventors: Li Wan, Yang Yu, Prashant Sridhar, Ignacio Lopez Moreno, Quan Wang
-
Publication number: 20200202869Abstract: Techniques are disclosed that enable processing of audio data to generate one or more refined versions of audio data, where each of the refined versions of audio data isolate one or more utterances of a single respective human speaker. Various implementations generate a refined version of audio data that isolates utterance(s) of a single human speaker by processing a spectrogram representation of the audio data (generated by processing the audio data with a frequency transformation) using a mask generated by processing the spectrogram of the audio data and a speaker embedding for the single human speaker using a trained voice filter model. Output generated over the trained voice filter model is processed using an inverse of the frequency transformation to generate the refined audio data.Type: ApplicationFiled: October 10, 2019Publication date: June 25, 2020Inventors: Quan Wang, Prashant Sridhar, Ignacio Lopez Moreno, Hannah Muckenhirn
-
Publication number: 20170372264Abstract: An inventory management server is provided. The inventory management server includes at least one processor, and at least one memory. The at least one memory includes computer program code configured to cause the inventory management server at least to receive tracking data assigned to a product from a payment network, interrogate a mapping table containing assigned product to tracking data information, for the presence of the received tracking data, update an inventory database of the product stocked at the merchant inventory in response to detection of the presence of the received tracking data, and transmit acknowledgement data indicative of the inventory database update. The tracking data is transmitted by a merchant via a payment terminal in communication with the payment network.Type: ApplicationFiled: June 16, 2017Publication date: December 28, 2017Inventors: Hao Tang, Senxian Zhuo, Xijing Wang, Bensam Joyson, Naman Aggarwal, Donghao Huang, Prashant Sridhar, Martin Collings, Perry Kick
-
Publication number: 20170201377Abstract: There is provided a data processor implemented method for dynamic authentication of an object. There is also provided non-transitory computer readable storage mediums and systems for carrying out dynamic authentication of an object.Type: ApplicationFiled: January 9, 2017Publication date: July 13, 2017Applicant: MASTERCARD ASIA/PACIFIC PTE LTDInventors: Hao TANG, Xijing WANG, Senxian ZHUO, Yong-How CHIN, Jiaming LI, Bensam JOYSON, Donghao HUANG, Martin COLLINGS, Prashant SRIDHAR, Perry KICK