Patents by Inventor Emilian Stoimenov
Emilian Stoimenov has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11929076Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.Type: GrantFiled: December 1, 2022Date of Patent: March 12, 2024Assignee: Microsoft Technology Licensing, LLC.Inventors: Hosam Adel Khalil, Emilian Stoimenov, Christopher Hakan Basoglu, Kshitiz Kumar, Jian Wu
-
Patent number: 11798535Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.Type: GrantFiled: September 14, 2021Date of Patent: October 24, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Emilian Stoimenov, Rui Zhao, Kaustubh Prakash Kalgaonkar, Ivaylo Andreanov Enchev, Khuram Shahid, Anthony Phillip Stark, Guoli Ye, Mahadevan Srinivasan, Yifan Gong, Hosam Adel Khalil
-
Patent number: 11790891Abstract: Generally discussed herein are devices, systems, and methods for custom wake word selection assistance. A method can include receiving, at a device, data indicating a custom wake word provided by a user, determining one or more characteristics of the custom wake word, determining that use of the custom wake word will cause more than a threshold rate of false detections based on the characteristics, rejecting the custom wake word as the wake word for accessing a personal assistant in response to determining that use of the custom wake word will cause more than a threshold rate of false detections, and setting the custom wake word as the wake word in response to determining that use of the custom wake word will not cause more than the threshold rate of false detections.Type: GrantFiled: December 1, 2021Date of Patent: October 17, 2023Assignee: Microsoft Technology Licensing, LLCInventors: Emilian Stoimenov, Khuram Shahid, Guoli Ye, Hosam Adel Khalil, Yifan Gong
-
Publication number: 20230102295Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.Type: ApplicationFiled: December 1, 2022Publication date: March 30, 2023Inventors: Hosam Adel KHALIL, Emilian STOIMENOV, Christopher Hakan BASOGLU, Kshitiz KUMAR, Jian WU
-
Patent number: 11532312Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.Type: GrantFiled: December 15, 2020Date of Patent: December 20, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Hosam Adel Khalil, Emilian Stoimenov, Christopher Hakan Basoglu, Kshitiz Kumar, Jian Wu
-
Publication number: 20220254334Abstract: Generally discussed herein are devices, systems, and methods for custom wake word selection assistance. A method can include receiving, at a device, data indicating a custom wake word provided by a user, determining one or more characteristics of the custom wake word, determining that use of the custom wake word will cause more than a threshold rate of false detections based on the characteristics, rejecting the custom wake word as the wake word for accessing a personal assistant in response to determining that use of the custom wake word will cause more than a threshold rate of false detections, and setting the custom wake word as the wake word in response to determining that use of the custom wake word will not cause more than the threshold rate of false detections.Type: ApplicationFiled: December 1, 2021Publication date: August 11, 2022Inventors: Emilian Stoimenov, Khuram Shahid, Guoli Ye, Hosam Adel Khalil, Yifan Gong
-
Publication number: 20220189467Abstract: Disclosed speech recognition techniques improve user-perceived latency while maintaining accuracy by: receiving an audio stream, in parallel, by a primary (e.g., accurate) speech recognition engine (SRE) and a secondary (e.g., fast) SRE; generating, with the primary SRE, a primary result; generating, with the secondary SRE, a secondary result; appending the secondary result to a word list; and merging the primary result into the secondary result in the word list. Combining output from the primary and secondary SREs into a single decoder as described herein improves user-perceived latency while maintaining or improving accuracy, among other advantages.Type: ApplicationFiled: December 15, 2020Publication date: June 16, 2022Inventors: Hosam Adel KHALIL, Emilian STOIMENOV, Christopher Hakan BASOGLU, Kshitiz KUMAR, Jian WU
-
Patent number: 11222622Abstract: Generally discussed herein are devices, systems, and methods for custom wake word selection assistance. A method can include receiving, at a device, data indicating a custom wake word provided by a user, determining one or more characteristics of the custom wake word, determining that use of the custom wake word will cause more than a threshold rate of false detections based on the characteristics, rejecting the custom wake word as the wake word for accessing a personal assistant in response to determining that use of the custom wake word will cause more than a threshold rate of false detections, and setting the custom wake word as the wake word in response to determining that use of the custom wake word will not cause more than the threshold rate of false detections.Type: GrantFiled: July 25, 2019Date of Patent: January 11, 2022Assignee: Microsoft Technology Licensing, LLCInventors: Emilian Stoimenov, Khuram Shahid, Guoli Ye, Hosam Adel Khalil, Yifan Gong
-
Publication number: 20210407498Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.Type: ApplicationFiled: September 14, 2021Publication date: December 30, 2021Inventors: Emilian Stoimenov, Rui Zhao, Kaustubh Prakash Kalgaonkar, Ivaylo Andreanov Enchev, Khuram Shahid, Anthony Phillip Stark, Guoli Ye, Mahadevan Srinivasan, Yifan Gong, Hosam Adel Khalil
-
Patent number: 11132992Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.Type: GrantFiled: July 25, 2019Date of Patent: September 28, 2021Assignee: Microsoft Technology Licensing, LLCInventors: Emilian Stoimenov, Rui Zhao, Kaustubh Prakash Kalgaonkar, Ivaylo Andreanov Enchev, Khuram Shahid, Anthony Phillip Stark, Guoli Ye, Mahadevan Srinivasan, Yifan Gong, Hosam Adel Khalil
-
Publication number: 20200349927Abstract: Generally discussed herein are devices, systems, and methods for on-device detection of a wake word. A device can include a memory including model parameters that define a custom wake word detection model, the wake word detection model including a recurrent neural network transducer (RNNT) and a lookup table (LUT), the LUT indicating a hidden vector to be provided in response to a phoneme of a user-specified wake word, a microphone to capture audio, and processing circuitry to receive the audio from the microphone, determine, using the wake word detection model, whether the audio includes an utterance of the user-specified wake word, and wake up a personal assistant after determining the audio includes the utterance of the user-specified wake word.Type: ApplicationFiled: July 25, 2019Publication date: November 5, 2020Inventors: Emilian Stoimenov, Rui Zhao, Kaustubh Prakash Kalgaonkar, Ivaylo Andreanov Enchev, Khuram Shahid, Anthony Phillip Stark, Guoli Ye, Mahadevan Srinivasan, Yifan Gong, Hosam Adel Khalil
-
Publication number: 20200349924Abstract: Generally discussed herein are devices, systems, and methods for custom wake word selection assistance. A method can include receiving, at a device, data indicating a custom wake word provided by a user, determining one or more characteristics of the custom wake word, determining that use of the custom wake word will cause more than a threshold rate of false detections based on the characteristics, rejecting the custom wake word as the wake word for accessing a personal assistant in response to determining that use of the custom wake word will cause more than a threshold rate of false detections, and setting the custom wake word as the wake word in response to determining that use of the custom wake word will not cause more than the threshold rate of false detections.Type: ApplicationFiled: July 25, 2019Publication date: November 5, 2020Inventors: Emilian Stoimenov, Khuram Shahid, Guoli Ye, Hosam Adel Khalil, Yifan Gong
-
Publication number: 20170337918Abstract: A Deep Neural Network (DNN) model used in an Automatic Speech Recognition (ASR) system is restructured. A restructured DNN model may include fewer parameters compared to the original DNN model. The restructured DNN model may include a monophone state output layer in addition to the senone output layer of the original DNN model. Singular value decomposition (SVD) can be applied to one or more weight matrices of the DNN model to reduce the size of the DNN Model. The output layer of the DNN model may be restructured to include monophone states in addition to the senones (tied triphone states) which are included in the original DNN model. When the monophone states are included in the restructured DNN model, the posteriors of monophone states are used to select a small part of senones to be evaluated.Type: ApplicationFiled: August 7, 2017Publication date: November 23, 2017Applicant: Microsoft Technology Licensing, LLCInventors: Jian XUE, Emilian STOIMENOV, Jinyu LI, Yifan GONG
-
Patent number: 9728184Abstract: A Deep Neural Network (DNN) model used in an Automatic Speech Recognition (ASR) system is restructured. A restructured DNN model may include fewer parameters compared to the original DNN model. The restructured DNN model may include a monophone state output layer in addition to the senone output layer of the original DNN model. Singular value decomposition (SVD) can be applied to one or more weight matrices of the DNN model to reduce the size of the DNN Model. The output layer of the DNN model may be restructured to include monophone states in addition to the senones (tied triphone states) which are included in the original DNN model. When the monophone states are included in the restructured DNN model, the posteriors of monophone states are used to select a small part of senones to be evaluated.Type: GrantFiled: June 18, 2013Date of Patent: August 8, 2017Assignee: Microsoft Technology Licensing, LLCInventors: Jian Xue, Emilian Stoimenov, Jinyu Li, Yifan Gong
-
Publication number: 20140372112Abstract: A Deep Neural Network (DNN) model used in an Automatic Speech Recognition (ASR) system is restructured. A restructured DNN model may include fewer parameters compared to the original DNN model. The restructured DNN model may include a monophone state output layer in addition to the senone output layer of the original DNN model. Singular value decomposition (SVD) can be applied to one or more weight matrices of the DNN model to reduce the size of the DNN Model. The output layer of the DNN model may be restructured to include monophone states in addition to the senones (tied triphone states) which are included in the original DNN model. When the monophone states are included in the restructured DNN model, the posteriors of monophone states are used to select a small part of senones to be evaluated.Type: ApplicationFiled: June 18, 2013Publication date: December 18, 2014Inventors: Jian Xue, Emilian Stoimenov, Jinyu Li, Yifan Gong