Patents by Inventor Pongtep Angkititrakul
Pongtep Angkititrakul has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Patent number: 11848024Abstract: A smart mask includes a main body having a back frame and a front cover. The back frame and the front cover each include an opening that is aligned with the mask wearer's mouth when worn. The front cover and back frame may be detachable from one another, or a single piece. A microphone is provided in the main body, as well as a speaker. A processor located in the main body is connected to the microphone and the speaker, and is configured to enhance the speech of the mask wearer. In particular, the processor receives audio signals representing a transformation of a spoken utterance of the wearer, processes the audio signals to enhance the speech, and then outputs the enhanced speech to the speaker. This helps other people better understand what the mask wearer is saying.Type: GrantFiled: January 26, 2021Date of Patent: December 19, 2023Assignee: Robert Bosch GmbHInventors: Pongtep Angkititrakul, Xiaoyang Gao, Hyeongsik Kim, Xiaowei Zhou, Zhengyu Zhou
-
Patent number: 11710476Abstract: A voice recognition system includes a microphone configured to receive one or more spoken dialogue commands from a user in a voice recognition session. The system also includes a processor in communication with the microphone. The processor is configured to receive one or more audio files associated with one or more audio events associated with the voice recognition system, execute the one or more audio files in a voice recognition session in an audio event, and output a log report indicating a result of the audio events with the voice recognition session.Type: GrantFiled: April 27, 2020Date of Patent: July 25, 2023Assignee: ROBERT BOSCH GMBHInventors: Xiaowei Zhou, Pongtep Angkititrakul
-
Publication number: 20220238129Abstract: A smart mask includes a main body having a back frame and a front cover. The back frame and the front cover each include an opening that is aligned with the mask wearer's mouth when worn. The front cover and back frame may be detachable from one another, or a single piece. A microphone is provided in the main body, as well as a speaker. A processor located in the main body is connected to the microphone and the speaker, and is configured to enhance the speech of the mask wearer. In particular, the processor receives audio signals representing a transformation of a spoken utterance of the wearer, processes the audio signals to enhance the speech, and then outputs the enhanced speech to the speaker. This helps other people better understand what the mask wearer is saying.Type: ApplicationFiled: January 26, 2021Publication date: July 28, 2022Inventors: Pongtep ANGKITITRAKUL, Xiaoyang GAO, Hyeongsik KIM, Xiaowei ZHOU, Zhengyu ZHOU
-
Patent number: 11295748Abstract: A speaker recognition device includes a memory, and a processor. The memory stores enrolled key phrase data corresponding to utterances of a key phrase by enrolled users,and text-dependent and text-independent acoustic speaker models of the enrolled users. The processor is operatively connected to the memory, and executes instructions to authenticate a speaker as an enrolled user, which includes detecting input key phrase data corresponding to a key phrase uttered by the speaker, computing text-dependent and text-independent scores for the speaker using speech models of the enrolled user, computing a confidence score, and authenticating or rejecting the speaker as the enrolled user based on whether the confidence score indicates that the input key phrase data corresponds to the speech from the enrolled user.Type: GrantFiled: December 14, 2018Date of Patent: April 5, 2022Assignee: Robert Bosch GmbHInventors: Zhongnan Shen, Fuliang Weng, Gengyan Bei, Pongtep Angkititrakul
-
Patent number: 11170760Abstract: Systems and methods for detecting speech activity. The system includes an audio source and an electronic processor. The electronic processor is configured to receive a first audio signal from the audio source, buffer the first audio signal, add random noise to the buffered first audio signal, and filter the first audio stream to create a filtered signal. The electronic processor then determines a signal entropy of each frame of the filtered signal, determines an average signal entropy of a first plurality of frames of the filtered signal occurring at a beginning of the filtered signal, and compares the signal entropy of each frame of the filtered signal to the average signal entropy. Based on the comparison, the electronic processor determines a first speech endpoint located in a first frame of the filtered signal.Type: GrantFiled: June 21, 2019Date of Patent: November 9, 2021Assignee: Robert Bosch GmbHInventors: Pongtep Angkititrakul, HyeongSik Kim
-
Publication number: 20210335338Abstract: A voice recognition system includes a microphone configured to receive one or more spoken dialogue commands from a user in a voice recognition session. The system also includes a processor in communication with the microphone. The processor is configured to receive one or more audio files associated with one or more audio events associated with the voice recognition system, execute the one or more audio files in a voice recognition session in an audio event, and output a log report indicating a result of the audio events with the voice recognition session.Type: ApplicationFiled: April 27, 2020Publication date: October 28, 2021Inventors: Xiaowei ZHOU, Pongtep ANGKITITRAKUL
-
Publication number: 20210272573Abstract: A voice recognition system includes a microphone configured to receive spoken dialogue commands from a user and environmental noise, a processor in communication with the microphone. The processor is configured to receive one or more spoken dialogue commands and the environmental noise from the microphone and identify the user utilizing a first encoder that includes a first convolutional neural network to output a speaker signature derived from a time domain signal associated with the spoken dialogue commands, output a matrix representative of the environmental noise and the one or more spoken dialogue commands, extract speech data from a mixture of the one or more spoken dialogue commands and the environmental noise utilizing a residual convolution neural network that includes one or more layers and utilizing the speaker signature, and in response to the speech data being associated with the speaker signature, output audio data indicating the spoken dialogue commands.Type: ApplicationFiled: February 29, 2020Publication date: September 2, 2021Inventors: Midia YOUSEFI, Pongtep ANGKITITRAKUL
-
Publication number: 20200402499Abstract: Systems and methods for detecting speech activity. The system includes an audio source and an electronic processor. The electronic processor is configured to receive a first audio signal from the audio source, buffer the first audio signal, add random noise to the buffered first audio signal, and filter the first audio stream to create a filtered signal. The electronic processor then determines a signal entropy of each frame of the filtered signal, determines an average signal entropy of a first plurality of frames of the filtered signal occurring at a beginning of the filtered signal, and compares the signal entropy of each frame of the filtered signal to the average signal entropy. Based on the comparison, the electronic processor determines a first speech endpoint located in a first frame of the filtered signal.Type: ApplicationFiled: June 21, 2019Publication date: December 24, 2020Inventors: Pongtep Angkititrakul, HyeongSik Kim
-
Publication number: 20200210911Abstract: A workflow management system for generating workflows. The system includes a knowledgebase encoded with terms for steps, dependencies of the steps, and constraints for the steps. The system further includes a computing system programmed to receive the dependencies of the steps from the knowledgebase and to generate a workflow or a portion thereof based on the dependencies of the steps without reference to any other existing workflows.Type: ApplicationFiled: December 28, 2018Publication date: July 2, 2020Inventors: Hyeongsik KIM, Pongtep ANGKITITRAKUL
-
Publication number: 20200152206Abstract: A speaker recognition device includes a memory, and a processor. The memory stores enrolled key phrase data corresponding to utterances of a key phrase by enrolled users,and text-dependent and text-independent acoustic speaker models of the enrolled users. The processor is operatively connected to the memory, and executes instructions to authenticate a speaker as an enrolled user, which includes detecting input key phrase data corresponding to a key phrase uttered by the speaker, computing text-dependent and text-independent scores for the speaker using speech models of the enrolled user, computing a confidence score, and authenticating or rejecting the speaker as the enrolled user based on whether the confidence score indicates that the input key phrase data corresponds to the speech from the enrolled user.Type: ApplicationFiled: December 14, 2018Publication date: May 14, 2020Inventors: Zhongnan Shen, Fuliang Weng, Gengyan Bei, Pongtep Angkititrakul
-
Patent number: 10431207Abstract: A method for spoken language understanding (SLU) includes generating a first encoded representation of words from a user based on an output of a recurrent neural network (RNN), generating an intent label corresponding to the words based on an output of a first RNN decoder based on the first encoded representation, generating a corrected plurality of words based on an output of a second RNN decoder based on the first encoded representation and the intent label, generating a second encoded representation corresponding to the plurality of corrected words using the RNN encoder based on the plurality of corrected words, and generating a machine-readable dialog phrase that includes at least one word in the plurality of corrected words assigned to at least one slot based on an output of a third RNN decoder based on the second encoded representation of the plurality of corrected words and the intent label.Type: GrantFiled: April 25, 2018Date of Patent: October 1, 2019Assignee: Robert Bosch GmbHInventors: Pongtep Angkititrakul, Raphael Schumann
-
Patent number: 10410630Abstract: A system provides multi-modal user interaction. The system is configured to detect acoustic events to perform context-sensitive personalized conversations with the speaker. Conversation or communication among the speakers or devices is categorized into different classes as confidential, partially anonymous, or public. When exchange with cloud infrastructure is needed, a clear indicator is presented to the speaker via one or more modalities. Furthermore, different dialog strategies are employed in situations where conversation failures, such as misunderstanding, wrong expectation, emotional stress, or memory deficiencies, occur.Type: GrantFiled: June 19, 2015Date of Patent: September 10, 2019Assignee: Robert Bosch GmbHInventors: Fuliang Weng, Katrin Schulze, Zhongnan Shen, Pongtep Angkititrakul, Gengyan Bei, Xiao Xiong
-
Publication number: 20190244603Abstract: A method for spoken language understanding (SLU) includes generating a first encoded representation of words from a user based on an output of a recurrent neural network (RNN), generating an intent label corresponding to the words based on an output of a first RNN decoder based on the first encoded representation, generating a corrected plurality of words based on an output of a second RNN decoder based on the first encoded representation and the intent label, generating a second encoded representation corresponding to the plurality of corrected words using the RNN encoder based on the plurality of corrected words, and generating a machine-readable dialogue phrase that includes at least one word in the plurality of corrected words assigned to at least one slot based on an output of a third RNN decoder based on the second encoded representation of the plurality of corrected words and the intent label.Type: ApplicationFiled: April 25, 2018Publication date: August 8, 2019Inventors: Pongtep Angkititrakul, Raphael Schumann
-
Publication number: 20170116986Abstract: A system provides multi-modal user interaction. The system is configured to detect acoustic events to perform context-sensitive personalized conversations with the speaker. Conversation or communication among the speakers or devices is categorized into different classes as confidential, partially anonymous, or public. When exchange with cloud infrastructure is needed, a clear indicator is presented to the speaker via one or more modalities. Furthermore, different dialog strategies are employed in situations where conversation failures, such as misunderstanding, wrong expectation, emotional stress, or memory deficiencies, occur.Type: ApplicationFiled: June 19, 2015Publication date: April 27, 2017Inventors: Fuliang Weng, Katrin Schulze, Zhongnan Shen, Pongtep Angkititrakul, Gengyan Bei, Nikita Xiong