Patents by Inventor Soonthorn Ativanichayaphong
Soonthorn Ativanichayaphong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20120065982Abstract: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.Type: ApplicationFiled: November 23, 2011Publication date: March 15, 2012Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., David Jaramillo, Yan Li
-
Patent number: 8086463Abstract: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.Type: GrantFiled: September 12, 2006Date of Patent: December 27, 2011Assignees: Nuance Communications, Inc., International Business Machines CorporationInventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., David Jaramillo, Yan Li
-
Patent number: 8073692Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: GrantFiled: November 2, 2010Date of Patent: December 6, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 8073698Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: GrantFiled: August 31, 2010Date of Patent: December 6, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7966188Abstract: A method for enhancing voice interactions within a portable multimodal computing device using visual messages. A multimodal interface can be provided that includes an audio interface and a visual interface. A speech input can then be received and a voice recognition task can be performed upon at least a portion of the speech input. At least one message within the multimodal interface can be visually presented, wherein the message is a prompt for the speech input and/or a confirmation of the speech input.Type: GrantFiled: May 20, 2003Date of Patent: June 21, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, David Jaramillo, Gerald McCobb, Leslie R. Wilson
-
Patent number: 7962343Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.Type: GrantFiled: November 21, 2008Date of Patent: June 14, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
-
Patent number: 7953597Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.Type: GrantFiled: August 9, 2005Date of Patent: May 31, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Publication number: 20110047452Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: ApplicationFiled: November 2, 2010Publication date: February 24, 2011Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
-
Patent number: 7870000Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.Type: GrantFiled: March 28, 2007Date of Patent: January 11, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
-
Publication number: 20100324889Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: ApplicationFiled: August 31, 2010Publication date: December 23, 2010Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
-
Patent number: 7840409Abstract: Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.Type: GrantFiled: February 27, 2007Date of Patent: November 23, 2010Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Igor R. Jablokov, Gerald McCobb
-
Patent number: 7827033Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: GrantFiled: December 6, 2006Date of Patent: November 2, 2010Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7809575Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: GrantFiled: February 27, 2007Date of Patent: October 5, 2010Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7739117Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.Type: GrantFiled: September 20, 2004Date of Patent: June 15, 2010Assignees: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Publication number: 20090076818Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.Type: ApplicationFiled: November 21, 2008Publication date: March 19, 2009Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Brien H. Muschett
-
Patent number: 7487085Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.Type: GrantFiled: August 24, 2004Date of Patent: February 3, 2009Assignee: International Business Machines CorporationInventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
-
Publication number: 20080255851Abstract: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.Type: ApplicationFiled: April 12, 2007Publication date: October 16, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
-
Publication number: 20080249782Abstract: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the applType: ApplicationFiled: April 4, 2007Publication date: October 9, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
-
Publication number: 20080243502Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.Type: ApplicationFiled: March 28, 2007Publication date: October 2, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
-
Publication number: 20080208586Abstract: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in depenType: ApplicationFiled: February 27, 2007Publication date: August 28, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb