Patents by Inventor Soonthorn Ativanichayaphong

Soonthorn Ativanichayaphong has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

DYNAMICALLY GENERATING A VOCAL HELP PROMPT IN A MULTIMODAL APPLICATION

Publication number: 20120065982

Abstract: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

Type: Application

Filed: November 23, 2011

Publication date: March 15, 2012

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., David Jaramillo, Yan Li
Dynamically generating a vocal help prompt in a multimodal application

Patent number: 8086463

Abstract: Dynamically generating a vocal help prompt in a multimodal application that include detecting a help-triggering event for an input element of a VoiceXML dialog, where the detecting is implemented with a multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the multimodal application has no static help text. Dynamically generating a vocal help prompt in a multimodal application according to embodiments of the present invention typically also includes retrieving, by the VoiceXML interpreter from a source of help text, help text for an element of a speech recognition grammar, forming by the VoiceXML interpreter the help text into a vocal help prompt, and presenting by the multimodal application the vocal help prompt through a computer user interface to a user.

Type: Grant

Filed: September 12, 2006

Date of Patent: December 27, 2011

Assignees: Nuance Communications, Inc., International Business Machines Corporation

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., David Jaramillo, Yan Li
Enabling speech recognition grammars in web page frames

Patent number: 8073692

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Grant

Filed: November 2, 2010

Date of Patent: December 6, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Enabling global grammars for a particular multimodal application

Patent number: 8073698

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Grant

Filed: August 31, 2010

Date of Patent: December 6, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Method of enhancing voice interactions using visual messages

Patent number: 7966188

Abstract: A method for enhancing voice interactions within a portable multimodal computing device using visual messages. A multimodal interface can be provided that includes an audio interface and a visual interface. A speech input can then be received and a voice recognition task can be performed upon at least a portion of the speech input. At least one message within the multimodal interface can be visually presented, wherein the message is a prompt for the speech input and/or a confirmation of the speech input.

Type: Grant

Filed: May 20, 2003

Date of Patent: June 21, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, David Jaramillo, Gerald McCobb, Leslie R. Wilson
Method and system of building a grammar rule with baseforms generated dynamically from user utterances

Patent number: 7962343

Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.

Type: Grant

Filed: November 21, 2008

Date of Patent: June 14, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
Method and system for voice-enabled autofill

Patent number: 7953597

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: August 9, 2005

Date of Patent: May 31, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
ENABLING GRAMMARS IN WEB PAGE FRAME

Publication number: 20110047452

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Application

Filed: November 2, 2010

Publication date: February 24, 2011

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data

Patent number: 7870000

Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.

Type: Grant

Filed: March 28, 2007

Date of Patent: January 11, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
ENABLING GLOBAL GRAMMARS FOR A PARTICULAR MULTIMODAL APPLICATION

Publication number: 20100324889

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
Ordering recognition results produced by an automatic speech recognition engine for a multimodal application

Patent number: 7840409

Abstract: Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.

Type: Grant

Filed: February 27, 2007

Date of Patent: November 23, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Igor R. Jablokov, Gerald McCobb
Enabling grammars in web page frames

Patent number: 7827033

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Grant

Filed: December 6, 2006

Date of Patent: November 2, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Enabling global grammars for a particular multimodal application

Patent number: 7809575

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Grant

Filed: February 27, 2007

Date of Patent: October 5, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Method and system for voice-enabled autofill

Patent number: 7739117

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: September 20, 2004

Date of Patent: June 15, 2010

Assignees: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
METHOD AND SYSTEM OF BUILDING A GRAMMAR RULE WITH BASEFORMS GENERATED DYNAMICALLY FROM USER UTTERANCES

Publication number: 20090076818

Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.

Type: Application

Filed: November 21, 2008

Publication date: March 19, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Brien H. Muschett
Method and system of building a grammar rule with baseforms generated dynamically from user utterances

Patent number: 7487085

Abstract: A method (200) of building a grammar with baseforms generated dynamically from user utterances can include the steps of recording (205) a user utterance, generating (210) a baseform using the user utterance, creating or adding to (215) a grammar rule using the baseform, and binding (230) the grammar rule in a grammar document of a voice extensible markup language program. Generating a baseform can optionally include introducing a new element to VoiceXML with attributes that enable generating the baseform from a referenced recording such as the user utterance. In one embodiment, the method can be used to create (235) a phonebook and a grammar to access the phonebook by repeatedly visiting a form containing the grammar rule with attributes that enable generating the baseform from the referenced recording.

Type: Grant

Filed: August 24, 2004

Date of Patent: February 3, 2009

Assignee: International Business Machines Corporation

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Brien H. Muschett
Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser

Publication number: 20080255851

Abstract: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.

Type: Application

Filed: April 12, 2007

Publication date: October 16, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
Web Service Support For A Multimodal Client Processing A Multimodal Application

Publication number: 20080249782

Abstract: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the appl

Type: Application

Filed: April 4, 2007

Publication date: October 9, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
PARTIALLY FILLING MIXED-INITIATIVE FORMS FROM UTTERANCES HAVING SUB-THRESHOLD CONFIDENCE SCORES BASED UPON WORD-LEVEL CONFIDENCE DATA

Publication number: 20080243502

Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.

Type: Application

Filed: March 28, 2007

Publication date: October 2, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application

Publication number: 20080208586

Abstract: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in depen

Type: Application

Filed: February 27, 2007

Publication date: August 28, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb

prev 1 2 3 next