Patents by Inventor Gerald M. McCobb

Gerald M. McCobb has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

INVOKING TAPERED PROMPTS IN A MULTIMODAL APPLICATION

Publication number: 20120166201

Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.

Type: Application

Filed: March 1, 2012

Publication date: June 28, 2012

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
Invoking tapered prompts in a multimodal application

Patent number: 8150698

Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.

Type: Grant

Filed: February 26, 2007

Date of Patent: April 3, 2012

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Managing application interactions using distributed modality components

Patent number: 8095939

Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.

Type: Grant

Filed: June 9, 2008

Date of Patent: January 10, 2012

Assignee: Nuance Communications, Inc.

Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
Enabling speech recognition grammars in web page frames

Patent number: 8073692

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Grant

Filed: November 2, 2010

Date of Patent: December 6, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Enabling global grammars for a particular multimodal application

Patent number: 8073698

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Grant

Filed: August 31, 2010

Date of Patent: December 6, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Dynamic switching between local and remote speech rendering

Patent number: 8024194

Abstract: A multimodal browser for rendering a multimodal document on an end system defining a host can include a visual browser component for rendering visual content, if any, of the multimodal document, and a voice browser component for rendering voice-based content, if any, of the multimodal document. The voice browser component can determine which of a plurality of speech processing configuration is used by the host in rendering the voice-based content. The determination can be based upon the resources of the host running the application. The determination also can be based upon a processing instruction contained in the application.

Type: Grant

Filed: December 8, 2004

Date of Patent: September 20, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., David Jaramillo, Gerald M. McCobb
Method and system for voice-enabled autofill

Patent number: 7953597

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: August 9, 2005

Date of Patent: May 31, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
APPLICATION MODULE FOR MANAGING INTERACTIONS OF DISTRIBUTED MODALITY COMPONENTS

Publication number: 20110093868

Abstract: A method for managing application modalities using dialogue states can include the step of asserting a set of activation conditions associated with a dialogue state of an application. Each of the activation conditions can be linked to at least one programmatic action, wherein different programmatic actions can be executed by different modality components. The application conditions can be monitored. An application event can be detected resulting in an associated application condition being run. At least one programmatic action linked to the application condition can be responsively initiated.

Type: Application

Filed: December 23, 2010

Publication date: April 21, 2011

Applicant: Nuance Communications, Inc.

Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
ENABLING GRAMMARS IN WEB PAGE FRAME

Publication number: 20110047452

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Application

Filed: November 2, 2010

Publication date: February 24, 2011

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
Partially filling mixed-initiative forms from utterances having sub-threshold confidence scores based upon word-level confidence data

Patent number: 7870000

Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.

Type: Grant

Filed: March 28, 2007

Date of Patent: January 11, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
ENABLING GLOBAL GRAMMARS FOR A PARTICULAR MULTIMODAL APPLICATION

Publication number: 20100324889

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
VOIP barge-in support for half-duplex DSR client on a full-duplex network

Patent number: 7848314

Abstract: Providing VOIP barge-in support for a half-duplex DSR client on a full-duplex network by buffering, in a half-duplex DSR client, input audio from the full-duplex network; playing, through the half-duplex DSR client, the buffered input audio; pausing, during voice activity on the half-duplex DSR client, the playing of the buffered input audio; sending, during voice activity on the half-duplex DSR client, speech for recognition through the full-duplex network to a voice server; receiving in the half-duplex DSR client through the full-duplex network from the voice server notification of speech recognition, the notification bearing a time stamp; and, responsive to receiving the notification, resuming the playing of the buffered input audio, including playing only buffered VOIP audio data bearing time stamps later than the time stamp of the recognition notification.

Type: Grant

Filed: May 10, 2006

Date of Patent: December 7, 2010

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Yan Li, Gerald M. McCobb
Enabling grammars in web page frames

Patent number: 7827033

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Grant

Filed: December 6, 2006

Date of Patent: November 2, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Enabling global grammars for a particular multimodal application

Patent number: 7809575

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Grant

Filed: February 27, 2007

Date of Patent: October 5, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Method and system for voice-enabled autofill

Patent number: 7739117

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: September 20, 2004

Date of Patent: June 15, 2010

Assignees: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Speech-Enabled Content Navigation And Control Of A Distributed Multimodal Browser

Publication number: 20080255851

Abstract: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.

Type: Application

Filed: April 12, 2007

Publication date: October 16, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
Web Service Support For A Multimodal Client Processing A Multimodal Application

Publication number: 20080249782

Abstract: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the appl

Type: Application

Filed: April 4, 2007

Publication date: October 9, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
MANAGING APPLICATION INTERACTIONS USING DISTRIBUTED MODALITY COMPONENTS

Publication number: 20080244059

Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.

Type: Application

Filed: June 9, 2008

Publication date: October 2, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
PARTIALLY FILLING MIXED-INITIATIVE FORMS FROM UTTERANCES HAVING SUB-THRESHOLD CONFIDENCE SCORES BASED UPON WORD-LEVEL CONFIDENCE DATA

Publication number: 20080243502

Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.

Type: Application

Filed: March 28, 2007

Publication date: October 2, 2008

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application

Publication number: 20080208586

Abstract: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in depen

Type: Application

Filed: February 27, 2007

Publication date: August 28, 2008

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb

prev 1 2 3 next