Patents by Inventor Gerald M. McCobb

Gerald M. McCobb has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

  • Publication number: 20120166201
    Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.
    Type: Application
    Filed: March 1, 2012
    Publication date: June 28, 2012
    Applicant: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
  • Patent number: 8150698
    Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.
    Type: Grant
    Filed: February 26, 2007
    Date of Patent: April 3, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 8095939
    Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.
    Type: Grant
    Filed: June 9, 2008
    Date of Patent: January 10, 2012
    Assignee: Nuance Communications, Inc.
    Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
  • Patent number: 8073692
    Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.
    Type: Grant
    Filed: November 2, 2010
    Date of Patent: December 6, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 8073698
    Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.
    Type: Grant
    Filed: August 31, 2010
    Date of Patent: December 6, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 8024194
    Abstract: A multimodal browser for rendering a multimodal document on an end system defining a host can include a visual browser component for rendering visual content, if any, of the multimodal document, and a voice browser component for rendering voice-based content, if any, of the multimodal document. The voice browser component can determine which of a plurality of speech processing configuration is used by the host in rendering the voice-based content. The determination can be based upon the resources of the host running the application. The determination also can be based upon a processing instruction contained in the application.
    Type: Grant
    Filed: December 8, 2004
    Date of Patent: September 20, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Charles W. Cross, Jr., David Jaramillo, Gerald M. McCobb
  • Patent number: 7953597
    Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.
    Type: Grant
    Filed: August 9, 2005
    Date of Patent: May 31, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Publication number: 20110093868
    Abstract: A method for managing application modalities using dialogue states can include the step of asserting a set of activation conditions associated with a dialogue state of an application. Each of the activation conditions can be linked to at least one programmatic action, wherein different programmatic actions can be executed by different modality components. The application conditions can be monitored. An application event can be detected resulting in an associated application condition being run. At least one programmatic action linked to the application condition can be responsively initiated.
    Type: Application
    Filed: December 23, 2010
    Publication date: April 21, 2011
    Applicant: Nuance Communications, Inc.
    Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
  • Publication number: 20110047452
    Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.
    Type: Application
    Filed: November 2, 2010
    Publication date: February 24, 2011
    Applicant: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
  • Patent number: 7870000
    Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.
    Type: Grant
    Filed: March 28, 2007
    Date of Patent: January 11, 2011
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
  • Publication number: 20100324889
    Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.
    Type: Application
    Filed: August 31, 2010
    Publication date: December 23, 2010
    Applicant: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
  • Patent number: 7848314
    Abstract: Providing VOIP barge-in support for a half-duplex DSR client on a full-duplex network by buffering, in a half-duplex DSR client, input audio from the full-duplex network; playing, through the half-duplex DSR client, the buffered input audio; pausing, during voice activity on the half-duplex DSR client, the playing of the buffered input audio; sending, during voice activity on the half-duplex DSR client, speech for recognition through the full-duplex network to a voice server; receiving in the half-duplex DSR client through the full-duplex network from the voice server notification of speech recognition, the notification bearing a time stamp; and, responsive to receiving the notification, resuming the playing of the buffered input audio, including playing only buffered VOIP audio data bearing time stamps later than the time stamp of the recognition notification.
    Type: Grant
    Filed: May 10, 2006
    Date of Patent: December 7, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Charles W. Cross, Jr., Yan Li, Gerald M. McCobb
  • Patent number: 7827033
    Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.
    Type: Grant
    Filed: December 6, 2006
    Date of Patent: November 2, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 7809575
    Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.
    Type: Grant
    Filed: February 27, 2007
    Date of Patent: October 5, 2010
    Assignee: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Patent number: 7739117
    Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.
    Type: Grant
    Filed: September 20, 2004
    Date of Patent: June 15, 2010
    Assignees: Nuance Communications, Inc.
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
  • Publication number: 20080255851
    Abstract: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.
    Type: Application
    Filed: April 12, 2007
    Publication date: October 16, 2008
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
  • Publication number: 20080249782
    Abstract: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the appl
    Type: Application
    Filed: April 4, 2007
    Publication date: October 9, 2008
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
  • Publication number: 20080244059
    Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.
    Type: Application
    Filed: June 9, 2008
    Publication date: October 2, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
  • Publication number: 20080243502
    Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.
    Type: Application
    Filed: March 28, 2007
    Publication date: October 2, 2008
    Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION
    Inventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
  • Publication number: 20080208586
    Abstract: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in depen
    Type: Application
    Filed: February 27, 2007
    Publication date: August 28, 2008
    Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb