Patents by Inventor Gerald M. McCobb
Gerald M. McCobb has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20120166201Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.Type: ApplicationFiled: March 1, 2012Publication date: June 28, 2012Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
-
Patent number: 8150698Abstract: Methods, apparatus, and computer program products are described for invoking tapered prompts in a multimodal application implemented with a multimodal browser and a multimodal application operating on a multimodal device supporting multiple modes of user interaction with the multimodal application, the modes of user interaction including a voice mode and one or more non-voice modes. Embodiments include identifying, by a multimodal browser, a prompt element in a multimodal application; identifying, by the multimodal browser, one or more attributes associated with the prompt element; and playing a speech prompt according to the one or more attributes associated with the prompt element.Type: GrantFiled: February 26, 2007Date of Patent: April 3, 2012Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 8095939Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.Type: GrantFiled: June 9, 2008Date of Patent: January 10, 2012Assignee: Nuance Communications, Inc.Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
-
Patent number: 8073692Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: GrantFiled: November 2, 2010Date of Patent: December 6, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 8073698Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: GrantFiled: August 31, 2010Date of Patent: December 6, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 8024194Abstract: A multimodal browser for rendering a multimodal document on an end system defining a host can include a visual browser component for rendering visual content, if any, of the multimodal document, and a voice browser component for rendering voice-based content, if any, of the multimodal document. The voice browser component can determine which of a plurality of speech processing configuration is used by the host in rendering the voice-based content. The determination can be based upon the resources of the host running the application. The determination also can be based upon a processing instruction contained in the application.Type: GrantFiled: December 8, 2004Date of Patent: September 20, 2011Assignee: Nuance Communications, Inc.Inventors: Charles W. Cross, Jr., David Jaramillo, Gerald M. McCobb
-
Patent number: 7953597Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.Type: GrantFiled: August 9, 2005Date of Patent: May 31, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Publication number: 20110093868Abstract: A method for managing application modalities using dialogue states can include the step of asserting a set of activation conditions associated with a dialogue state of an application. Each of the activation conditions can be linked to at least one programmatic action, wherein different programmatic actions can be executed by different modality components. The application conditions can be monitored. An application event can be detected resulting in an associated application condition being run. At least one programmatic action linked to the application condition can be responsively initiated.Type: ApplicationFiled: December 23, 2010Publication date: April 21, 2011Applicant: Nuance Communications, Inc.Inventors: Akram A. Bou-Ghannam, Gerald M. McCobb
-
Publication number: 20110047452Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: ApplicationFiled: November 2, 2010Publication date: February 24, 2011Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
-
Patent number: 7870000Abstract: The present disclosure relates to prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played.Type: GrantFiled: March 28, 2007Date of Patent: January 11, 2011Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Gerald M. McCobb, Paritosh D. Patel, Marc White
-
Publication number: 20100324889Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: ApplicationFiled: August 31, 2010Publication date: December 23, 2010Applicant: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
-
Patent number: 7848314Abstract: Providing VOIP barge-in support for a half-duplex DSR client on a full-duplex network by buffering, in a half-duplex DSR client, input audio from the full-duplex network; playing, through the half-duplex DSR client, the buffered input audio; pausing, during voice activity on the half-duplex DSR client, the playing of the buffered input audio; sending, during voice activity on the half-duplex DSR client, speech for recognition through the full-duplex network to a voice server; receiving in the half-duplex DSR client through the full-duplex network from the voice server notification of speech recognition, the notification bearing a time stamp; and, responsive to receiving the notification, resuming the playing of the buffered input audio, including playing only buffered VOIP audio data bearing time stamps later than the time stamp of the recognition notification.Type: GrantFiled: May 10, 2006Date of Patent: December 7, 2010Assignee: Nuance Communications, Inc.Inventors: Charles W. Cross, Jr., Yan Li, Gerald M. McCobb
-
Patent number: 7827033Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.Type: GrantFiled: December 6, 2006Date of Patent: November 2, 2010Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7809575Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.Type: GrantFiled: February 27, 2007Date of Patent: October 5, 2010Assignee: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Patent number: 7739117Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.Type: GrantFiled: September 20, 2004Date of Patent: June 15, 2010Assignees: Nuance Communications, Inc.Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
-
Publication number: 20080255851Abstract: Speech-enabled content navigation and control of a distributed multimodal browser is disclosed, the browser providing an execution environment for a multimodal application, the browser including a graphical user agent (‘GUA’) and a voice user agent (‘VUA’), the GUA operating on a multimodal device, the VUA operating on a voice server, that includes: transmitting, by the GUA, a link message to the VUA, the link message specifying voice commands that control the browser and an event corresponding to each voice command; receiving, by the GUA, a voice utterance from a user, the voice utterance specifying a particular voice command; transmitting, by the GUA, the voice utterance to the VUA for speech recognition by the VUA; receiving, by the GUA, an event message from the VUA, the event message specifying a particular event corresponding to the particular voice command; and controlling, by the GUA, the browser in dependence upon the particular event.Type: ApplicationFiled: April 12, 2007Publication date: October 16, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
-
Publication number: 20080249782Abstract: Web service support for a multimodal client processing a multimodal application, the multimodal client providing an execution environment for the application and operating on a multimodal device supporting multiple modes of user interaction including a voice mode and one or more non-voice modes, the application stored on an application server, includes: receiving, by the server, an application request from the client that specifies the application and device characteristics; determining, by a multimodal adapter of the server, modality requirements for the application; selecting, by the adapter, a modality web service in dependence upon the modality requirements and the characteristics for the device; determining, by the adapter, whether the device supports VoIP in dependence upon the characteristics; providing, by the server, the application to the client; and providing, by the adapter to the client in dependence upon whether the device supports VoIP, access to the modality web service for processing the applType: ApplicationFiled: April 4, 2007Publication date: October 9, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb
-
Publication number: 20080244059Abstract: A method for managing multimodal interactions can include the step of registering a multitude of modality components with a modality component server, wherein each modality component handles an interface modality for an application. The modality component can be connected to a device. A user interaction can be conveyed from the device to the modality component for processing. Results from the user interaction can be placed on a shared memory are of the modality component server.Type: ApplicationFiled: June 9, 2008Publication date: October 2, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: Akram A. Bou-Ghannam, Gerald M. McCobb
-
Publication number: 20080243502Abstract: The invention discloses prompting for a spoken response that provides input for multiple elements. A single spoken utterance including content for multiple elements can be received, where each element is mapped to a data field. The spoken utterance can be speech-to-text converted to derive values for each of the multiple elements. An utterance level confidence score can be determined, which can fall below an associated certainty threshold. Element-level confidence scores for each of the derived elements can then be ascertained. A first set of the multiple elements can have element-level confidence scores above an associated certainty threshold and a second set can have scores below. Values can be stored in data fields mapped to the first set. A prompt for input for the second set can be played. Accordingly, data fields are partially filled in based upon the original speech utterance, where a second prompt for unfilled fields is played.Type: ApplicationFiled: March 28, 2007Publication date: October 2, 2008Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATIONInventors: SOONTHORN ATIVANICHAYAPHONG, Gerald M. McCobb, PARITOSH D. PATEL, MARC WHITE
-
Publication number: 20080208586Abstract: Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in depenType: ApplicationFiled: February 27, 2007Publication date: August 28, 2008Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Gerald M. McCobb