Patents by Inventor Charles W. Cross, Jr.

Charles W. Cross, Jr. has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).

Method and system for voice-enabled autofill

Patent number: 7953597

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: August 9, 2005

Date of Patent: May 31, 2011

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Enabling dynamic voiceXML in an X+V page of a multimodal application

Patent number: 7945851

Abstract: Enabling dynamic VoiceXML in an X+V page of a multimodal application implemented with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to a VoiceXML interpreter, including representing by the multimodal browser an XML element of a VoiceXML dialog of the X+V page as an ECMAScript object, the XML element comprising XML content; storing by the multimodal browser the XML content of the XML element in an attribute of the ECMAScript object; and accessing the XML content of the XML element in the attribute of the ECMAScript object from an ECMAScript script in the X+V page.

Type: Grant

Filed: March 14, 2007

Date of Patent: May 17, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Hilary A. Pike, Lisa A. Seacat, Marc T. White
Method, system, and apparatus for a voice markup language interpreter and voice browser

Patent number: 7925512

Abstract: The present invention can include a method of allocating an interpreter module within an application program. The application program can create one or more interpreter module instances. The method also can include updating a property descriptor of the interpreter module instance and directing the interpreter module instance to allocate speech and audio resources. Content then can be loaded into the interpreter module instance and run.

Type: Grant

Filed: May 19, 2004

Date of Patent: April 12, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Brien H. Muschett
Synchronizing visual and speech events in a multimodal application

Patent number: 7917365

Abstract: Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

Type: Grant

Filed: June 16, 2005

Date of Patent: March 29, 2011

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Michael C. Hollinger, Igor R. Jablokov, Benjamin D. Lewis, Hilary A. Pike, Daniel M. Smith, David W. Wintermute, Michael A. Zaitzeff
ENABLING GRAMMARS IN WEB PAGE FRAME

Publication number: 20110047452

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Application

Filed: November 2, 2010

Publication date: February 24, 2011

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
Multimodal Teleconferencing

Publication number: 20110032845

Abstract: Multimodal teleconferencing including receiving, by a multimodal teleconferencing module, a speech utterance from one of a plurality of participants in the multimodal teleconference; identifying the participant making the speech utterance as a current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to the current speaker; retrieving, by the multimodal teleconferencing module from accounts for the current speaker, content for display to one or more other participants in the multimodal teleconference; providing, by the multimodal teleconferencing module to a multimodal teleconferencing client for display to the current speaker, an identification of the speaker and the content retrieved for the speaker; and providing, by the multimodal teleconferencing module to one or more of multimodal teleconferencing clients for display to the other participants, an identification of the current speaker with the content retrieved for the one or more ot

Type: Application

Filed: August 5, 2009

Publication date: February 10, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR.
Speech Enabled Media Sharing In A Multimodal Application

Publication number: 20110010180

Abstract: Speech enabled media sharing in a multimodal application including parsing, by a multimodal browser, one or more markup documents of a multimodal application; identifying, by the multimodal browser, in the one or more markup documents a web resource for display in the multimodal browser; loading, by the multimodal browser, a web resource sharing grammar that includes keywords for modes of resource sharing and keywords for targets for receipt of web resources; receiving, by the multimodal browser, an utterance matching a keyword for the web resource, a keyword for a mode of resource sharing and a keyword for a target for receipt of the web resource in the web resource sharing grammar thereby identifying the web resource, a mode of resource sharing, and a target for receipt of the web resource; and sending, by the multimodal browser, the web resource to the identified target for the web resource using the identified mode of resource sharing.

Type: Application

Filed: July 9, 2009

Publication date: January 13, 2011

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Clprian Agapi, William K. Bodin, Charles W. Cross, JR.
Dynamically Extending The Speech Prompts Of A Multimodal Application

Publication number: 20100332234

Abstract: Dynamically extending the speech prompts of a multimodal application including receiving, by the prompt generation engine, a media file having a metadata container; retrieving, by the prompt generation engine from the metadata container, a speech prompt related to content stored in the media file for inclusion in the multimodal application; and modifying, by the prompt generation engine, the multimodal application to include the speech prompt.

Type: Application

Filed: June 24, 2009

Publication date: December 30, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR.
ENABLING GLOBAL GRAMMARS FOR A PARTICULAR MULTIMODAL APPLICATION

Publication number: 20100324889

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Application

Filed: August 31, 2010

Publication date: December 23, 2010

Applicant: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, JR., Gerald M. McCobb
VOIP barge-in support for half-duplex DSR client on a full-duplex network

Patent number: 7848314

Abstract: Providing VOIP barge-in support for a half-duplex DSR client on a full-duplex network by buffering, in a half-duplex DSR client, input audio from the full-duplex network; playing, through the half-duplex DSR client, the buffered input audio; pausing, during voice activity on the half-duplex DSR client, the playing of the buffered input audio; sending, during voice activity on the half-duplex DSR client, speech for recognition through the full-duplex network to a voice server; receiving in the half-duplex DSR client through the full-duplex network from the voice server notification of speech recognition, the notification bearing a time stamp; and, responsive to receiving the notification, resuming the playing of the buffered input audio, including playing only buffered VOIP audio data bearing time stamps later than the time stamp of the recognition notification.

Type: Grant

Filed: May 10, 2006

Date of Patent: December 7, 2010

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Yan Li, Gerald M. McCobb
Speech Capabilities Of A Multimodal Application

Publication number: 20100299146

Abstract: Improving speech capabilities of a multimodal application including receiving, by the multimodal browser, a media file having a metadata container; retrieving, by the multimodal browser, from the metadata container a speech artifact related to content stored in the media file for inclusion in the speech engine available to the multimodal browser; determining whether the speech artifact includes a grammar rule or a pronunciation rule; if the speech artifact includes a grammar rule, modifying, by the multimodal browser, the grammar of the speech engine to include the grammar rule; and if the speech artifact includes a pronunciation rule, modifying, by the multimodal browser, the lexicon of the speech engine to include the pronunciation rule.

Type: Application

Filed: May 19, 2009

Publication date: November 25, 2010

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR.
Ordering recognition results produced by an automatic speech recognition engine for a multimodal application

Patent number: 7840409

Abstract: Ordering recognition results produced by an automatic speech recognition (‘ASR’) engine for a multimodal application implemented with a grammar of the multimodal application in the ASR engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, includes: receiving, in the VoiceXML interpreter from the multimodal application, a voice utterance; determining, by the VoiceXML interpreter using the ASR engine, a plurality of recognition results in dependence upon the voice utterance and the grammar; determining, by the VoiceXML interpreter according to semantic interpretation scripts of the grammar, a weight for each recognition result; and sorting, by the VoiceXML interpreter, the plurality of recognition results in dependence upon the weight for each recognition result.

Type: Grant

Filed: February 27, 2007

Date of Patent: November 23, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Igor R. Jablokov, Gerald McCobb
Enabling grammars in web page frames

Patent number: 7827033

Abstract: Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.

Type: Grant

Filed: December 6, 2006

Date of Patent: November 2, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Disambiguating a speech recognition grammar in a multimodal application

Patent number: 7822608

Abstract: Disambiguating a speech recognition grammar in a multimodal application, the multimodal application including voice activated hyperlinks, the voice activated hyperlinks voice enabled by a speech recognition grammar characterized by ambiguous terminal grammar elements, including maintaining by the multimodal browser a record of visibility of each voice activated hyperlink, the record of visibility including current visibility and past visibility on a display of the multimodal device of each voice activated hyperlink, the record of visibility further including an ordinal indication, for each voice activated hyperlink scrolled off display, of the sequence in which each such voice activated hyperlink was scrolled off display; recognizing by the multimodal browser speech from a user matching an ambiguous terminal element of the speech recognition grammar; selecting by the multimodal browser a voice activated hyperlink for activation, the selecting carried out in dependence upon the recognized speech and the record

Type: Grant

Filed: February 27, 2007

Date of Patent: October 26, 2010

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Marc T. White
Enabling global grammars for a particular multimodal application

Patent number: 7809575

Abstract: Methods, apparatus, and computer program products are described for enabling global grammars for a particular multimodal application according to the present invention by loading a multimodal web page; determining whether the loaded multimodal web page is one of a plurality of multimodal web pages of the particular multimodal application. If the loaded multimodal web page is one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes loading any currently unloaded global grammars of the particular multimodal application identified in the multimodal web page and maintaining any previously loaded global grammars. If the loaded multimodal web page is not one of the plurality of multimodal web pages of the particular multimodal application, enabling global grammars typically includes unloading any currently loaded global grammars.

Type: Grant

Filed: February 27, 2007

Date of Patent: October 5, 2010

Assignee: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Document session replay for multimodal applications

Patent number: 7801728

Abstract: Methods, apparatus, and computer program products are described for document session replay for multimodal applications. including identifying, by a multimodal browser in dependence upon a log produced by a Form Interpretation Algorithm (‘FIA’) during a previous document session with a user, a speech prompt provided by a multimodal application in the previous document session; identifying, by a multimodal browser in replay mode in dependence upon the log, a response to the prompt provided by a user of the multimodal application in the previous document session; retrieving, by the multimodal browser in dependence upon the log, an X+V page of the multimodal application associated with the speech prompt and the response; rendering, by the multimodal browser, the visual elements of the retrieved X+V page; replaying, by the multimodal browser, the speech prompt; and replaying, by a multimodal browser, the response.

Type: Grant

Filed: February 26, 2007

Date of Patent: September 21, 2010

Assignee: Nuance Communications, Inc.

Inventors: Shay Ben-David, Charles W. Cross, Jr., Marc T. White
Method and system for voice-enabled autofill

Patent number: 7739117

Abstract: A computer-implemented method and system are provided for filling a graphic-based form field in response to a speech utterance. The computer-implemented method includes generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The method further includes creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the auto-fill event causing the filling of the form field with data corresponding to the user profile. The system includes a grammar-generating module for generating a grammar corresponding to the form field, the grammar being based on a user profile and comprising a semantic interpretation string. The system also includes an event module for creating an auto-fill event based upon the at least one grammar and responsive to the speech utterance, the event causing the filling of the form field with data corresponding to the user profile.

Type: Grant

Filed: September 20, 2004

Date of Patent: June 15, 2010

Assignees: Nuance Communications, Inc.

Inventors: Soonthorn Ativanichayaphong, Charles W. Cross, Jr., Gerald M. McCobb
Oral modification of an ASR lexicon of an ASR engine

Patent number: 7676371

Abstract: Methods, apparatus, and computer program products are described for providing oral modification of an ASR lexicon of an ASR engine that include receiving, in the ASR engine from a user through a multimodal application, speech for recognition, where the ASR engine includes an ASR lexicon of words capable of recognition by the ASR engine, and the ASR lexicon does not contain at least one word of the speech for recognition; indicating by the ASR engine through the multimodal application to the user that the ASR lexicon does not contain the word; receiving by the ASR engine from the user through the multimodal application an oral instruction to add the word to the ASR lexicon, where the oral instruction is accompanied by an oral spelling of the word from the user; and executing the instruction by the ASR engine.

Type: Grant

Filed: June 13, 2006

Date of Patent: March 9, 2010

Assignee: Nuance Communications, Inc.

Inventors: Charles W. Cross, Jr., Frank L. Jania, James R. Lewis
Dynamically Publishing Directory Information For A Plurality Of Interactive Voice Response Systems

Publication number: 20090268883

Abstract: Methods, apparatus, and products are disclosed for dynamically publishing directory information for a plurality of interactive voice response (‘IVR’) systems through an IVR directory service that include: providing a description of a web services publication interface for the IVR directory service; receiving, on behalf of one or more IVR systems, web services publication requests through the publication interface; determining, in response to the web services publication requests, directory information for each IVR system requesting publication; adding the directory information for each IVR system to an IVR system directory; generating a voice mode user interface to reflect the directory information for each IVR system added to the IVR system directory; and interacting, using the voice mode user interface, with a caller to identify a particular IVR system in dependence upon the IVR system directory and query information provided by the caller and to connect the caller with the identified IVR system.

Type: Application

Filed: April 24, 2008

Publication date: October 29, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES CORPORATION

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR., Fang Wang
Testing A Grammar Used In Speech Recognition For Reliability In A Plurality Of Operating Environments Having Different Background Noise

Publication number: 20090271189

Abstract: Methods, systems, and products for testing a grammar used in speech recognition for reliability in a plurality of operating environments having different background noise that include: receiving recorded background noise for each of the plurality of operating environments; generating a test speech utterance for recognition by a speech recognition engine using a grammar; mixing the test speech utterance with each recorded background noise, resulting in a plurality of mixed test speech utterances, each mixed test speech utterance having different background noise; performing, for each of the mixed test speech utterances, speech recognition using the grammar and the mixed test speech utterance, resulting in speech recognition results for each of the mixed test speech utterances; and evaluating, for each recorded background noise, speech recognition reliability of the grammar in dependence upon the speech recognition results for the mixed test speech utterance having that recorded background noise.

Type: Application

Filed: April 24, 2008

Publication date: October 29, 2009

Applicant: INTERNATIONAL BUSINESS MACHINES

Inventors: Ciprian Agapi, William K. Bodin, Charles W. Cross, JR., Michael H. Mirt

prev 1 2 3 4 5 6 next