Method and arrangement for managing grammar options in a graphical callflow builder
A method (10) in a speech recognition application callflow can include the steps of assigning (11) an individual option and a pre-built grammar to a same prompt, treating (15) the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase (12) or an annotation (13) in the pre-built grammar, and treating (14) the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.
Latest IBM Patents:
- Integration of selector on confined phase change memory
- Method probe with high density electrodes, and a formation thereof
- Thermally activated retractable EMC protection
- Method to manufacture conductive anodic filament-resistant microvias
- Detecting and preventing distributed data exfiltration attacks
1. Technical Field
This invention relates to the field of graphical user interfaces and more particularly to a graphical call flow builder.
2. Description of the Related Art
Systems exist that allow callflow designers to write simple grammar options or separately select prebuilt grammar files in graphical callflow builders. Some systems are described below. There is no system that allows designers who do not have any technical knowledge of speech grammars to both select a pre-built grammar file and write in the same element of a callflow. Furthermore, there is no other system that lets a designer select a specific output of a prebuilt grammar for special treatment in a callflow. The system we describe below overcomes these problems.
One such system, as described in U.S. Pat. No. 6,510,411, discusses a simplification of the process of developing call or dialogue flows for use in an Interactive Voice Response system where three principal aspects of the invention include a task-oriented dialogue model (or task model), a development tool and a dialogue manager. The task model is a framework for describing the application-specific information needed to perform the task. The development tool is an object that interprets a user specified task model and outputs information for a spoken dialogue system to perform according to the specified task model. The dialogue manager is a runtime system that uses output from the development tool in carrying out interactive dialogues to perform the task specified according to the task model. The dialogue manager conducts the dialogue using the task model and its built-in knowledge of dialogue management. Plus, generic knowledge of how to conduct a dialogue is separated from the specific information to be collected in a particular application. It is only necessary for the developer to provide the specific information about the structure of a task, leaving the specifics of dialogue management to the dialogue manager. This invention describes a form-based method for developing very simple speech applications, and does not address at all the use of external grammar files.
Another system, U.S. Pat. No. 6,269,336, discusses a voice browser for interactive services. A markup language document, as described in the U.S. Pat. No. 6,269,336, includes a dialogue element including a plurality of markup language elements. Each of the plurality of markup language elements is identifiable by at least one markup tag. A step element is contained within the dialogue element to define a state within the dialogue element. The step element includes a prompt element and an input element. The prompt element includes an announcement to be read to the user. The input element includes at least one input that corresponds to a user input. A method in accordance with the present invention includes the steps of creating a markup language document having a plurality of elements, selecting a prompt element, and defining a voice communication in the prompt element to be read to the user. The method further includes the steps of selecting an input element and defining an input variable to store data inputted by the user. Although this invention describes a markup language similar, but not identical to, VoiceXML, and includes the capacity (like VoiceXML) to refer to either built-in or external grammars, it does not address the resolution of specific new options with the contents of existing grammars.
U.S. Pat. No. 6,173,266 discusses a dialogue module that includes computer readable instructions for accomplishing a predefined interactive dialogue task in an interactive speech application. In response to user input, a subset of the plurality of dialogue modules are selected to accomplish their respective interactive dialogue tasks in the interactive speech application and are interconnected in an order defining the callflow of the application, and the application is generated. A graphical user interface represents the stored plurality of dialogue modules as icons in a graphical display in which icons for the subset of dialogue modules are selected in the graphical display. In response to user input, the icons for the subset of dialogue modules are graphically interconnected into a graphical representation of the call flow of the interactive speech application, and the interactive speech application is generated based upon the graphical representation. Using the graphical display, the method further includes associating configuration parameters with specific dialogue modules. Once again, this existing invention describes a graphical callflow builder using dialogue modules as elements, but does not address the resolution of specific new options with the contents of existing grammars.
SUMMARY OF THE INVENTIONEmbodiments in accordance with the invention can enable callflow designers to work more efficiently with lists of variables in a graphical callflow builder, particularly where users can create their own variable names. Furthermore, embodiments disclosed herein overcome the problems described above through the automatic evaluation of options added to prompts in a graphical callflow when the prompt is using one or more existing grammars. The nature of this evaluation is to determine if the added options are present in one or more of the existing grammars. If not present, the added prompts are used as external referents for use in the graphical callflow and become part of a new generated grammar. If present, the added prompts are only used as external referents for use in the graphical callflow and do not become part of a new generated grammar.
In a first aspect of the invention, a method for a speech recognition application callflow can include the steps of placing a prompt into a workspace for the speech recognition application workflow and attaching at least one among a pre-built grammar and a user-entered individual new option to the prompt. The pre-built grammars can be selected from a list. The method can further include the step of searching the list of pre-built grammars for matches to the user-entered individual new option. If a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option can point to an equivalent pre-built grammar. If a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option can form a part of the list of pre-built grammars.
In a second aspect of the invention, a method in a speech recognition application callflow can include the steps of assigning a individual option and a pre-built grammar to the same prompt, treating the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase or an annotation in the pre-built grammar, and treating the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.
In a third aspect of the invention, a system for managing grammar options in a graphical callflow builder can include a memory and a processor. The processor can be programmed to place a prompt into a workspace for the speech recognition application workflow and to attach at least one among a pre-built grammar and a user-entered individual new option to the prompt.
In a fourth aspect of the invention, a computer program has a plurality of code sections executable by a machine for causing the machine to perform certain steps as described in the method and systems above.
BRIEF DESCRIPTION OF THE DRAWINGSThere are shown in the drawings embodiments which are presently preferred, it being understood, however, that the invention is not limited to the precise arrangements and instrumentalities shown.
In our proposed system, designers can put a prompt into a workspace, then attach either prebuilt grammars from a list or attach individual new options, or both. To keep the system as parsimonious as possible, and to prevent potential conflicts between multiple grammars, if the user combines a prebuilt grammar and any new options, the system searches the prebuilt grammar for any matches to the new options, searching both valid utterances and associated annotations. If the new option exists in the grammar, the ‘new’ option simply points to the equivalent grammar entry. Otherwise, the new option becomes part of a grammar automatically built to hold it, with the entry in the new grammar having the text of the new option as both the recognition string and an associated annotation. Thus, without any deep understanding of the structure of a speech recognition grammar, callflow designers can create or work with grammars with a high degree of flexibility.
Referring to
Referring to
Assume that a system exists for the graphical building of speech recognition callflows. A key component of such a system would be a prompt—a request for user input. The prompt could have a symbolic representation similar to the call flow element 29 shown in
For example, suppose the designer has created the callflow shown in
While these techniques can be generalized to any code generated from the callflow, here is an example of a VoiceXML form capable of being automatically generated from the information provided in the graphical callflow for the Time prompt (assuming that ‘midnight’ was NOT a valid input or annotation or time.jsgf):
Finally, here is an example of a VoiceXML form capable of being generated from the information provided in the graphical callflow for the Time prompt, assuming that ‘midnight’ IS a valid input for time.jsgf, and that the annotation returned for ‘midnight’ is 12:00 AM.
Note that in searching the grammar (shown in the list below, using a jsgf grammar as an example, but note that this would be workable for any type of grammar that includes recognition text and annotations—including bnf, srcl, SRGS XML, SRGS ABNF, etc.), it could be determined that ‘midnight’ was in the grammar, and that the annotation for midnight was ‘1200 AM’, which enabled the automatic generation of the <if> statement in the form code above—all capable of being done without any detailed knowledge about the content of the prebuilt grammars on the part of the callflow designer.
-
- #JSGF V1.0 iso-8859-1;
grammar time;
It should be understood that the present invention can be realized in hardware, software, or a combination of hardware and software. The present invention can also be realized in a centralized fashion in one computer system, or in a distributed fashion where different elements are spread across several interconnected computer systems. Any kind of computer system or other apparatus adapted for carrying out the methods described herein is suited. A typical combination of hardware and software can be a general purpose computer system with a computer program that, when being loaded and executed, controls the computer system such that it carries out the methods described herein.
The present invention also can be embedded in a computer program product, which comprises all the features enabling the implementation of the methods described herein, and which when loaded in a computer system is able to carry out these methods. Computer program or application in the present context means any expression, in any language, code or notation, of a set of instructions intended to cause a system having an information processing capability to perform a particular function either directly or after either or both of the following: a) conversion to another language, code or notation; b) reproduction in a different material form.
This invention can be embodied in other forms without departing from the spirit or essential attributes thereof. Accordingly, reference should be made to the following claims, rather than to the foregoing specification, as indicating the scope of the invention.
Claims
1. A method in a speech recognition application callflow, comprising the steps of:
- placing a prompt into a workspace for the speech recognition application workflow; and
- attaching at least one among a pre-built grammar and a user-entered individual new option to the prompt.
2. The method of claim 1, wherein the step of attaching the pre-built grammar comprises the step of selecting the pre-built grammar from a list.
3. The method of claim 2, wherein the method further comprises the step of searching the list of pre-built grammars for matches to the user-entered individual new option.
4. The method of claim 3, wherein if a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option points to an equivalent pre-built grammar.
5. The method of claim 3, wherein if a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option forms a part of the list of pre-built grammars.
6. The method of claim 1, wherein the pre-built grammars are selected from the group comprising VoiceXML and custom-built grammars from a library.
7. The method of claim 1, wherein the method further comprises the step of enabling a customized user selective output of the pre-built grammar.
8. The method of claim 1, wherein the method supports prototyping without knowledge of a grammar structure by a user.
9. The method of claim 3, wherein the method further comprises the step of feeding the result of the step of searching to the pre-defined grammar instead of forming an auxiliary grammar.
10. A method in a speech recognition application callflow, comprising the steps of:
- assigning a individual option and a pre-built grammar to a same prompt;
- treat the individual option as a valid output of the pre-built grammar if the individual option is a potential valid match to a recognition phrase or an annotation in the pre-built grammar; and
- treat the individual option as an independent grammar from the pre-built grammar if the individual option fails to be a potential valid match to the recognition phrase or the annotation in the pre-built grammar.
11. A system for managing grammar options in a graphical callflow builder, comprises:
- a memory; and
- a processor programmed to place a prompt into a workspace for the speech recognition application workflow; and
- attach at least one among a pre-built grammar and a user-entered individual new option to the prompt.
12. The system of claim 11, wherein the processors of attaches the pre-built grammar by selecting the pre-built grammar from a list.
13. The system of claim 12, wherein the processor is further programmed to search the list of pre-built grammars for matches to the user-entered individual new option.
14. The system of claim 13, wherein if a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option points to an equivalent pre-built grammar.
15. The system of claim 13, wherein if a match exists between the pre-built grammar and the user-entered individual new option, then the user-entered individual new option forms a part of the list of pre-built grammars.
16. The system of claim 11, wherein the pre-built grammars are selected from the group comprising VoiceXML and custom-built grammars from a library.
17. The system of claim 11, wherein the processor is further programmed to further enable a customized user selective output of the pre-built grammar.
18. The system of claim 13, wherein the processor is further programmed to feed the result of the search to the pre-defined grammar instead of forming an auxiliary grammar.
19. A machine-readable storage, having stored thereon a computer program having a plurality of code sections executable by a machine for causing the machine to perform the steps of placing a prompt into a workspace for the speech recognition application workflow and attaching at least one among a pre-built grammar and a user-entered individual new option to the prompt.
20. The machine-readable storage of claim 19, wherein the machine-readable storage is further programmed to select the pre-built grammar from a list.
Type: Application
Filed: Dec 2, 2003
Publication Date: Jun 2, 2005
Applicant: International Business Machines Corporation (Armonk, NY)
Inventors: Ciprian Agapi (Lake Worth, FL), Felipe Gomez (Weston, FL), James Lewis (Delray Beach, FL), Vanessa Michelini (Boca Raton, FL)
Application Number: 10/726,102