Abstract: There is disclosed a system and method for automatically performing semantic categorization. In one embodiment at least one text description pertaining to a category set is accepted along with words that are anticipated to be uttered by a user pertaining to that category set; lexical chaining confidence score is attached to each pair matched between the anticipated words and the accepted text description. These confidence scores are used subsequently by a categorization circuit that accepts a text phrase utterance from an input source along with a category set pertaining to the accepted utterance. The categorization circuit, in one embodiment, creates word pairs matched between the accepted text phrase utterance and the accepted category set. From these word scores, the category pertaining to the utterance is determined based, at least in part, on the assigned lexical chaining confidence scores as previously determined.
Type:
Application
Filed:
February 20, 2007
Publication date:
August 21, 2008
Applicants:
Intervoice Limited Partnership, Language Computer Corporation
Inventors:
Ellis K. Cave, Mithun Balakrishna, Vincent Mo
Abstract: A Statistical Language Model (SLM) that can be used in an ASR for Interactive Voice Response (IVR) systems in general and Natural Language Speech Applications (NLSAs) in particular can be created by first manually producing a brief description in text for each task that can be performed in an NLSA. These brief descriptions are then analyzed, in one embodiment, to generate spontaneous speech utterances based pre-filler patterns and a skeletal set of content words. The pre-filler patterns are in turn used with Part-of-Speech (POS) tagged conversations from a spontaneous speech corpus to generate a set of pre-filler phrases. The skeletal set of content words is used with an electronic lexico-semantic database and with a thesaurus-based content word extraction process to generate a more extensive list of content words. The pre-filler phrases and content words set, thus generated, are combined into utterances using a lexico-semantic resource based process.
Type:
Application
Filed:
September 14, 2006
Publication date:
March 20, 2008
Applicants:
Intervoice Limited Partnership, Language Computer Corporation