Abstract: A system for synthesizing a speech signal from strings of words, which are themselves strings of characters, includes a memory in which predetermined syntax tags are stored in association with entered words and phonetic transcriptions are stored in association with the syntax tags. A parser accesses the memory and groups the syntax tags of the entered words into phrases according to a first set of predetermined grammatical rules relating the syntax tags to one another. The parser also verifies the conformance of sequences of the phrases to a second set of predetermined grammatical rules relating the phrases to one another. The system retrieves the phonetic transcriptions associated with the syntax tags that were grouped into phrases conforming to the second set of rules, and also translates predetermined strings of characters into words.
Abstract: In a digital computer, there is provided a method of recognizing speech, comprising the steps of: entering a cohesive speech segment; determining gross acoustic attributes of the entered segment; determining fine acoustic attributes of the entered segment; assigning at least one subsyllable to the entered segment based on the gross and fine acoustic attributes determined; repeating the foregoing steps on successive cohesive speech segments to generate at least one sequence of subsyllables; converting the sequence of subsyllables into a sequence of syllables by finding the sequence of subsyllables in a table in which predetermined subsyllable sequences correspond with respective syllables and syllable sequences; combining the converted sequence of syllables into words; and verifying the conformance of the words to a first predetermined set of grammatical rules. An apparatus implementing the method is also disclosed.
Abstract: In a digital computer, a method for speech recognition includes steps of sampling a speaker's speech and providing speech data sample segments of predetermined length at predetermined sampling intervals based on changes in energy in the speech. Cohesive speech segments, which correspond to intervals of stable vocoids, changing vocoids, frication, and silence, are identified from the speech data sample segments, and are assigned frames of subsyllables. Each cohesive segment corresponds to at least one respective frame, and each frame includes at least one of a plurality of subsyllables that characterizes predetermined gross and fine phonetic attributes of the respective cohesive segment. The subsyllables are located in a first lookup table mapping sequences of subsyllables into syllables, and the syllables are combined into words by locating words in another lookup table. The conformance of sequences of the words to a set of predetermined checking rules is checked, and a recognition result is reported.
Abstract: A system and method for parsing natural language is provided. The system comprises a plurality of computer program code modules which address a plurality of predetermined lookup tables. Strings of characters, such as words, assigned one or more syntactical tags identifying the grammatical roles the strings can play are stored in a dictionary and retrieved as a system user inputs text to be processed. The tags are manipulated by a phrase parsing program module and translated into phrases according to grammatical rules stored in a lookup table. Sequences of the phrases corresponding to input sentences are maniplated by a sentence checking program module which consults another suitable rule table. The system and method optionally provide help in identifying grammatically incorrect passages in the input text.
Abstract: A system and method for determining from continuous speech, the instantaneous values of a set of articulatory parameters. The continuous speech data is a sequence of spectral profiles obtained by spectrally sampling continuous speech. The spectral samples are presented in sequence to a plurality of class transforms, each establishing a respective speech phoneme class which includes plurality of speech phoneme having similar spectral and articulatory characteristics. Each class transform converts a speech segment included in its class and contained in a spectral sample into a predetermined set of articulatory parameter values. A class-discriminating transform operates in parallel with the class transforms to produce a set of probability values, each indicating the probability that the spectral sample being transformed represents a phoneme in a respective speech phoneme class.