Patents Examined by Thomas Shortledge

Method of nonvisual enrollment for speech recognition

Patent number: 7092884

Abstract: In a speech recognition system, a method of nonvisual enrollment comprising playing an audio representation of an enrollment script. As the enrollment is playing, shadowed speech from a user can be received, wherein the shadowed speech can lag the enrollment script. The received shadowed speech can be recorded for enrolling the user into the speech recognition system.

Type: Grant

Filed: March 1, 2002

Date of Patent: August 15, 2006

Assignee: International Business Machines Corporation

Inventors: James R. Lewis, Melanie D. Polkosky, Wallace J. Sadowski, Jr.
Method and system for verifying and enabling user access based on voice parameters

Patent number: 7054811

Abstract: A system for verifying and enabling user access, which includes a voice registration unit for providing a substantially unique and initial identification of each of a plurality of the speaker/users by finding the speaker/user's voice parameters in a voice registration sample and storing same in a database. The system also includes a voice authenticating unit for substantially absolute verification of an identity of one of said plurality of users. The voice authenticating unit includes a recognition unit for providing a voice authentication sample, and being operative with the database. The voice authenticating unit also includes a decision unit operative with the recognition unit and the database to decide whether the user is the same as the person of the same identity registered with the system, such that the identity of one of the plurality of users is substantially absolutely verified.

Type: Grant

Filed: October 6, 2004

Date of Patent: May 30, 2006

Assignee: Cellmax Systems Ltd.

Inventor: Ziv Barzilay
Compact easily parseable binary format for a context-free grammer

Patent number: 7024350

Abstract: A computer-loadable data structure is provided that represents a state-and-transition-based description of a speech grammar. The data structure includes first and second transition entries that both represent transitions from a first state. The second transition entry is contiguous with the first transition entry in the data structure and includes a last-transition value. The last-transition value indicating that the second transition is the last transition from the first state in the data structure. A method is also provided for retrieving information from a binary grammar. The method includes receiving an index into a set of transition entries and converting the index into a memory offset relative to the beginning of the binary grammar, where the offset is based on a memory offset to the beginning of the set of transition entries, the fixed size of each transition entry and the index.

Type: Grant

Filed: February 7, 2001

Date of Patent: April 4, 2006

Assignee: Microsoft Corporation

Inventors: Philipp H. Schmid, Ralph Lipe
Method and apparatus for extracting infinite ambiguity when factoring finite state transducers

Patent number: 6952667

Abstract: A method extracts all infinite ambiguity from an input finite-state transducer (FST). The input FST is factorized into a first factor and a second factor such that the first factor is finitely ambiguous, and the second factor retains all infinite ambiguity of the original FST. The first factor is defined so that it replaces every loop where the input symbol of every arc is an ? (i.e., epsilon, empty string) by a single arc with ? on the input side and a diacritic on the output side. The second factor is defined so that it maps every diacritic to one or more ?-loops.

Type: Grant

Filed: December 18, 2000

Date of Patent: October 4, 2005

Assignee: Xerox Corporation

Inventor: Andre Kempe
Systems and methods for extracting meaning from multimodal inputs using finite-state devices

Patent number: 6868383

Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.

Type: Grant

Filed: July 12, 2001

Date of Patent: March 15, 2005

Assignee: AT&T Corp.

Inventors: Srinivas Bangalore, Michael J. Johnston

Method of nonvisual enrollment for speech recognition

Method and system for verifying and enabling user access based on voice parameters

Compact easily parseable binary format for a context-free grammer

Method and apparatus for extracting infinite ambiguity when factoring finite state transducers

Systems and methods for extracting meaning from multimodal inputs using finite-state devices