Patents Examined by Thomas Shortledge
  • Patent number: 7092884
    Abstract: In a speech recognition system, a method of nonvisual enrollment comprising playing an audio representation of an enrollment script. As the enrollment is playing, shadowed speech from a user can be received, wherein the shadowed speech can lag the enrollment script. The received shadowed speech can be recorded for enrolling the user into the speech recognition system.
    Type: Grant
    Filed: March 1, 2002
    Date of Patent: August 15, 2006
    Assignee: International Business Machines Corporation
    Inventors: James R. Lewis, Melanie D. Polkosky, Wallace J. Sadowski, Jr.
  • Patent number: 7054811
    Abstract: A system for verifying and enabling user access, which includes a voice registration unit for providing a substantially unique and initial identification of each of a plurality of the speaker/users by finding the speaker/user's voice parameters in a voice registration sample and storing same in a database. The system also includes a voice authenticating unit for substantially absolute verification of an identity of one of said plurality of users. The voice authenticating unit includes a recognition unit for providing a voice authentication sample, and being operative with the database. The voice authenticating unit also includes a decision unit operative with the recognition unit and the database to decide whether the user is the same as the person of the same identity registered with the system, such that the identity of one of the plurality of users is substantially absolutely verified.
    Type: Grant
    Filed: October 6, 2004
    Date of Patent: May 30, 2006
    Assignee: Cellmax Systems Ltd.
    Inventor: Ziv Barzilay
  • Patent number: 7024350
    Abstract: A computer-loadable data structure is provided that represents a state-and-transition-based description of a speech grammar. The data structure includes first and second transition entries that both represent transitions from a first state. The second transition entry is contiguous with the first transition entry in the data structure and includes a last-transition value. The last-transition value indicating that the second transition is the last transition from the first state in the data structure. A method is also provided for retrieving information from a binary grammar. The method includes receiving an index into a set of transition entries and converting the index into a memory offset relative to the beginning of the binary grammar, where the offset is based on a memory offset to the beginning of the set of transition entries, the fixed size of each transition entry and the index.
    Type: Grant
    Filed: February 7, 2001
    Date of Patent: April 4, 2006
    Assignee: Microsoft Corporation
    Inventors: Philipp H. Schmid, Ralph Lipe
  • Patent number: 6952667
    Abstract: A method extracts all infinite ambiguity from an input finite-state transducer (FST). The input FST is factorized into a first factor and a second factor such that the first factor is finitely ambiguous, and the second factor retains all infinite ambiguity of the original FST. The first factor is defined so that it replaces every loop where the input symbol of every arc is an ? (i.e., epsilon, empty string) by a single arc with ? on the input side and a diacritic on the output side. The second factor is defined so that it maps every diacritic to one or more ?-loops.
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: October 4, 2005
    Assignee: Xerox Corporation
    Inventor: Andre Kempe
  • Patent number: 6868383
    Abstract: Multimodal utterances contain a number of different modes. These modes can include speech, gestures, and pen, haptic, and gaze inputs, and the like. This invention use recognition results from one or more of these modes to provide compensation to the recognition process of one or more other ones of these modes. In various exemplary embodiments, a multimodal recognition system inputs one or more recognition lattices from one or more of these modes, and generates one or more models to be used by one or more mode recognizers to recognize the one or more other modes. In one exemplary embodiment, a gesture recognizer inputs a gesture input and outputs a gesture recognition lattice to a multimodal parser. The multimodal parser generates a language model and outputs it to an automatic speech recognition system, which uses the received language model to recognize the speech input that corresponds to the recognized gesture input.
    Type: Grant
    Filed: July 12, 2001
    Date of Patent: March 15, 2005
    Assignee: AT&T Corp.
    Inventors: Srinivas Bangalore, Michael J. Johnston