Trigrams Or Digrams Patents (Class 382/230)
  • Patent number: 5930783
    Abstract: A computer implemented method for searching and retrieving images contained within a database of images in which both semantic and cognitive methodologies are utilized. The method accepts a semantic and cognitive description of an image to be searched from a user, and successively refines the search utilizing semantic and cognitive methodologies and then ranking the results for presentation to the user.
    Type: Grant
    Filed: August 29, 1997
    Date of Patent: July 27, 1999
    Assignee: NEC USA, Inc.
    Inventors: Wen-Syan Li, Kasim S. Candan
  • Patent number: 5915039
    Abstract: Fixed-pitch, fixed-font characters embedded in a noisy gray-scale image of picture elements (pels) within a complex background can be extracted prior to execution of any recognition operations by first deriving a normalized Boolean-coded image from the gray-scale image. Then, a subset of at least three uncontaminated character triples is formed by filtering the Boolean-coded image. Next, an affine transform is approximated from locations in the Boolean-coded image of at least three noncollinear ones of the uncontaminated character triples. Lastly, the locations in a logical matrix array of all possible character triples are estimated according to the affine transform.
    Type: Grant
    Filed: November 12, 1996
    Date of Patent: June 22, 1999
    Assignee: International Business Machines Corporation
    Inventors: Raymond Amand Lorie, Jianchang Mao, Kottappuram Mohamedali Mohiuddin
  • Patent number: 5886794
    Abstract: A picture encoding apparatus and the method in which the quantization step width of the lower hierarchy data having a resolution higher than that of the upper hierarchy data is determined for each predetermined block of respective hierarchy data based on the quantization step width determined by the upper hierarchy data having a low resolution, so that the additional code indicating the characteristics of quantizer can be omitted thereby the compression efficiency can be improved and the deterioration of picture quality can be reduced when picture data is hierarchical-encoded.
    Type: Grant
    Filed: October 22, 1997
    Date of Patent: March 23, 1999
    Assignee: Sony Corporation
    Inventors: Tetsujiro Kondo, Yasuhiro Fujimori, Kunio Kawaguchi
  • Patent number: 5883986
    Abstract: A method and system for automatically modifying an original transcription produced as the output of a recognition operation produces a second, modified transcription, such as, for example, automatically correcting an errorful transcription produced by an OCR operation. The invention uses information in an input text image of character images and in an original transcription associated with the input text image to modify aspects of a formal image source model that models as a grammar the spatial image structure of a set of text images. A recognition operation is then performed on the input text image using the modified formal image source model to produce a second, modified transcription. When the original transcription is errorful, the second transcription is a corrected transcription. Several aspects of the formal image source model may be modified; in particular, character templates to be used in the recognition operation are trained in the font of the glyphs occurring in the input text image.
    Type: Grant
    Filed: June 2, 1995
    Date of Patent: March 16, 1999
    Assignee: Xerox Corporation
    Inventors: Gary E. Kopec, Philip A. Chou, Leslie T. Niles
  • Patent number: 5850480
    Abstract: The present invention includes methods of correcting optical character recognition errors occurring during recognition of alphanumeric character strings contained within one or more predetermined types of alphanumeric character fields. The methods may be practiced with a document processing system having (1) a optical character recognition device for scanning documents and outputting bit-map image data; (2) a recognition engine for converting the bit-map image data into possibly correct alphanumeric characters with associated confidence values; and (3) at least one lexicon of character strings consisting of a list of at least a portion of all of the possible character string values for each of the fields being processed. The present invention corrects OCR errors by performing a contextual comparison analysis between the alphanumeric characters outputted from the recognition engine and the lexicon of character strings.
    Type: Grant
    Filed: May 30, 1996
    Date of Patent: December 15, 1998
    Assignee: Scan-Optics, Inc.
    Inventor: Edward Francis Scanlon
  • Patent number: 5774588
    Abstract: A system and method for more efficiently comparing an unverified string to a lexicon, which filters the lexicon through multiple steps to reduce the number of entries to be directly compared with the unverified string. The method begins by preparing the lexicon with an n-gram encoding, partitioning and hashing process, which can be accomplished in advance of any processing of unverified strings. The unknown is compared first by partitioning and hashing it in the same way to reduce the lexicon in a computationally inexpensive manner. This is followed by an encoded vector comparison step, and finally by a direct string comparison step, which is the most computationally expensive. The reduction of the lexicon is accomplished without arbitrarily eliminating any large portions of the lexicon that might contain relevant candidates. At the same time, the method avoids the need to compare the unverified string directly or indirectly with all the entries in the lexicon.
    Type: Grant
    Filed: June 7, 1995
    Date of Patent: June 30, 1998
    Assignee: United Parcel Service of America, Inc.
    Inventor: Liang Li
  • Patent number: 5724449
    Abstract: A pointing device-driven interface apparatus and method for enabling a user to enter data (in the form of options) into a computer system. User data files are maintained which indicate the relative frequency with which the user has entered sequences of options. The sequences are of a predetermined length. The user data is consulted to predict the most likely options to be entered next by the user. These options are presented to the user on a template. A stroke made by the user with the pointing device is then interpreted in light of the template and a syntax of strokes. An action associated with the interpretation is then carried out. An example of the action is communicating a character to a software application, updating the sequence, and repeating the consultation, presentation, interpretation and carrying out just described.
    Type: Grant
    Filed: June 22, 1995
    Date of Patent: March 3, 1998
    Assignee: International Business Machines Corporation
    Inventor: Liam David Cornerford
  • Patent number: 5706365
    Abstract: A system and method provides for indexing and retrieval of stored documents using a decomposition of words in the documents in n-grams, or linear word subunits. The documents are indexed as pages in a number of banks. For each bank there is a bank index. The individual n-grams are identified for each page are stored in the bank index. Each bank index further contains an entry map that indicates whether a given n-gram is present in any of the pages of the bank, and then provides an index to a page map that further indicates which page in the bank contains the n-gram. When a search query is input, the query words are decomposed into their n-grams. The query word n-grams are compared first with entry maps to determine if the query word n-grams appear on any page in the bank. If so, the associated page map is traversed to determine which page in the bank contains the query word n-grams. The n-grams on the page are compared with the query word n-grams to determine the presence of an match therebetween.
    Type: Grant
    Filed: April 10, 1995
    Date of Patent: January 6, 1998
    Assignee: Rebus Technology, Inc.
    Inventors: Vijayakumar Rangarajan, Natarajan Ravichandran
  • Patent number: 5687254
    Abstract: A method and system provide for searching and matching gesture-based data such as handwriting without performing a recognition process on the handwritten gesture data to convert it to a standard computer-coded form. Target data collected as sample data points of spatial coordinates over time are concatenated into a single target gesture sequence of sample data points. The sample data points comprising the gesture-based data structure to be searched (the corpus) are grouped into corpus gesture sequences for matching against the target gesture sequence. Matching may be done by any suitable method, and a novel signal comparison technique based on dynamic time warping concepts is illustrated. The result of the matching is a list of the locations of the matching corpus gesture sequences in the corpus, which in turn may be used for further processing, such as the display of an image of the matching corpus gestures for a system user.
    Type: Grant
    Filed: June 6, 1994
    Date of Patent: November 11, 1997
    Assignee: Xerox Corporation
    Inventors: Alex D. Poon, Karon Anne Weber, Todd A. Cass
  • Patent number: 5628003
    Abstract: A document storage and retrieval system is provided with means for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, means for executing a retrieval with reference to the text information, and means for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Accordingly, users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. Further, the invention provides a system wherein the text information for retrieval is extracted automatically from the document image through character recognition.
    Type: Grant
    Filed: August 24, 1993
    Date of Patent: May 6, 1997
    Assignee: Hitachi, Ltd.
    Inventors: Hiromichi Fujisawa, Atsushi Hatakeyama, Yasuaki Nakano, Junichi Higashino, Toshihiro Hananoi
  • Patent number: 5617488
    Abstract: A word recognizer system 10 has a probabilistic relaxation process that improves the performance of an image text recognition technique by propagating the influence of word collocation statistics. Word collocation refers to the likelihood that two words co-occur within a fixed distance of one another. The word recognizer 10 receives groups of visually similar decisions (called neighborhoods) for words in a running text. The position of decisions within the neighborhoods are modified based on how often they co-occur with decisions in the neighborhoods of other nearby words. This process is iterated a number of times effectively propagating the influence of the collocation statistics across an input text.
    Type: Grant
    Filed: February 1, 1995
    Date of Patent: April 1, 1997
    Assignee: The Research Foundation of State University of New York
    Inventors: Tao Hong, Jonathan J. Hull
  • Patent number: 5479536
    Abstract: A pointing device-driven interface apparatus and method for enabling a user to enter data (in the form of options) into a computer system. User data files are maintained which indicate the relative frequency with which the user has entered sequences of options. The sequences are of a predetermined length. The user data is consulted to predict the most likely options to be entered next by the user. These options are presented to the user on a template. A stroke made by the user with the pointing device is then interpreted in light of the template and a syntax of strokes. An action associated with the interpretation is then carried out. An example of the action is communicating a character to a software application, updating the sequence, and repeating the consultation, presentation, interpretation and carrying out just described.
    Type: Grant
    Filed: April 25, 1994
    Date of Patent: December 26, 1995
    Assignee: International Business Machines Corporation
    Inventor: Liam D. Comerford