Trigrams Or Digrams Patents (Class 382/230)
-
Patent number: 5930783Abstract: A computer implemented method for searching and retrieving images contained within a database of images in which both semantic and cognitive methodologies are utilized. The method accepts a semantic and cognitive description of an image to be searched from a user, and successively refines the search utilizing semantic and cognitive methodologies and then ranking the results for presentation to the user.Type: GrantFiled: August 29, 1997Date of Patent: July 27, 1999Assignee: NEC USA, Inc.Inventors: Wen-Syan Li, Kasim S. Candan
-
Patent number: 5915039Abstract: Fixed-pitch, fixed-font characters embedded in a noisy gray-scale image of picture elements (pels) within a complex background can be extracted prior to execution of any recognition operations by first deriving a normalized Boolean-coded image from the gray-scale image. Then, a subset of at least three uncontaminated character triples is formed by filtering the Boolean-coded image. Next, an affine transform is approximated from locations in the Boolean-coded image of at least three noncollinear ones of the uncontaminated character triples. Lastly, the locations in a logical matrix array of all possible character triples are estimated according to the affine transform.Type: GrantFiled: November 12, 1996Date of Patent: June 22, 1999Assignee: International Business Machines CorporationInventors: Raymond Amand Lorie, Jianchang Mao, Kottappuram Mohamedali Mohiuddin
-
Patent number: 5886794Abstract: A picture encoding apparatus and the method in which the quantization step width of the lower hierarchy data having a resolution higher than that of the upper hierarchy data is determined for each predetermined block of respective hierarchy data based on the quantization step width determined by the upper hierarchy data having a low resolution, so that the additional code indicating the characteristics of quantizer can be omitted thereby the compression efficiency can be improved and the deterioration of picture quality can be reduced when picture data is hierarchical-encoded.Type: GrantFiled: October 22, 1997Date of Patent: March 23, 1999Assignee: Sony CorporationInventors: Tetsujiro Kondo, Yasuhiro Fujimori, Kunio Kawaguchi
-
Patent number: 5883986Abstract: A method and system for automatically modifying an original transcription produced as the output of a recognition operation produces a second, modified transcription, such as, for example, automatically correcting an errorful transcription produced by an OCR operation. The invention uses information in an input text image of character images and in an original transcription associated with the input text image to modify aspects of a formal image source model that models as a grammar the spatial image structure of a set of text images. A recognition operation is then performed on the input text image using the modified formal image source model to produce a second, modified transcription. When the original transcription is errorful, the second transcription is a corrected transcription. Several aspects of the formal image source model may be modified; in particular, character templates to be used in the recognition operation are trained in the font of the glyphs occurring in the input text image.Type: GrantFiled: June 2, 1995Date of Patent: March 16, 1999Assignee: Xerox CorporationInventors: Gary E. Kopec, Philip A. Chou, Leslie T. Niles
-
Patent number: 5850480Abstract: The present invention includes methods of correcting optical character recognition errors occurring during recognition of alphanumeric character strings contained within one or more predetermined types of alphanumeric character fields. The methods may be practiced with a document processing system having (1) a optical character recognition device for scanning documents and outputting bit-map image data; (2) a recognition engine for converting the bit-map image data into possibly correct alphanumeric characters with associated confidence values; and (3) at least one lexicon of character strings consisting of a list of at least a portion of all of the possible character string values for each of the fields being processed. The present invention corrects OCR errors by performing a contextual comparison analysis between the alphanumeric characters outputted from the recognition engine and the lexicon of character strings.Type: GrantFiled: May 30, 1996Date of Patent: December 15, 1998Assignee: Scan-Optics, Inc.Inventor: Edward Francis Scanlon
-
Patent number: 5774588Abstract: A system and method for more efficiently comparing an unverified string to a lexicon, which filters the lexicon through multiple steps to reduce the number of entries to be directly compared with the unverified string. The method begins by preparing the lexicon with an n-gram encoding, partitioning and hashing process, which can be accomplished in advance of any processing of unverified strings. The unknown is compared first by partitioning and hashing it in the same way to reduce the lexicon in a computationally inexpensive manner. This is followed by an encoded vector comparison step, and finally by a direct string comparison step, which is the most computationally expensive. The reduction of the lexicon is accomplished without arbitrarily eliminating any large portions of the lexicon that might contain relevant candidates. At the same time, the method avoids the need to compare the unverified string directly or indirectly with all the entries in the lexicon.Type: GrantFiled: June 7, 1995Date of Patent: June 30, 1998Assignee: United Parcel Service of America, Inc.Inventor: Liang Li
-
Patent number: 5724449Abstract: A pointing device-driven interface apparatus and method for enabling a user to enter data (in the form of options) into a computer system. User data files are maintained which indicate the relative frequency with which the user has entered sequences of options. The sequences are of a predetermined length. The user data is consulted to predict the most likely options to be entered next by the user. These options are presented to the user on a template. A stroke made by the user with the pointing device is then interpreted in light of the template and a syntax of strokes. An action associated with the interpretation is then carried out. An example of the action is communicating a character to a software application, updating the sequence, and repeating the consultation, presentation, interpretation and carrying out just described.Type: GrantFiled: June 22, 1995Date of Patent: March 3, 1998Assignee: International Business Machines CorporationInventor: Liam David Cornerford
-
Patent number: 5706365Abstract: A system and method provides for indexing and retrieval of stored documents using a decomposition of words in the documents in n-grams, or linear word subunits. The documents are indexed as pages in a number of banks. For each bank there is a bank index. The individual n-grams are identified for each page are stored in the bank index. Each bank index further contains an entry map that indicates whether a given n-gram is present in any of the pages of the bank, and then provides an index to a page map that further indicates which page in the bank contains the n-gram. When a search query is input, the query words are decomposed into their n-grams. The query word n-grams are compared first with entry maps to determine if the query word n-grams appear on any page in the bank. If so, the associated page map is traversed to determine which page in the bank contains the query word n-grams. The n-grams on the page are compared with the query word n-grams to determine the presence of an match therebetween.Type: GrantFiled: April 10, 1995Date of Patent: January 6, 1998Assignee: Rebus Technology, Inc.Inventors: Vijayakumar Rangarajan, Natarajan Ravichandran
-
Patent number: 5687254Abstract: A method and system provide for searching and matching gesture-based data such as handwriting without performing a recognition process on the handwritten gesture data to convert it to a standard computer-coded form. Target data collected as sample data points of spatial coordinates over time are concatenated into a single target gesture sequence of sample data points. The sample data points comprising the gesture-based data structure to be searched (the corpus) are grouped into corpus gesture sequences for matching against the target gesture sequence. Matching may be done by any suitable method, and a novel signal comparison technique based on dynamic time warping concepts is illustrated. The result of the matching is a list of the locations of the matching corpus gesture sequences in the corpus, which in turn may be used for further processing, such as the display of an image of the matching corpus gestures for a system user.Type: GrantFiled: June 6, 1994Date of Patent: November 11, 1997Assignee: Xerox CorporationInventors: Alex D. Poon, Karon Anne Weber, Todd A. Cass
-
Patent number: 5628003Abstract: A document storage and retrieval system is provided with means for storing a document body in the form of image, means for storing text information in the form of a character code string for retrieval, means for executing a retrieval with reference to the text information, and means for displaying a document image relating thereto on a retrieval terminal according to the retrieval result. Such a form of the system is available for retrieving the full contents of a document and also for displaying the document body printed in a format easy to read straight in the form of image. Accordingly, users are capable of retrieving documents with arbitrary words and also capable of reading even such a document as is complicated to include mathematical expressions and charts through a terminal in the form of image, the same as on paper. Further, the invention provides a system wherein the text information for retrieval is extracted automatically from the document image through character recognition.Type: GrantFiled: August 24, 1993Date of Patent: May 6, 1997Assignee: Hitachi, Ltd.Inventors: Hiromichi Fujisawa, Atsushi Hatakeyama, Yasuaki Nakano, Junichi Higashino, Toshihiro Hananoi
-
Patent number: 5617488Abstract: A word recognizer system 10 has a probabilistic relaxation process that improves the performance of an image text recognition technique by propagating the influence of word collocation statistics. Word collocation refers to the likelihood that two words co-occur within a fixed distance of one another. The word recognizer 10 receives groups of visually similar decisions (called neighborhoods) for words in a running text. The position of decisions within the neighborhoods are modified based on how often they co-occur with decisions in the neighborhoods of other nearby words. This process is iterated a number of times effectively propagating the influence of the collocation statistics across an input text.Type: GrantFiled: February 1, 1995Date of Patent: April 1, 1997Assignee: The Research Foundation of State University of New YorkInventors: Tao Hong, Jonathan J. Hull
-
Patent number: 5479536Abstract: A pointing device-driven interface apparatus and method for enabling a user to enter data (in the form of options) into a computer system. User data files are maintained which indicate the relative frequency with which the user has entered sequences of options. The sequences are of a predetermined length. The user data is consulted to predict the most likely options to be entered next by the user. These options are presented to the user on a template. A stroke made by the user with the pointing device is then interpreted in light of the template and a syntax of strokes. An action associated with the interpretation is then carried out. An example of the action is communicating a character to a software application, updating the sequence, and repeating the consultation, presentation, interpretation and carrying out just described.Type: GrantFiled: April 25, 1994Date of Patent: December 26, 1995Assignee: International Business Machines CorporationInventor: Liam D. Comerford