Context Analysis Or Word Recognition (e.g., Character String) Patents (Class 382/229)

Trigrams or digrams (Class 382/230)

Checking spelling for recognition (Class 382/231)

Extracting information from symbolically compressed document images

Patent number: 6658151

Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.

Type: Grant

Filed: April 8, 1999

Date of Patent: December 2, 2003

Assignee: Ricoh Co., Ltd.

Inventors: Dar-Shyang Lee, Jonathan J. Hull
Movement detecting apparatus with feature point extractor based on luminance gradient in current frame

Patent number: 6650362

Abstract: In connection with the detection of an amount of movement of an image such as a document image having a small gradient of brightness, it has been difficult to detect the amount by a representative point method, and a block matching method requires much time. An image (F1) at a time (T1), which has been imaged by moving an imaging section (1), is taken into a memory (2) and a feature point extracting section (3), where feature points of the image (F1) are extracted, and an image (F2) at a start time (T2) of the subsequent frame is taken in the memory (2) and the feature point extracting section (3). A correlation operating section (5) operates feature points of the image (F2) and the image (F1) in an area designated by a search area deciding section (4) to output an amount of movement.

Type: Grant

Filed: March 18, 1999

Date of Patent: November 18, 2003

Assignee: Sharp Kabushiki Kaisha

Inventors: Yasuhisa Nakamura, Yoshihiro Kitamura, Hiroshi Akagi
Apparatus and method for recognizing character

Patent number: 6643401

Abstract: A character pattern is extracted from image data read from a document, listing, etc., and discriminated between a hand-written character and a typed character by a hand-written/typed character discrimination unit. The hand-written/typed character discrimination unit obtains, from the character pattern, N feature vectors containing a feature indicating at least the complexity and the linearity of the character pattern; and discriminating the character pattern between a hand-written character and a typed character using the feature vectors. A character recognition unit performs a character recognizing process based on the result of discriminating whether the character data is a hand-written character or a typed character. As a feature of the above described character pattern, the variance of line widths, the variance of character positions, etc. can also be used.

Type: Grant

Filed: June 24, 1999

Date of Patent: November 4, 2003

Assignee: Fujitsu Limited

Inventors: Junji Kashioka, Satoshi Naoi
Word-to-word selection on images

Patent number: 6640010

Abstract: An image processing technique for selecting a text region from an image is described. Character and formatting information for each word in the image is used to determine an active region for each word in the image. For a preferred embodiment of the present invention, the character and formatting information is derived during optical character recognition (OCR). A first and last word within a selected text region is identified based on at least one active region associated with at least one word within the selected text region. Using the first and last words within the selected text region, all words within the selected text region are identified. An image of the selected text region may be displayed. Text contained within the selected text region may be copied to an application program.

Type: Grant

Filed: November 12, 1999

Date of Patent: October 28, 2003

Assignee: Xerox Corporation

Inventors: Mauritius Seeger, Christopher R. Dance, Stuart A. Taylor, William M. Newman
Rollup functions and methods

Publication number: 20030190077

Abstract: Methods of organizing a series of sibling data entities in a digital computer are provided for preserving sibling ranking information associated with the sibling data entities and for attaching the sibling ranking information to a joint parent of the sibling data entities to facilitate on-demand generation of ranked parent candidates. A rollup function of the present invention builds a rollup matrix (126) that embodies information about the sibling entities and the sibling ranking information and provides a method for reading out the ranked parent candidates from the rollup matrix in order of their parent confidences (141). Parent confidences are based on the sibling ranking information, either alone or in combination with n-gram dictionary ranking or other ranking information.

Type: Application

Filed: April 8, 2003

Publication date: October 9, 2003

Applicant: RAF Technology, Inc.

Inventors: David Justin Ross, Stephen E.M. Billester, Brent R. Smith
WORD-TO-WORD SELECTION ON IMAGES

Publication number: 20030185448

Abstract: An image processing technique for selecting a text region from an image is described. Character and formatting information for each word in the image is used to determine an active region for each word in the image. For a preferred embodiment of the present invention, the character and formatting information is derived during optical character recognition (OCR). A first and last word within a selected text region is identified based on at least one active region associated with at least one word within the selected text region. Using the first and last words within the selected text region, all words within the selected text region are identified. An image of the selected text region may be displayed. Text contained within the selected text region may be copied to an application program.

Type: Application

Filed: November 12, 1999

Publication date: October 2, 2003

Inventors: MAURITIUS SEEGER, CHRISTOPHER R DANCE, STUART A TAYLOR, WILLIAM M NEWMAN
Method, system, and program for generating a table to determine boundaries between characters

Patent number: 6626960

Abstract: Disclosed is a system, method, and program for generating a table for use by a computer in determining a location of a boundary, such as a word boundary, between two characters in text. A first table indicates a boundary between characters when processing text in a first direction, such as the forward direction. A second table is generated based on the content of the first table. The second table can be used to determine whether one boundary is located between any two consecutive characters processed in a second direct ion, such as the backward direction.

Type: Grant

Filed: September 1, 1999

Date of Patent: September 30, 2003

Assignee: International Business Machines Corporation

Inventor: Richard Theodore Gillam
Automatic categorization of documents based on textual content

Patent number: 6621930

Abstract: An electronic device automatically classifies documents based upon textual content. Documents may be classified into document categories. Statistical characteristics are gathered for each document category and these statistical characteristics are used as a frame of reference in determining how to classify the document. The document categories may be intersecting or non-intersecting. A neutral category is used to represent documents that do not fit into many of the other specified categories. The statistical characteristics for an input document are compared with those for the document category and for the neutral category in making a determination on how to categorize the document. This approach is extensible, generalizable and efficient.

Type: Grant

Filed: August 9, 2000

Date of Patent: September 16, 2003

Assignee: Elron Software, Inc.

Inventor: Frank Smadja
Method and system for interactive ground-truthing of document images

Publication number: 20030152277

Abstract: A method and a system by which a document image is analyzed for the purposes of establishing a searchable data structure characterizing ground-truthed contents of the document represented by the document image operates by segmenting a document image into a set of image objects, and linking the image objects with fields that store metadata. Image objects identified by segmenting the document image are grouped into subsets. The image objects are grouped according to characteristics suggesting that the image objects may have common ground-truthed metadata. By grouping the image objects into subsets, the image objects may be indexed to facilitate the ground-truthing process. In some embodiments, the index of representative image objects is presented to the user in a table form. A database of image objects with ground-truthed metadata is formed. Interactive tools and processes facilitate ground-truthing based on paired image objects and metadata.

Type: Application

Filed: June 13, 2002

Publication date: August 14, 2003

Applicant: Convey Corporation

Inventors: Floyd Steven Hall, Cameron Telfer Howie
Rollup functions for efficient storage presentation and analysis of data

Patent number: 6597809

Abstract: Methods of organizing a series of sibling data entities in a digital computer are provided for preserving sibling ranking information associated with the sibling data entities and for attaching the sibling ranking information to a joint parent of the sibling data entities to facilitate on-demand generation of ranked parent candidates. A rollup function of the present invention builds a rollup matrix (126) that embodies information about the sibling entities and the sibling ranking information and provides a method for reading out the ranked parent candidates from the rollup matrix in order of their parent confidences (141). Parent confidences are based on the sibling ranking information, either alone or in combination with n-gram dictionary ranking or other ranking information.

Type: Grant

Filed: March 20, 2000

Date of Patent: July 22, 2003

Assignee: RAF Technology, Inc.

Inventors: David Justin Ross, Stephen E. M. Billester, Brent R. Smith
Dynamic programming operation with skip mode for text line image decoding

Patent number: 6594393

Abstract: In a text recognition system, the computational efficiency of a text line image decoding operation is improved by utilizing the characteristic of a graph known as the cut set. The branches of the data structure that represents the image are initially labeled with estimated scores. When estimated scores are used, the decoding operation must perform iteratively on a text line before producing the best path through the data structure. After each iteration, nodes in the best path are re-scored with actual scores. The decoding operation incorporates an operating mode called skip mode.

Type: Grant

Filed: May 12, 2000

Date of Patent: July 15, 2003

Inventors: Thomas P. Minka, Dan S. Bloomberg, Ashok C. Popat
Multimedia information retrieval method, program, record medium and system

Publication number: 20030103675

Abstract: Paired image information and text information correlated to each other are retrieved as information sets. Frequency information on words used in text is extracted from text information in a group of information sets, and text information features are extracted based on frequency information. Text features are used to lay out information sets in a virtual space such that similar pieces of text are located close to each other, and images are displayed at those positions. Further, important words are extracted from those words extracted from text information in a group of information sets, and those words are laid out in the virtual space in the same manner as with information sets and displayed as labels.

Type: Application

Filed: November 27, 2002

Publication date: June 5, 2003

Applicant: Fujitsu Limited

Inventors: Susumu Endo, Yuusuke Uehara, Daiki Masumoto, Syuuichi Shiitani
Predictive keyboard

Patent number: 6573844

Abstract: Predictive keyboards, such as predictive soft keyboards, are disclosed. In one embodiment, a computer-implemented method predicts at least one key to be entered next within a sequence of keys. The method displays a soft keyboard where the predicted keys are displayed on the soft keyboard differently than the other keys on the keyboard. For example, the predicted keys may be larger in size on the soft keyboard as compared to the other keys. This makes the predicted keys more easily typed by a user as compared to the other keys.

Type: Grant

Filed: January 18, 2000

Date of Patent: June 3, 2003

Assignee: Microsoft Corporation

Inventors: Daniel Venolia, Joshua Goodman, Xuedong Huang, Hsiao-Wuen Hon
Method for conducting and categorizing data

Publication number: 20030099402

Abstract: A method of analyzing a verbatim text comprising the steps of storing the verbatim text in an electronic memory device and identifying at least one concept in said verbatim text and linking said concept to a code.

Type: Application

Filed: March 11, 2002

Publication date: May 29, 2003

Inventor: Charles M. Baylis
Method and apparatus for compressing data string

Patent number: 6563956

Abstract: The present invention provides a data compression method in which a plurality of consecutive characters of a data string to be compressed are set as a character string to be searched for. Bits of a bit string representing the set character string are allocated to at least two codewords. Thus, first and second searching codewords are generated. These first and second codewords are used as array addresses. Fist and second array tables are prepared, in which information on the past occurrence positions of the set character string is previously entered as the contents thereof. When the first and second codewords are generated from the character string to be compressed, the first and second array tables are looked up by using these codewords as the addresses of the arrays. When results of looking up these tables match with each other, it is found that the set character string occurred in the past.

Type: Grant

Filed: July 7, 1999

Date of Patent: May 13, 2003

Assignee: Fujitsu Limited

Inventors: Noriko Satoh, Shigeru Yoshida
Image-evaluation method, image-evaluation system, and image-evaluation-processing program

Publication number: 20030086618

Abstract: Evaluation based on sensitivity of an image that was performed by using the sensibility and the manual work of a person is automatically performed.

Type: Application

Filed: July 11, 2002

Publication date: May 8, 2003

Applicant: SEIKO EPSON CORPORATION

Inventor: Michihiro Nagaishi
Image-layout evaluation method, image-layout evaluation system, and image-layout-evaluation-processing program

Publication number: 20030086619

Abstract: The layout of an image that was performed with the help of the sensibility and the manual work of a person is automatically optimized.

Type: Application

Filed: July 11, 2002

Publication date: May 8, 2003

Applicant: Seiko Epson Corporation

Inventor: Michihiro Nagaishi
Feed forward feed back multiple neural network with context driven recognition

Patent number: 6560360

Abstract: A recognition system is disclosed, including a representation of an object in terms of its constituent parts that is translationally invariant, and which provides scale invariant recognition. The system further provides effective recognition of patterns that are partially present in the input signal, or that are partially occluded, and also provides an effective representation for sequences within the input signal. The system utilizes dynamically determined, context based expectations, for identifying individual features/parts of an object to be recognized. The system is computationally efficient, and capable of highly parallel implementation, and further includes a mechanism for improving the preprocessing of individual sections of an input pattern, either by applying one or more preprocessors selected from a set of several preprocessors, or by changing the parameters within a single preprocessor.

Type: Grant

Filed: January 27, 2000

Date of Patent: May 6, 2003

Assignees: Nestor, Inc., Brown University Research Foundation

Inventors: Predrag Neskovic, Douglas L. Reilly, Leon N Cooper
Image processing apparatus and method and storage medium

Patent number: 6556713

Abstract: A search result of a search target object is displayed at a high speed. By dividing an image into a plurality of areas and allocating attribute information to each area, only the area including the attribute information showing the search target object is searched and is displayed or transmitted, so that a part of a desired image can be extracted at a high speed.

Type: Grant

Filed: July 30, 1998

Date of Patent: April 29, 2003

Assignee: Canon Kabushiki Kaisha

Inventors: Yuji Kobayashi, Kentaro Matsumoto
Method of recognizing characters

Patent number: 6549662

Abstract: Characters of data on a document are recognized by automatically determining the definitions of characters of the data from the arrangement of character strings of the data. Character strings on the document are extracted by reading the document, and headers and data on the document are distinguished from each other by determining the positional relationship between the character strings. Character attributes of the data are determined by recognizing characters of the character strings of the headers using a header recognition dictionary. Characters of the character strings of the data are recognized according to the determined character attributes of the data. Since character attributes of the data are determined from recognized characters of the headers after the headers and the data are distinguished from each other from the layout on the document, it is possible to enter automatically the character attributes of the data.

Type: Grant

Filed: May 27, 1998

Date of Patent: April 15, 2003

Assignee: Fujitsu Limited

Inventors: Katsutoshi Kobara, Shinichi Eguchi, Yoshihiro Nagano, Hideki Matsuno, Koichi Chiba, Yutaka Katsumata
Magnification of information with user controlled look ahead and look behind contextual information

Publication number: 20030068088

Abstract: A mechanism is provided for magnifying information with contextual information. The user may configure the magnification mechanism to present some contextual information along with the focus being magnified. Particularly, a user may set “look ahead” and “look behind” parameters to specify a number of words or characters to include before and after the magnified word or words. The actual magnified word or words may be distinguished from the contextual information. For example, the word or words being magnified may be magnified to a size that is larger than that of the contextual information. The magnification mechanism may also present a magnified display of image information.

Type: Application

Filed: October 4, 2001

Publication date: April 10, 2003

Applicant: International Business Machines Corporation

Inventors: Janani Janakiraman, Rabindranath Dutta
Data compressing apparatus, reconstructing apparatus, and its method

Patent number: 6542640

Abstract: A dictionary in which a character train serving as a processing unit upon compression has been registered is stored into a character train dictionary storing unit. In a character train comparing unit, the registration character train in the character train dictionary storing unit and a partial character train in non-compression data are compared, thereby detecting the coincident partial character train. A code output unit allocates a predetermined code every partial character train detected by the character train comparing unit and outputs. The character train dictionary storing unit allocates character train codes of a fixed length of 17 bits to about 130,000 words and substantially compresses a data amount to the half or less irrespective of an amount of document data.

Type: Grant

Filed: June 18, 1998

Date of Patent: April 1, 2003

Assignee: Fujitsu Limited

Inventors: Takashi Morihara, Yahagi Hironori, Satoh Noriko
Information processing apparatus and method, and computer readable memory therefor

Patent number: 6539116

Abstract: The structure of entered document image data is analyzed and a character string in a text block that has been analyzed is subjected to pattern recognition. Synonyms and equivalents of words obtained as results of language analysis are extracted and words obtained as results of language analysis are converted to words of another language. A character string in a text block that has been analyzed is translated to another language. At least results of analyzing the structure of document image data, results of character recognition and results of language analysis are stored, and at least one of the results of extraction, results of conversion and results of translation are stored in a RAM in association with the results of character recognition.

Type: Grant

Filed: October 2, 1998

Date of Patent: March 25, 2003

Assignee: Canon Kabushiki Kaisha

Inventor: Makoto Takaoka
System and method for rendering image based data

Patent number: 6539117

Abstract: A communications system for rendering image based data includes a data interface, a display device, and a data manager. The data interface receives image based data that is used by the display device to display an image. The data manager identifies word blocks defined by the received data. The data manager uses the word blocks to define a first row of the image. In this regard, the data manager determines whether images respectively defined by each of the word blocks would be visible if the word blocks are rendered to the first row of the display screen. In response to a determination that an image associated with one of the word blocks would not be visible if the one word block is rendered to the first row of the display screen, the data manager defines a second row and renders the one word block to the second row.

Type: Grant

Filed: April 12, 1999

Date of Patent: March 25, 2003

Assignee: Hewlett-Packard Company

Inventor: Frank P Carau, Sr.
Radical definition and dictionary creation for a handwriting recognition system

Patent number: 6539113

Abstract: The system described herein automatically defines a set of radicals to be used in a Kanji character handwriting recognition system and automatically creates a dictionary of the Kanji characters that are recognized by the system. In performing its functionality, the system described herein first obtains representative handwriting samples for each Kanji character that is to be recognized by the system. The system described herein then evaluates the samples to identify a set of subparts (“radicals”) that are common to at least two of the Kanji characters. These radicals represent component roots from which the characters are formed. Each Kanji character is formed by one or more of these radicals. The radicals that are identified by the system described herein are not constrained to any preset definition (e.g., the traditional set of radicals used to organize Japanese dictionaries).

Type: Grant

Filed: December 29, 1999

Date of Patent: March 25, 2003

Assignee: Microsoft Corporation

Inventor: Michael Van Kleeck
System and method for evaluating character sets of a message containing a plurality of character sets

Patent number: 6539118

Abstract: An evaluator system accepts input textual messages in unknown languages and assesses which character sets, corresponding to languages, matches that message. Textual messages whose individual characters are encoded in 16 bit Unicode of other universal format are parsed, and character sets which can express each character and the accumulated correspondence is logged. When the character sets against which the message is being tested only provide partial matches, the invention can determine which offers the best fit, including by way of a weighting function. The evaluation technology of the invention can be applied to multipart documents, and to search engines and indices.

Type: Grant

Filed: August 27, 1999

Date of Patent: March 25, 2003

Assignee: International Business Machines Corporation

Inventors: Brendan P. Murray, Kuniaki Takizawa
Mobile device and transmission system

Publication number: 20030044068

Abstract: The invention relates to a mobile device with a built-in image capture device, and a character recognition function to present the information gathered with the character recognition result. With the mobile device, the character line extraction process is displayed whenever necessary, and the resolution of an image to be inputted for recognition processing is enhanced. Accordingly, it is possible for the operator to select the target character line with ease. In addition, the mobile device has a character recognition ratio improved by the enhancement in resolution.

Type: Application

Filed: July 23, 2002

Publication date: March 6, 2003

Applicant: Hitachi, Ltd.

Inventors: Tatsuhiko Kagehiro, Minenobu Seki, Hiroshi Sako
Character recognition system

Patent number: 6526170

Abstract: A character recognition system is disclosed, In a feature extraction parameter storage section 22 a transformation matrix for reducing a number of dimensions of feature parameters and a codebook for quantization are stored. In an HMM storage section 23 a constitution and parameters of Hidden Markov Model (HMM) for character string expression are stored. A feature extraction section 32 scans a word image given from an image storage means from left to right in a predetermined cycle with a slit having a sufficiently small width than the character width and thus outputs a feature symbol at each predetermined timing. A matching section 33 matches a feature symbol row and a probability maximization HMM state, thereby recognizing the character string.

Type: Grant

Filed: December 13, 1994

Date of Patent: February 25, 2003

Assignee: NEC Corporation

Inventor: Shinji Matsumoto
System for drawing patent map using technical field word and method therefor

Publication number: 20030026459

Abstract: A system and a method for drawing a patent map using a technical field word are disclosed. In the system and the method, a word to be used for drawing a patent map is extracted by calculating weight values of significant words which are gotten by removing unnecessary words from patent data, and this extracted word is matched with a patent to draw the patent map.

Type: Application

Filed: November 29, 2001

Publication date: February 6, 2003

Inventors: Jeong Wook Won, Hyoung Bok Lee, Jai Sang Koh
Word recognition device and method

Patent number: 6512851

Abstract: A word recognition device uses an associative memory to store a plurality of coded words in such a way that a weight is associated with each character of the alphabet of the stored words, wherein equal weights correspond to equal characters. To perform the recognition, a dictionary of words is first chosen; this is stored in the associative memory according to a pre-determined code; a string of characters which correspond to a word to be recognized is received; a sequence of weights corresponding to the string of characters received is supplied to the associative memory; the distance between the word to be recognized and at least some of the stored words is calculated in parallel as the sum of the difference between the weights of each character of the word to be recognized and the weights of each character of the stored words; the minimum distance is identified; and the word stored in the associative memory having the minimum distance is stored.

Type: Grant

Filed: October 9, 2001

Date of Patent: January 28, 2003

Assignee: STMicroelectronics S.r.l.

Inventors: Loris Navoni, Roberto Canegallo, Mauro Chinosi, Giovanni Gozzini, Alan Kramer, Pierluigi Rolandi
Image analyzing method for detecting significant changes in a time sequence of images

Publication number: 20030016874

Abstract: An image analysis method designed to identify images in a sequence of images that are statistically different in a pre-selected region of interest. The method is suitable when there is no a priori knowledge of the nature of the interesting images. A reference image is used to identify specific regions of the image that may contain interesting changes (Detect Zone), that will not have interesting changes, but can be used to assess image quality (Veto zone), and an unanalyzed region (Ignore zone). To improve the spatial sensitivity, the Detect and Veto zones can be divided into specific cells. The analysis may also be performed on compressed data and another method automatically classifies a cell as either in the Detect zone or Ignore zone. The sensitivity can be further improved by removing periodic feature variation prior to the statistics calculation.

Type: Application

Filed: May 31, 2001

Publication date: January 23, 2003

Inventors: Kenneth A. Lefler, Wayne L. Kilmer, Yi Zhang
Apparatus and method for retrieving character string based on classification of character

Patent number: 6507678

Abstract: A character string retrieval apparatus classifies a plurality of characters following a prefix of a registration character string into a plurality of groups, and registers those following characters in an array structure using a different displacement amount for each group. The character string retrieval apparatus retrieves a given character string based on the displacement amount of a group corresponding to an input character.

Type: Grant

Filed: February 8, 1999

Date of Patent: January 14, 2003

Assignee: Fujitsu Limited

Inventor: Hironori Yahagi
Data compression method and data compression apparatus

Publication number: 20020196166

Abstract: The present invention provides a data compression method in which a plurality of consecutive characters of a data string to be compressed are set as a character string to be searched for. Bits of a bit string representing the set character string are allocated to at least two codewords. Thus, first and second searching codewords are generated. These first and second codewords are used as array addresses. Fist and second array tables are prepared, in which information on the past occurrence positions of the set character string is previously entered as the contents thereof. When the first and second codewords are generated from the character string to be compressed, the first and second array tables are looked up by using these codewords as the addresses of the arrays. When results of looking up these tables match with each other, it is found that the set character string occurred in the past.

Type: Application

Filed: August 29, 2002

Publication date: December 26, 2002

Applicant: FUJITSU LIMITED

Inventors: Noriko Satoh, Shigeru Yoshida
Document imaging and indexing system

Publication number: 20020176628

Abstract: A document digitizing method digitizes and automatically indexes documents in printed form. The method includes optically scanning the document, forming and storing a digitized image file from the optically scanned document, optically recognizing characters in the optically scanned document, and forming and storing a text file of the optically recognized characters in document. A retrieval method for retrieving the digitized image file for a document includes searching the text files to identify any having a selected text string and providing access to the digitized image files that correspond to those text files. The digital image file and the text file together represent a digitized document data structure that combines a digital image of a document with a text file of optically recognized characters in the digital image.

Type: Application

Filed: May 22, 2001

Publication date: November 28, 2002

Inventor: Gary K. Starkweather
Recognizer of text-based work

Publication number: 20020172425

Abstract: Described herein is a technology for recognizing the content of text documents. The technology determines one or more hash values for the content of a text document. Alternatively, the technology may generate a “sifted text” version of a document. In one implementation described herein, document recognition is used to determine whether the content of one document is copied (i.e., plagiarized) from another document. This is done by comparing hash values of documents (or alternatively their sifted text). In another implementation described herein, document recognition is used to categorize the content of a document so that it may be grouped with other documents in the same category. This abstract itself is not intended to limit the scope of this patent. The scope of the present invention is pointed out in the appending claims.

Type: Application

Filed: April 24, 2001

Publication date: November 21, 2002

Inventors: Ramarathnam Venkatesan, Michael Malkin
Systems and methods for rendering image-based data

Publication number: 20020164079

Abstract: Systems and methods for rendering image-based data are disclosed. A representative system includes a data interface that receives a remotely-generated data stream; a data manager coupled to the data interface, the data manager configured to translate the remotely-generated data stream into a plurality of word blocks, wherein the data manager determines for each word block of interest whether an active line can accommodate an entire word block of interest prior to registering the word block with the active line and wherein the data manager increments the active line in response to a determination that the word block of interest would not be accommodated on the active line; and a display device coupled to the data manager, the display device configured to render the plurality of word blocks.

Type: Application

Filed: June 25, 2002

Publication date: November 7, 2002

Inventor: Frank P. Carau
Apparatus for searching document images using a result of character recognition

Publication number: 20020154817

Abstract: A document image search apparatus generates a text by performing the character recognition of a document image and determines a re-process scope. Then, the apparatus generates a candidate character lattice from the re-recognition result of the re-process scope, generates character strings from the candidate character lattice and adds the character strings to the text. Then, the apparatus performs index search using the text with the character strings added.

Type: Application

Filed: September 12, 2001

Publication date: October 24, 2002

Applicant: Fujitsu Limited

Inventors: Yutaka Katsuyama, Satoshi Naoi, Fumihito Nishino
Document image search device and recording medium having document search program stored thereon

Patent number: 6470336

Abstract: A document search device searches for a keyword in a recognition result obtained by character recognition performed on a document image. The keyword includes at least one first character, and a character code is assigned to each of the at least one first character. The recognition result includes at least one second character, and a character code and a partial area of the document image are assigned to each of the at least one second character.

Type: Grant

Filed: August 23, 2000

Date of Patent: October 22, 2002

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Yoshihiko Matsukawa, Taro Imagawa, Kenji Kondo, Tsuyoshi Mekata
EXTRACTING INFORMATION FROM SYMBOLICALLY COMPRESSED DOCUMENT IMAGES

Publication number: 20020150300

Abstract: A method and apparatus for extracting information from symbolically compressed document images. A deciphering module generates first and second text strings by deciphering respective sequences of template identifiers in first and second symbolically compressed document images. A conditional n-gram module receives the first and second text strings from the deciphering module and extracts n-gram terms therefrom based on a predicate condition. A comparison module generates a measure of similarity between the first and second symbolically compressed document images based on the n-gram terms extracted by the conditional n-gram module.

Type: Application

Filed: April 8, 1999

Publication date: October 17, 2002

Inventors: DAR-SHYANG LEE, JONTHAN J. HULL
System for conducting fortune telling and character analysis over internet based on names of any language

Publication number: 20020141644

Abstract: A system for conducting fortune telling and character analysis over Internet based on an entered name of any language is provided. The system includes a host, a database, one or more terminals, and a communicating system connecting the host, the database and the terminal to one another. The database is capable of analyzing implied meanings of names of any language entered at the terminal in accordance with traditional Chinese fortune telling theories and thereby providing judgments on good or bad signs possibly represented by the entered names. The implied meanings of names are obtained either from numbers of strokes of the entered names or from meanings or origins of words constituting the entered names.

Type: Application

Filed: March 29, 2001

Publication date: October 3, 2002

Inventor: Tawei Lin
Method and apparatus for forming variant search strings

Patent number: 6459810

Abstract: An exemplary embodiment of the invention is a method for forming variant search strings. The method includes receiving a search string and parsing the search string to locate a mistaken search string character. A mistaken search string character is a character which is confused with other characters. A variant search string is formed in response to a presence of a mistaken search string character in the search string. The search string and variant search string may then be used to search a database. Another exemplary embodiment of the invention is a system for forming variant search strings. The system includes a user interface for receiving a search string. A variant search string generator parses the search string to locate a mistaken search string character. The mistaken search string character is a character which is confused with other characters. The variant search string generator forms a variant search string in response to a presence of a mistaken search string character in the search string.

Type: Grant

Filed: September 3, 1999

Date of Patent: October 1, 2002

Assignee: International Business Machines Corporation

Inventor: Christopher T. Cring
Diacritical processing for unconstrained, on-line handwriting recognition using a forward search

Patent number: 6453070

Abstract: Handwritten ink is scanned to identify potential diacriticals. A list of diacriticals (19) is generated by traversing the ink. Potential diacritical-containing characters are processed by scoring them with and without a diacritical to generate a first and second score. The first score is compared to the second score to in order to make a decision as to which variant of the potential diacritical-containing character produced a highest score. The highest score is used as a score for a theory and the decision is recorded. A data structure (50) is added to the theory. Each data unit in the data structure (50) corresponds to an entry in the list of diacriticals (19). As a new theory is created by propagation, contents of the data structure (50) are copied into the new theory. Thus, the data structure (50) is used to ensure that all handwritten ink is used and is used only once.

Type: Grant

Filed: March 17, 1998

Date of Patent: September 17, 2002

Assignee: Motorola, Inc.

Inventors: Giovanni Seni, John Seybold
Character-recognition pre-processing apparatus and method, and program recording medium

Publication number: 20020126904

Abstract: A character-recognition pre-processing apparatus includes extraction means for extracting an image of a character string to be subjected to character recognition; setting means for setting the smallest rectangle that surrounds the character string image extracted; specifying means for specifying the position of each character within the smallest rectangle set by the setting means; detection means for detecting, at each character position specified, the shortest distance between a character region and the lower edge of the smallest rectangle, and the shortest distance between the character region and the upper edge of the smallest rectangle; and judgment means for judging whether the character string extracted is in an upright state or an inverted state, on the basis of variations in the two shortest distances detected.

Type: Application

Filed: October 3, 2001

Publication date: September 12, 2002

Inventors: Hiroshi Kakutani, Yasuharu Inami
WORD RECOGNIZING APPARATUS FOR DYNAMICALLY GENERATING FEATURE AMOUNT OF WORD AND METHOD THEREOF

Publication number: 20020126903

Abstract: A word recognizing apparatus extracts the feature amount from a given image, and dynamically composes the feature amount of a candidate word to be recognized which is registered in a word list, using feature amounts of characters registered in an individual character dictionary. Then, the apparatus collates the composed feature amount of the word with the feature amount extracted from the image, calculates the degree of similarity between the two feature amounts, and outputs a recognition result.

Type: Application

Filed: May 11, 1999

Publication date: September 12, 2002

Inventors: HIROAKI TAKEBE, YOSHINOBU HOTTA, SATOSHI NAOI
Mathematical expression recognizing device, mathematical expression recognizing method, character recognizing device and character recognizing method

Publication number: 20020126905

Abstract: A mathematical expression recognizing device comprises a character recognition unit which recognizes characters in a document image, a dictionary storing a pair of evaluation scores for each type of word, the score showing the possibility of belonging to the text and that of belonging to the mathematical expression, an evaluation unit which obtains the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words included in the recognized characters with reference to the dictionary, and a mathematical expression detecting unit which searches for an optimal path connecting words by selecting one of the text and the mathematical expression based on a formative grammar and the evaluation scores showing the possibility of belonging to the text and that of belonging to the mathematical expression for each of the words, thereby detecting characters belonging to the mathematical expression.

Type: Application

Filed: March 5, 2002

Publication date: September 12, 2002

Applicant: KABUSHIKI KAISHA TOSHIBA

Inventors: Masakazu Suzuki, Kazuaki Yokota, Yuko Eto
Method of and system for the automatic registration of anatomically corresponding positions for perfusion measurements

Publication number: 20020118866

Abstract: An automatic quantitative analysis method is developed so as to analyze perfusion cardiovascular images. First the image registration per data set is performed so as to compensate for translation and rotation of the target region of interest over the acquisition time. Next a parameter, for example, a maximum intensity projection, is calculated in order to average out misalignments of the target region of interest within each data set. Finally, parameter registration is performed to calculate the co-ordinate translation matrix between the anatomically corresponding pixels within the target region of interest. The co-ordinate translation matrix can also be used to calculate local perfusion values.

Type: Application

Filed: January 29, 2002

Publication date: August 29, 2002

Inventors: Marcel Breeuwer, Marcel Johannes Quist
Word recognition device and method

Patent number: 6442295

Abstract: A word recognition device uses an associative memory to store a plurality of coded words in such a way that a weight is associated with each character of the alphabet of the stored words, wherein equal weights correspond to equal characters. To perform the recognition, a dictionary of words is first chosen; this is stored in the associative memory according to a pre-determined code; a string of characters which correspond to a word to be recognized is received; a sequence of weights corresponding to the string of characters received is supplied to the associative memory; the distance between the word to be recognized and at least some of the stored words is calculated in parallel as the sum of the difference between the weights of each character of the word to be recognized and the weights of each character of the stored words; the minimum distance is identified; and the word stored in the associative memory having the minimum distance is stored.

Type: Grant

Filed: February 12, 1998

Date of Patent: August 27, 2002

Assignee: STMicroelectronics S.r.l.

Inventors: Loris Navoni, Roberto Canegallo, Mauro Chinosi, Giovanni Gozzini, Alan Kramer, Pierluigi Rolandi
Holistic-analytical recognition of handwritten text

Publication number: 20020114523

Abstract: In a combined holistic and analytic recognition system, the holistic recognition module will recognize an input word or phrase image by matching an input string of character features for the whole word or phrase against a string of prototype features for a plurality of reference words in a lexicon. This will yield a holistic answer list of recognized word or phrase candidates for the input word or phrase along with a confidence value for each answer on the list. At the same time based on each answer in the answer list, the holistic recognition modules will generate a list of character features and segment the character features into sets for each character in an answer. The analytical recognition module uses segmentation hypotheses from the segmented character feature sets to cut the image of the input string of characters into individual character images.

Type: Application

Filed: February 16, 2001

Publication date: August 22, 2002

Inventors: Alexander Filatov, Igor Kil, Arseni Seregin
Method and apparatus for statistical text filtering

Publication number: 20020114524

Abstract: Disclosed herein is a method for automatically filtering a corpus of documents containing textual and non-textual information of a natural language. According to the method, through a first dividing step (101), the document corpus is divided into appropriate portions. At a following determining step (105), for each portion of the document corpus, there is determined a regularity value (VR) measuring the conformity of the portion with respect to character sequences probabilities predetermined for the language considered. At a comparing step (107), each regularity value (VR) is then compared with a threshold value (VT) to decide whether the conformity is sufficient. Finally, at a rejecting step (111), any portion of the document corpus whose conformity is not sufficient is rejected and removed from the corpus. An apparatus for carrying out such a method is also disclosed.

Type: Application

Filed: June 29, 2001

Publication date: August 22, 2002

Applicant: International Business Machines Corporation

Inventor: Hubert Crepy
Method and apparatus for entering data strings including hangul (Korean) and ASCII characters

Patent number: 6430314

Abstract: Described are methods for entering and editing data strings that are inputted into cellular telephones having a screen. In one method, all basic Hangul consonants and some of the compound Hangul consonants are included in a candidate consonant list and all basic Hangul vowels and some of the compound vowels are included in a candidate vowel list. The candidate consonant and vowel lists are alternatively displayed on a component display region (906) located on the screen. To form a Korean character, a user can select consonant(s) and vowel from the candidate consonant and vowel lists. To form a compound Hangul component that is not included in either the candidate consonant list or the candidate vowel list, the user selects a basic Hangul component as a first part of the compound Hangul component from either the candidate consonant list or the candidate vowel list.

Type: Grant

Filed: January 20, 1999

Date of Patent: August 6, 2002

Assignees: Sony Corporation, Sony Electronics. Inc.

Inventor: Soon Ko

prev … 10 11 12 13 14 15 16 17 next