Segmenting Individual Characters Or Words Patents (Class 382/177)

Separating touching or overlapping characters (Class 382/178)

Segmenting hand-printed characters (Class 382/179)

Character recognition system

Patent number: 6526170

Abstract: A character recognition system is disclosed, In a feature extraction parameter storage section 22 a transformation matrix for reducing a number of dimensions of feature parameters and a codebook for quantization are stored. In an HMM storage section 23 a constitution and parameters of Hidden Markov Model (HMM) for character string expression are stored. A feature extraction section 32 scans a word image given from an image storage means from left to right in a predetermined cycle with a slit having a sufficiently small width than the character width and thus outputs a feature symbol at each predetermined timing. A matching section 33 matches a feature symbol row and a probability maximization HMM state, thereby recognizing the character string.

Type: Grant

Filed: December 13, 1994

Date of Patent: February 25, 2003

Assignee: NEC Corporation

Inventor: Shinji Matsumoto
Method and system for automatically segmenting and recognizing handwritten Chinese characters

Patent number: 6519363

Abstract: This invention discloses a method for automatically segmenting and recognizing Chinese character strings continuously written by a user in a handwritten Chinese character processing system, comprising the steps of: creating a geometry model and a language mode; finding out all of potential segmentation schemes in the Chinese character strings continuously written by a user based on the associated timing information and said geometry model; recognizing the groups of strokes as defined by each of potential segmentation schemes and computing the probability characterizing the exactness of recognition results; correcting the probability characterizing the exactness of recognition results by said language model; and, selecting the recognition result and the corresponding segmentation scheme having the maximum probability value.

Type: Grant

Filed: January 12, 2000

Date of Patent: February 11, 2003

Assignee: International Business Machines Corporation

Inventors: Hui Su, Donald T. Tang, Qian Ying Wang
Scheme for extraction and recognition of telop characters from video data

Patent number: 6501856

Abstract: A scheme for detecting telop character displaying frames in video image which is capable of suppressing erroneous detection of frames without telop characters due to instability of image features is disclosed. In this scheme, each input frame constituting the video data is entered, and whether each input frame is a telop character displaying frame in which telop characters are displayed or not is judged, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on some scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame.

Type: Grant

Filed: September 28, 2001

Date of Patent: December 31, 2002

Assignee: Nippon Telegraph and Telephone Corporation

Inventors: Hidetaka Kuwano, Hiroyuki Arai, Shoji Kurakake, Kenji Ogura, Toshiaki Sugimura, Minoru Mori, Minoru Takahata
Character segmentation device, character segmentation method used thereby, and program therefor

Publication number: 20020172422

Abstract: Image size converter 4 converts the size of the image data stored in image input part 1 to an arbitrary size and stores the converted data. Image enhancer uses the character frame design data stored in character frame information memory 3 to extract, from the image stored in image size converter 4, an image of a region containing character frames, and enhances and stores this extracted image. Image outline detector 6 forms an outline image from the image obtained by image enhancer 5. Character frame center detector 7 uses the outline image to detect the coordinates of the centers of the character frames of the input image data. Character frame remover 8 uses the character frame center coordinates and the character frame design data to remove the character frames, and outputs the result from character image output part 9.

Type: Application

Filed: May 14, 2002

Publication date: November 21, 2002

Applicant: NEC Corporation

Inventor: Daisuke Nishiwaki
Document image processing device and method thereof

Patent number: 6466694

Abstract: A processing device performs region identification of an input image, and then performs an intra-region recognition process. The type code of each region and the individual code of a recognition result are then displayed, so that a user can modify both of the results of the region identification and the recognition process at one time. Furthermore, the processing device displays an original image close to the recognition result. If no correct answer exists among recognition candidates, code is added to the original image, and the original image with the code added is handled as a recognition result.

Type: Grant

Filed: April 16, 1998

Date of Patent: October 15, 2002

Assignee: Fujitsu Limited

Inventors: Hiroshi Kamada, Katsuhito Fujimoto, Koji Kurokawa
Data visualization apparatuses, computer-readable mediums, computer data signals embodied in a transmission medium, data visualization methods, and digital computer data visualization methods

Patent number: 6466211

Abstract: Data visualization apparatuses, computer-readable mediums, computer data signals embodied in a transmission medium, data visualization methods, and digital computer data visualization methods are provided. According to one aspect of the present invention, a data visualization apparatus includes an image device configured to provide a visual image; and digital processing circuitry coupled with the image device and configured to access data including a plurality of themes, to generate a thematic illustration corresponding to the themes and having a plurality of outer contour lines which are spaced at varying distances relative to a reference line, and to control the image device to depict the thematic illustration.

Type: Grant

Filed: October 22, 1999

Date of Patent: October 15, 2002

Assignee: Battelle Memorial Institute

Inventors: Susan L. Havre, Elizabeth G. Hetzler, Lucy T. Nowell, Paul D. Whitney, Feng Gao, James J. Thomas, Louis M. Martucci, W. Michelle Harris
Method and apparatus for forming variant search strings

Patent number: 6459810

Abstract: An exemplary embodiment of the invention is a method for forming variant search strings. The method includes receiving a search string and parsing the search string to locate a mistaken search string character. A mistaken search string character is a character which is confused with other characters. A variant search string is formed in response to a presence of a mistaken search string character in the search string. The search string and variant search string may then be used to search a database. Another exemplary embodiment of the invention is a system for forming variant search strings. The system includes a user interface for receiving a search string. A variant search string generator parses the search string to locate a mistaken search string character. The mistaken search string character is a character which is confused with other characters. The variant search string generator forms a variant search string in response to a presence of a mistaken search string character in the search string.

Type: Grant

Filed: September 3, 1999

Date of Patent: October 1, 2002

Assignee: International Business Machines Corporation

Inventor: Christopher T. Cring
Apparatus for recognizing characters and a method therefor

Patent number: 6456739

Abstract: A character image is inputted by use of a scanner, and recognized. The resultant character string of such recognition is represented on a display. The image serving as recognition source of the character designated on the display screen thereof, and the image in the vicinity of such image are represented. A character frame, which can discriminate the character image serving as recognition source, is edited in order to designate a new character image. This image and the inputted character information are registered on a character recognition dictionary correspondingly. Thereafter, the character recognition is carried out even with the utilization of such newly registered character. As a result, the recognition rate of the character recognition increases one after another.

Type: Grant

Filed: June 18, 1996

Date of Patent: September 24, 2002

Assignee: Canon Kabushiki Kaisha

Inventor: Hiroaki Ikeda
Method for creating a database such as a dictionary used for a word conversion system

Publication number: 20020118876

Abstract: Character (or letter) information is extracted from source information, word information is extracted from the character information, and a database is created of the word information. Thereby, the created database is adapted for the technical field of the user or a field of interest to the user.

Type: Application

Filed: October 22, 2001

Publication date: August 29, 2002

Inventors: Hidetaka Magoshi, Nobuo Sasaki
Character string recognition apparatus, character string recognizing method, and storage medium therefor

Publication number: 20020114515

Abstract: A key word is first and automatically extracted from a character string group to be recognized, and entered. Then, a character is recognized by segmenting an individual character from a character string image to be recognized, and a character string corresponding to the extracted/entered key word id extracted. Then, a word area delimited by a key word is extracted from the character string image, and a word is recognized. Furthermore, a word recognition result is verified, and a final character string recognition result is output.

Type: Application

Filed: December 18, 2001

Publication date: August 22, 2002

Applicant: Fujitsu Limited

Inventors: Yoshinobu Hotta, Katsuhito Fujimoto, Satoshi Naoi, Misako Suwa
Document image assessment system and method

Patent number: 6408094

Abstract: A system and method in accordance with the present invention includes a scanning assembly and a storage device coupled to a programmed computer with a set of instructions for carrying out an assessment of a document image. The system and method operate by: processing the document image to obtain one or more attributes related to the geometrical integrity of the document image; selecting a threshold value from a database for each of the obtained attributes; and then comparing each of the obtained attributes against the threshold value selected for the obtained attribute to determine a difference for each and then evaluating one or more of the differences using predetermined criteria to provide evaluation results of the geometrical integrity of the document image.

Type: Grant

Filed: November 4, 1997

Date of Patent: June 18, 2002

Assignee: Eastman Kodak Company

Inventors: Alexander David Mirzaoff, Thaddeus Francis Pawlicki
Information communications apparatus

Publication number: 20020071606

Abstract: A character recognition section generates character recognition result information resulting from character recognition of image information. An image information cutout section cuts out character recognition image information, corresponding to an area as to which the character recognition is performed, from the image information. A recognition result generation section generates recognition result information which is composed of the character recognition result information and the character recognition image information. A recognition result transmission section transmits the recognition result information to other terminals using electronic mail. As a result, an information communications apparatus of the invention can make transmissions of information to the wide area, without increasing the network load, which is to be used for the determination of whether or not a character recognition has been accurately performed.

Type: Application

Filed: November 29, 2001

Publication date: June 13, 2002

Applicant: MATSUSHITA GRAPHIC COMMUNICATION SYSTEMS, INC.

Inventors: Shinichi Watanabe, Hideki Honma
Keyfact-based text retrieval system, keyfact-based text index method, and retrieval method

Patent number: 6366908

Abstract: A keyfact-based text retrieval method and a keyfact-based text index method that describes the formalized concept of a document by a pair comprising an object that is the head and a property that is the modifier and uses the information described by the pairs as index information for efficient document retrieval. A keyfact-based text retrieval system includes keyfact extracting, keyfact indexing, and keyfact retrieving. The keyfact extracting analyzes a document collection and a query and extracts keywords and keyfacts. The keywords do not have part-of-speech ambiguity and the keyfacts are extracted from the keywords. The keyfact indexing calculates the frequency of the keyfacts and generates a keyfact list of the document collection for a keyfact index structure. The keyfact retrieving receive a keyfact of the query and keyfacts of the document collection and defines a keyfact-based retrieval model in consideration of a weight factor of the keyfact pattern and generates a retrieval result.

Type: Grant

Filed: December 30, 1999

Date of Patent: April 2, 2002

Assignee: Electronics and Telecommunications Research Institute

Inventors: Kyung Taek Chong, Myung-Gil Jang, MiSeon Jun, Se Young Park
Optical character recognition device and method and recording medium

Publication number: 20020012462

Abstract: An image processing method or device invented to reduce the ratio of erroneously recognized non-character elements in optical character recognition (OCR) regarding a color document that includes character images and other types of images, wherein the extracted character image data is checked to determine whether a color change exists in each character image, and wherein if no color change exists, the character image data is converted into character code data, but where a color change does exist, the character image data is not converted into character code data.

Type: Application

Filed: June 4, 2001

Publication date: January 31, 2002

Inventor: Yoko Fujiwara
Handwritten character recognition apparatus

Publication number: 20020009226

Abstract: A handwritten character recognition apparatus has a character string input area of a size that allows a user to hand write a plurality of characters thereon using a stylus. A coordinate detection unit extracts a coordinate string for each stroke that forms the handwritten character string. An input completion judgement unit judges an immediately preceding handwritten character string to be complete if a time difference between a last coordinate of an immediately preceding stroke and a first coordinate of a stroke being input is at least a predetermined time, when the first coordinate of the stroke is detected in a first area of the character string input area. A character segmentation unit segments a stroke string for each character from all the strokes of the previously input hand written character string from which a character recognition unit recognizes each character and outputs a character string which is the recognition result.

Type: Application

Filed: April 19, 2001

Publication date: January 24, 2002

Inventors: Ichiro Nakao, Yoshikatsu Ito
Robust identification code recognition system

Patent number: 6339651

Abstract: A method and system for recognizing the characters on surfaces where alphanumeric identification code (“ID” for short) may be present such as a license plate. The present system is particularly adapted for situations where visual distortions can occur, and utilizes a highly robust method for recognizing the characters of an ID. Multiple character recovery schemes are applied to account for a variety of conditions to ensure high accuracy in identifying the ID. Accuracy is greatly enhanced by taking a comprehensive approach where multiple criteria are taken into consideration before any conclusions are drawn. Special considerations are given to recognizing the ID as a whole and not just the individual characters.

Type: Grant

Filed: February 25, 1998

Date of Patent: January 15, 2002

Assignee: Kent Ridge Digital Labs

Inventors: Qi Tian, Kong Wah Wan, Karianto Leman, Chade Meng Tan, Chun Biao Guo
Character segmentation device and character segmentation system

Patent number: 6327385

Abstract: A character segmentation system for segmentation out a character from a string of characters which are in touch with each other, which is capable of being executed on a small size hardware resource without influence of variation of touching condition due to difference of character font, comprises an image storing unit 110 for storing an electronic image of character string obtained by such means as optical scanning, a partial pattern dictionary 122 for storing partial pattern shapes used as features for specifying fonts of character, a partial pattern detecting unit 121 for extracting areas of the image of character string, which coincide with a partial pattern, a character font determining unit 123 for determining the font of character on the basis of positions of the areas of the image of character string, which coincide with the partial pattern, and the number of the areas, a feature extraction inhibited area dictionary 132 for storing areas in which feature extraction processing for respective fonts of ch

Type: Grant

Filed: November 10, 1998

Date of Patent: December 4, 2001

Assignee: NEC Corporation

Inventor: Masaaki Kamitani
Image processing method and apparatus and storage medium therefor

Patent number: 6327382

Abstract: It is an object of the present invention to appropriately extract areas for character recognition from a color image. It is another object of the present invention to separate and extract characters from a background color in a color image if the background of the manuscript is not white and if the characters are printed in a portion having a color that is not commonly used all over the image. To achieve these objects, this invention binarizes an input color image in a plurality of stages and extracts area from binary images obtained in each stage to enable areas and text sections to be appropriately extracted despite the unknown colors of the characters and background contained in the input color image.

Type: Grant

Filed: January 29, 1999

Date of Patent: December 4, 2001

Assignee: Canon Kabushiki Kaisha

Inventors: Kitahiro Kaneda, Toshiaki Yagasaki
Character recognition apparatus and method for recognizing characters

Patent number: 6327384

Abstract: In a projection means black pixel histograms of a binary stationary image are generated in both the vertical and the horizontal direction. In a text type judgment means, in accordance with these histograms, a determination is made of whether the image is vertical text or horizontal text. Based on the result of this determination, a pattern block extraction means extracts either a column or a row from the image. The block is further projected and divided into smaller blocks. Then projection is again applied to these divided blocks and patterns are extracted by a pattern extraction means. A judgment is made as to whether or not joining of the extracted patterns is to be performed and, if joining is required, they are joined by a pattern joining means and finally the offsets of all the extracted patterns are calculated, whereupon data (of extracted patterns) are sent to a pattern matching process.

Type: Grant

Filed: November 13, 1997

Date of Patent: December 4, 2001

Assignee: NEC Corporation

Inventors: Kouichirou Hirao, Keiji Yamada, Takahiro Hongu, Takashi Mochizuki, Mitsutoshi Arai
Method and apparatus for processing mailpieces including means for identifying the location and content of data blocks thereon

Patent number: 6289109

Abstract: An apparatus for determining the location and content of data blocks on a mailpiece includes a computer connected to a structure for obtaining a digital bit map image of an outer surface of a mailpiece. The computer includes structure programmed for: finding each run of a plurality of black bits of each scan line of the bit map image and determining if any bit thereof neighbors at least one black bit of another scan line; combining the found run with each neighboring bit to form a piece; assigning a descriptive value to a block having at least one piece and comparing the descriptive value to a list of values to determine which type of data block the block having the descriptive value is.

Type: Grant

Filed: December 29, 1993

Date of Patent: September 11, 2001

Assignee: Pitney Bowes Inc.

Inventors: Ronald E. Gocht, Leon A. Pintsov
System for entering handwritten data into computer generated forms

Patent number: 6282315

Abstract: A method for entering data into a computer generated form including field areas of preselected height and width includes the steps of converting handwritten characters of arbitrary height which may be greater than the preselected height formed on the screen to computer generated characters and displaying the computer generated characters within a field area. Additionally, handwritten characters to be entered into several field areas are grouped, converted, and displayed in selected field areas.

Type: Grant

Filed: July 2, 1997

Date of Patent: August 28, 2001

Assignee: Samsung Electronics, Ltd.

Inventor: Monty L. Boyer
Word grouping accuracy value generation

Patent number: 6269188

Abstract: The present invention is a computer-implemented method for calculating word accuracy. Word grouping accuracy values (260) are calculated (212) by using the character accuracy values (250) calculated by an OCR program present in a computer system. The present invention preferably uses these character accuracy values (250) to create a word grouping accuracy value (260). Various methods are employed to calculate the word accuracy (260), including binarizing the character accuracy values (250), modified averaging of the character accuracy values (250), and creating fuzzy visual displays of word grouping accuracy values (260). The calculated word grouping accuracy values (260) are then adjusted based upon known OCR strengths and weaknesses, and based upon comparisons to stored word lists and the application of language rules. In a system with multiple character recognition techniques, the system can compare the accuracy values (260) of different versions of the word groupings to find the most accurate version.

Type: Grant

Filed: March 12, 1998

Date of Patent: July 31, 2001

Assignee: Canon Kabushiki Kaisha

Inventor: Hamadi Jamali
Method for determining boundaries of words in text

Patent number: 6249604

Abstract: A method for determining the boundaries of a symbol or word string within an image, including the steps of determining page orientation, isolating symbol strings from adjacent symbol strings, establishing a set of boundaries or references with respect to which measurements about, or further processing of, the symbol string may be made.

Type: Grant

Filed: February 14, 1994

Date of Patent: June 19, 2001

Assignee: Xerox Corporation

Inventors: Daniel P. Huttenlocher, Peter C. Wayner, Michael J. Hopcroft
Image editing apparatus

Patent number: 6249353

Abstract: The image editing apparatus of the present invention reads text image having a plurality of character line image along to a predetermined direction, makes histogram expressing the distribution characteristics of said text image, detects said character line image having a predetermined size based on said histogram, and performs editing process for said character line image having said predetermined size.

Type: Grant

Filed: August 7, 1996

Date of Patent: June 19, 2001

Assignee: Minolta Co., Ltd.

Inventors: Akinori Yoshida, Shigeru Sawada, Takao Fujiwara
Method of reading characters and method of reading postal addresses

Patent number: 6246794

Abstract: A character reading method has enhanced character segmentation accuracy and character string recognition accuracy for reading correctly hand-written addresses on postal matters. The method extracts provisional character patterns from image information of the address character string (step 206), creates a table 219 of tentative character patterns and implements the character classification for the tentative character patterns (step 207), extracts, specifically for characters of the street number portion of the address character string, periphery information (vertical and horizontal lengths, vertical/horizontal length ratio, pattern spacings, etc.) of tentative character patterns (step 212), and segments the character string into characters accurately based on the information (step 215).

Type: Grant

Filed: December 11, 1996

Date of Patent: June 12, 2001

Assignee: Hitachi, Ltd.

Inventors: Tatsuhiko Kagehiro, Masashi Koga, Hiroshi Sako, Hiromichi Fujisawa, Hisao Ogata, Yoshihiro Shima, Shigeru Watanabe, Masato Teramoto
Image processing method and apparatus therefor

Patent number: 6212294

Abstract: An image processor which receives an image and assigns position of an arbitrary pixel in the image. An image block is extracted from the received image, and start and end positions of a designated area of the received image are acquired. A designated image block in the extracted image block is designated for processing in accordance with the start and end positions of the designated area of the received image.

Type: Grant

Filed: February 28, 1997

Date of Patent: April 3, 2001

Assignee: Canon Kabushiki Kaisha

Inventor: Hiroaki Ikeda
Full-text index producing device for producing a full-text index and full-text data base retrieving device having the full-text index

Patent number: 6189006

Abstract: A full-text data base retrieving device retrieves a data base in accordance with a query. A full-text index has character location information representative of location of each of key character sequences of N characters that appear in the data base, where N is a positive integer. A query memory memorizes the query as a retrieval key character sequence. A separating section separates the retrieval key character sequence into a plurality of retrieval key character sequences of N characters to extract contexts as extracted contexts from the retrieval key character sequence in accordance with the retrieval key character sequences. A context classifying section classifies the extracted contexts into classified contexts having the classification numbers, respectively. An index retrieving section retrieves the full-text index in accordance with the sorts of the retrieval key character sequences and the classified contexts to read the character location information as a retrieval result out of the full-text index.

Type: Grant

Filed: March 2, 1999

Date of Patent: February 13, 2001

Assignee: NEC Corporation

Inventor: Toshikazu Fukushima
Method and apparatus for pre-recognition character processing

Patent number: 6188790

Abstract: An apparatus for recognizing characters read by a reading unit. A circumscribing rectangle of a read character is formed, and the degree of narrowness of that circumscribing rectangle is acquired. Characters having a degree of narrowness that is equal to or greater than a predetermined value are selected and blank areas are added to the circumscribing rectangle to yield a character area with a corrected degree of narrowness. The character is normalized by converting the character area to a specified size, and is recognized based on the normalized character. It is therefore possible to normalize even characters significantly elongated vertically or horizontally for easier recognition and to group their character patterns.

Type: Grant

Filed: February 26, 1997

Date of Patent: February 13, 2001

Assignee: Tottori Sanyo Electric Ltd.

Inventors: Takatoshi Yoshikawa, Hiromitsu Kawajiri, Hiroshi Horii, Junji Tanaka
Device and record medium for pattern matching encoding/decoding of binary still images

Patent number: 6185330

Abstract: A pattern matching encoding device for executing pattern matching encoding of binary still images and a pattern matching decoding device corresponding to the pattern matching encoding device are proposed. In the pattern matching encoding, each input pattern extracted from the input image is matched against library patterns in the library, and the input pattern is encoded using a matched library pattern as a reference pattern if the matched library pattern is found. The pattern matching encoding device comprises a pattern segmentation section for segmenting each of selected library patterns and the input pattern into two or more parts and thereby generating segmented library patterns and segmented input patterns, a matching section for matching each of the segmented input patterns against corresponding segmented library patterns, and a pattern combination section for generating a new library pattern by combining the segmented library patterns each of which has matched one of the segmented input patterns.

Type: Grant

Filed: March 18, 1998

Date of Patent: February 6, 2001

Assignee: NEC Corporation

Inventor: Kouichirou Hirao
Character recognition using candidate frames to determine character location

Patent number: 6185338

Abstract: A character recognition method for recognizing characters on an article having multiple character-bearing areas, such as a license plate, first involves obtaining image data from an image of the article. The method then assigns at least one parameter to a selected character-bearing area on the article. The method then attempts to obtain a correct frame which expresses the correct positional relationship between the selected character-bearing area on the article with other character-bearing areas of the article, and then uses that correct frame to perform character recognition with respect to each of the character-bearing areas of the article. To obtain the correct frame, the invention compares the image data of the article with plural candidate frames. The plural candidate frames are calculated using the predetermined positional correlation between (1) the selected character-bearing area [as represented by the at least one parameter] and (2) other character-bearing areas of the article.

Type: Grant

Filed: March 21, 1997

Date of Patent: February 6, 2001

Assignee: Sharp Kabushiki Kaisha

Inventor: Mitsuaki Nakamura
Closest word algorithm

Patent number: 6137906

Abstract: A reading system includes a computer and a mass storage device including software comprising instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software convert the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The reading system has the ability therefore to display the image representation of the scanned image on a computer monitor and permit a user to control operation of the reader by with respect to the displayed image representation of the document by using the locational information associated with the converted text file. Also described are techniques for dual highlighting spoken text and a technique for determining the nearest word to a postion selected by use of mouse or other pointing device operating on the image representation as displayed on the monitor.

Type: Grant

Filed: June 27, 1997

Date of Patent: October 24, 2000

Assignee: Kurzweil Educational Systems, Inc.

Inventor: Mark S. Dionne
Character recognition method, character recognition apparatus and recording medium on which a character recognition program is recorded

Patent number: 6115506

Abstract: The invention provides a character recognition apparatus wherein wrong correction in slant correction processing of a character string is minimized to minimize erroneous recognition. A character slant estimation section receives an image, calculates slant angle candidates and evaluation values of them, and calculates a slant angle estimated value based on the evaluation values. An estimated value evaluation section receives the evaluation values, calculates an information amount of the evaluation values or the like, and outputs it as a validity of the slant angle estimated value. A slant correction section receives and normalizes the validity to a value from 0 to 1 and determines the normalized value as an execution coefficient for slant correction.

Type: Grant

Filed: May 4, 1998

Date of Patent: September 5, 2000

Assignee: NEC Corporation

Inventor: Takafumi Koshinaka
Method of grouping handwritten word segments in handwritten document images

Patent number: 6108444

Abstract: A method and system of recognizing handwritten words in scanned documents, wherein by processing a document containing handwriting, features for word localization are extracted from handwritten words contained in said document through basis points taken from a single curve of text lines. The method is independent of page orientation, and does not assume that the individual lines of handwritten text are parallel, and the method does not require that word regions be aligned with text line orientation wherein intra-word statistics are derived from sample pages rather than using a fixed threshold. The method has applications in digital libraries, handwriting tokenization, document management and OCR systems.

Type: Grant

Filed: September 29, 1997

Date of Patent: August 22, 2000

Assignee: Xerox Corporation

Inventor: Tanveer F. Syeda-Mahmood
Pattern recognizing apparatus and method

Patent number: 6104833

Abstract: An environment recognizing unit extracts the first through N-th states from an input image and calls data corresponding to the first through N-th states from the first through N-th pattern recognizing units to perform a recognizing unit.

Type: Grant

Filed: January 3, 1997

Date of Patent: August 15, 2000

Assignee: Fujitsu Limited

Inventors: Satoshi Naoi, Misako Suwa, Yoshinobu Hotta
Method and apparatus for character recognition

Patent number: 6081616

Abstract: A method for cutting character images from a line segment of pixel image data includes a first cutting layer step in which nontouching and nonoverlapping characters are cut from a line segment, and a second cutting layer step in which touching characters are cut from the line segment.

Type: Grant

Filed: July 18, 1997

Date of Patent: June 27, 2000

Assignee: Canon Kabushiki Kaisha

Inventors: Mehrzad R. Vaezi, Christopher Allen Sherrick
Character extraction apparatus, dictionary production apparatus and character recognition apparatus, using both apparatuses

Patent number: 6064769

Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.

Type: Grant

Filed: November 5, 1998

Date of Patent: May 16, 2000

Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
Automatic language identification by stroke geometry analysis

Patent number: 6064767

Abstract: A computer-implemented process identifies an unknown language used to create a document. A set of training documents is defined in a variety of known languages and formed from a variety of text styles. Black and white electronic pixel images are formed of text material forming the training documents and the document in the unknown language. A plurality of line strokes are defined from the black pixels and point features are extracted from the strokes that are effective to characterize each of the languages. Point features from the unknown language are compared with point features from the known languages to identify one of the known languages that best represents the unknown language.

Type: Grant

Filed: January 16, 1998

Date of Patent: May 16, 2000

Assignee: Regents of the University of California

Inventors: Douglas W. Muir, Timothy R. Thomas
Title extracting apparatus for extracting title from document image and method thereof

Patent number: 6035061

Abstract: A title extracting apparatus scans black pixels in a document image and extracts rectangular regions that circumscribe connected regions of the black pixels as character rectangles. In addition, the title extracting apparatus unifies a plurality of character rectangles that adjoin and extracts rectangular regions that circumscribe the character rectangles as character string rectangles. Thereafter, the title extracting apparatus calculates points with the likelihood of being a title corresponding to attributes such as an underline attribute, a frame attribute, and a ruled line attribute of each character string rectangle, the positions of the character string rectangles in the document image, and the mutual position relation and extracts a character string rectangle with the highest points as a title rectangle. In the case of a tabulated document, the title extracting apparatus can extract a title rectangle from the inside of the table.

Type: Grant

Filed: August 7, 1996

Date of Patent: March 7, 2000

Assignee: Fujitsu Limited

Inventors: Yutaka Katsuyama, Satoshi Naoi
Character strings reading device

Patent number: 6014460

Abstract: A character strings reading device for reading character strings from input image data comprises cut-out recognition means for cutting out a segment corresponding to one character from the image data to perform individual character recognition every segment, a recognition result buffer for storing a recognition result of the cut-out recognition means, word searching means for searching a word string candidate corresponding to a combination of character candidates in the recognition result buffer, a word string candidate buffer for storing a search result of the word searching means, check portion determining means for determining a check target portion and a presumed character string of the check target portion on the basis of the result in the word string candidate buffer, and check means for judging the possibility of existence of the presumed character string on the check portion.

Type: Grant

Filed: December 19, 1995

Date of Patent: January 11, 2000

Assignee: NEC Corporation

Inventors: Toshikazu Fukushima, Eiki Ishidera, Masahiko Hamanaka, Daisuke Nishiwaki
Character extraction apparatus for extracting character data from a text image

Patent number: 5999647

Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.

Type: Grant

Filed: February 28, 1996

Date of Patent: December 7, 1999

Assignee: Matsushita Electric Industrial Co., Ltd.

Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
Method and apparatus for recognizing a quadrilateral object contained in an input bitmap image

Patent number: 5966473

Abstract: Described is an image processing system which is operative to automatically determine a quadrilateral object such as a character frame, page mark, or position correction mark only with a mouse click. A field is automatically specified by displaying a scanned image of a form including a black frame on a display, clicking within a character frame at the left end for each recognition field, and clicking within a character frame at the right end of the same field. In this case, a field position/size determination program scans the image in the vertical and horizontal directions from the two clicked points to detect the inner wall of the black frame, and produces a histogram by establishing rectangles between two character frames to automatically detect the number of character frames in the field and the thickness of the black line between the character frames.

Type: Grant

Filed: October 14, 1997

Date of Patent: October 12, 1999

Assignee: International Business Machines Corporation

Inventors: Hiroyasu Takahashi, Toshimichi Arima
Method and device for removing spots from a character image in an optical character reader

Patent number: 5956433

Abstract: A method and apparatus for removing spots from character images of a multi-character image read by an image scanner. A character image is cut out from the multi-character image. Separated segments in the cut-out character image are then detected. A respective segment of the detected, separated segments is deleted as a free spot if the number of detected segments exceeds a maximum segment number. After deleting a free spot, an attempt is then made to recognize a character in the character image. When a character cannot be recognized, a black pixel width is identified by analyzing the distribution of black pixel widths in the character image. Then, a circumscribed rectangle is defined in accordance with the identified black pixel width. Pixels of images lying outside the circumscribed rectangle are deleted from the character image as an externally contacted spot.

Type: Grant

Filed: March 18, 1997

Date of Patent: September 21, 1999

Assignee: Fujitsu Limited

Inventor: Hisashi Sasaki
Image processing apparatus and method

Patent number: 5949555

Abstract: An apparatus and method for compressing an image based upon an attribute of a partial area efficiently includes a classification device for providing attribute classification information in which partial areas of an input image are classified into a variety of attributes and area information, and an image compression device for selecting an image compression method for each partial area in the above-described input image out of plural image compression methods and executing such compression.

Type: Grant

Filed: February 3, 1995

Date of Patent: September 7, 1999

Assignee: Canon Kabushiki Kaisha

Inventors: Akihiko Sakai, Eiji Ohara, Michiko Hirayu, Yuka Nagai
Apparatus and method for extracting character string

Patent number: 5949906

Abstract: A character string region extracting apparatus comprises an extracting section for extracting a plurality of primitives from image information in which a character and a graphic pattern other than the character are mixedly present, a character string candidate region forming section for generating character candidate regions from the primitives and connecting the character candidate regions, thereby forming at least one character string candidate region, a character recognizing section for subjecting the character candidate regions included in the character string candidate region to character recognition, and a character string region extracting section for extracting a character string region from the character string candidate region by the character recognition.

Type: Grant

Filed: December 7, 1995

Date of Patent: September 7, 1999

Assignee: Kabushiki Kaisha Toshiba

Inventors: Hidekata Hontani, Shigeyoshi Shimotsuji
Method and apparatus for enhancing degraded document images

Patent number: 5930393

Abstract: A technique for the enhancement of degraded document images to improve their display quality characteristics and image recognition accuracy. Images believed to be representative of the same symbol which occur in different positions over a image source (e.g., a facsimile page) are clustered together. Using the symbols within a particular cluster, an average character image outline for that cluster of symbols is derived and thereafter used to refine the matching of symbols within the cluster and to determine a final representative symbol for that cluster. The final representative symbols from the various resulting clusters are then used to replace all matching images throughout the image source. Advantageously, the display quality and recognition accuracy of the image source is enhanced after application of the present invention due to the resulting improvement of the images in the image source.

Type: Grant

Filed: August 11, 1997

Date of Patent: July 27, 1999

Assignee: Lucent Technologies Inc.

Inventors: Tin Kam Ho, John D. Hobby
Method for reconstructing linear structures present in raster form

Patent number: 5920641

Abstract: Linear structures are used to identify persons. In order to be able to combine a multiplicity of such linear structures in a database, their original images are analyzed and reconstructed using orthonormal basic functions. A preferred direction of the linear structure is determined for each pixel. A quality measure is used to evaluate the reliability of the analyzed data. Singularities (SI) and minutiae (MI) are extracted and stored.

Type: Grant

Filed: March 6, 1997

Date of Patent: July 6, 1999

Assignee: Siemens Nixdorf Informationssysteme Aktiengesellschaft

Inventors: Birgit Ueberreiter, Joachim Dengler
Character segmentation technique with integrated word search for handwriting recognition

Patent number: 5917941

Abstract: After each complete stroke in a handwriting recognition process, a hypothesis is generated whether a word break is present between the previous stroke and the new stroke. This hypothesis is weighted with a probability of a word-break occurring between the strokes. This probability is determined from the geometrical relationships between characters. Subsequently, a word search is carried out on the basis of these weighted hypotheses, to identity the most likely candidates for the words represented by the written strokes. A user interface is provided that offers the user a limited list of alternative word recognitions for a group of characters. These recognitions undergo segmentation filtering, in accordance with the word breaks of the selected hypotheses, to present the user with only those alternatives having the same groupings of strokes.

Type: Grant

Filed: August 8, 1995

Date of Patent: June 29, 1999

Assignee: Apple Computer, Inc.

Inventors: Brandyn Webb, Larry S. Yaeger
Method of enhancing the signal-to-noise within the pixel image of a hand entered symbol

Patent number: 5912996

Abstract: An input carrier sheet 12C for document distribution system 10 carries input symbols hand entered by the user into pre-existing constraint grids 12. The constraint grids may be printed in continuous tone or halftone. The print only partially covers the underlying carrier, permitting the exposed carrier to reflect light. The grids have sufficient pigment to be visible to the user, but insufficient pigment to form foreground pixels along with the hand-entered stroke when detected during the scanning. The signal (symbol)-to-noise (carrier) ratio is enhanced by reducing the pigment content of the constraint grids which increases the reflectivity of the grids. The S/N may be further enhanced by placing the strokes of the hand-entered symbols on top of the grid which occults some of the grid pigment. The S/N is further enhanced by highly reflective brightening agents in the grid print, and by aperture effect during scanning.

Type: Grant

Filed: March 6, 1997

Date of Patent: June 15, 1999

Assignee: Canon Kabushiki Kaisha

Inventor: Roger D. Melen
Character recognition method and system

Patent number: 5911005

Abstract: The current invention is directed to further improve the character recognition process based upon the comparison of identification value in a sample image and a reference image by adjusting the identification value of the sample image. The adjustment is made based upon a predetermined feature of the sub-area or a mesh region of the images. The desired improvement in accuracy is obtained especially for recognizing handwritten characters.

Type: Grant

Filed: November 20, 1995

Date of Patent: June 8, 1999

Assignee: Ricoh Company, Ltd.

Inventor: Yukinaka Uchiyama
Image extraction system

Patent number: 5907630

Abstract: An image extraction system includes a connected pattern extracting part for extracting partial patterns respectively having connected pixels from an image which is formed by a block frame having a table format and including one-character frames or a free format frame, characters, graphics or symbols, a one-character frame extracting part for extracting one-character frames from the image based on the partial patterns extracted by the connected pattern extracting part, a straight line extracting part for extracting straight lines from the partial patterns which are extracted by the connected pattern extracting part and is eliminated of the one-character frames by the one-character frame extracting part, a frame detecting part for detecting straight lines forming the frame from the straight lines extracted by the straight line extracting part, and a frame separating part for separating the straight lines detected by the frame detecting part from the partial patterns so as to extract the characters, graphics or

Type: Grant

Filed: August 26, 1996

Date of Patent: May 25, 1999

Assignee: Fujitsu Limited

Inventors: Satoshi Naoi, Atsuko Asakawa, Maki Yabuki, Yoshinobu Hotta

prev … 7 8 9 10 11 12 next