Segmenting Individual Characters Or Words Patents (Class 382/177)
  • Patent number: 6526170
    Abstract: A character recognition system is disclosed, In a feature extraction parameter storage section 22 a transformation matrix for reducing a number of dimensions of feature parameters and a codebook for quantization are stored. In an HMM storage section 23 a constitution and parameters of Hidden Markov Model (HMM) for character string expression are stored. A feature extraction section 32 scans a word image given from an image storage means from left to right in a predetermined cycle with a slit having a sufficiently small width than the character width and thus outputs a feature symbol at each predetermined timing. A matching section 33 matches a feature symbol row and a probability maximization HMM state, thereby recognizing the character string.
    Type: Grant
    Filed: December 13, 1994
    Date of Patent: February 25, 2003
    Assignee: NEC Corporation
    Inventor: Shinji Matsumoto
  • Patent number: 6519363
    Abstract: This invention discloses a method for automatically segmenting and recognizing Chinese character strings continuously written by a user in a handwritten Chinese character processing system, comprising the steps of: creating a geometry model and a language mode; finding out all of potential segmentation schemes in the Chinese character strings continuously written by a user based on the associated timing information and said geometry model; recognizing the groups of strokes as defined by each of potential segmentation schemes and computing the probability characterizing the exactness of recognition results; correcting the probability characterizing the exactness of recognition results by said language model; and, selecting the recognition result and the corresponding segmentation scheme having the maximum probability value.
    Type: Grant
    Filed: January 12, 2000
    Date of Patent: February 11, 2003
    Assignee: International Business Machines Corporation
    Inventors: Hui Su, Donald T. Tang, Qian Ying Wang
  • Patent number: 6501856
    Abstract: A scheme for detecting telop character displaying frames in video image which is capable of suppressing erroneous detection of frames without telop characters due to instability of image features is disclosed. In this scheme, each input frame constituting the video data is entered, and whether each input frame is a telop character displaying frame in which telop characters are displayed or not is judged, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on some scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame.
    Type: Grant
    Filed: September 28, 2001
    Date of Patent: December 31, 2002
    Assignee: Nippon Telegraph and Telephone Corporation
    Inventors: Hidetaka Kuwano, Hiroyuki Arai, Shoji Kurakake, Kenji Ogura, Toshiaki Sugimura, Minoru Mori, Minoru Takahata
  • Publication number: 20020172422
    Abstract: Image size converter 4 converts the size of the image data stored in image input part 1 to an arbitrary size and stores the converted data. Image enhancer uses the character frame design data stored in character frame information memory 3 to extract, from the image stored in image size converter 4, an image of a region containing character frames, and enhances and stores this extracted image. Image outline detector 6 forms an outline image from the image obtained by image enhancer 5. Character frame center detector 7 uses the outline image to detect the coordinates of the centers of the character frames of the input image data. Character frame remover 8 uses the character frame center coordinates and the character frame design data to remove the character frames, and outputs the result from character image output part 9.
    Type: Application
    Filed: May 14, 2002
    Publication date: November 21, 2002
    Applicant: NEC Corporation
    Inventor: Daisuke Nishiwaki
  • Patent number: 6466694
    Abstract: A processing device performs region identification of an input image, and then performs an intra-region recognition process. The type code of each region and the individual code of a recognition result are then displayed, so that a user can modify both of the results of the region identification and the recognition process at one time. Furthermore, the processing device displays an original image close to the recognition result. If no correct answer exists among recognition candidates, code is added to the original image, and the original image with the code added is handled as a recognition result.
    Type: Grant
    Filed: April 16, 1998
    Date of Patent: October 15, 2002
    Assignee: Fujitsu Limited
    Inventors: Hiroshi Kamada, Katsuhito Fujimoto, Koji Kurokawa
  • Patent number: 6466211
    Abstract: Data visualization apparatuses, computer-readable mediums, computer data signals embodied in a transmission medium, data visualization methods, and digital computer data visualization methods are provided. According to one aspect of the present invention, a data visualization apparatus includes an image device configured to provide a visual image; and digital processing circuitry coupled with the image device and configured to access data including a plurality of themes, to generate a thematic illustration corresponding to the themes and having a plurality of outer contour lines which are spaced at varying distances relative to a reference line, and to control the image device to depict the thematic illustration.
    Type: Grant
    Filed: October 22, 1999
    Date of Patent: October 15, 2002
    Assignee: Battelle Memorial Institute
    Inventors: Susan L. Havre, Elizabeth G. Hetzler, Lucy T. Nowell, Paul D. Whitney, Feng Gao, James J. Thomas, Louis M. Martucci, W. Michelle Harris
  • Patent number: 6459810
    Abstract: An exemplary embodiment of the invention is a method for forming variant search strings. The method includes receiving a search string and parsing the search string to locate a mistaken search string character. A mistaken search string character is a character which is confused with other characters. A variant search string is formed in response to a presence of a mistaken search string character in the search string. The search string and variant search string may then be used to search a database. Another exemplary embodiment of the invention is a system for forming variant search strings. The system includes a user interface for receiving a search string. A variant search string generator parses the search string to locate a mistaken search string character. The mistaken search string character is a character which is confused with other characters. The variant search string generator forms a variant search string in response to a presence of a mistaken search string character in the search string.
    Type: Grant
    Filed: September 3, 1999
    Date of Patent: October 1, 2002
    Assignee: International Business Machines Corporation
    Inventor: Christopher T. Cring
  • Patent number: 6456739
    Abstract: A character image is inputted by use of a scanner, and recognized. The resultant character string of such recognition is represented on a display. The image serving as recognition source of the character designated on the display screen thereof, and the image in the vicinity of such image are represented. A character frame, which can discriminate the character image serving as recognition source, is edited in order to designate a new character image. This image and the inputted character information are registered on a character recognition dictionary correspondingly. Thereafter, the character recognition is carried out even with the utilization of such newly registered character. As a result, the recognition rate of the character recognition increases one after another.
    Type: Grant
    Filed: June 18, 1996
    Date of Patent: September 24, 2002
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hiroaki Ikeda
  • Publication number: 20020118876
    Abstract: Character (or letter) information is extracted from source information, word information is extracted from the character information, and a database is created of the word information. Thereby, the created database is adapted for the technical field of the user or a field of interest to the user.
    Type: Application
    Filed: October 22, 2001
    Publication date: August 29, 2002
    Inventors: Hidetaka Magoshi, Nobuo Sasaki
  • Publication number: 20020114515
    Abstract: A key word is first and automatically extracted from a character string group to be recognized, and entered. Then, a character is recognized by segmenting an individual character from a character string image to be recognized, and a character string corresponding to the extracted/entered key word id extracted. Then, a word area delimited by a key word is extracted from the character string image, and a word is recognized. Furthermore, a word recognition result is verified, and a final character string recognition result is output.
    Type: Application
    Filed: December 18, 2001
    Publication date: August 22, 2002
    Applicant: Fujitsu Limited
    Inventors: Yoshinobu Hotta, Katsuhito Fujimoto, Satoshi Naoi, Misako Suwa
  • Patent number: 6408094
    Abstract: A system and method in accordance with the present invention includes a scanning assembly and a storage device coupled to a programmed computer with a set of instructions for carrying out an assessment of a document image. The system and method operate by: processing the document image to obtain one or more attributes related to the geometrical integrity of the document image; selecting a threshold value from a database for each of the obtained attributes; and then comparing each of the obtained attributes against the threshold value selected for the obtained attribute to determine a difference for each and then evaluating one or more of the differences using predetermined criteria to provide evaluation results of the geometrical integrity of the document image.
    Type: Grant
    Filed: November 4, 1997
    Date of Patent: June 18, 2002
    Assignee: Eastman Kodak Company
    Inventors: Alexander David Mirzaoff, Thaddeus Francis Pawlicki
  • Publication number: 20020071606
    Abstract: A character recognition section generates character recognition result information resulting from character recognition of image information. An image information cutout section cuts out character recognition image information, corresponding to an area as to which the character recognition is performed, from the image information. A recognition result generation section generates recognition result information which is composed of the character recognition result information and the character recognition image information. A recognition result transmission section transmits the recognition result information to other terminals using electronic mail. As a result, an information communications apparatus of the invention can make transmissions of information to the wide area, without increasing the network load, which is to be used for the determination of whether or not a character recognition has been accurately performed.
    Type: Application
    Filed: November 29, 2001
    Publication date: June 13, 2002
    Applicant: MATSUSHITA GRAPHIC COMMUNICATION SYSTEMS, INC.
    Inventors: Shinichi Watanabe, Hideki Honma
  • Patent number: 6366908
    Abstract: A keyfact-based text retrieval method and a keyfact-based text index method that describes the formalized concept of a document by a pair comprising an object that is the head and a property that is the modifier and uses the information described by the pairs as index information for efficient document retrieval. A keyfact-based text retrieval system includes keyfact extracting, keyfact indexing, and keyfact retrieving. The keyfact extracting analyzes a document collection and a query and extracts keywords and keyfacts. The keywords do not have part-of-speech ambiguity and the keyfacts are extracted from the keywords. The keyfact indexing calculates the frequency of the keyfacts and generates a keyfact list of the document collection for a keyfact index structure. The keyfact retrieving receive a keyfact of the query and keyfacts of the document collection and defines a keyfact-based retrieval model in consideration of a weight factor of the keyfact pattern and generates a retrieval result.
    Type: Grant
    Filed: December 30, 1999
    Date of Patent: April 2, 2002
    Assignee: Electronics and Telecommunications Research Institute
    Inventors: Kyung Taek Chong, Myung-Gil Jang, MiSeon Jun, Se Young Park
  • Publication number: 20020012462
    Abstract: An image processing method or device invented to reduce the ratio of erroneously recognized non-character elements in optical character recognition (OCR) regarding a color document that includes character images and other types of images, wherein the extracted character image data is checked to determine whether a color change exists in each character image, and wherein if no color change exists, the character image data is converted into character code data, but where a color change does exist, the character image data is not converted into character code data.
    Type: Application
    Filed: June 4, 2001
    Publication date: January 31, 2002
    Inventor: Yoko Fujiwara
  • Publication number: 20020009226
    Abstract: A handwritten character recognition apparatus has a character string input area of a size that allows a user to hand write a plurality of characters thereon using a stylus. A coordinate detection unit extracts a coordinate string for each stroke that forms the handwritten character string. An input completion judgement unit judges an immediately preceding handwritten character string to be complete if a time difference between a last coordinate of an immediately preceding stroke and a first coordinate of a stroke being input is at least a predetermined time, when the first coordinate of the stroke is detected in a first area of the character string input area. A character segmentation unit segments a stroke string for each character from all the strokes of the previously input hand written character string from which a character recognition unit recognizes each character and outputs a character string which is the recognition result.
    Type: Application
    Filed: April 19, 2001
    Publication date: January 24, 2002
    Inventors: Ichiro Nakao, Yoshikatsu Ito
  • Patent number: 6339651
    Abstract: A method and system for recognizing the characters on surfaces where alphanumeric identification code (“ID” for short) may be present such as a license plate. The present system is particularly adapted for situations where visual distortions can occur, and utilizes a highly robust method for recognizing the characters of an ID. Multiple character recovery schemes are applied to account for a variety of conditions to ensure high accuracy in identifying the ID. Accuracy is greatly enhanced by taking a comprehensive approach where multiple criteria are taken into consideration before any conclusions are drawn. Special considerations are given to recognizing the ID as a whole and not just the individual characters.
    Type: Grant
    Filed: February 25, 1998
    Date of Patent: January 15, 2002
    Assignee: Kent Ridge Digital Labs
    Inventors: Qi Tian, Kong Wah Wan, Karianto Leman, Chade Meng Tan, Chun Biao Guo
  • Patent number: 6327385
    Abstract: A character segmentation system for segmentation out a character from a string of characters which are in touch with each other, which is capable of being executed on a small size hardware resource without influence of variation of touching condition due to difference of character font, comprises an image storing unit 110 for storing an electronic image of character string obtained by such means as optical scanning, a partial pattern dictionary 122 for storing partial pattern shapes used as features for specifying fonts of character, a partial pattern detecting unit 121 for extracting areas of the image of character string, which coincide with a partial pattern, a character font determining unit 123 for determining the font of character on the basis of positions of the areas of the image of character string, which coincide with the partial pattern, and the number of the areas, a feature extraction inhibited area dictionary 132 for storing areas in which feature extraction processing for respective fonts of ch
    Type: Grant
    Filed: November 10, 1998
    Date of Patent: December 4, 2001
    Assignee: NEC Corporation
    Inventor: Masaaki Kamitani
  • Patent number: 6327382
    Abstract: It is an object of the present invention to appropriately extract areas for character recognition from a color image. It is another object of the present invention to separate and extract characters from a background color in a color image if the background of the manuscript is not white and if the characters are printed in a portion having a color that is not commonly used all over the image. To achieve these objects, this invention binarizes an input color image in a plurality of stages and extracts area from binary images obtained in each stage to enable areas and text sections to be appropriately extracted despite the unknown colors of the characters and background contained in the input color image.
    Type: Grant
    Filed: January 29, 1999
    Date of Patent: December 4, 2001
    Assignee: Canon Kabushiki Kaisha
    Inventors: Kitahiro Kaneda, Toshiaki Yagasaki
  • Patent number: 6327384
    Abstract: In a projection means black pixel histograms of a binary stationary image are generated in both the vertical and the horizontal direction. In a text type judgment means, in accordance with these histograms, a determination is made of whether the image is vertical text or horizontal text. Based on the result of this determination, a pattern block extraction means extracts either a column or a row from the image. The block is further projected and divided into smaller blocks. Then projection is again applied to these divided blocks and patterns are extracted by a pattern extraction means. A judgment is made as to whether or not joining of the extracted patterns is to be performed and, if joining is required, they are joined by a pattern joining means and finally the offsets of all the extracted patterns are calculated, whereupon data (of extracted patterns) are sent to a pattern matching process.
    Type: Grant
    Filed: November 13, 1997
    Date of Patent: December 4, 2001
    Assignee: NEC Corporation
    Inventors: Kouichirou Hirao, Keiji Yamada, Takahiro Hongu, Takashi Mochizuki, Mitsutoshi Arai
  • Patent number: 6289109
    Abstract: An apparatus for determining the location and content of data blocks on a mailpiece includes a computer connected to a structure for obtaining a digital bit map image of an outer surface of a mailpiece. The computer includes structure programmed for: finding each run of a plurality of black bits of each scan line of the bit map image and determining if any bit thereof neighbors at least one black bit of another scan line; combining the found run with each neighboring bit to form a piece; assigning a descriptive value to a block having at least one piece and comparing the descriptive value to a list of values to determine which type of data block the block having the descriptive value is.
    Type: Grant
    Filed: December 29, 1993
    Date of Patent: September 11, 2001
    Assignee: Pitney Bowes Inc.
    Inventors: Ronald E. Gocht, Leon A. Pintsov
  • Patent number: 6282315
    Abstract: A method for entering data into a computer generated form including field areas of preselected height and width includes the steps of converting handwritten characters of arbitrary height which may be greater than the preselected height formed on the screen to computer generated characters and displaying the computer generated characters within a field area. Additionally, handwritten characters to be entered into several field areas are grouped, converted, and displayed in selected field areas.
    Type: Grant
    Filed: July 2, 1997
    Date of Patent: August 28, 2001
    Assignee: Samsung Electronics, Ltd.
    Inventor: Monty L. Boyer
  • Patent number: 6269188
    Abstract: The present invention is a computer-implemented method for calculating word accuracy. Word grouping accuracy values (260) are calculated (212) by using the character accuracy values (250) calculated by an OCR program present in a computer system. The present invention preferably uses these character accuracy values (250) to create a word grouping accuracy value (260). Various methods are employed to calculate the word accuracy (260), including binarizing the character accuracy values (250), modified averaging of the character accuracy values (250), and creating fuzzy visual displays of word grouping accuracy values (260). The calculated word grouping accuracy values (260) are then adjusted based upon known OCR strengths and weaknesses, and based upon comparisons to stored word lists and the application of language rules. In a system with multiple character recognition techniques, the system can compare the accuracy values (260) of different versions of the word groupings to find the most accurate version.
    Type: Grant
    Filed: March 12, 1998
    Date of Patent: July 31, 2001
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hamadi Jamali
  • Patent number: 6249604
    Abstract: A method for determining the boundaries of a symbol or word string within an image, including the steps of determining page orientation, isolating symbol strings from adjacent symbol strings, establishing a set of boundaries or references with respect to which measurements about, or further processing of, the symbol string may be made.
    Type: Grant
    Filed: February 14, 1994
    Date of Patent: June 19, 2001
    Assignee: Xerox Corporation
    Inventors: Daniel P. Huttenlocher, Peter C. Wayner, Michael J. Hopcroft
  • Patent number: 6249353
    Abstract: The image editing apparatus of the present invention reads text image having a plurality of character line image along to a predetermined direction, makes histogram expressing the distribution characteristics of said text image, detects said character line image having a predetermined size based on said histogram, and performs editing process for said character line image having said predetermined size.
    Type: Grant
    Filed: August 7, 1996
    Date of Patent: June 19, 2001
    Assignee: Minolta Co., Ltd.
    Inventors: Akinori Yoshida, Shigeru Sawada, Takao Fujiwara
  • Patent number: 6246794
    Abstract: A character reading method has enhanced character segmentation accuracy and character string recognition accuracy for reading correctly hand-written addresses on postal matters. The method extracts provisional character patterns from image information of the address character string (step 206), creates a table 219 of tentative character patterns and implements the character classification for the tentative character patterns (step 207), extracts, specifically for characters of the street number portion of the address character string, periphery information (vertical and horizontal lengths, vertical/horizontal length ratio, pattern spacings, etc.) of tentative character patterns (step 212), and segments the character string into characters accurately based on the information (step 215).
    Type: Grant
    Filed: December 11, 1996
    Date of Patent: June 12, 2001
    Assignee: Hitachi, Ltd.
    Inventors: Tatsuhiko Kagehiro, Masashi Koga, Hiroshi Sako, Hiromichi Fujisawa, Hisao Ogata, Yoshihiro Shima, Shigeru Watanabe, Masato Teramoto
  • Patent number: 6212294
    Abstract: An image processor which receives an image and assigns position of an arbitrary pixel in the image. An image block is extracted from the received image, and start and end positions of a designated area of the received image are acquired. A designated image block in the extracted image block is designated for processing in accordance with the start and end positions of the designated area of the received image.
    Type: Grant
    Filed: February 28, 1997
    Date of Patent: April 3, 2001
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hiroaki Ikeda
  • Patent number: 6189006
    Abstract: A full-text data base retrieving device retrieves a data base in accordance with a query. A full-text index has character location information representative of location of each of key character sequences of N characters that appear in the data base, where N is a positive integer. A query memory memorizes the query as a retrieval key character sequence. A separating section separates the retrieval key character sequence into a plurality of retrieval key character sequences of N characters to extract contexts as extracted contexts from the retrieval key character sequence in accordance with the retrieval key character sequences. A context classifying section classifies the extracted contexts into classified contexts having the classification numbers, respectively. An index retrieving section retrieves the full-text index in accordance with the sorts of the retrieval key character sequences and the classified contexts to read the character location information as a retrieval result out of the full-text index.
    Type: Grant
    Filed: March 2, 1999
    Date of Patent: February 13, 2001
    Assignee: NEC Corporation
    Inventor: Toshikazu Fukushima
  • Patent number: 6188790
    Abstract: An apparatus for recognizing characters read by a reading unit. A circumscribing rectangle of a read character is formed, and the degree of narrowness of that circumscribing rectangle is acquired. Characters having a degree of narrowness that is equal to or greater than a predetermined value are selected and blank areas are added to the circumscribing rectangle to yield a character area with a corrected degree of narrowness. The character is normalized by converting the character area to a specified size, and is recognized based on the normalized character. It is therefore possible to normalize even characters significantly elongated vertically or horizontally for easier recognition and to group their character patterns.
    Type: Grant
    Filed: February 26, 1997
    Date of Patent: February 13, 2001
    Assignee: Tottori Sanyo Electric Ltd.
    Inventors: Takatoshi Yoshikawa, Hiromitsu Kawajiri, Hiroshi Horii, Junji Tanaka
  • Patent number: 6185330
    Abstract: A pattern matching encoding device for executing pattern matching encoding of binary still images and a pattern matching decoding device corresponding to the pattern matching encoding device are proposed. In the pattern matching encoding, each input pattern extracted from the input image is matched against library patterns in the library, and the input pattern is encoded using a matched library pattern as a reference pattern if the matched library pattern is found. The pattern matching encoding device comprises a pattern segmentation section for segmenting each of selected library patterns and the input pattern into two or more parts and thereby generating segmented library patterns and segmented input patterns, a matching section for matching each of the segmented input patterns against corresponding segmented library patterns, and a pattern combination section for generating a new library pattern by combining the segmented library patterns each of which has matched one of the segmented input patterns.
    Type: Grant
    Filed: March 18, 1998
    Date of Patent: February 6, 2001
    Assignee: NEC Corporation
    Inventor: Kouichirou Hirao
  • Patent number: 6185338
    Abstract: A character recognition method for recognizing characters on an article having multiple character-bearing areas, such as a license plate, first involves obtaining image data from an image of the article. The method then assigns at least one parameter to a selected character-bearing area on the article. The method then attempts to obtain a correct frame which expresses the correct positional relationship between the selected character-bearing area on the article with other character-bearing areas of the article, and then uses that correct frame to perform character recognition with respect to each of the character-bearing areas of the article. To obtain the correct frame, the invention compares the image data of the article with plural candidate frames. The plural candidate frames are calculated using the predetermined positional correlation between (1) the selected character-bearing area [as represented by the at least one parameter] and (2) other character-bearing areas of the article.
    Type: Grant
    Filed: March 21, 1997
    Date of Patent: February 6, 2001
    Assignee: Sharp Kabushiki Kaisha
    Inventor: Mitsuaki Nakamura
  • Patent number: 6137906
    Abstract: A reading system includes a computer and a mass storage device including software comprising instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software convert the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The reading system has the ability therefore to display the image representation of the scanned image on a computer monitor and permit a user to control operation of the reader by with respect to the displayed image representation of the document by using the locational information associated with the converted text file. Also described are techniques for dual highlighting spoken text and a technique for determining the nearest word to a postion selected by use of mouse or other pointing device operating on the image representation as displayed on the monitor.
    Type: Grant
    Filed: June 27, 1997
    Date of Patent: October 24, 2000
    Assignee: Kurzweil Educational Systems, Inc.
    Inventor: Mark S. Dionne
  • Patent number: 6115506
    Abstract: The invention provides a character recognition apparatus wherein wrong correction in slant correction processing of a character string is minimized to minimize erroneous recognition. A character slant estimation section receives an image, calculates slant angle candidates and evaluation values of them, and calculates a slant angle estimated value based on the evaluation values. An estimated value evaluation section receives the evaluation values, calculates an information amount of the evaluation values or the like, and outputs it as a validity of the slant angle estimated value. A slant correction section receives and normalizes the validity to a value from 0 to 1 and determines the normalized value as an execution coefficient for slant correction.
    Type: Grant
    Filed: May 4, 1998
    Date of Patent: September 5, 2000
    Assignee: NEC Corporation
    Inventor: Takafumi Koshinaka
  • Patent number: 6108444
    Abstract: A method and system of recognizing handwritten words in scanned documents, wherein by processing a document containing handwriting, features for word localization are extracted from handwritten words contained in said document through basis points taken from a single curve of text lines. The method is independent of page orientation, and does not assume that the individual lines of handwritten text are parallel, and the method does not require that word regions be aligned with text line orientation wherein intra-word statistics are derived from sample pages rather than using a fixed threshold. The method has applications in digital libraries, handwriting tokenization, document management and OCR systems.
    Type: Grant
    Filed: September 29, 1997
    Date of Patent: August 22, 2000
    Assignee: Xerox Corporation
    Inventor: Tanveer F. Syeda-Mahmood
  • Patent number: 6104833
    Abstract: An environment recognizing unit extracts the first through N-th states from an input image and calls data corresponding to the first through N-th states from the first through N-th pattern recognizing units to perform a recognizing unit.
    Type: Grant
    Filed: January 3, 1997
    Date of Patent: August 15, 2000
    Assignee: Fujitsu Limited
    Inventors: Satoshi Naoi, Misako Suwa, Yoshinobu Hotta
  • Patent number: 6081616
    Abstract: A method for cutting character images from a line segment of pixel image data includes a first cutting layer step in which nontouching and nonoverlapping characters are cut from a line segment, and a second cutting layer step in which touching characters are cut from the line segment.
    Type: Grant
    Filed: July 18, 1997
    Date of Patent: June 27, 2000
    Assignee: Canon Kabushiki Kaisha
    Inventors: Mehrzad R. Vaezi, Christopher Allen Sherrick
  • Patent number: 6064769
    Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.
    Type: Grant
    Filed: November 5, 1998
    Date of Patent: May 16, 2000
    Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
  • Patent number: 6064767
    Abstract: A computer-implemented process identifies an unknown language used to create a document. A set of training documents is defined in a variety of known languages and formed from a variety of text styles. Black and white electronic pixel images are formed of text material forming the training documents and the document in the unknown language. A plurality of line strokes are defined from the black pixels and point features are extracted from the strokes that are effective to characterize each of the languages. Point features from the unknown language are compared with point features from the known languages to identify one of the known languages that best represents the unknown language.
    Type: Grant
    Filed: January 16, 1998
    Date of Patent: May 16, 2000
    Assignee: Regents of the University of California
    Inventors: Douglas W. Muir, Timothy R. Thomas
  • Patent number: 6035061
    Abstract: A title extracting apparatus scans black pixels in a document image and extracts rectangular regions that circumscribe connected regions of the black pixels as character rectangles. In addition, the title extracting apparatus unifies a plurality of character rectangles that adjoin and extracts rectangular regions that circumscribe the character rectangles as character string rectangles. Thereafter, the title extracting apparatus calculates points with the likelihood of being a title corresponding to attributes such as an underline attribute, a frame attribute, and a ruled line attribute of each character string rectangle, the positions of the character string rectangles in the document image, and the mutual position relation and extracts a character string rectangle with the highest points as a title rectangle. In the case of a tabulated document, the title extracting apparatus can extract a title rectangle from the inside of the table.
    Type: Grant
    Filed: August 7, 1996
    Date of Patent: March 7, 2000
    Assignee: Fujitsu Limited
    Inventors: Yutaka Katsuyama, Satoshi Naoi
  • Patent number: 6014460
    Abstract: A character strings reading device for reading character strings from input image data comprises cut-out recognition means for cutting out a segment corresponding to one character from the image data to perform individual character recognition every segment, a recognition result buffer for storing a recognition result of the cut-out recognition means, word searching means for searching a word string candidate corresponding to a combination of character candidates in the recognition result buffer, a word string candidate buffer for storing a search result of the word searching means, check portion determining means for determining a check target portion and a presumed character string of the check target portion on the basis of the result in the word string candidate buffer, and check means for judging the possibility of existence of the presumed character string on the check portion.
    Type: Grant
    Filed: December 19, 1995
    Date of Patent: January 11, 2000
    Assignee: NEC Corporation
    Inventors: Toshikazu Fukushima, Eiki Ishidera, Masahiko Hamanaka, Daisuke Nishiwaki
  • Patent number: 5999647
    Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.
    Type: Grant
    Filed: February 28, 1996
    Date of Patent: December 7, 1999
    Assignee: Matsushita Electric Industrial Co., Ltd.
    Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
  • Patent number: 5966473
    Abstract: Described is an image processing system which is operative to automatically determine a quadrilateral object such as a character frame, page mark, or position correction mark only with a mouse click. A field is automatically specified by displaying a scanned image of a form including a black frame on a display, clicking within a character frame at the left end for each recognition field, and clicking within a character frame at the right end of the same field. In this case, a field position/size determination program scans the image in the vertical and horizontal directions from the two clicked points to detect the inner wall of the black frame, and produces a histogram by establishing rectangles between two character frames to automatically detect the number of character frames in the field and the thickness of the black line between the character frames.
    Type: Grant
    Filed: October 14, 1997
    Date of Patent: October 12, 1999
    Assignee: International Business Machines Corporation
    Inventors: Hiroyasu Takahashi, Toshimichi Arima
  • Patent number: 5956433
    Abstract: A method and apparatus for removing spots from character images of a multi-character image read by an image scanner. A character image is cut out from the multi-character image. Separated segments in the cut-out character image are then detected. A respective segment of the detected, separated segments is deleted as a free spot if the number of detected segments exceeds a maximum segment number. After deleting a free spot, an attempt is then made to recognize a character in the character image. When a character cannot be recognized, a black pixel width is identified by analyzing the distribution of black pixel widths in the character image. Then, a circumscribed rectangle is defined in accordance with the identified black pixel width. Pixels of images lying outside the circumscribed rectangle are deleted from the character image as an externally contacted spot.
    Type: Grant
    Filed: March 18, 1997
    Date of Patent: September 21, 1999
    Assignee: Fujitsu Limited
    Inventor: Hisashi Sasaki
  • Patent number: 5949555
    Abstract: An apparatus and method for compressing an image based upon an attribute of a partial area efficiently includes a classification device for providing attribute classification information in which partial areas of an input image are classified into a variety of attributes and area information, and an image compression device for selecting an image compression method for each partial area in the above-described input image out of plural image compression methods and executing such compression.
    Type: Grant
    Filed: February 3, 1995
    Date of Patent: September 7, 1999
    Assignee: Canon Kabushiki Kaisha
    Inventors: Akihiko Sakai, Eiji Ohara, Michiko Hirayu, Yuka Nagai
  • Patent number: 5949906
    Abstract: A character string region extracting apparatus comprises an extracting section for extracting a plurality of primitives from image information in which a character and a graphic pattern other than the character are mixedly present, a character string candidate region forming section for generating character candidate regions from the primitives and connecting the character candidate regions, thereby forming at least one character string candidate region, a character recognizing section for subjecting the character candidate regions included in the character string candidate region to character recognition, and a character string region extracting section for extracting a character string region from the character string candidate region by the character recognition.
    Type: Grant
    Filed: December 7, 1995
    Date of Patent: September 7, 1999
    Assignee: Kabushiki Kaisha Toshiba
    Inventors: Hidekata Hontani, Shigeyoshi Shimotsuji
  • Patent number: 5930393
    Abstract: A technique for the enhancement of degraded document images to improve their display quality characteristics and image recognition accuracy. Images believed to be representative of the same symbol which occur in different positions over a image source (e.g., a facsimile page) are clustered together. Using the symbols within a particular cluster, an average character image outline for that cluster of symbols is derived and thereafter used to refine the matching of symbols within the cluster and to determine a final representative symbol for that cluster. The final representative symbols from the various resulting clusters are then used to replace all matching images throughout the image source. Advantageously, the display quality and recognition accuracy of the image source is enhanced after application of the present invention due to the resulting improvement of the images in the image source.
    Type: Grant
    Filed: August 11, 1997
    Date of Patent: July 27, 1999
    Assignee: Lucent Technologies Inc.
    Inventors: Tin Kam Ho, John D. Hobby
  • Patent number: 5920641
    Abstract: Linear structures are used to identify persons. In order to be able to combine a multiplicity of such linear structures in a database, their original images are analyzed and reconstructed using orthonormal basic functions. A preferred direction of the linear structure is determined for each pixel. A quality measure is used to evaluate the reliability of the analyzed data. Singularities (SI) and minutiae (MI) are extracted and stored.
    Type: Grant
    Filed: March 6, 1997
    Date of Patent: July 6, 1999
    Assignee: Siemens Nixdorf Informationssysteme Aktiengesellschaft
    Inventors: Birgit Ueberreiter, Joachim Dengler
  • Patent number: 5917941
    Abstract: After each complete stroke in a handwriting recognition process, a hypothesis is generated whether a word break is present between the previous stroke and the new stroke. This hypothesis is weighted with a probability of a word-break occurring between the strokes. This probability is determined from the geometrical relationships between characters. Subsequently, a word search is carried out on the basis of these weighted hypotheses, to identity the most likely candidates for the words represented by the written strokes. A user interface is provided that offers the user a limited list of alternative word recognitions for a group of characters. These recognitions undergo segmentation filtering, in accordance with the word breaks of the selected hypotheses, to present the user with only those alternatives having the same groupings of strokes.
    Type: Grant
    Filed: August 8, 1995
    Date of Patent: June 29, 1999
    Assignee: Apple Computer, Inc.
    Inventors: Brandyn Webb, Larry S. Yaeger
  • Patent number: 5912996
    Abstract: An input carrier sheet 12C for document distribution system 10 carries input symbols hand entered by the user into pre-existing constraint grids 12. The constraint grids may be printed in continuous tone or halftone. The print only partially covers the underlying carrier, permitting the exposed carrier to reflect light. The grids have sufficient pigment to be visible to the user, but insufficient pigment to form foreground pixels along with the hand-entered stroke when detected during the scanning. The signal (symbol)-to-noise (carrier) ratio is enhanced by reducing the pigment content of the constraint grids which increases the reflectivity of the grids. The S/N may be further enhanced by placing the strokes of the hand-entered symbols on top of the grid which occults some of the grid pigment. The S/N is further enhanced by highly reflective brightening agents in the grid print, and by aperture effect during scanning.
    Type: Grant
    Filed: March 6, 1997
    Date of Patent: June 15, 1999
    Assignee: Canon Kabushiki Kaisha
    Inventor: Roger D. Melen
  • Patent number: 5911005
    Abstract: The current invention is directed to further improve the character recognition process based upon the comparison of identification value in a sample image and a reference image by adjusting the identification value of the sample image. The adjustment is made based upon a predetermined feature of the sub-area or a mesh region of the images. The desired improvement in accuracy is obtained especially for recognizing handwritten characters.
    Type: Grant
    Filed: November 20, 1995
    Date of Patent: June 8, 1999
    Assignee: Ricoh Company, Ltd.
    Inventor: Yukinaka Uchiyama
  • Patent number: 5907630
    Abstract: An image extraction system includes a connected pattern extracting part for extracting partial patterns respectively having connected pixels from an image which is formed by a block frame having a table format and including one-character frames or a free format frame, characters, graphics or symbols, a one-character frame extracting part for extracting one-character frames from the image based on the partial patterns extracted by the connected pattern extracting part, a straight line extracting part for extracting straight lines from the partial patterns which are extracted by the connected pattern extracting part and is eliminated of the one-character frames by the one-character frame extracting part, a frame detecting part for detecting straight lines forming the frame from the straight lines extracted by the straight line extracting part, and a frame separating part for separating the straight lines detected by the frame detecting part from the partial patterns so as to extract the characters, graphics or
    Type: Grant
    Filed: August 26, 1996
    Date of Patent: May 25, 1999
    Assignee: Fujitsu Limited
    Inventors: Satoshi Naoi, Atsuko Asakawa, Maki Yabuki, Yoshinobu Hotta