Segmenting Individual Characters Or Words Patents (Class 382/177)
-
Patent number: 6526170Abstract: A character recognition system is disclosed, In a feature extraction parameter storage section 22 a transformation matrix for reducing a number of dimensions of feature parameters and a codebook for quantization are stored. In an HMM storage section 23 a constitution and parameters of Hidden Markov Model (HMM) for character string expression are stored. A feature extraction section 32 scans a word image given from an image storage means from left to right in a predetermined cycle with a slit having a sufficiently small width than the character width and thus outputs a feature symbol at each predetermined timing. A matching section 33 matches a feature symbol row and a probability maximization HMM state, thereby recognizing the character string.Type: GrantFiled: December 13, 1994Date of Patent: February 25, 2003Assignee: NEC CorporationInventor: Shinji Matsumoto
-
Patent number: 6519363Abstract: This invention discloses a method for automatically segmenting and recognizing Chinese character strings continuously written by a user in a handwritten Chinese character processing system, comprising the steps of: creating a geometry model and a language mode; finding out all of potential segmentation schemes in the Chinese character strings continuously written by a user based on the associated timing information and said geometry model; recognizing the groups of strokes as defined by each of potential segmentation schemes and computing the probability characterizing the exactness of recognition results; correcting the probability characterizing the exactness of recognition results by said language model; and, selecting the recognition result and the corresponding segmentation scheme having the maximum probability value.Type: GrantFiled: January 12, 2000Date of Patent: February 11, 2003Assignee: International Business Machines CorporationInventors: Hui Su, Donald T. Tang, Qian Ying Wang
-
Patent number: 6501856Abstract: A scheme for detecting telop character displaying frames in video image which is capable of suppressing erroneous detection of frames without telop characters due to instability of image features is disclosed. In this scheme, each input frame constituting the video data is entered, and whether each input frame is a telop character displaying frame in which telop characters are displayed or not is judged, according to edge pairs detected from each input frame by detecting each two adjacent edge pixels for which intensity gradient directions are opposite on some scanning line used in judging an intensity gradient direction at each edge pixel and for which an intensity difference between said two adjacent edge pixels is within a prescribed range as one edge pair, edge pixels being pixels at which an intensity value locally changes by at least a prescribed amount with respect to a neighboring pixel among a plurality of pixels constituting each input frame.Type: GrantFiled: September 28, 2001Date of Patent: December 31, 2002Assignee: Nippon Telegraph and Telephone CorporationInventors: Hidetaka Kuwano, Hiroyuki Arai, Shoji Kurakake, Kenji Ogura, Toshiaki Sugimura, Minoru Mori, Minoru Takahata
-
Publication number: 20020172422Abstract: Image size converter 4 converts the size of the image data stored in image input part 1 to an arbitrary size and stores the converted data. Image enhancer uses the character frame design data stored in character frame information memory 3 to extract, from the image stored in image size converter 4, an image of a region containing character frames, and enhances and stores this extracted image. Image outline detector 6 forms an outline image from the image obtained by image enhancer 5. Character frame center detector 7 uses the outline image to detect the coordinates of the centers of the character frames of the input image data. Character frame remover 8 uses the character frame center coordinates and the character frame design data to remove the character frames, and outputs the result from character image output part 9.Type: ApplicationFiled: May 14, 2002Publication date: November 21, 2002Applicant: NEC CorporationInventor: Daisuke Nishiwaki
-
Patent number: 6466694Abstract: A processing device performs region identification of an input image, and then performs an intra-region recognition process. The type code of each region and the individual code of a recognition result are then displayed, so that a user can modify both of the results of the region identification and the recognition process at one time. Furthermore, the processing device displays an original image close to the recognition result. If no correct answer exists among recognition candidates, code is added to the original image, and the original image with the code added is handled as a recognition result.Type: GrantFiled: April 16, 1998Date of Patent: October 15, 2002Assignee: Fujitsu LimitedInventors: Hiroshi Kamada, Katsuhito Fujimoto, Koji Kurokawa
-
Patent number: 6466211Abstract: Data visualization apparatuses, computer-readable mediums, computer data signals embodied in a transmission medium, data visualization methods, and digital computer data visualization methods are provided. According to one aspect of the present invention, a data visualization apparatus includes an image device configured to provide a visual image; and digital processing circuitry coupled with the image device and configured to access data including a plurality of themes, to generate a thematic illustration corresponding to the themes and having a plurality of outer contour lines which are spaced at varying distances relative to a reference line, and to control the image device to depict the thematic illustration.Type: GrantFiled: October 22, 1999Date of Patent: October 15, 2002Assignee: Battelle Memorial InstituteInventors: Susan L. Havre, Elizabeth G. Hetzler, Lucy T. Nowell, Paul D. Whitney, Feng Gao, James J. Thomas, Louis M. Martucci, W. Michelle Harris
-
Patent number: 6459810Abstract: An exemplary embodiment of the invention is a method for forming variant search strings. The method includes receiving a search string and parsing the search string to locate a mistaken search string character. A mistaken search string character is a character which is confused with other characters. A variant search string is formed in response to a presence of a mistaken search string character in the search string. The search string and variant search string may then be used to search a database. Another exemplary embodiment of the invention is a system for forming variant search strings. The system includes a user interface for receiving a search string. A variant search string generator parses the search string to locate a mistaken search string character. The mistaken search string character is a character which is confused with other characters. The variant search string generator forms a variant search string in response to a presence of a mistaken search string character in the search string.Type: GrantFiled: September 3, 1999Date of Patent: October 1, 2002Assignee: International Business Machines CorporationInventor: Christopher T. Cring
-
Patent number: 6456739Abstract: A character image is inputted by use of a scanner, and recognized. The resultant character string of such recognition is represented on a display. The image serving as recognition source of the character designated on the display screen thereof, and the image in the vicinity of such image are represented. A character frame, which can discriminate the character image serving as recognition source, is edited in order to designate a new character image. This image and the inputted character information are registered on a character recognition dictionary correspondingly. Thereafter, the character recognition is carried out even with the utilization of such newly registered character. As a result, the recognition rate of the character recognition increases one after another.Type: GrantFiled: June 18, 1996Date of Patent: September 24, 2002Assignee: Canon Kabushiki KaishaInventor: Hiroaki Ikeda
-
Publication number: 20020118876Abstract: Character (or letter) information is extracted from source information, word information is extracted from the character information, and a database is created of the word information. Thereby, the created database is adapted for the technical field of the user or a field of interest to the user.Type: ApplicationFiled: October 22, 2001Publication date: August 29, 2002Inventors: Hidetaka Magoshi, Nobuo Sasaki
-
Publication number: 20020114515Abstract: A key word is first and automatically extracted from a character string group to be recognized, and entered. Then, a character is recognized by segmenting an individual character from a character string image to be recognized, and a character string corresponding to the extracted/entered key word id extracted. Then, a word area delimited by a key word is extracted from the character string image, and a word is recognized. Furthermore, a word recognition result is verified, and a final character string recognition result is output.Type: ApplicationFiled: December 18, 2001Publication date: August 22, 2002Applicant: Fujitsu LimitedInventors: Yoshinobu Hotta, Katsuhito Fujimoto, Satoshi Naoi, Misako Suwa
-
Patent number: 6408094Abstract: A system and method in accordance with the present invention includes a scanning assembly and a storage device coupled to a programmed computer with a set of instructions for carrying out an assessment of a document image. The system and method operate by: processing the document image to obtain one or more attributes related to the geometrical integrity of the document image; selecting a threshold value from a database for each of the obtained attributes; and then comparing each of the obtained attributes against the threshold value selected for the obtained attribute to determine a difference for each and then evaluating one or more of the differences using predetermined criteria to provide evaluation results of the geometrical integrity of the document image.Type: GrantFiled: November 4, 1997Date of Patent: June 18, 2002Assignee: Eastman Kodak CompanyInventors: Alexander David Mirzaoff, Thaddeus Francis Pawlicki
-
Publication number: 20020071606Abstract: A character recognition section generates character recognition result information resulting from character recognition of image information. An image information cutout section cuts out character recognition image information, corresponding to an area as to which the character recognition is performed, from the image information. A recognition result generation section generates recognition result information which is composed of the character recognition result information and the character recognition image information. A recognition result transmission section transmits the recognition result information to other terminals using electronic mail. As a result, an information communications apparatus of the invention can make transmissions of information to the wide area, without increasing the network load, which is to be used for the determination of whether or not a character recognition has been accurately performed.Type: ApplicationFiled: November 29, 2001Publication date: June 13, 2002Applicant: MATSUSHITA GRAPHIC COMMUNICATION SYSTEMS, INC.Inventors: Shinichi Watanabe, Hideki Honma
-
Patent number: 6366908Abstract: A keyfact-based text retrieval method and a keyfact-based text index method that describes the formalized concept of a document by a pair comprising an object that is the head and a property that is the modifier and uses the information described by the pairs as index information for efficient document retrieval. A keyfact-based text retrieval system includes keyfact extracting, keyfact indexing, and keyfact retrieving. The keyfact extracting analyzes a document collection and a query and extracts keywords and keyfacts. The keywords do not have part-of-speech ambiguity and the keyfacts are extracted from the keywords. The keyfact indexing calculates the frequency of the keyfacts and generates a keyfact list of the document collection for a keyfact index structure. The keyfact retrieving receive a keyfact of the query and keyfacts of the document collection and defines a keyfact-based retrieval model in consideration of a weight factor of the keyfact pattern and generates a retrieval result.Type: GrantFiled: December 30, 1999Date of Patent: April 2, 2002Assignee: Electronics and Telecommunications Research InstituteInventors: Kyung Taek Chong, Myung-Gil Jang, MiSeon Jun, Se Young Park
-
Publication number: 20020012462Abstract: An image processing method or device invented to reduce the ratio of erroneously recognized non-character elements in optical character recognition (OCR) regarding a color document that includes character images and other types of images, wherein the extracted character image data is checked to determine whether a color change exists in each character image, and wherein if no color change exists, the character image data is converted into character code data, but where a color change does exist, the character image data is not converted into character code data.Type: ApplicationFiled: June 4, 2001Publication date: January 31, 2002Inventor: Yoko Fujiwara
-
Publication number: 20020009226Abstract: A handwritten character recognition apparatus has a character string input area of a size that allows a user to hand write a plurality of characters thereon using a stylus. A coordinate detection unit extracts a coordinate string for each stroke that forms the handwritten character string. An input completion judgement unit judges an immediately preceding handwritten character string to be complete if a time difference between a last coordinate of an immediately preceding stroke and a first coordinate of a stroke being input is at least a predetermined time, when the first coordinate of the stroke is detected in a first area of the character string input area. A character segmentation unit segments a stroke string for each character from all the strokes of the previously input hand written character string from which a character recognition unit recognizes each character and outputs a character string which is the recognition result.Type: ApplicationFiled: April 19, 2001Publication date: January 24, 2002Inventors: Ichiro Nakao, Yoshikatsu Ito
-
Patent number: 6339651Abstract: A method and system for recognizing the characters on surfaces where alphanumeric identification code (“ID” for short) may be present such as a license plate. The present system is particularly adapted for situations where visual distortions can occur, and utilizes a highly robust method for recognizing the characters of an ID. Multiple character recovery schemes are applied to account for a variety of conditions to ensure high accuracy in identifying the ID. Accuracy is greatly enhanced by taking a comprehensive approach where multiple criteria are taken into consideration before any conclusions are drawn. Special considerations are given to recognizing the ID as a whole and not just the individual characters.Type: GrantFiled: February 25, 1998Date of Patent: January 15, 2002Assignee: Kent Ridge Digital LabsInventors: Qi Tian, Kong Wah Wan, Karianto Leman, Chade Meng Tan, Chun Biao Guo
-
Patent number: 6327385Abstract: A character segmentation system for segmentation out a character from a string of characters which are in touch with each other, which is capable of being executed on a small size hardware resource without influence of variation of touching condition due to difference of character font, comprises an image storing unit 110 for storing an electronic image of character string obtained by such means as optical scanning, a partial pattern dictionary 122 for storing partial pattern shapes used as features for specifying fonts of character, a partial pattern detecting unit 121 for extracting areas of the image of character string, which coincide with a partial pattern, a character font determining unit 123 for determining the font of character on the basis of positions of the areas of the image of character string, which coincide with the partial pattern, and the number of the areas, a feature extraction inhibited area dictionary 132 for storing areas in which feature extraction processing for respective fonts of chType: GrantFiled: November 10, 1998Date of Patent: December 4, 2001Assignee: NEC CorporationInventor: Masaaki Kamitani
-
Patent number: 6327382Abstract: It is an object of the present invention to appropriately extract areas for character recognition from a color image. It is another object of the present invention to separate and extract characters from a background color in a color image if the background of the manuscript is not white and if the characters are printed in a portion having a color that is not commonly used all over the image. To achieve these objects, this invention binarizes an input color image in a plurality of stages and extracts area from binary images obtained in each stage to enable areas and text sections to be appropriately extracted despite the unknown colors of the characters and background contained in the input color image.Type: GrantFiled: January 29, 1999Date of Patent: December 4, 2001Assignee: Canon Kabushiki KaishaInventors: Kitahiro Kaneda, Toshiaki Yagasaki
-
Patent number: 6327384Abstract: In a projection means black pixel histograms of a binary stationary image are generated in both the vertical and the horizontal direction. In a text type judgment means, in accordance with these histograms, a determination is made of whether the image is vertical text or horizontal text. Based on the result of this determination, a pattern block extraction means extracts either a column or a row from the image. The block is further projected and divided into smaller blocks. Then projection is again applied to these divided blocks and patterns are extracted by a pattern extraction means. A judgment is made as to whether or not joining of the extracted patterns is to be performed and, if joining is required, they are joined by a pattern joining means and finally the offsets of all the extracted patterns are calculated, whereupon data (of extracted patterns) are sent to a pattern matching process.Type: GrantFiled: November 13, 1997Date of Patent: December 4, 2001Assignee: NEC CorporationInventors: Kouichirou Hirao, Keiji Yamada, Takahiro Hongu, Takashi Mochizuki, Mitsutoshi Arai
-
Patent number: 6289109Abstract: An apparatus for determining the location and content of data blocks on a mailpiece includes a computer connected to a structure for obtaining a digital bit map image of an outer surface of a mailpiece. The computer includes structure programmed for: finding each run of a plurality of black bits of each scan line of the bit map image and determining if any bit thereof neighbors at least one black bit of another scan line; combining the found run with each neighboring bit to form a piece; assigning a descriptive value to a block having at least one piece and comparing the descriptive value to a list of values to determine which type of data block the block having the descriptive value is.Type: GrantFiled: December 29, 1993Date of Patent: September 11, 2001Assignee: Pitney Bowes Inc.Inventors: Ronald E. Gocht, Leon A. Pintsov
-
Patent number: 6282315Abstract: A method for entering data into a computer generated form including field areas of preselected height and width includes the steps of converting handwritten characters of arbitrary height which may be greater than the preselected height formed on the screen to computer generated characters and displaying the computer generated characters within a field area. Additionally, handwritten characters to be entered into several field areas are grouped, converted, and displayed in selected field areas.Type: GrantFiled: July 2, 1997Date of Patent: August 28, 2001Assignee: Samsung Electronics, Ltd.Inventor: Monty L. Boyer
-
Patent number: 6269188Abstract: The present invention is a computer-implemented method for calculating word accuracy. Word grouping accuracy values (260) are calculated (212) by using the character accuracy values (250) calculated by an OCR program present in a computer system. The present invention preferably uses these character accuracy values (250) to create a word grouping accuracy value (260). Various methods are employed to calculate the word accuracy (260), including binarizing the character accuracy values (250), modified averaging of the character accuracy values (250), and creating fuzzy visual displays of word grouping accuracy values (260). The calculated word grouping accuracy values (260) are then adjusted based upon known OCR strengths and weaknesses, and based upon comparisons to stored word lists and the application of language rules. In a system with multiple character recognition techniques, the system can compare the accuracy values (260) of different versions of the word groupings to find the most accurate version.Type: GrantFiled: March 12, 1998Date of Patent: July 31, 2001Assignee: Canon Kabushiki KaishaInventor: Hamadi Jamali
-
Patent number: 6249604Abstract: A method for determining the boundaries of a symbol or word string within an image, including the steps of determining page orientation, isolating symbol strings from adjacent symbol strings, establishing a set of boundaries or references with respect to which measurements about, or further processing of, the symbol string may be made.Type: GrantFiled: February 14, 1994Date of Patent: June 19, 2001Assignee: Xerox CorporationInventors: Daniel P. Huttenlocher, Peter C. Wayner, Michael J. Hopcroft
-
Patent number: 6249353Abstract: The image editing apparatus of the present invention reads text image having a plurality of character line image along to a predetermined direction, makes histogram expressing the distribution characteristics of said text image, detects said character line image having a predetermined size based on said histogram, and performs editing process for said character line image having said predetermined size.Type: GrantFiled: August 7, 1996Date of Patent: June 19, 2001Assignee: Minolta Co., Ltd.Inventors: Akinori Yoshida, Shigeru Sawada, Takao Fujiwara
-
Patent number: 6246794Abstract: A character reading method has enhanced character segmentation accuracy and character string recognition accuracy for reading correctly hand-written addresses on postal matters. The method extracts provisional character patterns from image information of the address character string (step 206), creates a table 219 of tentative character patterns and implements the character classification for the tentative character patterns (step 207), extracts, specifically for characters of the street number portion of the address character string, periphery information (vertical and horizontal lengths, vertical/horizontal length ratio, pattern spacings, etc.) of tentative character patterns (step 212), and segments the character string into characters accurately based on the information (step 215).Type: GrantFiled: December 11, 1996Date of Patent: June 12, 2001Assignee: Hitachi, Ltd.Inventors: Tatsuhiko Kagehiro, Masashi Koga, Hiroshi Sako, Hiromichi Fujisawa, Hisao Ogata, Yoshihiro Shima, Shigeru Watanabe, Masato Teramoto
-
Patent number: 6212294Abstract: An image processor which receives an image and assigns position of an arbitrary pixel in the image. An image block is extracted from the received image, and start and end positions of a designated area of the received image are acquired. A designated image block in the extracted image block is designated for processing in accordance with the start and end positions of the designated area of the received image.Type: GrantFiled: February 28, 1997Date of Patent: April 3, 2001Assignee: Canon Kabushiki KaishaInventor: Hiroaki Ikeda
-
Patent number: 6189006Abstract: A full-text data base retrieving device retrieves a data base in accordance with a query. A full-text index has character location information representative of location of each of key character sequences of N characters that appear in the data base, where N is a positive integer. A query memory memorizes the query as a retrieval key character sequence. A separating section separates the retrieval key character sequence into a plurality of retrieval key character sequences of N characters to extract contexts as extracted contexts from the retrieval key character sequence in accordance with the retrieval key character sequences. A context classifying section classifies the extracted contexts into classified contexts having the classification numbers, respectively. An index retrieving section retrieves the full-text index in accordance with the sorts of the retrieval key character sequences and the classified contexts to read the character location information as a retrieval result out of the full-text index.Type: GrantFiled: March 2, 1999Date of Patent: February 13, 2001Assignee: NEC CorporationInventor: Toshikazu Fukushima
-
Patent number: 6188790Abstract: An apparatus for recognizing characters read by a reading unit. A circumscribing rectangle of a read character is formed, and the degree of narrowness of that circumscribing rectangle is acquired. Characters having a degree of narrowness that is equal to or greater than a predetermined value are selected and blank areas are added to the circumscribing rectangle to yield a character area with a corrected degree of narrowness. The character is normalized by converting the character area to a specified size, and is recognized based on the normalized character. It is therefore possible to normalize even characters significantly elongated vertically or horizontally for easier recognition and to group their character patterns.Type: GrantFiled: February 26, 1997Date of Patent: February 13, 2001Assignee: Tottori Sanyo Electric Ltd.Inventors: Takatoshi Yoshikawa, Hiromitsu Kawajiri, Hiroshi Horii, Junji Tanaka
-
Patent number: 6185330Abstract: A pattern matching encoding device for executing pattern matching encoding of binary still images and a pattern matching decoding device corresponding to the pattern matching encoding device are proposed. In the pattern matching encoding, each input pattern extracted from the input image is matched against library patterns in the library, and the input pattern is encoded using a matched library pattern as a reference pattern if the matched library pattern is found. The pattern matching encoding device comprises a pattern segmentation section for segmenting each of selected library patterns and the input pattern into two or more parts and thereby generating segmented library patterns and segmented input patterns, a matching section for matching each of the segmented input patterns against corresponding segmented library patterns, and a pattern combination section for generating a new library pattern by combining the segmented library patterns each of which has matched one of the segmented input patterns.Type: GrantFiled: March 18, 1998Date of Patent: February 6, 2001Assignee: NEC CorporationInventor: Kouichirou Hirao
-
Patent number: 6185338Abstract: A character recognition method for recognizing characters on an article having multiple character-bearing areas, such as a license plate, first involves obtaining image data from an image of the article. The method then assigns at least one parameter to a selected character-bearing area on the article. The method then attempts to obtain a correct frame which expresses the correct positional relationship between the selected character-bearing area on the article with other character-bearing areas of the article, and then uses that correct frame to perform character recognition with respect to each of the character-bearing areas of the article. To obtain the correct frame, the invention compares the image data of the article with plural candidate frames. The plural candidate frames are calculated using the predetermined positional correlation between (1) the selected character-bearing area [as represented by the at least one parameter] and (2) other character-bearing areas of the article.Type: GrantFiled: March 21, 1997Date of Patent: February 6, 2001Assignee: Sharp Kabushiki KaishaInventor: Mitsuaki Nakamura
-
Patent number: 6137906Abstract: A reading system includes a computer and a mass storage device including software comprising instructions for causing a computer to accept an image file generated from optically scanning an image of a document. The software convert the image file into a converted text file that includes text information, and positional information associating the text with the position of its representation in the image file. The reading system has the ability therefore to display the image representation of the scanned image on a computer monitor and permit a user to control operation of the reader by with respect to the displayed image representation of the document by using the locational information associated with the converted text file. Also described are techniques for dual highlighting spoken text and a technique for determining the nearest word to a postion selected by use of mouse or other pointing device operating on the image representation as displayed on the monitor.Type: GrantFiled: June 27, 1997Date of Patent: October 24, 2000Assignee: Kurzweil Educational Systems, Inc.Inventor: Mark S. Dionne
-
Patent number: 6115506Abstract: The invention provides a character recognition apparatus wherein wrong correction in slant correction processing of a character string is minimized to minimize erroneous recognition. A character slant estimation section receives an image, calculates slant angle candidates and evaluation values of them, and calculates a slant angle estimated value based on the evaluation values. An estimated value evaluation section receives the evaluation values, calculates an information amount of the evaluation values or the like, and outputs it as a validity of the slant angle estimated value. A slant correction section receives and normalizes the validity to a value from 0 to 1 and determines the normalized value as an execution coefficient for slant correction.Type: GrantFiled: May 4, 1998Date of Patent: September 5, 2000Assignee: NEC CorporationInventor: Takafumi Koshinaka
-
Patent number: 6108444Abstract: A method and system of recognizing handwritten words in scanned documents, wherein by processing a document containing handwriting, features for word localization are extracted from handwritten words contained in said document through basis points taken from a single curve of text lines. The method is independent of page orientation, and does not assume that the individual lines of handwritten text are parallel, and the method does not require that word regions be aligned with text line orientation wherein intra-word statistics are derived from sample pages rather than using a fixed threshold. The method has applications in digital libraries, handwriting tokenization, document management and OCR systems.Type: GrantFiled: September 29, 1997Date of Patent: August 22, 2000Assignee: Xerox CorporationInventor: Tanveer F. Syeda-Mahmood
-
Patent number: 6104833Abstract: An environment recognizing unit extracts the first through N-th states from an input image and calls data corresponding to the first through N-th states from the first through N-th pattern recognizing units to perform a recognizing unit.Type: GrantFiled: January 3, 1997Date of Patent: August 15, 2000Assignee: Fujitsu LimitedInventors: Satoshi Naoi, Misako Suwa, Yoshinobu Hotta
-
Patent number: 6081616Abstract: A method for cutting character images from a line segment of pixel image data includes a first cutting layer step in which nontouching and nonoverlapping characters are cut from a line segment, and a second cutting layer step in which touching characters are cut from the line segment.Type: GrantFiled: July 18, 1997Date of Patent: June 27, 2000Assignee: Canon Kabushiki KaishaInventors: Mehrzad R. Vaezi, Christopher Allen Sherrick
-
Patent number: 6064769Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.Type: GrantFiled: November 5, 1998Date of Patent: May 16, 2000Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
-
Patent number: 6064767Abstract: A computer-implemented process identifies an unknown language used to create a document. A set of training documents is defined in a variety of known languages and formed from a variety of text styles. Black and white electronic pixel images are formed of text material forming the training documents and the document in the unknown language. A plurality of line strokes are defined from the black pixels and point features are extracted from the strokes that are effective to characterize each of the languages. Point features from the unknown language are compared with point features from the known languages to identify one of the known languages that best represents the unknown language.Type: GrantFiled: January 16, 1998Date of Patent: May 16, 2000Assignee: Regents of the University of CaliforniaInventors: Douglas W. Muir, Timothy R. Thomas
-
Patent number: 6035061Abstract: A title extracting apparatus scans black pixels in a document image and extracts rectangular regions that circumscribe connected regions of the black pixels as character rectangles. In addition, the title extracting apparatus unifies a plurality of character rectangles that adjoin and extracts rectangular regions that circumscribe the character rectangles as character string rectangles. Thereafter, the title extracting apparatus calculates points with the likelihood of being a title corresponding to attributes such as an underline attribute, a frame attribute, and a ruled line attribute of each character string rectangle, the positions of the character string rectangles in the document image, and the mutual position relation and extracts a character string rectangle with the highest points as a title rectangle. In the case of a tabulated document, the title extracting apparatus can extract a title rectangle from the inside of the table.Type: GrantFiled: August 7, 1996Date of Patent: March 7, 2000Assignee: Fujitsu LimitedInventors: Yutaka Katsuyama, Satoshi Naoi
-
Patent number: 6014460Abstract: A character strings reading device for reading character strings from input image data comprises cut-out recognition means for cutting out a segment corresponding to one character from the image data to perform individual character recognition every segment, a recognition result buffer for storing a recognition result of the cut-out recognition means, word searching means for searching a word string candidate corresponding to a combination of character candidates in the recognition result buffer, a word string candidate buffer for storing a search result of the word searching means, check portion determining means for determining a check target portion and a presumed character string of the check target portion on the basis of the result in the word string candidate buffer, and check means for judging the possibility of existence of the presumed character string on the check portion.Type: GrantFiled: December 19, 1995Date of Patent: January 11, 2000Assignee: NEC CorporationInventors: Toshikazu Fukushima, Eiki Ishidera, Masahiko Hamanaka, Daisuke Nishiwaki
-
Patent number: 5999647Abstract: A character extraction apparatus is provided for extracting character data for each character from a text image which is represented by first pixels corresponding to character images and second pixels corresponding to background images. The character extraction apparatus comprises a character row detecting means for detecting character rows from the text image and obtaining position data of each character row; a pixel array extracting means for extracting arrays of continuous first pixels in an area specified by the character row position data and computing position data of each of the arrays of continuous first pixels; a character array linking means for linking the arrays of continuous first pixels in the area based on the position data of the arrays of continuous first pixels; and a character extracting means for recognizing each set of arrays of continuous first pixels linked by the character array linking means as a character and outputting character data.Type: GrantFiled: February 28, 1996Date of Patent: December 7, 1999Assignee: Matsushita Electric Industrial Co., Ltd.Inventors: Ichiro Nakao, Mariko Takenouchi, Saki Takakura, Satoshi Emura
-
Patent number: 5966473Abstract: Described is an image processing system which is operative to automatically determine a quadrilateral object such as a character frame, page mark, or position correction mark only with a mouse click. A field is automatically specified by displaying a scanned image of a form including a black frame on a display, clicking within a character frame at the left end for each recognition field, and clicking within a character frame at the right end of the same field. In this case, a field position/size determination program scans the image in the vertical and horizontal directions from the two clicked points to detect the inner wall of the black frame, and produces a histogram by establishing rectangles between two character frames to automatically detect the number of character frames in the field and the thickness of the black line between the character frames.Type: GrantFiled: October 14, 1997Date of Patent: October 12, 1999Assignee: International Business Machines CorporationInventors: Hiroyasu Takahashi, Toshimichi Arima
-
Patent number: 5956433Abstract: A method and apparatus for removing spots from character images of a multi-character image read by an image scanner. A character image is cut out from the multi-character image. Separated segments in the cut-out character image are then detected. A respective segment of the detected, separated segments is deleted as a free spot if the number of detected segments exceeds a maximum segment number. After deleting a free spot, an attempt is then made to recognize a character in the character image. When a character cannot be recognized, a black pixel width is identified by analyzing the distribution of black pixel widths in the character image. Then, a circumscribed rectangle is defined in accordance with the identified black pixel width. Pixels of images lying outside the circumscribed rectangle are deleted from the character image as an externally contacted spot.Type: GrantFiled: March 18, 1997Date of Patent: September 21, 1999Assignee: Fujitsu LimitedInventor: Hisashi Sasaki
-
Patent number: 5949555Abstract: An apparatus and method for compressing an image based upon an attribute of a partial area efficiently includes a classification device for providing attribute classification information in which partial areas of an input image are classified into a variety of attributes and area information, and an image compression device for selecting an image compression method for each partial area in the above-described input image out of plural image compression methods and executing such compression.Type: GrantFiled: February 3, 1995Date of Patent: September 7, 1999Assignee: Canon Kabushiki KaishaInventors: Akihiko Sakai, Eiji Ohara, Michiko Hirayu, Yuka Nagai
-
Patent number: 5949906Abstract: A character string region extracting apparatus comprises an extracting section for extracting a plurality of primitives from image information in which a character and a graphic pattern other than the character are mixedly present, a character string candidate region forming section for generating character candidate regions from the primitives and connecting the character candidate regions, thereby forming at least one character string candidate region, a character recognizing section for subjecting the character candidate regions included in the character string candidate region to character recognition, and a character string region extracting section for extracting a character string region from the character string candidate region by the character recognition.Type: GrantFiled: December 7, 1995Date of Patent: September 7, 1999Assignee: Kabushiki Kaisha ToshibaInventors: Hidekata Hontani, Shigeyoshi Shimotsuji
-
Patent number: 5930393Abstract: A technique for the enhancement of degraded document images to improve their display quality characteristics and image recognition accuracy. Images believed to be representative of the same symbol which occur in different positions over a image source (e.g., a facsimile page) are clustered together. Using the symbols within a particular cluster, an average character image outline for that cluster of symbols is derived and thereafter used to refine the matching of symbols within the cluster and to determine a final representative symbol for that cluster. The final representative symbols from the various resulting clusters are then used to replace all matching images throughout the image source. Advantageously, the display quality and recognition accuracy of the image source is enhanced after application of the present invention due to the resulting improvement of the images in the image source.Type: GrantFiled: August 11, 1997Date of Patent: July 27, 1999Assignee: Lucent Technologies Inc.Inventors: Tin Kam Ho, John D. Hobby
-
Patent number: 5920641Abstract: Linear structures are used to identify persons. In order to be able to combine a multiplicity of such linear structures in a database, their original images are analyzed and reconstructed using orthonormal basic functions. A preferred direction of the linear structure is determined for each pixel. A quality measure is used to evaluate the reliability of the analyzed data. Singularities (SI) and minutiae (MI) are extracted and stored.Type: GrantFiled: March 6, 1997Date of Patent: July 6, 1999Assignee: Siemens Nixdorf Informationssysteme AktiengesellschaftInventors: Birgit Ueberreiter, Joachim Dengler
-
Patent number: 5917941Abstract: After each complete stroke in a handwriting recognition process, a hypothesis is generated whether a word break is present between the previous stroke and the new stroke. This hypothesis is weighted with a probability of a word-break occurring between the strokes. This probability is determined from the geometrical relationships between characters. Subsequently, a word search is carried out on the basis of these weighted hypotheses, to identity the most likely candidates for the words represented by the written strokes. A user interface is provided that offers the user a limited list of alternative word recognitions for a group of characters. These recognitions undergo segmentation filtering, in accordance with the word breaks of the selected hypotheses, to present the user with only those alternatives having the same groupings of strokes.Type: GrantFiled: August 8, 1995Date of Patent: June 29, 1999Assignee: Apple Computer, Inc.Inventors: Brandyn Webb, Larry S. Yaeger
-
Patent number: 5912996Abstract: An input carrier sheet 12C for document distribution system 10 carries input symbols hand entered by the user into pre-existing constraint grids 12. The constraint grids may be printed in continuous tone or halftone. The print only partially covers the underlying carrier, permitting the exposed carrier to reflect light. The grids have sufficient pigment to be visible to the user, but insufficient pigment to form foreground pixels along with the hand-entered stroke when detected during the scanning. The signal (symbol)-to-noise (carrier) ratio is enhanced by reducing the pigment content of the constraint grids which increases the reflectivity of the grids. The S/N may be further enhanced by placing the strokes of the hand-entered symbols on top of the grid which occults some of the grid pigment. The S/N is further enhanced by highly reflective brightening agents in the grid print, and by aperture effect during scanning.Type: GrantFiled: March 6, 1997Date of Patent: June 15, 1999Assignee: Canon Kabushiki KaishaInventor: Roger D. Melen
-
Patent number: 5911005Abstract: The current invention is directed to further improve the character recognition process based upon the comparison of identification value in a sample image and a reference image by adjusting the identification value of the sample image. The adjustment is made based upon a predetermined feature of the sub-area or a mesh region of the images. The desired improvement in accuracy is obtained especially for recognizing handwritten characters.Type: GrantFiled: November 20, 1995Date of Patent: June 8, 1999Assignee: Ricoh Company, Ltd.Inventor: Yukinaka Uchiyama
-
Patent number: 5907630Abstract: An image extraction system includes a connected pattern extracting part for extracting partial patterns respectively having connected pixels from an image which is formed by a block frame having a table format and including one-character frames or a free format frame, characters, graphics or symbols, a one-character frame extracting part for extracting one-character frames from the image based on the partial patterns extracted by the connected pattern extracting part, a straight line extracting part for extracting straight lines from the partial patterns which are extracted by the connected pattern extracting part and is eliminated of the one-character frames by the one-character frame extracting part, a frame detecting part for detecting straight lines forming the frame from the straight lines extracted by the straight line extracting part, and a frame separating part for separating the straight lines detected by the frame detecting part from the partial patterns so as to extract the characters, graphics orType: GrantFiled: August 26, 1996Date of Patent: May 25, 1999Assignee: Fujitsu LimitedInventors: Satoshi Naoi, Atsuko Asakawa, Maki Yabuki, Yoshinobu Hotta