Abstract: A system that identifies and discriminates between image regions that consist of text lines of alphanumeric characters and image regions that largely consist of non-alphanumeric line-drawing components. Only image components which are determined to be alphanumeric characters are submitted to an OCR program, thus saving processing time and avoiding errors. The system mainly exploits the principle that text blocks in an image are characterized by regularly spaced horizontal runs of white consistent with inter-line spaces.
Abstract: A method and apparatus for the elimination of color from a multi-color image document is described. All color information for every picture element of the image of the document is provided concurrently and for every picture element PEL the image signal of all provided colors are analyzed and for every picture element that image signal of that color is selected which has the minimum contrast relative to the background of the image document.