Separating Touching Or Overlapping Characters Patents (Class 382/178)
  • Patent number: 5818963
    Abstract: A method and system for identifying boundaries of characters in handwritten text by classifying segment strokes provides improved performance in a handwriting recognition system. A segment stroke is a portion of handwritten text which includes a boundary between two characters. The segment stroke is recognized by the same method used to recognized characters. Recognition of a segment stroke is accomplished by training a learning machine to act as a classifier which implements a discriminant function based on a polynomial expansion.
    Type: Grant
    Filed: November 4, 1996
    Date of Patent: October 6, 1998
    Inventors: Michael Murdock, Shay-Ping T. Wang
  • Patent number: 5809166
    Abstract: An optical character recognition system cuts between touching characters. A first cut is made between touching characters based on white spaces, and the cut characters are subjected to character recognition processing. All characters not recognized are then cut again. A pair of adjacent vertical bars is detected in a vertical histogram of character image data of unrecognized characters, the vertical bars having a vertical component in the histogram that exceeds a predetermined vertical threshold. Horizontal crossings are detected in each of three discrete horizontal bands between the vertical bars. The vertical bars are classified according to the detected horizontal crossings, and, based on the classification, a decision is made whether or not to cut between the vertical bars, and where to cut between the vertical bars.
    Type: Grant
    Filed: October 28, 1997
    Date of Patent: September 15, 1998
    Assignee: Canon Kabushiki Kaisha
    Inventors: Hung Khei Huang, Toshiaki Yagasaki
  • Patent number: 5795716
    Abstract: A computer system for analyzing nucleic acid sequences is provided. The computer system is used to perform multiple methods for determining unknown bases by analyzing the fluorescence intensities of hybridized nucleic acid probes. The results of individual experiments are improved by processing nucleic acid sequences together. Comparative analysis of multiple experiments is also provided by displaying reference sequences in one area and sample sequences in another area on a display device.
    Type: Grant
    Filed: October 21, 1994
    Date of Patent: August 18, 1998
    Inventor: Mark S. Chee
  • Patent number: 5787194
    Abstract: Image processing apparatus for segmenting an input image into image portions each containing a single character, the apparatus comprising identification logic for identifying connected components in the input image; classification logic, including a neural network, for determining into which of a number of predefined classes a connected component falls, at least one of said classes indicating that the connected component is most likely to be a single character; merging logic and splitting logic for merging and splitting the connected components. The merging and splitting logic and the classification logic are arranged to operate so that the connected components are iteratively merged and/or split and the resulting split and/or merged connected components reclassified by the classification logic until an image segmentation is achieved which meets a predefined criterion.
    Type: Grant
    Filed: December 31, 1996
    Date of Patent: July 28, 1998
    Assignee: International Business Machines Corporation
    Inventor: Eyal Yair
  • Patent number: 5787196
    Abstract: Image processing apparatus is disclosed for splitting, for subsequent storage or processing by OCR apparatus, character images in digital form comprising connected characters, the image processing apparatus extracts an image skeleton from an input image and represents the topology of the image skeleton in a first data structure; generates, from the first data structure, second data structures representing the topologies of the results of possible splits of the input image skeleton; evaluates the second data structures and selects one of the possible splits to thereby split the input image.
    Type: Grant
    Filed: November 21, 1996
    Date of Patent: July 28, 1998
    Assignee: International Business Machines Corporation
    Inventors: Eyal Yair, Oren Kagan
  • Patent number: 5768414
    Abstract: Method and apparatus for separating touching characters within an optical character recognition (OCR) computer (1). An input document (20) is scanned by scanner (2), forming a set of scan line's (3). A segmentation process (4) is performed on the scan lines (3) to create a set of segmented image boxes (5). Candidate characters within the image boxes (5) are classified by a classification module (6), based upon a library of stored models (7). When the candidate characters have high degree of confidence, they are classified and coded into a binary form (8), such as ASCII. Those candidate characters that are not classified are processed by a touching character decision module (9) to determine whether a series of separation modules (10-14) is to be invoked. The execution of modules (10-13), followed by the reexecution of modules (4) and (6), may or may not cause all of the touching characters to be separated. Any touching characters that remain are subjected to one or more reprocessing cycles.
    Type: Grant
    Filed: December 22, 1995
    Date of Patent: June 16, 1998
    Assignee: Canon Kabushiki Kaisha
    Inventor: Hamadi Jamali
  • Patent number: 5751850
    Abstract: A method to segment, classify and clean an image is presented. It may be used in applications which have image data as their input that contains different classes of elements. The method will find, separate and classify those elements. Only significant elements must be kept for further processing and thus the amount of processed data may be significantly reduced.
    Type: Grant
    Filed: October 4, 1996
    Date of Patent: May 12, 1998
    Assignee: International Business Machines Corporation
    Inventor: Klaus Rindtorff
  • Patent number: 5727081
    Abstract: A method and system for forming an interpretation of an input expression, where the input expression is expressed in a medium, the interpretation is a sequence of symbols, and each symbol is a symbol in a known symbol set. In general, the system processes an acquired input data set representative of the input expression, to form a set of segments, which are then used to specify a set of consegmentations. Each consegmentation and each possible interpretation for the input expression is represented in a data structure. The data structure is graphically representable by a graph comprising a two-dimensional array of nodes arranged in rows and columns and selectively connected by directed arcs. Each path, extending through the nodes and along the directed arcs, represents one consegmentation and one possible interpretation for the input expression. All of the consegmentations and all of the possible interpretations for the input expression are represented by the set of paths extending through the graph.
    Type: Grant
    Filed: August 4, 1994
    Date of Patent: March 10, 1998
    Assignee: Lucent Technologies Inc.
    Inventors: Christopher John Burges, John Stewart Denker
  • Patent number: 5721790
    Abstract: A method for separating integer and fractional portions of a financial amount preparatory to recognition of the financial amount. This separating is accomplished based on determining the presence of at least one of a plurality of possible distinguishing separation characteristics, such as the presence of a period (decimal point), superscripted characters, or a fraction. The separated fractional portion is then categorized into one of a plurality of categories based on the nature of the fractional portion representation. The characters making up this fractional portion are then extracted based on this categorizing.
    Type: Grant
    Filed: January 11, 1993
    Date of Patent: February 24, 1998
    Assignee: Unisys Corporation
    Inventor: Norbert Klenner
  • Patent number: 5692069
    Abstract: An apparatus for performing character segmentation of digitized handwritten character data utilizes a vertical histogram processing unit to identify primary cuts to be made in the character data based on a vertical histogram. Character blocks generated after the primary cuts are analyzed to determine if the character blocks contain multiple characters. A slant histogram processing unit then generates a set of slant histograms for each of the character blocks to identify minima indicative of segmentation points. The slant histograms are evaluated by an evaluation processing unit to determine segmentation points for kerned characters based on zero-value minima. After segmenting the kerned characters to generated new character blocks, the evaluation processing unit evaluates the minima in accordance with a set of predetermined rules to identify further segmentation points for touching characters.
    Type: Grant
    Filed: March 17, 1995
    Date of Patent: November 25, 1997
    Assignee: Eastman Kodak Company
    Inventor: John Douglas Hanson
  • Patent number: 5684891
    Abstract: This disclosure relates to a character recognition method and apparatus through which highly accurate character recognition is capable of being executed inexpensively and at high speed. Character recognition is raised in speed by executing segmentation of character images from an input image and character recognition from the segmented character images in parallel by separate processors without use being made of a special communication processor. After the character images have been segmented from the input image, the results of segmentation are evaluated and the character images are segmented further based upon the results of evaluation.
    Type: Grant
    Filed: November 14, 1994
    Date of Patent: November 4, 1997
    Assignee: Canon Kabushiki Kaisha
    Inventors: Tetsuomi Tanaka, Shugoro Ueno, Hiroaki Ikeda
  • Patent number: 5680479
    Abstract: In a character recognition system or the like, method and apparatus for selecting blocks of pixels from pixel image data so as to permit identification and grouping of similarly-typed pixels, such as text-type pixels and non-text-type pixels. Pixel image data is inputted and, if the pixel image data is not binary image data then the pixel image data is converted into binary pixel image data. Blocks of pixel image data are selected by outlining contours of connected components in the pixel image data, determining whether the outlined connected components include text unit or non-text units based on the size of the outlined connected components, selectively connecting text units widthwisely to form text lines based on proximity of adjacent text units, and selectively connecting text lines vertically to form text blocks based on proximity of adjacent text lines and on the position of non-text units between text lines. A hierarchical tree is formed based on the outlined connected components.
    Type: Grant
    Filed: April 24, 1992
    Date of Patent: October 21, 1997
    Assignee: Canon Kabushiki Kaisha
    Inventors: Shin-Ywan Wang, Mehrzad R. Vaezi, Christopher Allen Sherrick
  • Patent number: 5680478
    Abstract: A character recognition system or the like in which character identities are stored in accordance with a hierarchical order established during processing to separate text image areas from non-text image areas. To separate text image areas from non-text image areas, blocks of pixels are selected from pixel image data by outlining contours of connected components in the pixel image data, determining whether the outlined connected components include text units or non-text units, selectively connecting text units widthwisely to form text lines, and selectively connecting text lines vertically to form text blocks. After blocks of pixels have been so selected, text blocks are segmented into lines of pixel image data, and characters are cut from the lines of pixel image data so obtained. If desired, the characters may be cut by a two-step cutting process in which non-touching and non-overlapping characters are first cut out, and touching characters are then cut out.
    Type: Grant
    Filed: June 27, 1994
    Date of Patent: October 21, 1997
    Assignee: Canon Kabushiki Kaisha
    Inventors: Shin-Ywan Wang, Mehrzad R. Vaezi, Christopher Allen Sherrick
  • Patent number: 5600735
    Abstract: The present invention determines whether two discrete continuous segments of handwritten input S.sub.1 (210) and S.sub.2 (220) form part of the same handwritten input or are part of more than one, separate handwritten inputs. The present method calculates one or more substantially parallel distance disposed substantially parallel to the writing access (210) and compares these distances to one or more predefined thresholds. The predefined thresholds specify minimum distance measures which must be exceeded by the substantially parallel distances for the discrete continuous segments to be judged as belonging to separate segments of handwritten input.
    Type: Grant
    Filed: May 10, 1994
    Date of Patent: February 4, 1997
    Assignee: Motorola, Inc.
    Inventor: John L. C. Seybold
  • Patent number: 5572602
    Abstract: In an image extraction system, an extracting part for extracting wide lines, an extracting part for extracting narrow lines and a frame detector detect a frame from a pattern which is extracted by a connected pattern extracting part. An attribute adder adds attributes of a character (graphic and symbol inclusive), frame, and a contact pattern of the character and frame to a partial pattern, and a separating part separates the frame from the contact pattern. An intersection calculator calculates intersections of the character and frame, and the calculated intersections are associated by an intersection associating part. An interpolator obtains a character region within the frame and interpolates this region based on the associated intersections. A connection confirming part confirms a connection of the pattern with respect to the extracted character pattern, and patterns confirmed of their connection are integrated in a connected pattern integrating part to thereby extract the character.
    Type: Grant
    Filed: February 7, 1994
    Date of Patent: November 5, 1996
    Assignee: Fujitsu Limited
    Inventors: Satoshi Naoi, Maki Yabuki, Atsuko Asakawa
  • Patent number: 5561720
    Abstract: Image-pattern-specific separation values are calculated in each case per column for a separation image of standardized width, with the aid of a separation classifier, and the column having the maximum separation value is defined as the right-hand separation column. On the basis of the separation column predetermined by the separation classifier, an attempt is made, starting from the upper pixel of the separation column, to find a separation path which, in the absence of a white column, is formed within a separation region, which is located on both sides of the separation column, partially by contour tracing and, if no white path can be found by contour tracing, partially by forced separation along the separation column.
    Type: Grant
    Filed: October 8, 1993
    Date of Patent: October 1, 1996
    Assignee: CGK Computer Gesellschaft Konstanz mbH
    Inventors: Wolfgang Lellmann, Xaver Muller
  • Patent number: 5535287
    Abstract: A method and apparatus for correctly cutting an unidentified read composite character image into separate and discrete character images suitable for use in recognizing characters in which the images have been wrongly joined to each other to form a composite image when read from an original. For the purpose of correctly determining the position where the composite image is to be cut into separate character images, the apparatus has a configuration tracing device for detecting coordinate information concerning the configuration of an image of interest in input image information from an original. A distance computing device computes, on the basis of the coordinate information obtained by the configuration tracing device, the distances between the ends of the configuration traced by the configuration tracing device as measured in the direction perpendicular to the train of characters, for successive points along a line parallel to the train of characters.
    Type: Grant
    Filed: February 28, 1994
    Date of Patent: July 9, 1996
    Assignee: Canon Kabushiki Kaisha
    Inventor: Touru Niki
  • Patent number: 5528701
    Abstract: A method is disclosed for matching input data representing a continuous combination of input objects to a plurality of objects in a trie database structure. This data structure has a plurality of nodes partitioned into a plurality of levels. Each node in the Trie includes a plurality of elements where each element corresponds to a respective one of the component objects. In addition, a hidden Markov model corresponding to the component object is associated with the element in the database. According to the method, the input object is applied to each of the hidden Markov models associated with the respective plurality of elements of a node to generate a respective plurality of acceptance values. The element which generates the largest acceptance value is identified with a segment of the input data. The component object for this element is recorded and the identified segment is deleted from the input data string.
    Type: Grant
    Filed: September 2, 1994
    Date of Patent: June 18, 1996
    Assignee: Panasonic Technologies, Inc.
    Inventor: Walid G. Aref
  • Patent number: 5517578
    Abstract: A note taking system that integrates word-processing functionality and computerized drawing functionality for processing ink strokes comprises novel methods that provide this functionality such as: a method for modeless operation of the note taking system that automatically switches between providing word-processing functionality and drawing functionality; a novel method for processing ink strokes as drawings, a unique method for processing ink strokes as writing, and other methods for parsing the ink strokes into words, lines, and paragraphs. The present invention also includes additional methods for manipulating figures such as a division between line and shape type figures, and a special handle performing either rotation or re-sizing.
    Type: Grant
    Filed: May 20, 1993
    Date of Patent: May 14, 1996
    Assignee: aha! software corporation
    Inventors: Dan Altman, Steven R. Kusmer, Gregory Stikeleather, Michael P. Thompson
  • Patent number: 5504822
    Abstract: Banking apparatus for reading numeric information on bank checks, drafts and like documents. An electronic black/white pixel image of the numeric information is analyzed by a system capable of reading unconstrained, constrained, printed and typed numeric characters, locating the division between dollar and cents amounts. The invention also reads overlapping, touching and not touching characters.
    Type: Grant
    Filed: July 23, 1992
    Date of Patent: April 2, 1996
    Inventor: Arthur W. Holt
  • Patent number: 5497432
    Abstract: A dividing step (a) divides into segments at least one character line forming input characters to be read. A partial-figure forming step (b) forms partial figures by combining segments from among the thus obtained segments. A character reading step (c) attempts to read each of said partial figures as a character. A network forming step (d) forms a network wherein partial figures, among the partial figures, which have been read in said character-reading step (c) are used as nodes and said nodes are connected with one another by links, and wherein said links respectively have appropriate weights. An optimum-path selecting step (e) selects an optimum path from among paths existing in said network so that said optimum path comprises nodes, from among said nodes, which respectively correspond to said input characters.
    Type: Grant
    Filed: August 23, 1993
    Date of Patent: March 5, 1996
    Assignee: Ricoh Company, Ltd.
    Inventor: Hirobumi Nishida