Abstract: A method and system for the spotting of keywords in a handwritten document, the method comprising the steps of inputting an image of the handwritten document, performing word segmentation on the image to obtain segmented words, performing word matching, and outputting the spotted keywords. The word matching itself consisting in the substeps of performing character segmentation on the segmented words, performing character recognition on the segmented characters, performing distance computations on the recognized characters using a Generalized Hidden Markov Model with ergodic topology to identify words based on character models and performing nonkeyword rejection using a classifier based on a combination of Gaussian Mixture Models, Hidden Markov Models and Support Vector Machines.
Abstract: Image binarization method and device for converting a grayscale image into a black and white binary image are provided. The grayscale image is divided into sub-images dynamically created from pixel to pixel containing a given pixel as well as neighboring pixels. A threshold for each pixel is determined based on the color values of all the pixels in the sub-image. Therefore, at a certain color value, this given pixel is converted to white and at another color value the pixel is converted black. The foregoing is effectuated pixel by pixel in a dynamic fashion evaluating each pixel relative to its neighboring pixels in order to produce a binary image.