Abstract: A method, computing device, and associated computer readable storage media containing instructions for binarizing a grayscale image by manually determining a first threshold that yields optimal binarization values to one or more images in a set of images, calculating the histograms of each of the images determined using the first threshold, calculating a set of statistical parameters such as the mean, standard deviation and variance of each histogram, determining a second threshold as a function of the set of statistical parameters, and comparing each pixel of the grayscale image to the second threshold. The second threshold T may be a function of the mean m, standard deviation s and variance v and is calculated by fitting a third degree polynomial curve T=a0+a1m+a2s+a3v, where the coefficients A=[a0 a1 a2 a3]T are found using a minimum mean square error algorithm.
Abstract: A method, device and computer readable storage media for enhancing an image for optical character recognition by detecting the edges of the image to create an edge detected image, binarizing the edge detected image to create a binary edge image for processing, dilating the binary edge image to create a dilated binary edge image, taking the XOR difference between the binary edge image and the dilated binary edge image to obtain a text boundary, superimposing the text boundary on the image and determining the pixels of the image that are covered by the text boundary, calculating the average grayscale value of the pixels of the image that are covered by the text boundary, and setting background pixels of the image to the calculated average grayscale value of the pixels of the image that are covered by the text boundary.