Search Patents
  • Publication number: 20020076111
    Abstract: Following scanning of a document image, and optical character recognition (OCR) processing, the outputted OCR text is processed to determine a text format (typeface and font size) to match the OCR text to the originally scanned image. The text format is identified by matching word sizes rather than individual character sizes. In particular, for each word and for each of a plurality of candidate typefaces, a scaling factor is calculated to match a typeface rendering of the word to the width of the word in the originally scanned image. After all of the scaling factors have been calculated, a cluster analysis is performed to identify close clusters of scaling factors for a typeface, indicative of a good typeface fit at a constant scaling factor (font size).
    Type: Application
    Filed: December 18, 2000
    Publication date: June 20, 2002
    Applicant: Xerox Corporation
    Inventors: Christopher R. Dance, Mauritius Seeger
  • Publication number: 20120177295
    Abstract: A method for enhancing the accuracy of Optical Character Recognition (OCR) algorithms by detection of differences between a digital image of a document and a text file corresponding to the digital image, created by the OCR algorithm. The method includes calculating the transformation between the first and second digital images such as geometrical distortion, local brightness and contrast differences and blurring due to the optical imaging process. The method estimates the parameters of these transformations so that the transformations can be applied to at least one of the images, rendering it as similar as possible to the other image. The method further compares the two images in order to find differences. The method further displays the differences on a display device and analyzes the differences. The analysis results are fed back to the OCR algorithm.
    Type: Application
    Filed: January 7, 2011
    Publication date: July 12, 2012
    Inventors: Yuval Gronau, Edo Likhovski
  • Patent number: 8755604
    Abstract: A system and method may include a processor that groups the glyphs of a document into font character models. OCR processing may be performed to identify the ASCII value of the font character models, with the results mapped to the glyphs contained with those models, thereby identifying the text of the original document. This results in fewer calls to an OCR engine, thereby providing a significant speedup. Further, when a model is assigned differing text values by the OCR engine, the system and method may identify the value most likely to be correct, thereby improving the accuracy of the output text.
    Type: Grant
    Filed: June 5, 2009
    Date of Patent: June 17, 2014
    Assignee: CVISION Technologies, Inc.
    Inventors: Ari David Gross, Raphael Meyers, Navdeep Tinna, Yunhao Shi
  • Patent number: 12205370
    Abstract: Systems and methods for computer-implemented pre-optimization of input data before further processing thereof by a computer-implemented analyzation process, such as optical character recognition (OCR). A cooperative model is employed that combines one or more supervised-learning based inspector sub-models, and one or more filter sub-models that operating in series with the inspector sub-model(s). The inspectors first receive the input data and calculate one predicted transformation parameters then used to perform transformations on the input data. The inspector-transformed data is then passed to the filters, which derive respective convolution kernels and apply same to the inspector-transformed data before passing same to the OCR or other analyzation process. The inspectors may be pretrained with different training data.
    Type: Grant
    Filed: May 13, 2021
    Date of Patent: January 21, 2025
    Inventor: Ian Jeffrey Wilkins
  • Publication number: 20090263019
    Abstract: Disclosed embodiments of the invention provide automated global optimization methods and systems of OCR, tailored to each document being digitized. A document-specific database is created from an OCR scan of a document of interest, which contains an exhaustive listing of words in the document. Images of each word, taken from all the fonts encountered, are entered into the database and mapped to a corresponding textual representation. After entry of a first instance of an image of a word written in a particular font, each new occurrence of the word in that font can be quickly recognized by image processing techniques. The disclosed methods and systems may be used in conjunction with adaptive character recognition training and word recognition training of the OCR engines.
    Type: Application
    Filed: April 16, 2008
    Publication date: October 22, 2009
    Inventors: Asaf Tzadok, Eugeniusz WALACH
  • Publication number: 20140168478
    Abstract: An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.
    Type: Application
    Filed: March 15, 2013
    Publication date: June 19, 2014
    Applicant: QUALCOMM INCORPORATED
    Inventors: Pawan Kumar Baheti, Abhijeet S. Bisain, Rajiv Soundararajan, Dhananjay Ashok Gore
  • Patent number: 6741745
    Abstract: Following scanning of a document image, and optical character recognition (OCR) processing, the outputted OCR text is processed to determine a text format (typeface and font size) to match the OCR text to the originally scanned image. The text format is identified by matching word sizes rather than individual character sizes. In particular, for each word and for each of a plurality of candidate typefaces, a scaling factor is calculated to match a typeface rendering of the word to the width of the word in the originally scanned image. After all of the scaling factors have been calculated, a cluster analysis is performed to identify close clusters of scaling factors for a typeface, indicative of a good typeface fit at a constant scaling factor (font size).
    Type: Grant
    Filed: December 18, 2000
    Date of Patent: May 25, 2004
    Assignee: Xerox Corporation
    Inventors: Christopher R. Dance, Mauritius Seeger
  • Publication number: 20090208103
    Abstract: An optical character recognition (OCR) system that includes a user-input function for receiving a user input sample for executing said OCR system for optically recognizing a document to generate an output file using the user input sample as a reference.
    Type: Application
    Filed: April 22, 2007
    Publication date: August 20, 2009
    Inventor: Bo-In Lin
  • Publication number: 20070140595
    Abstract: Optical character recognition (OCR) for images such as a street scene image is generally a difficult problem because of the variety of fonts, styles, colors, sizes, orientations, occlusions and partial occlusions that can be observed in the textual content of such scenes. However, a database query can provide useful information that can assist the OCR process. For instance, a query to a digital mapping database can provide information such as one or more businesses in a vicinity, the street name, and a range of possible addresses. In accordance with an embodiment of the present invention, this mapping information is used as prior information or constraints for an OCR engine that is interpreting the corresponding street scene image, resulting in much greater accuracy of the digital map data provided to the user.
    Type: Application
    Filed: December 16, 2005
    Publication date: June 21, 2007
    Inventors: Bret Taylor, Luc Vincent
  • Publication number: 20130230208
    Abstract: A mobile device can receive OCR library information associated with a coarse position. The coarse position can be determined by the mobile device, or by a network server configured to communicate with the mobile device. A camera on the mobile device can obtain images of human-readable information in an area near the coarse position. The view finder image can be processed with an OCR engine that is utilizing the OCR library information to determine one or more location string values. A location database can be searched based on the location string values. The position of the mobile device can be estimated and displayed. The position estimated can be adjusted based on the proximity of the mobile device to other features in the image.
    Type: Application
    Filed: March 2, 2012
    Publication date: September 5, 2013
    Applicant: QUALCOMM INCORPORATED
    Inventors: Rajarshi GUPTA, Saumitra Mohan DAS, Hui CHAO
  • Patent number: 9080882
    Abstract: A mobile device can receive OCR library information associated with a coarse position. The coarse position can be determined by the mobile device, or by a network server configured to communicate with the mobile device. A camera on the mobile device can obtain images of human-readable information in an area near the coarse position. The view finder image can be processed with an OCR engine that is utilizing the OCR library information to determine one or more location string values. A location database can be searched based on the location string values. The position of the mobile device can be estimated and displayed. The position estimated can be adjusted based on the proximity of the mobile device to other features in the image.
    Type: Grant
    Filed: March 2, 2012
    Date of Patent: July 14, 2015
    Assignee: QUALCOMM Incorporated
    Inventors: Rajarshi Gupta, Saumitra Mohan Das, Hui Chao
  • Patent number: 8200009
    Abstract: An optical character recognition (OCR) system that includes a user-input function for receiving a user input sample for executing said OCR system for optically recognizing a document to generate an output file using the user input sample as a reference.
    Type: Grant
    Filed: April 22, 2007
    Date of Patent: June 12, 2012
    Inventor: Bo-In Lin
  • Patent number: 8472727
    Abstract: A method for enhancing the accuracy of Optical Character Recognition (OCR) algorithms by detection of differences between a digital image of a document and a text file corresponding to the digital image, created by the OCR algorithm. The method includes calculating the transformation between the first and second digital images such as geometrical distortion, local brightness and contrast differences and blurring due to the optical imaging process. The method estimates the parameters of these transformations so that the transformations can be applied to at least one of the images, rendering it as similar as possible to the other image. The method further compares the two images in order to find differences. The method further displays the differences on a display device and analyzes the differences. The analysis results are fed back to the OCR algorithm.
    Type: Grant
    Filed: January 7, 2011
    Date of Patent: June 25, 2013
    Inventors: Yuval Gronau, Edo Likhovski
  • Patent number: 9317764
    Abstract: An electronic device and method capture multiple images of a scene of real world at a several zoom levels, the scene of real world containing text of one or more sizes. Then the electronic device and method extract from each of the multiple images, one or more text regions, followed by analyzing an attribute that is relevant to OCR in one or more versions of a first text region as extracted from one or more of the multiple images. When an attribute has a value that meets a limit of optical character recognition (OCR) in a version of the first text region, the version of the first text region is provided as input to OCR.
    Type: Grant
    Filed: March 15, 2013
    Date of Patent: April 19, 2016
    Assignee: QUALCOMM Incorporated
    Inventors: Pawan Kumar Baheti, Abhijeet S. Bisain, Rajiv Soundararajan, Dhananjay Ashok Gore
  • Publication number: 20210295030
    Abstract: In example implementations, an apparatus is provided. The apparatus includes an imaging device, a display, a zonal optical character recognition (OCR) device, and a processor. The imaging device is to scan a document to generate a scanned document. The display is to provide a graphical user interface (GUI) to display the scanned document and an electronic form. The zonal OCR device is to scan a selected area of the scanned document. The processor is communicatively coupled to the imaging device, the display, and the zonal OCR device. The processor is to receive a selection of the selected area on the display, to receive a selection of a field on the electronic form, to obtain data from the selected area of the scanned document that is scanned, and to enter the data in the field of the electronic form that is selected.
    Type: Application
    Filed: December 12, 2018
    Publication date: September 23, 2021
    Applicant: Hewlett-Packard Development Company, L.P.
    Inventors: Peter G. Hwang, Timothy P. Blair, Jordi Padros Dominguez
  • Publication number: 20200210743
    Abstract: Representative embodiments disclose mechanisms to create a text stream from raw OCR outputs. The raw OCR output comprises a plurality of bounding boxes, each bounding box defining a region containing text which has been recognized by the OCR system. A weight matrix is calculated that comprises a weight for each pair of bounding boxes. The weight representing the probability that a pair of bounding boxes belongs to the same cluster. The bounding boxes are then clustered along the weights. The resulting clusters are first ordered using an ordering criteria. The bounding boxes within each cluster are then ordered according to a second ordering criteria. The ordered clusters and bounding boxes are then arranged into a text stream.
    Type: Application
    Filed: December 27, 2018
    Publication date: July 2, 2020
    Inventors: Yan Wang, Arun Sacheti, Vishal Chhabilbhai Thakkar, Surendra Srinivas Ulabala, Shloak Jain
  • Patent number: 8014604
    Abstract: Disclosed embodiments of the invention provide automated global optimization methods and systems of OCR, tailored to each document being digitized. A document-specific database is created from an OCR scan of a document of interest, which contains an exhaustive listing of words in the document. Images of each word, taken from all the fonts encountered, are entered into the database and mapped to a corresponding textual representation. After entry of a first instance of an image of a word written in a particular font, each new occurrence of the word in that font can be quickly recognized by image processing techniques. The disclosed methods and systems may be used in conjunction with adaptive character recognition training and word recognition training of the OCR engines.
    Type: Grant
    Filed: April 16, 2008
    Date of Patent: September 6, 2011
    Assignee: International Business Machines Corporation
    Inventors: Asaf Tzadok, Eugeniusz Walach
  • Publication number: 20160307061
    Abstract: Methods and systems for bootstrapping an OCR engine for license plate recognition. One or more OCR engines can be trained utilizing purely synthetically generated characters. A subset of classifiers, which require augmentation with real examples, along how many real examples are required for each, can be identified. The OCR engine can then be deployed to the field with constraints on automation based on this analysis to operate in a “bootstrapping” period wherein some characters are automatically recognized while others are sent for human review. The previously determined number of real examples required for augmenting the subset of classifiers can be collected. Each subset of identified classifiers can then be retrained as the number of real examples required becomes available.
    Type: Application
    Filed: April 16, 2015
    Publication date: October 20, 2016
    Inventors: Orhan Bulan, Claude Fillion, Aaron M. Burry, Vladimir Kozitsky
  • Publication number: 20030085162
    Abstract: The an embodiment of the present invention generally comprises a mailpiece sorting apparatus including a customer specific keyword database and a method of post processing OCR reject mailpieces. Mailpieces that the OCR cannot read and determine the recipient for (“rejects”) are post processed using the customer specific keyword database which contains information regarding addressee field that is particular to the customer. Address cleansing is performed to the information obtained from the OCR system and an addressee match is attempted. If a match is made, the mailpiece is delivered to an appropriate sort bin. If a match is not made then the mailpiece is delivered to a reject bin. The method provides for better automated throughput of sorted mailpieces.
    Type: Application
    Filed: November 7, 2001
    Publication date: May 8, 2003
    Applicant: Pitney Bowes Incorporated
    Inventors: Edward P. Daniels, Robert K. Gottlieb
  • Publication number: 20200302206
    Abstract: A medical device monitoring system and method extract information from screen images from medical device controllers, with a single OCR process invocation per screen image, despite critical information appearing in different screen locations, depending on which medical device controller's screen image is processed. For example, different software versions of the medical device controllers might display the same type of information in different screen locations. Copies of the critical screen information, one copy from each different screen location, are made in a mosaic image, and then the mosaic image is OCR processed to produce text results. Text is selectively extracted from the OCR text results, depending on contents of a selector field on the screen image, such as a software version number or a heart pump model identifier.
    Type: Application
    Filed: March 21, 2019
    Publication date: September 24, 2020
    Inventors: Paul Roland Lemay, Alessandro Simone Agnello
Narrow Results

Filter by US Classification