Search Patents
  • Publication number: 20110213685
    Abstract: A number of different tags are input in a fax cover sheet that tell an OCR system not only the identity of the supplier, but also to which client the document should be routed. The OCR system identifies a number of these tags and compares them to stored supplier data to validate to which supplier the document belongs. If the system cannot validate the document, it is routed to a GUI for manual sorting. If there is no coversheet, the system relies upon the OCR system to locate keywords on the document and caller ID information to suggest a correct supplier. The OCR system also clips a separate, horizontal slice of the document ('snippet) that corresponds to the display of any line item and places it in a data base for future reference and reporting. The application collects and associates all corresponding snippets to their originating line items.
    Type: Application
    Filed: March 10, 2011
    Publication date: September 1, 2011
    Inventors: Joseph FLYNN, Kerry Edward Koitzsch, Wassim G. Jraige
  • Publication number: 20150178855
    Abstract: A number of different tags are input in a fax cover sheet that tell an OCR system not only the identity of the supplier, but also to which client the document should be routed. The OCR system identifies a number of these tags and compares them to stored supplier data to validate to which supplier the document belongs. If the system cannot validate the document, it is routed to a GUI for manual sorting. If there is no coversheet, the system relies upon the OCR system to locate keywords on the document and caller ID information to suggest a correct supplier. The OCR system also clips a separate, horizontal slice of the document (‘snippet’) that corresponds to the display of any line item and places it in a data base for future reference and reporting. The application collects and associates all corresponding snippets to their originating line items.
    Type: Application
    Filed: February 27, 2015
    Publication date: June 25, 2015
    Inventors: Joseph FLYNN, Kerry Edward KOITZSCH, Wassim G. JRAIGE
  • Publication number: 20150139506
    Abstract: The technology of the present disclosure includes computer-implemented methods, computer program products, and systems to filter images before transmitting to a system for optical character recognition (“OCR”). A user computing device obtains a first image of the card from the digital scan of a physical card and analyzes features of the first image, the analysis being sufficient to determine if the first image is likely to be usable by an OCR algorithm. If the user computing device determines that the first image is likely to be usable, then the first image is transmitted to an OCR system associated with the OCR algorithm. Upon a determination that the first image is unlikely to be usable, a second image of the card from the digital scan of the physical card is analyzed. The optical character recognition system performs an optical character recognition algorithm on the filtered card.
    Type: Application
    Filed: October 27, 2014
    Publication date: May 21, 2015
    Inventors: Xiaohang Wang, Alessandro Bissacco, Glen Berntson, Marria Nazif, Justin Scheiner, Sam Shih, Mark Leslie Snyder, Daniel Talavera
  • Publication number: 20180157906
    Abstract: There is disclosed a method of analyzing a digital image of a document (to determine, as example, a document suitability for server-based OCR processing) in a computer system that includes a user electronic device (for acquiring or storing a digital image of a document) connectable to a server (for executing the server-based OCR processing of the digital image to create a recognized-text document). The method is executable by the user electronic device and comprises: acquiring the digital image of the document; analyzing an OCR quality parameter associated with a compressed digital image to be created from the digital image using a compression algorithm and a compression parameter; in response to the OCR quality parameter being above or equal to a pre-determined threshold: transmitting the compressed digital image to the server.
    Type: Application
    Filed: December 13, 2016
    Publication date: June 7, 2018
    Inventors: Vasily Loginov, Ivan Zagaynov
  • Patent number: 12008829
    Abstract: A method to improve the efficacy of optical character recognition (OCR) includes scanning an electronically stored representation of a whole or partial document, identifying an image having text in the electronically stored representation of a whole or partial document, identifying the text within the image, and generating a plurality of bounding boxes around the identified text using blob detection. The method also includes grouping together certain text bounding boxes of the plurality of text bounding boxes that are vertically aligned with each other to generate a plurality of aligned text bounding boxes and performing OCR on the aligned text bounding boxes to generate a plurality of OCR groups of text. In addition, the method includes generating a resultant representation of a whole or partial document electronically using the plurality of OCR groups of text and saving the resultant representation of a whole or partial document electronically.
    Type: Grant
    Filed: February 16, 2022
    Date of Patent: June 11, 2024
    Assignee: VASTEC, INC.
    Inventor: Willem H. Reinpoldt, III
  • Publication number: 20100169077
    Abstract: Disclosed is a method, system and computer readable recording medium for correcting an OCR result. According to an exemplary embodiment of the present invention, there is provided a method for correcting an OCR result, the method including performing character recognition on content including character information using an OCR technique, removing extra carriage return information from the content, outputting the character recognition result, and correcting word spacing on the outputted result.
    Type: Application
    Filed: December 30, 2009
    Publication date: July 1, 2010
    Applicant: NHN Corporation
    Inventors: Byoung Seok YANG, Hee Cheol Seo, Do Gil Lee, Ki Joon Sung
  • Patent number: 10108879
    Abstract: The present disclosure includes techniques for selecting a candidate presentation style for individual documents for inclusion in an aggregate training data set for a document type that may be used to train an OCR processing engine prior to identifying text in an image of a document of the document type. In one embodiment, text input corresponding to a text sample in a document is received, and an image of the text sample in the document is received. For each of a plurality of candidate presentation styles, an OCR processing engine is trained using a training data set corresponding to the given candidate presentation style, and the OCR processing engine is used, as trained, to identify text in the received image. The OCR processing results for each candidate presentation style are compared to the received text input. A candidate presentation style for the document is selected based on the comparisons.
    Type: Grant
    Filed: September 21, 2016
    Date of Patent: October 23, 2018
    Assignee: Intuit inc.
    Inventors: Eugene Krivopaltsev, Sreeneel K. Maddika, Vijay S. Yellapragada
  • Publication number: 20080212901
    Abstract: A character based system and method for correcting low confidence characters from an OCR system facilitates operator review, editing and correction of character and field level data generated by an OCR system without the need for an application that is installed at the operator workstation. The system creates a data structure of OCR information and provides that information to an operator through an HTML interface that is rendered using HTML and JavaScript. The data structure includes an OCR confidence level for each character and/or field and the operator is prompted to review only those characters/fields that meet a predetermined threshold for the confidence level. The operator can use an input key (e.g., TAB or ENTER) to navigate to each character/field with a low confidence level and thereby correct or validate each low confidence character/field as appropriate.
    Type: Application
    Filed: March 3, 2008
    Publication date: September 4, 2008
    Applicant: H.B.P. OF SAN DIEGO, INC.
    Inventors: Tom Castiglia, Mark Walter
  • Publication number: 20170185833
    Abstract: The technology of the present disclosure includes computer-implemented methods, computer program products, and systems to filter images before transmitting to a system for optical character recognition (“OCR”). A user computing device obtains a first image of the card from the digital scan of a physical card and analyzes features of the first image, the analysis being sufficient to determine if the first image is likely to be usable by an OCR algorithm. If the user computing device determines that the first image is likely to be usable, then the first image is transmitted to an OCR system associated with the OCR algorithm. Upon a determination that the first image is unlikely to be usable, a second image of the card from the digital scan of the physical card is analyzed. The optical character recognition system performs an optical character recognition algorithm on the filtered card.
    Type: Application
    Filed: March 13, 2017
    Publication date: June 29, 2017
    Inventors: Xiaohang Wang, Alessandro Bissacco, Glenn Merlind Berntson, Marria Nazif, Justin Scheiner, Sam Shih, Mark Leslie Snyder, Daniel Talavera
  • Patent number: 8996416
    Abstract: A number of different tags are input in a fax cover sheet that tell an OCR system not only the identity of the supplier, but also to which client the document should be routed. The OCR system identifies a number of these tags and compares them to stored supplier data to validate to which supplier the document belongs. If the system cannot validate the document, it is routed to a GUI for manual sorting. If there is no coversheet, the system relies upon the OCR system to locate keywords on the document and caller ID information to suggest a correct supplier. The OCR system also clips a separate, horizontal slice of the document (‘snippet’) that corresponds to the display of any line item and places it in a data base for future reference and reporting. The application collects and associates all corresponding snippets to their originating line items.
    Type: Grant
    Filed: March 10, 2011
    Date of Patent: March 31, 2015
    Assignee: Lavante, Inc.
    Inventors: Joseph Flynn, Kerry Edward Koitzsch, Wassim G. Jraige
  • Patent number: 8468013
    Abstract: Disclosed is a method, system and computer readable recording medium for correcting an OCR result. According to an exemplary embodiment of the present invention, there is provided a method for correcting an OCR result, the method including performing character recognition on content including character information using an OCR technique, removing extra carriage return information from the content, outputting the character recognition result, and correcting word spacing on the outputted result.
    Type: Grant
    Filed: December 30, 2009
    Date of Patent: June 18, 2013
    Assignee: NHN Corporation
    Inventors: Byoung Seok Yang, Hee Cheol Seo, Do Gil Lee, Ki Joon Sung
  • Patent number: 9740929
    Abstract: The technology of the present disclosure includes computer-implemented methods, computer program products, and systems to filter images before transmitting to a system for optical character recognition (“OCR”). A user computing device obtains a first image of the card from the digital scan of a physical card and analyzes features of the first image, the analysis being sufficient to determine if the first image is likely to be usable by an OCR algorithm. If the user computing device determines that the first image is likely to be usable, then the first image is transmitted to an OCR system associated with the OCR algorithm. Upon a determination that the first image is unlikely to be usable, a second image of the card from the digital scan of the physical card is analyzed. The optical character recognition system performs an optical character recognition algorithm on the filtered card.
    Type: Grant
    Filed: March 13, 2017
    Date of Patent: August 22, 2017
    Assignee: GOOGLE INC.
    Inventors: Xiaohang Wang, Alessandro Bissacco, Glenn Merlind Berntson, Marria Nazif, Justin Scheiner, Sam Shih, Mark Leslie Snyder, Daniel Talavera
  • Patent number: 10198628
    Abstract: There is disclosed a method of analyzing a digital image of a document (to determine, as example, a document suitability for server-based OCR processing) in a computer system that includes a user electronic device (for acquiring or storing a digital image of a document) connectable to a server (for executing the server-based OCR processing of the digital image to create a recognized-text document). The method is executable by the user electronic device and comprises: acquiring the digital image of the document; analyzing an OCR quality parameter associated with a compressed digital image to be created from the digital image using a compression algorithm and a compression parameter; in response to the OCR quality parameter being above or equal to a pre-determined threshold: transmitting the compressed digital image to the server.
    Type: Grant
    Filed: December 13, 2016
    Date of Patent: February 5, 2019
    Assignee: ABBYY DEVELOPMENT LLC
    Inventors: Vasily Loginov, Ivan Zagaynov
  • Patent number: 8903136
    Abstract: The technology of the present disclosure includes computer-implemented methods, computer program products, and systems to filter images before transmitting to a system for optical character recognition (“OCR”). A user computing device obtains a first image of the card from the digital scan of a physical card and analyzes features of the first image, the analysis being sufficient to determine if the first image is likely to be usable by an OCR algorithm. If the user computing device determines that the first image is likely to be usable, then the first image is transmitted to an OCR system associated with the OCR algorithm. Upon a determination that the first image is unlikely to be usable, a second image of the card from the digital scan of the physical card is analyzed. The optical character recognition system performs an optical character recognition algorithm on the filtered card.
    Type: Grant
    Filed: December 18, 2013
    Date of Patent: December 2, 2014
    Assignee: Google Inc.
    Inventors: Xiaohang Wang, Alessandro Bissacco, Glen Berntson, Marria Nazif, Justin Scheiner, Sam Shih, Mark Leslie Snyder, Daniel Talavera
  • Patent number: 7627177
    Abstract: A system is presented for scanning entire books or document all at once using an adaptive process where the book or document has known fonts and unknown fonts. The known fonts are processed through a verification system where sure words and error words are determined. Both the sure words and error words are sent to OCR training where they are re-OCR'ed and repeatedly verified until they meet a predetermined quality criteria. Characters or words not meeting the predetermined quality criteria receive additional OCR training until all the characters and words pass the predetermined quality criteria. Unknown fonts are scanned and clustered together by shape. Outliers in the shapes are manually keyed-in. Those symbols that are manually classified go to OCR training and then to the known type optimization process.
    Type: Grant
    Filed: November 24, 2008
    Date of Patent: December 1, 2009
    Assignee: International Business Machines Corporation
    Inventors: Asaf Tzadok, Eugeniusz Walach
  • Publication number: 20180107892
    Abstract: A camera system with dual embedded optical character recognition (OCR) engines. The camera system includes a camera module for capturing an image of a vehicle, the image including a license plate with a license plate number containing characters; a first OCR engine that produces a first read and first confidence level by extracting the characters from the license plate; and a second OCR engine, different from the first OCR engine, that produces a second read and second confidence level extracting the characters from the license plate. The camera system further includes a comparator for comparing the first read to the second read. If the first read and the second read match, the system produces the matching read as a final read. If the first read and the second read do not match, a fusion module produces a final read using the first read, the first confidence level, the second read, and the second confidence level.
    Type: Application
    Filed: April 6, 2016
    Publication date: April 19, 2018
    Inventors: Peter ISTENES, Stephanie R. SCHUMACHER, Benjamin W. WATSON
  • Patent number: 7480411
    Abstract: A system/method is presented for scanning entire books or document all at once using an adaptive process where the book or document has known fonts and unknown fonts. The known fonts are processed through a verification system where sure words and error words are determined. Both the sure words and error words are sent to OCR training where they are re-OCR'ed and repeatedly verified until they meet a predetermined quality criteria. Characters or word not meeting the predetermined quality criteria receive additional OCR training until all the characters and words pass the predetermined quality criteria. Unknown fonts are scanned and clustered together by shape. Outliers in the shapes are manually key-in. Those symbols that are manually classified go to OCR training and then to the known type optimization process.
    Type: Grant
    Filed: March 3, 2008
    Date of Patent: January 20, 2009
    Assignee: International Business Machines Corporation
    Inventors: Asaf Tzadok, Eugeniusz Walach
  • Patent number: 9626556
    Abstract: The technology of the present disclosure includes computer-implemented methods, computer program products, and systems to filter images before transmitting to a system for optical character recognition (“OCR”). A user computing device obtains a first image of the card from the digital scan of a physical card and analyzes features of the first image, the analysis being sufficient to determine if the first image is likely to be usable by an OCR algorithm. If the user computing device determines that the first image is likely to be usable, then the first image is transmitted to an OCR system associated with the OCR algorithm. Upon a determination that the first image is unlikely to be usable, a second image of the card from the digital scan of the physical card is analyzed. The optical character recognition system performs an optical character recognition algorithm on the filtered card.
    Type: Grant
    Filed: October 27, 2014
    Date of Patent: April 18, 2017
    Assignee: GOOGLE INC.
    Inventors: Xiaohang Wang, Alessandro Bissacco, Glenn Merlind Berntson, Marria Nazif, Justin Scheiner, Sam Shih, Mark Leslie Snyder, Daniel Talavera
  • Publication number: 20210365836
    Abstract: Systems and methods for computer-implemented pre-optimization of input data before further processing thereof by a computer-implemented analyzation process, such as optical character recognition (OCR). A cooperative model is employed that combines one or more supervised-learning based inspector sub-models, and one or more filter sub-models that operating in series with the inspector sub-model(s). The inspectors first receive the input data and calculate one predicted transformation parameters then used to perform transformations on the input data. The inspector-transformed data is then passed to the filters, which derive respective convolution kernels and apply same to the inspector-transformed data before passing same to the OCR or other analyzation process. The inspectors may be pretrained with different training data.
    Type: Application
    Filed: May 13, 2021
    Publication date: November 25, 2021
    Inventor: Ian Jeffrey Wilkins
  • Patent number: 3969700
    Abstract: A data processing system is disclosed for selecting the correct form of a garbled input word misread by an optical character reader so as to change the number of characters in the word by character splitting or concatenation. Dictionary words are stored in the system, having characters which are flagged for segmentation or concatenation OCR misread propensity. The OCR word and a dictionary word are loaded into a pair of associated shift registers, aligning their letters on one end. The dictionary word characters are inspected for error propensity flags. When a splitting propensity, for example, is found for a character, special conductional probability values are accessed from a storage and a calculation is performed of the probability that the first character of the dictionary word was split by the OCR into the first and second characters of the OCR word. This regional context probability is compared with the probability of a simple substitution error for the characters.
    Type: Grant
    Filed: July 30, 1975
    Date of Patent: July 13, 1976
    Assignee: International Business Machines Corporation
    Inventors: Ellen Willis Bollinger, Anne Marie Chaires, Jean Marie Ciconte, Allen Harold Ett, John Joseph Hilliard, Walter Steven Rosenbaum
Narrow Results

Filter by US Classification