Search Patents
  • Publication number: 20110103688
    Abstract: A system and/or method for increasing the accuracy of optical character recognition (OCR) for at least one item, comprising: obtaining OCR results of OCR scanning from at least one OCR module; creating at least one OCR seed using at least a portion of the OCR results; creating at least one OCR learn set using at least a portion of the OCR seed; and applying the OCR learn set to the at least one item to obtain additional optical character recognition (OCR) results.
    Type: Application
    Filed: November 2, 2009
    Publication date: May 5, 2011
    Inventors: Harry Urbschat, Ralph Meier, Thorsten Wanschura, Johannes Hausmann
  • Patent number: 9152883
    Abstract: A system and/or method for increasing the accuracy of optical character recognition (OCR) for at least one item, comprising: obtaining OCR results of OCR scanning from at least one OCR module; creating at least one OCR seed using at least a portion of the OCR results; creating at least one OCR learn set using at least a portion of the OCR seed; and applying the OCR learn set to the at least one item to obtain additional optical character recognition (OCR) results.
    Type: Grant
    Filed: November 2, 2009
    Date of Patent: October 6, 2015
    Inventors: Harry Urbschat, Ralph Meier, Thorsten Wanschura, Johannes Hausmann
  • Patent number: 7171061
    Abstract: Systems and methods for triage of passages of text output from an OCR system by use of trainable models of the accuracy of the OCR system based on attributes of individual characters. The systems and methods according to this invention automatically triage an OCR-output text passage by determining at least one OCR-output character attribute for each OCR-output character, determining an error rate for the OCR-output text passage using a triage model and the determined at least one OCR-output character attribute, and comparing the determined error rate for the OCR-output text passage with an OCR-output text passage threshold error rate to perform an OCR-output text passage triage decision. Triage decision includes for example, sending OCR results directly to an end user without any post-OCR processing, sending the OCR results through a post-OCR inspection and processing stage, sending the original document image to be completely keyed in manually, and a combination thereof.
    Type: Grant
    Filed: July 12, 2002
    Date of Patent: January 30, 2007
    Assignee: Xerox Corporation
    Inventors: Prateek Sarkar, Henry S. Baird, John R. Henderson
  • Publication number: 20180217973
    Abstract: The present disclosure discloses methods and systems for creating a multi-layered Optical Character Recognition (OCR) document, the multi-layered OCR document facilitates selection of the desired text from the multi-layered OCR document. The method includes receiving a scanned image corresponding to a document, the document includes text information. A binary image is generated from the scanned image. Then, a morphological dilation operation is performed to create one or more text groups, using a horizontal structuring element and a vertical structuring element. Thereafter, OCR operation is applied on each text group to generate a corresponding OCR layer. The one or more OCR layers are then combined while creating a multi-layered OCR document. Finally, the combined OCR layers are superimposed as invisible text layers over the scanned image to create the multi-layered OCR document.
    Type: Application
    Filed: January 27, 2017
    Publication date: August 2, 2018
    Inventors: Sainarayanan Gopalakrishnan, Rajasekar Kanagasabai, Sudhagar Subbaian
  • Publication number: 20040010758
    Abstract: Systems and methods for triage of passages of text output from an OCR system by use of trainable models of the accuracy of the OCR system based on attributes of individual characters.
    Type: Application
    Filed: July 12, 2002
    Publication date: January 15, 2004
    Inventors: Prateek Sarkar, Henry S. Baird, John R. Henderson
  • Publication number: 20210319248
    Abstract: An OCR system which acquires character data from a form (50) through OCR processing is characterized by: managing an OCR information table (34e) in which an issuer name of an issuer on the form (50) is associated with a font name of a font used in the OCR processing; and, when the OCR processing is performed on an issuer-recorded content reading target area in the form (50), performing the OCR processing (S156) in the font indicated by the font name associated in the OCR information table with the issuer name of the issuer of the form (50).
    Type: Application
    Filed: August 8, 2019
    Publication date: October 14, 2021
    Applicant: KYOCERA Document Solutions Inc.
    Inventor: Tomohiro KAWASAKI
  • Patent number: 10049097
    Abstract: The present disclosure discloses methods and systems for creating a multi-layered Optical Character Recognition (OCR) document, the multi-layered OCR document facilitates selection of the desired text from the multi-layered OCR document. The method includes receiving a scanned image corresponding to a document, the document includes text information. A binary image is generated from the scanned image. Then, a morphological dilation operation is performed to create one or more text groups, using a horizontal structuring element and a vertical structuring element. Thereafter, OCR operation is applied on each text group to generate a corresponding OCR layer. The one or more OCR layers are then combined while creating a multi-layered OCR document. Finally, the combined OCR layers are superimposed as invisible text layers over the scanned image to create the multi-layered OCR document.
    Type: Grant
    Filed: January 27, 2017
    Date of Patent: August 14, 2018
    Assignee: XEROX CORPORATION
    Inventors: Sainarayanan Gopalakrishnan, Rajasekar Kanagasabai, Sudhagar Subbaian
  • Patent number: 11972197
    Abstract: An OCR system which acquires character data from a form (50) through OCR processing is characterized by: managing an OCR information table (34e) in which an issuer name of an issuer on the form (50) is associated with a font name of a font used in the OCR processing; and, when the OCR processing is performed on an issuer-recorded content reading target area in the form (50), performing the OCR processing (S156) in the font indicated by the font name associated in the OCR information table with the issuer name of the issuer of the form (50).
    Type: Grant
    Filed: August 8, 2019
    Date of Patent: April 30, 2024
    Assignee: KYOCERA Document Solutions Inc.
    Inventor: Tomohiro Kawasaki
  • Publication number: 20070047847
    Abstract: A document OCR implementing device, includes a reading part configured to read a document and form a recognition image; an obtaining part configured to perform image processing of the recognition image and obtain a state of the recognition image; a plurality of OCR engines configured to perform a character recognition process of the recognition image; and a designating part configured to designate the OCR engine by combining the recognition image and the OCR engine; wherein the character recognition process is implemented by using the OCR engine designated by the designating part.
    Type: Application
    Filed: August 25, 2006
    Publication date: March 1, 2007
    Inventor: Kiyoshi Kasatani
  • Publication number: 20120134589
    Abstract: An image of a known text sample having a text type is generated. The image of the known text sample is input into each OCR engine of a number of OCR engines. Output text corresponding to the image of the known text sample is received from each OCR engine. For each OCR engine, the output text received from the OCR engine is compared with the known text sample, to determine a confidence value of the OCR engine for the text type of the known text sample.
    Type: Application
    Filed: November 27, 2010
    Publication date: May 31, 2012
    Inventor: Prakash Reddy
  • Publication number: 20170109594
    Abstract: The present disclosure is directed to systems, methods, and devices that enable the revising of Optical Character Recognition (OCR) data by indexing and displaying potential error locations within the OCR data. The primary method for revising the OCR data includes a terminal device indexing, displaying, receiving editing operations for, and editing the OCR data. The terminal device is configured to revise OCR data and includes an OCR review element, which, in some embodiments, is a software stored on a non-transitory, computer-readable medium, that is executed by a processing unit to cause the terminal device to index, display, receive editing operations for, and edit the OCR data.
    Type: Application
    Filed: October 20, 2015
    Publication date: April 20, 2017
    Inventors: Allan Sahagun, Jacek Joseph Matysiak
  • Publication number: 20100329537
    Abstract: A computer-implemented method is provided of identifying an optical character recognition (OCR) font to assist an operator in setting up a bank remittance coupon application. The computer-implemented method comprises electronically on a processor reading an OCR font in a zone of the bank remittance coupon, electronically on a processor comparing the read OCR font with an OCR font stored in a look-up table to determine if the read OCR font meets predetermined criteria, and storing the read OCR font into a configuration file to setup at least a portion of the bank remittance coupon application when the read OCR font meets the predetermined criteria.
    Type: Application
    Filed: June 25, 2009
    Publication date: December 30, 2010
    Inventor: Michael E. Gardi
  • Patent number: 9760786
    Abstract: The present disclosure is directed to systems, methods, and devices that enable the revising of Optical Character Recognition (OCR) data by indexing and displaying potential error locations within the OCR data. The primary method for revising the OCR data includes a terminal device indexing, displaying, receiving editing operations for, and editing the OCR data. The terminal device is configured to revise OCR data and includes an OCR review element, which, in some embodiments, is a software stored on a non-transitory, computer-readable medium, that is executed by a processing unit to cause the terminal device to index, display, receive editing operations for, and edit the OCR data.
    Type: Grant
    Filed: October 20, 2015
    Date of Patent: September 12, 2017
    Assignee: KYOCERA Document Solutions Inc.
    Inventors: Allan Sahagun, Jacek Joseph Matysiak
  • Publication number: 20160259991
    Abstract: Embodiments of the present disclosure disclose a method for performing Optical Character Recognition (OCR) of an article. The method comprises acquiring an image of the article. The image of the article is scanned using predetermined scan settings. Then, textual regions of the scanned image of the article are identified. The OCR of the at least one of the textual regions is performed using predetermined OCR settings. One or more textual regions of the textual regions are marked upon determining an error in performing the OCR of the one or more textual regions. The OCR of the one or more textual regions is iterated as per one or more predefined OCR scanning parameters based on an OCR quality of the one or more textual regions upon marking the one or more textual regions.
    Type: Application
    Filed: June 22, 2015
    Publication date: September 8, 2016
    Applicant: Wipro Limited
    Inventors: Tomson Ganapathiplackal GEORGE, Sudheesh Joseph
  • Publication number: 20150049947
    Abstract: Dynamically configuring OCR processing may include determining a device type and determining whether to perform optical character recognition (OCR) processing of the received image locally based on one or more OCR parameters. Example OCR parameters may include the device type, the image type, the size of the received image, the available amount of the memory, the measured/benchmarked throughput of OCR processing on the device relative to an OCR server throughput and network throughput, and/or the current level of network connectivity. If it is determined that OCR processing of the received image should be performed locally, the device may compute one or more name-value pairs corresponding to the received image and transmit the name-value pairs to a remote data server for processing.
    Type: Application
    Filed: August 13, 2013
    Publication date: February 19, 2015
    Applicant: Bank of America Corporation
    Inventors: Georgios Katsaros, Donald Werner Schoppe, Bryan Anthony VonCannon, Pavan Chayanam
  • Patent number: 8452099
    Abstract: An image of a known text sample having a text type is generated. The image of the known text sample is input into each OCR engine of a number of OCR engines. Output text corresponding to the image of the known text sample is received from each OCR engine. For each OCR engine, the output text received from the OCR engine is compared with the known text sample, to determine a confidence value of the OCR engine for the text type of the known text sample.
    Type: Grant
    Filed: November 27, 2010
    Date of Patent: May 28, 2013
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Prakash Reddy
  • Patent number: 9984287
    Abstract: Embodiments of the present disclosure disclose a method for performing Optical Character Recognition (OCR) of an article. The method comprises acquiring an image of the article. The image of the article is scanned using predetermined scan settings. Then, textual regions of the scanned image of the article are identified. The OCR of the at least one of the textual regions is performed using predetermined OCR settings. One or more textual regions of the textual regions are marked upon determining an error in performing the OCR of the one or more textual regions. The OCR of the one or more textual regions is iterated as per one or more predefined OCR scanning parameters based on an OCR quality of the one or more textual regions upon marking the one or more textual regions.
    Type: Grant
    Filed: June 22, 2015
    Date of Patent: May 29, 2018
    Assignee: Wipro Limited
    Inventors: Tomson Ganapathiplackal George, Sudheesh Joseph
  • Publication number: 20200380286
    Abstract: Disclosed herein are computer-implemented methods, computer-implemented systems, and non-transitory, computer-readable media for automatic Optical Character Recognition (OCR) correction. One computer-implemented method includes evaluating an OCR result using a trained Long short-term memory (LSTM) neural network language model to determine whether correction to the OCR result is required. If correction to the OCR result is required, a most similar text relative to the OCR result is determined from a name and address corpus using a modified edit distance technique. The OCR result is corrected with the determined most similar text.
    Type: Application
    Filed: February 14, 2020
    Publication date: December 3, 2020
    Applicant: Alibaba Group Holding Limited
    Inventor: Ruoyu Li
  • Patent number: 7769249
    Abstract: A document OCR implementing device, includes a reading part configured to read a document and form a recognition image; an obtaining part configured to perform image processing of the recognition image and obtain a state of the recognition image; a plurality of OCR engines configured to perform a character recognition process of the recognition image; and a designating part configured to designate the OCR engine by combining the recognition image and the OCR engine; wherein the character recognition process is implemented by using the OCR engine designated by the designating part.
    Type: Grant
    Filed: August 25, 2006
    Date of Patent: August 3, 2010
    Assignee: Ricoh Company, Limited
    Inventor: Kiyoshi Kasatani
  • Patent number: 8983190
    Abstract: Dynamically configuring OCR processing may include determining a device type and determining whether to perform optical character recognition (OCR) processing of the received image locally based on one or more OCR parameters. Example OCR parameters may include the device type, the image type, the size of the received image, the available amount of the memory, the measured/benchmarked throughput of OCR processing on the device relative to an OCR server throughput and network throughput, and/or the current level of network connectivity. If it is determined that OCR processing of the received image should be performed locally, the device may compute one or more name-value pairs corresponding to the received image and transmit the name-value pairs to a remote data server for processing.
    Type: Grant
    Filed: August 13, 2013
    Date of Patent: March 17, 2015
    Assignee: Bank of America Corporation
    Inventors: Georgios Katsaros, Donald Werner Schoppe, Bryan Anthony VonCannon, Pavan Chayanam
Narrow Results

Filter by US Classification