Correcting Alphanumeric Recognition Errors Patents (Class 382/310)
  • Patent number: 11468128
    Abstract: A search engine optimizer, which works independently and in parallel with a browser and search engine supercomputer to gather, analyze, and distill input information interactively. The optimizer reorganizes the input, and providing an optimized version as an output. The optimized version of the input (e.g. output) is sent to the search engine, which responds to the end user with search results. The optimizer recognizes each request as a pattern and stores the pattern in an advanced Glyph format. This permits the optimizer to identify a left and ride side check mate combination required to achieve certitude.
    Type: Grant
    Filed: December 24, 2014
    Date of Patent: October 11, 2022
    Inventor: Richard Paiz
  • Patent number: 11205091
    Abstract: An object information registration apparatus that registers information of a first object that is a reference object of object recognition holds a first object image that is an image of the first object and recognition method information related to the first object, selects one or more partial regions included in the first object image, sets a recognition method corresponding to each of the one or more partial regions, acquires feature information of each of the one or more partial regions from the first object image based on the set recognition method, and stores the one or more partial regions, the set recognition method, and the acquired feature information in the recognition method information in association with each other.
    Type: Grant
    Filed: March 30, 2020
    Date of Patent: December 21, 2021
    Assignee: HITACHI, LTD.
    Inventors: Taiki Yano, Nobutaka Kimura, Nobuhiro Chihara, Fumiko Beniyama
  • Patent number: 11144797
    Abstract: An image processing apparatus comprises a detection circuit that, by referencing dictionary data acquired by machine learning corresponding to a target photographic subject to be detected in an obtained image, detects the target photographic subject; a selection unit that selects one of a plurality of dictionary data items corresponding to the target photographic subject; and a control circuit that, in a case where a detection evaluation value in a case where the photographic subject is detected by using the dictionary data selected by the selection unit is lower than a predetermined value, controls the detection circuit to detect the target photographic subject by using the selected dictionary data and dictionary data different to the selected dictionary data.
    Type: Grant
    Filed: June 28, 2019
    Date of Patent: October 12, 2021
    Assignee: CANON KABUSHIKI KAISHA
    Inventor: Ryosuke Tsuji
  • Patent number: 11087122
    Abstract: A minimum edit cost is representative of a cost of edit operations performed on a candidate string detected in an image to satisfy characteristics of a model string. An attempt to perform a minimum edit cost between the candidate string and the model string is performed. Upon determining that the candidate string includes a blank character at a first character position between two consecutive non-blank candidate characters, and in response to determining that there is a non-blank model character at a second character position of the model string and that the second character position is associated with the first character position of the blank character in the candidate string, an indication that the minimum edit cost between the candidate string and the model string cannot be computed and that the candidate string is not a match to the model string is output.
    Type: Grant
    Filed: April 18, 2019
    Date of Patent: August 10, 2021
    Assignee: MATROX ELECTRONIC SYSTEMS LTD.
    Inventor: Dominique Rivard
  • Patent number: 10671973
    Abstract: The present disclosure discloses a method for automatic processing of forms using augmented reality. In an embodiment, a filled-out application form including one or more fields is scanned in augmented reality mode and its one or more images are captured. The captured images are used to identify the form type by comparing them with an original application form and to generate an electronic version of the form. Subsequently, filled-out content in the images is extracted and compared with a retrieved configuration file, which has a type same as the identified type of the filled-out application form. Based on this comparison, one or more messages are generated that are superimposed on the electronic version of the form in the augmented reality mode and that both are displayed.
    Type: Grant
    Filed: January 3, 2013
    Date of Patent: June 2, 2020
    Assignee: XEROX CORPORATION
    Inventors: Kovendhan Ponnavaikko, Nischal M Piratla, Sivasubramanian Kandaswamy, Anuradha Rukmangathan, Raja Srinivasan
  • Patent number: 10606947
    Abstract: A speech recognition apparatus includes a predictor configured to predict a word class of a word following a word sequence that has been previously searched for based on the word sequence that has been previously searched for; and a decoder configured to search for a candidate word corresponding to a speech signal, extend the word sequence that has been previously searched for using the candidate word that has been searched for, and adjust a probability value of the extended word sequence based on the predicted word class.
    Type: Grant
    Filed: October 21, 2016
    Date of Patent: March 31, 2020
    Assignee: Samsung Electronics Co., Ltd.
    Inventor: Ji Hyun Lee
  • Patent number: 10304000
    Abstract: Systems and methods are disclosed for selecting cohorts. In one implementation, a model-assisted selection system for identifying candidates for placement into a cohort includes a data interface and at least one processing device. The at least one processing device is programmed to access, via the data interface, a database from which feature vectors associated with an individual from among a population of individuals can be derived; derive, for the individual, one or more feature vectors from the database; provide the one or more feature vectors to a model; receive an output from the model; and determine whether the individual from among the population of individuals is a candidate for the cohort based on the output received from the model.
    Type: Grant
    Filed: April 12, 2018
    Date of Patent: May 28, 2019
    Assignee: Flatiron Health, Inc.
    Inventors: Benjamin Edward Birnbaum, Joshua Daniel Haimson, Lucy Dao-Ke He, Katharina Nicola Seidl-Rathkopf, Monica Nayan Agrawal, Nathan Nussbaum
  • Patent number: 9990268
    Abstract: A system and method for detection of duplicate bug reports. A receiver is configured to receive a first bug report and a word matrix. An extractor extracts keywords from the first bug report for creating a first search string. A comparator compares each of the keywords from the first search string with the word matrix for identifying dissimilar duplicate words. The duplicate bug detector further includes an expander to expand the first search string by including the dissimilar duplicate words for creating the second search string and a searcher to search a bug repository with the first search string and the second search string for identifying similar duplicate bug reports and dissimilar duplicate bug reports.
    Type: Grant
    Filed: March 9, 2016
    Date of Patent: June 5, 2018
    Assignee: Infosys Limited
    Inventors: Satya Prateek Bommaraju, Anjaneyulu Pasala, Shivani Rao
  • Patent number: 9928273
    Abstract: In a method for searching a computer database, a processor receives a set of data containing at least a first character. A processor creates a converted set of data by converting the first character in the received set of data into a second character, wherein the second character represents the first character and one or more additional characters based upon a predetermined equivalency. A processor searches the computer database for a previously stored data entry using the converted set of data. A processor returns a retrieved result of the searching.
    Type: Grant
    Filed: August 19, 2013
    Date of Patent: March 27, 2018
    Assignee: International Business Machines Corporation
    Inventors: Adrian M. Boyko, William J. Oliver
  • Patent number: 9858257
    Abstract: A machine learning engine may correlate contextual information associated with a misspelling in a publication with a likelihood that the misspelling is intentional in nature. Training data may be generated by analyzing one or more past publication to identify misspellings and labeling the misspellings as intentional. A contextual indicators application may analyze the context in which intentional misspellings have been previously included within publication to identify indicators of future misspellings being intentional. A machine learning engine may use the training data and indicators to generate an intentional linguistic deviation (ILD) prediction model to determine whether a new misspelling is an intentional misspelling. The machine learning engine may also determine weights for individual indicators that may calibrate the influence of the respective individual indicators.
    Type: Grant
    Filed: July 20, 2016
    Date of Patent: January 2, 2018
    Assignee: Amazon Technologies, Inc.
    Inventors: Janna S. Hamaker, Sravan Babu Bodapati, John Hambacher, Gururaj Narayanan, Sriraghavendra Ramaswamy
  • Patent number: 9786272
    Abstract: According to an embodiment, a decoder includes a token operating unit, a node adder, and a connection detector. The token operating unit is configured to, every time a signal or a feature is input, propagate each of a plurality of tokens, which is an object assigned with a state of the of a path being searched, according to a digraph until a state or a transition assigned with a non-empty input symbol is reached. The node adder is configured to, in each instance of token propagating, add, in a lattice, a node corresponding to a state assigned to each of the plurality of tokens. The connection detector is configured to refer to the digraph and detect a node that is connected to a node added in an i-th instance in the lattice and that is added in an i+1-th instance in the lattice.
    Type: Grant
    Filed: December 18, 2014
    Date of Patent: October 10, 2017
    Assignee: KABUSHIKI KAISHA TOSHIBA
    Inventor: Manabu Nagao
  • Patent number: 9659018
    Abstract: An MFP (Multifunction Peripheral) is a file name producing apparatus that produces a file name of an image. The MFP selects a candidate character string, which is a file name candidate and in which a head character is a space, from the character strings extracted from the image. The MFP deletes the space that is the head character of the candidate character string, or changes all the characters constituting the candidate character string to other characters. The MFP produces the character string, which is corrected by the deletion or the change, as the file name of the image. Therefore, the proper file name can be produced.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: May 23, 2017
    Assignee: KONICA MINOLTA BUSINESS TECHNOLOGIES, INC.
    Inventor: Takuya Kawano
  • Patent number: 9646061
    Abstract: Methods and arrangements performing fuzzy search. A contemplated method includes: establishing an edit distance threshold for the fuzzy search; generating an index of items to be searched, via: storing at least one string; and creating substrings corresponding to the at least one string; providing a query string for use in searching; creating substrings corresponding to the query string; comparing substrings of the query string with substrings in the index; designating at least one candidate string based on said comparing; verifying whether each candidate string satisfies the edit distance threshold; and outputting at least one matching string for each candidate string that satisfies the edit distance threshold. Other variants and embodiments are broadly contemplated herein.
    Type: Grant
    Filed: January 22, 2015
    Date of Patent: May 9, 2017
    Assignee: International Business Machines Corporation
    Inventors: Manoj Kumar Agarwal, Rajeev Gupta
  • Patent number: 9444968
    Abstract: An image forming apparatus includes a print device, a sign printing circuit, and a sign analyzing circuit. The print device performs a print job onto a print medium. The sign printing circuit causes the print device to print a sign where a glyph is present in a font corresponding to a code based on the code included in print data. The sign analyzing circuit analyzes whether or not the glyph is invalid data where the sign printing circuit is not able to cause the print device to print. When the sign analyzing circuit analyzes that the glyph present in the font corresponding to the code of a target sign is the invalid data, the sign printing circuit causes the print device to print a specific alternative sign as an alternative to the target sign.
    Type: Grant
    Filed: September 1, 2015
    Date of Patent: September 13, 2016
    Assignee: KYOCERA Document Solutions Inc.
    Inventor: Keizen Kanazawa
  • Patent number: 9076061
    Abstract: According to one aspect, embodiments of the invention provide a system and method for utilizing the effort expended by a user in responding to a CAPTCHA request to automatically transcribe text from images in order to verify, retrieve and/or update geographic data associated with geographic locations at which the images were recorded.
    Type: Grant
    Filed: April 12, 2012
    Date of Patent: July 7, 2015
    Assignee: Google Inc.
    Inventors: Marco Zennaro, Luc Vincent, Kong Man Cheung, David Abraham
  • Patent number: 9057618
    Abstract: Systems and methods provide approximations of latitude and longitude coordinates of objects, for example a business, in street level images. The images may be collected by a camera. An image of a business is collected along with GPS coordinates and direction of the camera. Depth maps of the images may be generated, for example, based on laser depth detection or displacement of the business between two images caused by a change in the position of the camera. After identifying a business in one or more images, the distance from the camera to a point or area relative to the business in the one or more images may be determined based on the depth maps. Using this distance and the direction of the camera which collected the one or more images and GPS coordinates of the camera, the approximate GPS coordinates of the business may be determined.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: June 16, 2015
    Assignee: Google Inc.
    Inventors: Abhijit S. Ogale, Stephane Lafon, Andrea Frome
  • Patent number: 9031309
    Abstract: The recognition rate is improved and recognition errors are suppressed when recognizing magnetic ink characters. A character recognition unit calculates a total difference by calculating the total of the differences between the character waveform data and the reference waveform data for each magnetic ink character within the area of one character; calculates a partial difference by summing the differences between character waveform data and reference waveform data in a target area, which is the area corresponding to a stroke that is 2 mesh or more wide in the area of one character; executing a correction process that reduces the value of the partial difference; and recognizing the candidate character as the magnetic ink character that was read when the total difference after the correction process is less than or equal to a threshold value.
    Type: Grant
    Filed: June 12, 2013
    Date of Patent: May 12, 2015
    Assignee: Seiko Epson Corporation
    Inventor: Yoshiaki Kinoshita
  • Patent number: 8995795
    Abstract: Textual errors in digital volumes in a corpus are corrected by comparing a set of similar digital volumes, the set including a basis volume and a plurality of comparison volumes. The basis volume is compared with the comparison volumes to identify sequences of text that are identical across all of the candidate volumes and mismatched sequences of text that contain different text in at least one of the candidate volumes. The correct text for at least some of the mismatched sequences is resolved by comparing the different text in the different candidate volumes. The mismatched sequences are replaced by the resolved correct text, thereby correcting errors in the candidate volumes.
    Type: Grant
    Filed: February 14, 2012
    Date of Patent: March 31, 2015
    Assignee: Google Inc.
    Inventor: Dana L. Dickinson
  • Patent number: 8996476
    Abstract: Apparatus, methods and media for correcting a defective check processing datum. The apparatus may include, and the methods and media may involve, a receiver that is configured to receive from memory a first transaction record. The transaction record may include Magnetic Ink Character Recognition (MICR) line data. The MICR line data may be electronically read from a check. The transaction record may include non-MICR data. The non-MICR data may be electronically read from the check. The apparatus may include, and the methods and media may involve, a processor that is configured to identify a defective datum among the MICR line data. The processor may identify a portion of the non-MICR data that corresponds to the defective datum. The processor may store in memory a second transaction record. The second transaction record may include corrected data that includes an element that is derived from the identified portion of the non-MICR data.
    Type: Grant
    Filed: August 20, 2012
    Date of Patent: March 31, 2015
    Assignee: Bank of America Corporation
    Inventors: Geoffrey R. Williams, Timmy L. Gauvin, Kerry M. Cantley, Deborah N. Bennett, Eric S. Sandoz, II, James F. Barrett, II, Joshua A. Beaudry
  • Patent number: 8971670
    Abstract: A system includes preparing respective proof reading tools for performing carpet proof reading and side-by-side proof reading of text data, recording a log of time to perform proof reading operations by using the first and second proof reading tools. The method further includes estimating, based on times stored in a log, times to perform proof reading of a character using 1) the first proof reading tool followed by using the second proof reading tool, and 2) the second proof reading tool. The method further includes determining for each character value, based on the estimated times, to use the first proof reading tool along with using the second proof reading tool and determining, or to use the second proof reading tool without using the first proof reading tool.
    Type: Grant
    Filed: November 6, 2012
    Date of Patent: March 3, 2015
    Assignee: International Business Machines Corporation
    Inventors: Takashi Itoh, Toshinari Itoki, Takayuki Osogami
  • Patent number: 8965125
    Abstract: Character code data and vector drawing data are both listed and provided in a re-editable manner. Electronic data is generated in which information obtained by vectorizing character areas in an image and information obtained by recognizing characters in the image are stored in respective storage locations. As for the electronic data generated in this manner, because character code data and vector drawing data generated from the input image are both presented by a display and edit program, a user can immediately utilize the both data.
    Type: Grant
    Filed: September 24, 2013
    Date of Patent: February 24, 2015
    Assignee: Canon Kabushiki Kaisha
    Inventors: Taeko Yamazaki, Tomotoshi Kanatsu, Makoto Enomoto, Kitahiro Kaneda
  • Patent number: 8953910
    Abstract: A method includes preparing respective proof reading tools for performing carpet proof reading and side-by-side proof reading of text data, recording a log of time to perform proof reading operations by using the first and second proof reading tools. The method further includes estimating, based on times stored in a log, times to perform proof reading of a character using 1) the first proof reading tool followed by using the second proof reading tool, and 2) the second proof reading tool. The method further includes determining for each character value, based on the estimated times, to use the first proof reading tool along with using the second proof reading tool or to use the second proof reading tool without using the first proof reading tool.
    Type: Grant
    Filed: September 28, 2012
    Date of Patent: February 10, 2015
    Assignee: International Business Machines Corporation
    Inventors: Takashi Itoh, Toshinari Itoki, Takayuki Osogami
  • Patent number: 8934676
    Abstract: A method and system for achieving accurate segmentation of characters with respect to a license plate image within a tight bounding box image. A vehicle image can be captured by an image capturing unit and processed utilizing an ALPR unit. A vertical projection histogram can be calculated to produce an initial character boundary (cuts) and local statistical information can be employed to split a large cut and insert a missing character. The cut can be classified as a valid and/or a suspect character and the suspect character can be analyzed. The suspect character can be normalized and passed to an OCR module for decoding and generating a confidence quote with every conclusion. The non-character images can be rejected at the OCR level by enforcing a confidence threshold. An adjoining suspect narrow character can be combined and the OCR confidence of the combined character can be assessed.
    Type: Grant
    Filed: July 2, 2012
    Date of Patent: January 13, 2015
    Assignee: Xerox Corporation
    Inventors: Aaron Michael Burry, Claude Fillion, Vladimir Kozitsky
  • Patent number: 8792751
    Abstract: Embodiments of a computer system, a method and a computer-program product (e.g., software) for use with the computer system are described. These embodiments allow a user to provide an image of a document for use with software, such as an image of a financial document for use with financial software. In particular, the user can provide the image of the document, for example, by taking a picture of the document using a cellular telephone. This image may be converted into an electronic format that is suitable for text and numerical processing using a character-recognition technique, such as optical character recognition or intelligent character recognition. Errors in the electronic version of the document, if present, may be identified and corrected by comparing the electronic version to information maintained by a third party. This information may be accessed based at least on one or more items in the electronic version of the document.
    Type: Grant
    Filed: July 27, 2009
    Date of Patent: July 29, 2014
    Assignee: Intuit Inc.
    Inventors: Amir Eftekhari, Erikheath A. Thomas, Carol Ann Howe, George Thomas Ericksen, Gerald B. Huff, Gang Wang
  • Patent number: 8744135
    Abstract: Searchable annotated formatted documents are produced by correlating documents stored as photographic or scanned graphic representations of an actual document (evidence, report, court order, etc.) with textual version of the same documents. A produced document will provide additional details in a data structure that supports citation annotation as well as other types of analysis of a document. The data structure also supports generation of citation reports and corpus reports. Methods of creating searchable annotated formatted documents including citation and corpus reports by correlating and correcting text files with photographic or scanned graphic of the original documents. Data structures for correlating and correcting text files with graphic images. Generation of citation reports, concordance reports, and corpus reports. Data structures for citation reports, concordance reports, and corpus reports generation. Multiple document data structures are used to create multiple citation documents and reports.
    Type: Grant
    Filed: October 4, 2010
    Date of Patent: June 3, 2014
    Inventor: Kendyl A. Román
  • Patent number: 8682075
    Abstract: Data representing an image of text is received, as is data representing the text in non-image form. A valid content boundary within the image of the text is determined. For each character within the text in the non-image form, a location of the character within the image of the text is determined. Where the location of the character within the image of the text falls outside the valid content boundary, the character is removed from the data representing the text in the non-image form.
    Type: Grant
    Filed: December 28, 2010
    Date of Patent: March 25, 2014
    Assignee: Hewlett-Packard Development Company, L.P.
    Inventor: Prakash Reddy
  • Patent number: 8649566
    Abstract: A method for detecting motion quality error of printed documents having text in a printing system includes: printing a document having text lines, each text line comprising a plurality of characters; scanning the printed document to generate a scanned image; detecting positions in a process direction of the printing system of one of text lines and characters in the scanned image; determining position errors in the process direction in the printed document based on the detected positions in the scanned image; determining at least one motion quality defect of the printing system in the process direction based on the determined position errors; and initiating an activity associated with said printing system in response to a motion quality error having been determined. A system for detecting motion quality error of printed documents is also disclosed.
    Type: Grant
    Filed: July 19, 2010
    Date of Patent: February 11, 2014
    Assignee: Xerox Corporation
    Inventors: Beilei Xu, Wencheng Wu, Peter Paul, Palghat Ramesh
  • Patent number: 8571359
    Abstract: Character code data and vector drawing data are both listed and provided in a re-editable manner. Electronic data is generated in which information obtained by vectorizing character areas in an image and information obtained by recognizing characters in the image are stored in respective storage locations. As for the electronic data generated in this manner, because character code data and vector drawing data generated from the input image are both presented by a display and edit program, a user can immediately utilize the both data.
    Type: Grant
    Filed: June 17, 2009
    Date of Patent: October 29, 2013
    Assignee: Canon Kabushiki Kaisha
    Inventors: Taeko Yamazaki, Tomotoshi Kanatsu, Makoto Enomoto, Kitahiro Kaneda
  • Patent number: 8509571
    Abstract: A task of the present invention is that even when a plurality of images exists in which the positions or sizes of character patterns indicating the identical object are different from each other, they can be treated as character patterns indicating the identical object. An image and supplementary information of the image, such as a photographing point and time, are input by an image input section (101) and are stored in an image data storage section (102). Character recognition in the image is performed by a character recognition section (103), and the recognition result is stored in a character recognition result storage section (104). An analysis section (106) extracts object character information relevant to an object from the image, the supplementary information, and the character recognition result on the basis of the analysis conditions input in a designation section (105) to thereby analyze an object, and the analysis result is output to a result output section (107).
    Type: Grant
    Filed: April 30, 2009
    Date of Patent: August 13, 2013
    Assignee: Panasonic Corporation
    Inventors: Mariko Takenouchi, Saki Takakura
  • Patent number: 8499046
    Abstract: Techniques for capturing images of business cards, uploading the images to a designated computing device for processing and recognition are disclosed. A mechanism is provided to update extracted data from the images when there are any changes. Depending on implementation, there are a number of ways to capture images of business cards (e.g., via a phone camera, a PC camera, or a scanning device). A transmission means is provided to transport the images to the designated computing device for centralized management of integrated contact information for individual users. As a result, a user may access his/her updatable integrated contact information database anywhere anytime from a chosen device.
    Type: Grant
    Filed: May 6, 2009
    Date of Patent: July 30, 2013
    Inventor: Joe Zheng
  • Patent number: 8452133
    Abstract: To remove an underline even if a business document includes a chart or even if the underline touches a character string, provided is an underline removal apparatus that removes an underline area from binary image data including the underline area touching a character string, the underline removal apparatus including: an underline search processing unit that executes a line template matching process by setting a point on the binary image data as a starting point to set a rectangular line template, tracing pixels included in the line template, and extracting a polyline indicating underline position coordinates; and an underline removal processing unit that uses the polyline to execute a process of obtaining background borderline coordinates between the underline area and a background area and character borderline coordinates between the underline area and the character string obtained by applying an interpolation process to a part in the underline area touching the character string and to execute a process of r
    Type: Grant
    Filed: March 3, 2010
    Date of Patent: May 28, 2013
    Assignee: Hitachi Solutions, Ltd.
    Inventor: Mitsuharu Oba
  • Patent number: 8447143
    Abstract: An image processing apparatus that includes a character recognition component, a determining component and a generating component is provided. The determining component determines, when document data is generated that contains first data representing the document and representing the entity in which the characters are mixed and second data containing character code data of the characters recognized by the character recognition component and representing a character block displaying the characters represented by the character code data, whether to hide the character block represented by the second data behind the entity represented by the first data or to display the character block represented by the second data in front of the entity represented by the first data when the document represented by the document data is displayed, based on lightness or distribution of the lightness of a background region around the characters of the entity or the like.
    Type: Grant
    Filed: January 18, 2011
    Date of Patent: May 21, 2013
    Assignee: Fuji Xerox Co., Ltd.
    Inventors: Chihiro Matsuguma, Hiroyoshi Uejo, Kazuhiro Ohya, Katsuya Koyanagi, Shintaro Adachi
  • Patent number: 8315484
    Abstract: The present invention provides a method and system for confirming uncertainly recognized words as reported by an Optical Character Recognition process by using spelling alternatives as search arguments for an Internet search engine. The measured number of hits for each spelling alternative is used to provide a confirmation measure for the most probable spelling alternative. Whenever the confirmation measure is inconclusive, a plurality of search strategies are used to reach a measured result comprising zero hits except for one spelling alternative that is used as the correct alternative.
    Type: Grant
    Filed: February 15, 2007
    Date of Patent: November 20, 2012
    Assignee: Lumex AS
    Inventors: Hans Christian Meyer, Mats Stefan Carlin, Knut Tharald Fosseide
  • Patent number: 8290312
    Abstract: An information processing apparatus for processing image data including character information, the processing of image data including a process of inserting interpretation information corresponding to the character information. The information processing apparatus includes an image data acquisition unit, an interpretation information retrieval unit, an area information definition unit, and an insertion style determination unit. The image data acquisition unit acquires the image data including the character information composed of a plurality of characters having a first string of characters. The interpretation information retrieval unit retrieves first interpretation information to be attached to the first string of characters. The area information definition unit computes insertable area information on a first insertable area, usable for inserting the first interpretation information, based on coordinate data of characters in the acquired image data.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: October 16, 2012
    Assignee: Ricoh Company, Ltd.
    Inventor: Yoshihisa Ohguro
  • Patent number: 8265396
    Abstract: The present invention provides for the recovery of characters entered into at least one data entry zone of a data entry window. A method in accordance with an embodiment includes: storing a first image of the data entry window during data entry; subtracting a reference image from the first image to obtain a delta image, wherein the reference image is an image of the data entry window without data entered; identifying at least one non empty zone of the delta image and the location of the at least one data entry zone on the data entry window from the location of the at least one non empty zone on the delta image; extracting at least one character by applying optical character recognition to the least one non empty zone; and inputting the at least one character into the location of the at least one data entry zone.
    Type: Grant
    Filed: December 3, 2008
    Date of Patent: September 11, 2012
    Assignee: International Business Machines Corporation
    Inventors: Frederic Bauchot, Jean-Luc Collet, Gerard Marmigere, Joaquin Picon
  • Patent number: 8260455
    Abstract: An address label rework station according to the invention includes a conveyor which can simultaneously transport a series of spaced parcels along a conveyor path in a substantially horizontal stream. The station includes a camera, A work space, a microphone is usable by an operator in the work space, a targeting device directing a human operator in the work space. Station also includes a sensor system, a labeler and a printer to print shipping information. The station also includes a computer configured to receive and recognize image and voice data, generate a label with a recognized shipping address, and control the conveyer to apply a new label to the parcel by the labeler at the position selected using the targeting device.
    Type: Grant
    Filed: December 3, 2009
    Date of Patent: September 4, 2012
    Assignee: Siemens Industry, Inc.
    Inventors: Dale E. Redford, Michael D. Carpenter, James M. Pippin
  • Patent number: 8249399
    Abstract: A method for optical character recognition (OCR) verification, the method includes: receiving a first character image that was obtained from applying an OCR process on a document; wherein the first character image is classified, by the OCR, as being associated with a first character; receiving a first character code of a text; replacing the first character code by the first character image; and evaluating a correctness of the OCR based upon a response of a user to a display of the text first character image.
    Type: Grant
    Filed: September 16, 2008
    Date of Patent: August 21, 2012
    Assignee: International Business Machines Corporation
    Inventors: Ella Barkan, Dan Shmuel Chevion, Boaz Ophir, Doron Tal
  • Patent number: 8170289
    Abstract: Systems and methods for character-by-character alignment of two character sequences (such as OCR output from a scanned document and an electronic version of the same document) using a Hidden Markov Model (HMM) in a hierarchical fashion are disclosed. The method may include aligning two character sequences utilizing multiple hierarchical levels. For each hierarchical level above a final hierarchical level, the aligning may include parsing character subsequences from the two character sequences, performing an alignment of the character subsequences, and designating aligned character subsequences as the anchors, the parsing and performing the alignment being between the anchors generated from an immediately previous hierarchical level if the current hierarchical level is below the first hierarchical level. For the final hierarchical level, the aligning includes performing a character-by-character alignment of characters between anchors generated from the immediately previous hierarchical level.
    Type: Grant
    Filed: September 21, 2005
    Date of Patent: May 1, 2012
    Assignee: Google Inc.
    Inventors: Shaolei Feng, Raghavan Manmatha
  • Patent number: 8155444
    Abstract: Converting text may be provided. A user selectable element may be used to select a text. The selected text may include a first text within an electronic document and a second text within an image. The second text within the image may be converted to character information by receiving the image. The image may have image character information and an image type. An aspect of the received image may be adjusted based on the image type. Optical character recognition may be performed on the adjusted image to extract character information. The character information may include characters and corresponding location information for the characters. The extracted character information may be evaluated to improve the recognition quality of the extracted character information as compared to the image character information.
    Type: Grant
    Filed: January 15, 2007
    Date of Patent: April 10, 2012
    Assignee: Microsoft Corporation
    Inventors: Alex J. Simmons, Radoslav P. Nickolov, Peter Baer, Vincent Lascaux, Igor Kofman
  • Patent number: 8150161
    Abstract: Embodiments of a computer system, a method, and a computer-program product (e.g., software) for use with the computer system are described. These embodiments may be used to identify and correct errors in financial information that was extracted using character-recognition software, such as optical character recognition software and/or intelligent character recognition software. In particular, potential errors may be identified by comparing the financial information for a current financial transaction of a user with expected financial information from one or more previous financial transactions of the user. Error metrics for these potential errors may be determined and used to correct at least some of the potential errors. For example, values of the Levenshtein edit distance may be determined based on the comparison, and one or more potential errors associated with one or more minimum values of the Levenshtein edit distance may be corrected.
    Type: Grant
    Filed: September 22, 2008
    Date of Patent: April 3, 2012
    Inventors: William T. Laaser, Rajalakshmi Ganesan, James A. Schneider
  • Patent number: 8131085
    Abstract: Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. The output of an OCR process is classified into a plurality of clusters of clip images and a representative image for each cluster is generated to identify clusters whose clip images were incorrectly assigned character codes by the OCR process.
    Type: Grant
    Filed: July 15, 2011
    Date of Patent: March 6, 2012
    Assignee: Google Inc.
    Inventors: Luc Vincent, Raymond W. Smith
  • Patent number: 8103132
    Abstract: A method for correcting results of OCR or other scanned symbols. Initially scanning and performing OCR classification on a document. Clustering character/symbol classifications resulting from the OCR based on shapes. Creating super-symbols based on at least a first difference in the shapes of the clustered characters/symbols exceeding a first threshold. A carpet of super-symbols, emphasizing localized differences in similar symbols, is displayed for analysis testing.
    Type: Grant
    Filed: March 31, 2008
    Date of Patent: January 24, 2012
    Assignee: International Business Machines Corporation
    Inventors: Asaf Tzadok, Eugeniusz Walach
  • Patent number: 8032372
    Abstract: A computer program product for computing a correction rate predictor for medical record dictations, the computer program product residing on a computer-readable medium includes computer-readable instructions for causing a computer to obtain a draft medical transcription of at least a portion of a dictation, the dictation being from medical personnel and concerning a patient, determine features of the dictation to produce a feature set comprising a combination of features of the dictation, the features being relevant to a quantity of transcription errors in the transcription, analyze the feature set to compute a predicted correction rate associated with the dictation and use the predicted correction rate to determine whether to provide at least a portion of the transcription to a transcriptionist.
    Type: Grant
    Filed: September 13, 2005
    Date of Patent: October 4, 2011
    Assignee: eScription, Inc.
    Inventors: Roger Scott Zimmerman, George Zavaliagkos
  • Patent number: 7912289
    Abstract: Image text enhancement techniques are described. In an implementation, graphically represented text included in an original image is converted into process capable text. The process capable text may be used to generate a text image which may replace the original text to enhance the image. In further implementations the process capable text may be translated from a first language to a second language for inclusion in the enhanced image.
    Type: Grant
    Filed: May 1, 2007
    Date of Patent: March 22, 2011
    Assignee: Microsoft Corporation
    Inventors: Aman Kansal, Feng Zhao
  • Patent number: 7849398
    Abstract: A method is provided for selecting fields of an electronic form for automatic population with candidate text segments. The candidate text segments can be obtained by capturing an image of a document, applying optical character recognition to the captured image to identify textual content, and tagging candidate text segments in the textual content for fields of the form. The method includes, for each of a plurality of fields of the form, computing a field exclusion function based on at least one parameter selected from a text length parameter, an optical character recognition error rate, a tagging error rate, and a field relevance parameter; and determining whether to select the field for automatic population based on the computed field exclusion function.
    Type: Grant
    Filed: April 26, 2007
    Date of Patent: December 7, 2010
    Assignee: Xerox Corporation
    Inventors: Sebastien Dabet, Marco Bressan, Hervé Poirier
  • Patent number: 7813576
    Abstract: A difference image is obtained between two images which are objects of comparative viewing, regardless of whether the images are processed images. A judgment means judges whether the two images are processed images, based on process confirmation data attached thereto. A correction means corrects images which have been judged to be processed images by the judgment means to a state equivalent to that of the images prior to image processes. The correction is performed based on image processing condition data which are attached to the processed images. A positional alignment means aligns the positions of the two images. An inter image calculation means performs inter image calculation between unprocessed or corrected images.
    Type: Grant
    Filed: November 24, 2003
    Date of Patent: October 12, 2010
    Assignee: FUJIFILM Corporation
    Inventor: Akira Oosawa
  • Patent number: 7778447
    Abstract: A method, device and computer program for mobile object information management program includes obtaining a first image by photographing identification information of a mobile object, executing character recognition process on the first image to obtain a first character recognition result, determining accuracy of the first character recognition result, registering, as the identification information corresponding to the mobile object, a plurality of first character recognition results, for each of which the accuracy is determined as low, and outputting the first character recognition results registered.
    Type: Grant
    Filed: May 18, 2004
    Date of Patent: August 17, 2010
    Assignee: Fujitsu Limited
    Inventors: Kunikazu Takahashi, Kazuyuki Yasutake, Nakaba Yuhara
  • Publication number: 20100169077
    Abstract: Disclosed is a method, system and computer readable recording medium for correcting an OCR result. According to an exemplary embodiment of the present invention, there is provided a method for correcting an OCR result, the method including performing character recognition on content including character information using an OCR technique, removing extra carriage return information from the content, outputting the character recognition result, and correcting word spacing on the outputted result.
    Type: Application
    Filed: December 30, 2009
    Publication date: July 1, 2010
    Applicant: NHN Corporation
    Inventors: Byoung Seok YANG, Hee Cheol Seo, Do Gil Lee, Ki Joon Sung
  • Patent number: 7729541
    Abstract: A method is provided for converting a two-dimensional image or bitmap of a handwritten manuscript into three-dimensional data The three-dimensional data can be used to automatically recognize features of the manuscript, such as characters or words. The method includes the steps of: converting the two-dimensional image into three-dimensional volumetric data; filtering the three-dimensional volumetric data; and processing the filtered three-dimensional volumetric data to resolve features of the two-dimensional image. The method can be used, for example, to differentiate between ascenders, descenders, loops, curls, and endpoints that define the overall letter forms in handwritten text, manuscripts or signatures.
    Type: Grant
    Filed: March 16, 2005
    Date of Patent: June 1, 2010
    Assignee: Arizona Board of Regents, A Body Corporate, Acting for and on Behalf of Arizona State University
    Inventors: Anshuman Razdan, John Femiani
  • Patent number: 7715633
    Abstract: The present invention comprises, in order to allow recognition of content of information indicated in a medium accurately, based on an image data obtained by reading a medium, an extraction unit for extracting each of plural information items from an image data obtained by reading a medium in which each of plural information items satisfying a predetermined relationship is indicated in plural areas, a recognition unit for recognizing content of each of plural information items, and a confirmation unit which evaluates whether or not content of plural information items recognized by the recognition unit is correct based on the predetermined relationship, and confirms content of plural information items as recognized by the recognition unit if correct, and executes correction of recognition content recognized by the recognition unit if incorrect, based on the predetermined relationship, to confirm content of plural information items.
    Type: Grant
    Filed: April 26, 2006
    Date of Patent: May 11, 2010
    Assignees: Fujitsu Limited, Fujitsu Frontech Limited
    Inventors: Koichi Kanamoto, Shinichi Eguchi