Correcting Alphanumeric Recognition Errors Patents (Class 382/310)

Search engine optimizer

Patent number: 11468128

Abstract: A search engine optimizer, which works independently and in parallel with a browser and search engine supercomputer to gather, analyze, and distill input information interactively. The optimizer reorganizes the input, and providing an optimized version as an output. The optimized version of the input (e.g. output) is sent to the search engine, which responds to the end user with search results. The optimizer recognizes each request as a pattern and stores the pattern in an advanced Glyph format. This permits the optimizer to identify a left and ride side check mate combination required to achieve certitude.

Type: Grant

Filed: December 24, 2014

Date of Patent: October 11, 2022

Inventor: Richard Paiz
Object information registration apparatus and object information registration method

Patent number: 11205091

Abstract: An object information registration apparatus that registers information of a first object that is a reference object of object recognition holds a first object image that is an image of the first object and recognition method information related to the first object, selects one or more partial regions included in the first object image, sets a recognition method corresponding to each of the one or more partial regions, acquires feature information of each of the one or more partial regions from the first object image based on the set recognition method, and stores the one or more partial regions, the set recognition method, and the acquired feature information in the recognition method information in association with each other.

Type: Grant

Filed: March 30, 2020

Date of Patent: December 21, 2021

Assignee: HITACHI, LTD.

Inventors: Taiki Yano, Nobutaka Kimura, Nobuhiro Chihara, Fumiko Beniyama
Image processing apparatus, image processing method, and storage medium

Patent number: 11144797

Abstract: An image processing apparatus comprises a detection circuit that, by referencing dictionary data acquired by machine learning corresponding to a target photographic subject to be detected in an obtained image, detects the target photographic subject; a selection unit that selects one of a plurality of dictionary data items corresponding to the target photographic subject; and a control circuit that, in a case where a detection evaluation value in a case where the photographic subject is detected by using the dictionary data selected by the selection unit is lower than a predetermined value, controls the detection circuit to detect the target photographic subject by using the selected dictionary data and dictionary data different to the selected dictionary data.

Type: Grant

Filed: June 28, 2019

Date of Patent: October 12, 2021

Assignee: CANON KABUSHIKI KAISHA

Inventor: Ryosuke Tsuji
Method and system for processing candidate strings detected in an image to identify a match of a model string in the image

Patent number: 11087122

Abstract: A minimum edit cost is representative of a cost of edit operations performed on a candidate string detected in an image to satisfy characteristics of a model string. An attempt to perform a minimum edit cost between the candidate string and the model string is performed. Upon determining that the candidate string includes a blank character at a first character position between two consecutive non-blank candidate characters, and in response to determining that there is a non-blank model character at a second character position of the model string and that the second character position is associated with the first character position of the blank character in the candidate string, an indication that the minimum edit cost between the candidate string and the model string cannot be computed and that the candidate string is not a match to the model string is output.

Type: Grant

Filed: April 18, 2019

Date of Patent: August 10, 2021

Assignee: MATROX ELECTRONIC SYSTEMS LTD.

Inventor: Dominique Rivard
Systems and methods for automatic processing of forms using augmented reality

Patent number: 10671973

Abstract: The present disclosure discloses a method for automatic processing of forms using augmented reality. In an embodiment, a filled-out application form including one or more fields is scanned in augmented reality mode and its one or more images are captured. The captured images are used to identify the form type by comparing them with an original application form and to generate an electronic version of the form. Subsequently, filled-out content in the images is extracted and compared with a retrieved configuration file, which has a type same as the identified type of the filled-out application form. Based on this comparison, one or more messages are generated that are superimposed on the electronic version of the form in the augmented reality mode and that both are displayed.

Type: Grant

Filed: January 3, 2013

Date of Patent: June 2, 2020

Assignee: XEROX CORPORATION

Inventors: Kovendhan Ponnavaikko, Nischal M Piratla, Sivasubramanian Kandaswamy, Anuradha Rukmangathan, Raja Srinivasan
Speech recognition apparatus and method

Patent number: 10606947

Abstract: A speech recognition apparatus includes a predictor configured to predict a word class of a word following a word sequence that has been previously searched for based on the word sequence that has been previously searched for; and a decoder configured to search for a candidate word corresponding to a speech signal, extend the word sequence that has been previously searched for using the candidate word that has been searched for, and adjust a probability value of the extended word sequence based on the predicted word class.

Type: Grant

Filed: October 21, 2016

Date of Patent: March 31, 2020

Assignee: Samsung Electronics Co., Ltd.

Inventor: Ji Hyun Lee
Systems and methods for model-assisted cohort selection

Patent number: 10304000

Abstract: Systems and methods are disclosed for selecting cohorts. In one implementation, a model-assisted selection system for identifying candidates for placement into a cohort includes a data interface and at least one processing device. The at least one processing device is programmed to access, via the data interface, a database from which feature vectors associated with an individual from among a population of individuals can be derived; derive, for the individual, one or more feature vectors from the database; provide the one or more feature vectors to a model; receive an output from the model; and determine whether the individual from among the population of individuals is a candidate for the cohort based on the output received from the model.

Type: Grant

Filed: April 12, 2018

Date of Patent: May 28, 2019

Assignee: Flatiron Health, Inc.

Inventors: Benjamin Edward Birnbaum, Joshua Daniel Haimson, Lucy Dao-Ke He, Katharina Nicola Seidl-Rathkopf, Monica Nayan Agrawal, Nathan Nussbaum
System and method for detection of duplicate bug reports

Patent number: 9990268

Abstract: A system and method for detection of duplicate bug reports. A receiver is configured to receive a first bug report and a word matrix. An extractor extracts keywords from the first bug report for creating a first search string. A comparator compares each of the keywords from the first search string with the word matrix for identifying dissimilar duplicate words. The duplicate bug detector further includes an expander to expand the first search string by including the dissimilar duplicate words for creating the second search string and a searcher to search a bug repository with the first search string and the second search string for identifying similar duplicate bug reports and dissimilar duplicate bug reports.

Type: Grant

Filed: March 9, 2016

Date of Patent: June 5, 2018

Assignee: Infosys Limited

Inventors: Satya Prateek Bommaraju, Anjaneyulu Pasala, Shivani Rao
Enhanced database searching and storage

Patent number: 9928273

Abstract: In a method for searching a computer database, a processor receives a set of data containing at least a first character. A processor creates a converted set of data by converting the first character in the received set of data into a second character, wherein the second character represents the first character and one or more additional characters based upon a predetermined equivalency. A processor searches the computer database for a previously stored data entry using the converted set of data. A processor returns a retrieved result of the searching.

Type: Grant

Filed: August 19, 2013

Date of Patent: March 27, 2018

Assignee: International Business Machines Corporation

Inventors: Adrian M. Boyko, William J. Oliver
Distinguishing intentional linguistic deviations from unintentional linguistic deviations

Patent number: 9858257

Abstract: A machine learning engine may correlate contextual information associated with a misspelling in a publication with a likelihood that the misspelling is intentional in nature. Training data may be generated by analyzing one or more past publication to identify misspellings and labeling the misspellings as intentional. A contextual indicators application may analyze the context in which intentional misspellings have been previously included within publication to identify indicators of future misspellings being intentional. A machine learning engine may use the training data and indicators to generate an intentional linguistic deviation (ILD) prediction model to determine whether a new misspelling is an intentional misspelling. The machine learning engine may also determine weights for individual indicators that may calibrate the influence of the respective individual indicators.

Type: Grant

Filed: July 20, 2016

Date of Patent: January 2, 2018

Assignee: Amazon Technologies, Inc.

Inventors: Janna S. Hamaker, Sravan Babu Bodapati, John Hambacher, Gururaj Narayanan, Sriraghavendra Ramaswamy
Decoder for searching a digraph and generating a lattice, decoding method, and computer program product

Patent number: 9786272

Abstract: According to an embodiment, a decoder includes a token operating unit, a node adder, and a connection detector. The token operating unit is configured to, every time a signal or a feature is input, propagate each of a plurality of tokens, which is an object assigned with a state of the of a path being searched, according to a digraph until a state or a transition assigned with a non-empty input symbol is reached. The node adder is configured to, in each instance of token propagating, add, in a lattice, a node corresponding to a state assigned to each of the plurality of tokens. The connection detector is configured to refer to the digraph and detect a node that is connected to a node added in an i-th instance in the lattice and that is added in an i+1-th instance in the lattice.

Type: Grant

Filed: December 18, 2014

Date of Patent: October 10, 2017

Assignee: KABUSHIKI KAISHA TOSHIBA

Inventor: Manabu Nagao
File name producing apparatus that produces file name of image

Patent number: 9659018

Abstract: An MFP (Multifunction Peripheral) is a file name producing apparatus that produces a file name of an image. The MFP selects a candidate character string, which is a file name candidate and in which a head character is a space, from the character strings extracted from the image. The MFP deletes the space that is the head character of the candidate character string, or changes all the characters constituting the candidate character string to other characters. The MFP produces the character string, which is corrected by the deletion or the change, as the file name of the image. Therefore, the proper file name can be produced.

Type: Grant

Filed: September 28, 2012

Date of Patent: May 23, 2017

Assignee: KONICA MINOLTA BUSINESS TECHNOLOGIES, INC.

Inventor: Takuya Kawano
Distributed fuzzy search and join with edit distance guarantees

Patent number: 9646061

Abstract: Methods and arrangements performing fuzzy search. A contemplated method includes: establishing an edit distance threshold for the fuzzy search; generating an index of items to be searched, via: storing at least one string; and creating substrings corresponding to the at least one string; providing a query string for use in searching; creating substrings corresponding to the query string; comparing substrings of the query string with substrings in the index; designating at least one candidate string based on said comparing; verifying whether each candidate string satisfies the edit distance threshold; and outputting at least one matching string for each candidate string that satisfies the edit distance threshold. Other variants and embodiments are broadly contemplated herein.

Type: Grant

Filed: January 22, 2015

Date of Patent: May 9, 2017

Assignee: International Business Machines Corporation

Inventors: Manoj Kumar Agarwal, Rajeev Gupta
Image forming apparatus that continues without halt to perform print job including sign where glyph is invalid data, and recording medium

Patent number: 9444968

Abstract: An image forming apparatus includes a print device, a sign printing circuit, and a sign analyzing circuit. The print device performs a print job onto a print medium. The sign printing circuit causes the print device to print a sign where a glyph is present in a font corresponding to a code based on the code included in print data. The sign analyzing circuit analyzes whether or not the glyph is invalid data where the sign printing circuit is not able to cause the print device to print. When the sign analyzing circuit analyzes that the glyph present in the font corresponding to the code of a target sign is the invalid data, the sign printing circuit causes the print device to print a specific alternative sign as an alternative to the target sign.

Type: Grant

Filed: September 1, 2015

Date of Patent: September 13, 2016

Assignee: KYOCERA Document Solutions Inc.

Inventor: Keizen Kanazawa
System and method for updating geographic data

Patent number: 9076061

Abstract: According to one aspect, embodiments of the invention provide a system and method for utilizing the effort expended by a user in responding to a CAPTCHA request to automatically transcribe text from images in order to verify, retrieve and/or update geographic data associated with geographic locations at which the images were recorded.

Type: Grant

Filed: April 12, 2012

Date of Patent: July 7, 2015

Assignee: Google Inc.

Inventors: Marco Zennaro, Luc Vincent, Kong Man Cheung, David Abraham
System and method of determining map coordinates from images

Patent number: 9057618

Abstract: Systems and methods provide approximations of latitude and longitude coordinates of objects, for example a business, in street level images. The images may be collected by a camera. An image of a business is collected along with GPS coordinates and direction of the camera. Depth maps of the images may be generated, for example, based on laser depth detection or displacement of the business between two images caused by a change in the position of the camera. After identifying a business in one or more images, the distance from the camera to a point or area relative to the business in the one or more images may be determined based on the depth maps. Using this distance and the direction of the camera which collected the one or more images and GPS coordinates of the camera, the approximate GPS coordinates of the business may be determined.

Type: Grant

Filed: September 24, 2013

Date of Patent: June 16, 2015

Assignee: Google Inc.

Inventors: Abhijit S. Ogale, Stephane Lafon, Andrea Frome
Recording media processing device, control method of a recording media processing device, and storage medium

Patent number: 9031309

Abstract: The recognition rate is improved and recognition errors are suppressed when recognizing magnetic ink characters. A character recognition unit calculates a total difference by calculating the total of the differences between the character waveform data and the reference waveform data for each magnetic ink character within the area of one character; calculates a partial difference by summing the differences between character waveform data and reference waveform data in a target area, which is the area corresponding to a stroke that is 2 mesh or more wide in the area of one character; executing a correction process that reduces the value of the partial difference; and recognizing the candidate character as the magnetic ink character that was read when the total difference after the correction process is less than or equal to a threshold value.

Type: Grant

Filed: June 12, 2013

Date of Patent: May 12, 2015

Assignee: Seiko Epson Corporation

Inventor: Yoshiaki Kinoshita
Automated cleanup of digital volumes

Patent number: 8995795

Abstract: Textual errors in digital volumes in a corpus are corrected by comparing a set of similar digital volumes, the set including a basis volume and a plurality of comparison volumes. The basis volume is compared with the comparison volumes to identify sequences of text that are identical across all of the candidate volumes and mismatched sequences of text that contain different text in at least one of the candidate volumes. The correct text for at least some of the mismatched sequences is resolved by comparing the different text in the different candidate volumes. The mismatched sequences are replaced by the resolved correct text, thereby correcting errors in the candidate volumes.

Type: Grant

Filed: February 14, 2012

Date of Patent: March 31, 2015

Assignee: Google Inc.

Inventor: Dana L. Dickinson
Correction of check processing defects

Patent number: 8996476

Abstract: Apparatus, methods and media for correcting a defective check processing datum. The apparatus may include, and the methods and media may involve, a receiver that is configured to receive from memory a first transaction record. The transaction record may include Magnetic Ink Character Recognition (MICR) line data. The MICR line data may be electronically read from a check. The transaction record may include non-MICR data. The non-MICR data may be electronically read from the check. The apparatus may include, and the methods and media may involve, a processor that is configured to identify a defective datum among the MICR line data. The processor may identify a portion of the non-MICR data that corresponds to the defective datum. The processor may store in memory a second transaction record. The second transaction record may include corrected data that includes an element that is derived from the identified portion of the non-MICR data.

Type: Grant

Filed: August 20, 2012

Date of Patent: March 31, 2015

Assignee: Bank of America Corporation

Inventors: Geoffrey R. Williams, Timmy L. Gauvin, Kerry M. Cantley, Deborah N. Bennett, Eric S. Sandoz, II, James F. Barrett, II, Joshua A. Beaudry
Proof reading of text data generated through optical character recognition

Patent number: 8971670

Abstract: A system includes preparing respective proof reading tools for performing carpet proof reading and side-by-side proof reading of text data, recording a log of time to perform proof reading operations by using the first and second proof reading tools. The method further includes estimating, based on times stored in a log, times to perform proof reading of a character using 1) the first proof reading tool followed by using the second proof reading tool, and 2) the second proof reading tool. The method further includes determining for each character value, based on the estimated times, to use the first proof reading tool along with using the second proof reading tool and determining, or to use the second proof reading tool without using the first proof reading tool.

Type: Grant

Filed: November 6, 2012

Date of Patent: March 3, 2015

Assignee: International Business Machines Corporation

Inventors: Takashi Itoh, Toshinari Itoki, Takayuki Osogami
Image processing device, method and storage medium for storing and displaying an electronic document

Patent number: 8965125

Abstract: Character code data and vector drawing data are both listed and provided in a re-editable manner. Electronic data is generated in which information obtained by vectorizing character areas in an image and information obtained by recognizing characters in the image are stored in respective storage locations. As for the electronic data generated in this manner, because character code data and vector drawing data generated from the input image are both presented by a display and edit program, a user can immediately utilize the both data.

Type: Grant

Filed: September 24, 2013

Date of Patent: February 24, 2015

Assignee: Canon Kabushiki Kaisha

Inventors: Taeko Yamazaki, Tomotoshi Kanatsu, Makoto Enomoto, Kitahiro Kaneda
Proof reading of text data generated through optical character recognition

Patent number: 8953910

Abstract: A method includes preparing respective proof reading tools for performing carpet proof reading and side-by-side proof reading of text data, recording a log of time to perform proof reading operations by using the first and second proof reading tools. The method further includes estimating, based on times stored in a log, times to perform proof reading of a character using 1) the first proof reading tool followed by using the second proof reading tool, and 2) the second proof reading tool. The method further includes determining for each character value, based on the estimated times, to use the first proof reading tool along with using the second proof reading tool or to use the second proof reading tool without using the first proof reading tool.

Type: Grant

Filed: September 28, 2012

Date of Patent: February 10, 2015

Assignee: International Business Machines Corporation

Inventors: Takashi Itoh, Toshinari Itoki, Takayuki Osogami
Robust character segmentation for license plate images

Patent number: 8934676

Abstract: A method and system for achieving accurate segmentation of characters with respect to a license plate image within a tight bounding box image. A vehicle image can be captured by an image capturing unit and processed utilizing an ALPR unit. A vertical projection histogram can be calculated to produce an initial character boundary (cuts) and local statistical information can be employed to split a large cut and insert a missing character. The cut can be classified as a valid and/or a suspect character and the suspect character can be analyzed. The suspect character can be normalized and passed to an OCR module for decoding and generating a confidence quote with every conclusion. The non-character images can be rejected at the OCR level by enforcing a confidence threshold. An adjoining suspect narrow character can be combined and the OCR confidence of the combined character can be assessed.

Type: Grant

Filed: July 2, 2012

Date of Patent: January 13, 2015

Assignee: Xerox Corporation

Inventors: Aaron Michael Burry, Claude Fillion, Vladimir Kozitsky
Identifying and correcting character-recognition errors

Patent number: 8792751

Abstract: Embodiments of a computer system, a method and a computer-program product (e.g., software) for use with the computer system are described. These embodiments allow a user to provide an image of a document for use with software, such as an image of a financial document for use with financial software. In particular, the user can provide the image of the document, for example, by taking a picture of the document using a cellular telephone. This image may be converted into an electronic format that is suitable for text and numerical processing using a character-recognition technique, such as optical character recognition or intelligent character recognition. Errors in the electronic version of the document, if present, may be identified and corrected by comparing the electronic version to information maintained by a third party. This information may be accessed based at least on one or more items in the electronic version of the document.

Type: Grant

Filed: July 27, 2009

Date of Patent: July 29, 2014

Assignee: Intuit Inc.

Inventors: Amir Eftekhari, Erikheath A. Thomas, Carol Ann Howe, George Thomas Ericksen, Gerald B. Huff, Gang Wang
Methods and data structures for multiple combined improved searchable formatted documents including citation and corpus generation

Patent number: 8744135

Abstract: Searchable annotated formatted documents are produced by correlating documents stored as photographic or scanned graphic representations of an actual document (evidence, report, court order, etc.) with textual version of the same documents. A produced document will provide additional details in a data structure that supports citation annotation as well as other types of analysis of a document. The data structure also supports generation of citation reports and corpus reports. Methods of creating searchable annotated formatted documents including citation and corpus reports by correlating and correcting text files with photographic or scanned graphic of the original documents. Data structures for correlating and correcting text files with graphic images. Generation of citation reports, concordance reports, and corpus reports. Data structures for citation reports, concordance reports, and corpus reports generation. Multiple document data structures are used to create multiple citation documents and reports.

Type: Grant

Filed: October 4, 2010

Date of Patent: June 3, 2014

Inventor: Kendyl A. Román
Removing character from text in non-image form where location of character in image of text falls outside of valid content boundary

Patent number: 8682075

Abstract: Data representing an image of text is received, as is data representing the text in non-image form. A valid content boundary within the image of the text is determined. For each character within the text in the non-image form, a location of the character within the image of the text is determined. Where the location of the character within the image of the text falls outside the valid content boundary, the character is removed from the data representing the text in the non-image form.

Type: Grant

Filed: December 28, 2010

Date of Patent: March 25, 2014

Assignee: Hewlett-Packard Development Company, L.P.

Inventor: Prakash Reddy
Motion quality error detection in printing systems using documents having text or line content

Patent number: 8649566

Abstract: A method for detecting motion quality error of printed documents having text in a printing system includes: printing a document having text lines, each text line comprising a plurality of characters; scanning the printed document to generate a scanned image; detecting positions in a process direction of the printing system of one of text lines and characters in the scanned image; determining position errors in the process direction in the printed document based on the detected positions in the scanned image; determining at least one motion quality defect of the printing system in the process direction based on the determined position errors; and initiating an activity associated with said printing system in response to a motion quality error having been determined. A system for detecting motion quality error of printed documents is also disclosed.

Type: Grant

Filed: July 19, 2010

Date of Patent: February 11, 2014

Assignee: Xerox Corporation

Inventors: Beilei Xu, Wencheng Wu, Peter Paul, Palghat Ramesh
Image processing device, method and storage medium for storing and displaying an electronic document

Patent number: 8571359

Abstract: Character code data and vector drawing data are both listed and provided in a re-editable manner. Electronic data is generated in which information obtained by vectorizing character areas in an image and information obtained by recognizing characters in the image are stored in respective storage locations. As for the electronic data generated in this manner, because character code data and vector drawing data generated from the input image are both presented by a display and edit program, a user can immediately utilize the both data.

Type: Grant

Filed: June 17, 2009

Date of Patent: October 29, 2013

Assignee: Canon Kabushiki Kaisha

Inventors: Taeko Yamazaki, Tomotoshi Kanatsu, Makoto Enomoto, Kitahiro Kaneda
Pattern recognition apparatus, pattern recognition method, image processing apparatus, and image processing method

Patent number: 8509571

Abstract: A task of the present invention is that even when a plurality of images exists in which the positions or sizes of character patterns indicating the identical object are different from each other, they can be treated as character patterns indicating the identical object. An image and supplementary information of the image, such as a photographing point and time, are input by an image input section (101) and are stored in an image data storage section (102). Character recognition in the image is performed by a character recognition section (103), and the recognition result is stored in a character recognition result storage section (104). An analysis section (106) extracts object character information relevant to an object from the image, the supplementary information, and the character recognition result on the basis of the analysis conditions input in a designation section (105) to thereby analyze an object, and the analysis result is output to a result output section (107).

Type: Grant

Filed: April 30, 2009

Date of Patent: August 13, 2013

Assignee: Panasonic Corporation

Inventors: Mariko Takenouchi, Saki Takakura
Method and system for updating business cards

Patent number: 8499046

Abstract: Techniques for capturing images of business cards, uploading the images to a designated computing device for processing and recognition are disclosed. A mechanism is provided to update extracted data from the images when there are any changes. Depending on implementation, there are a number of ways to capture images of business cards (e.g., via a phone camera, a PC camera, or a scanning device). A transmission means is provided to transport the images to the designated computing device for centralized management of integrated contact information for individual users. As a result, a user may access his/her updatable integrated contact information database anywhere anytime from a chosen device.

Type: Grant

Filed: May 6, 2009

Date of Patent: July 30, 2013

Inventor: Joe Zheng
Underline removal apparatus

Patent number: 8452133

Abstract: To remove an underline even if a business document includes a chart or even if the underline touches a character string, provided is an underline removal apparatus that removes an underline area from binary image data including the underline area touching a character string, the underline removal apparatus including: an underline search processing unit that executes a line template matching process by setting a point on the binary image data as a starting point to set a rectangular line template, tracing pixels included in the line template, and extracting a polyline indicating underline position coordinates; and an underline removal processing unit that uses the polyline to execute a process of obtaining background borderline coordinates between the underline area and a background area and character borderline coordinates between the underline area and the character string obtained by applying an interpolation process to a part in the underline area touching the character string and to execute a process of r

Type: Grant

Filed: March 3, 2010

Date of Patent: May 28, 2013

Assignee: Hitachi Solutions, Ltd.

Inventor: Mitsuharu Oba
Image processing apparatus and storage medium storing image processing program

Patent number: 8447143

Abstract: An image processing apparatus that includes a character recognition component, a determining component and a generating component is provided. The determining component determines, when document data is generated that contains first data representing the document and representing the entity in which the characters are mixed and second data containing character code data of the characters recognized by the character recognition component and representing a character block displaying the characters represented by the character code data, whether to hide the character block represented by the second data behind the entity represented by the first data or to display the character block represented by the second data in front of the entity represented by the first data when the document represented by the document data is displayed, based on lightness or distribution of the lightness of a background region around the characters of the entity or the like.

Type: Grant

Filed: January 18, 2011

Date of Patent: May 21, 2013

Assignee: Fuji Xerox Co., Ltd.

Inventors: Chihiro Matsuguma, Hiroyoshi Uejo, Kazuhiro Ohya, Katsuya Koyanagi, Shintaro Adachi
Method and system for verification of uncertainly recognized words in an OCR system

Patent number: 8315484

Abstract: The present invention provides a method and system for confirming uncertainly recognized words as reported by an Optical Character Recognition process by using spelling alternatives as search arguments for an Internet search engine. The measured number of hits for each spelling alternative is used to provide a confirmation measure for the most probable spelling alternative. Whenever the confirmation measure is inconclusive, a plurality of search strategies are used to reach a measured result comprising zero hits except for one spelling alternative that is used as the correct alternative.

Type: Grant

Filed: February 15, 2007

Date of Patent: November 20, 2012

Assignee: Lumex AS

Inventors: Hans Christian Meyer, Mats Stefan Carlin, Knut Tharald Fosseide
Information processing apparatus, method of processing information, control program, and recording medium

Patent number: 8290312

Abstract: An information processing apparatus for processing image data including character information, the processing of image data including a process of inserting interpretation information corresponding to the character information. The information processing apparatus includes an image data acquisition unit, an interpretation information retrieval unit, an area information definition unit, and an insertion style determination unit. The image data acquisition unit acquires the image data including the character information composed of a plurality of characters having a first string of characters. The interpretation information retrieval unit retrieves first interpretation information to be attached to the first string of characters. The area information definition unit computes insertable area information on a first insertable area, usable for inserting the first interpretation information, based on coordinate data of characters in the acquired image data.

Type: Grant

Filed: June 1, 2009

Date of Patent: October 16, 2012

Assignee: Ricoh Company, Ltd.

Inventor: Yoshihisa Ohguro
Data entry retrieval

Patent number: 8265396

Abstract: The present invention provides for the recovery of characters entered into at least one data entry zone of a data entry window. A method in accordance with an embodiment includes: storing a first image of the data entry window during data entry; subtracting a reference image from the first image to obtain a delta image, wherein the reference image is an image of the data entry window without data entered; identifying at least one non empty zone of the delta image and the location of the at least one data entry zone on the data entry window from the location of the at least one non empty zone on the delta image; extracting at least one character by applying optical character recognition to the least one non empty zone; and inputting the at least one character into the location of the at least one data entry zone.

Type: Grant

Filed: December 3, 2008

Date of Patent: September 11, 2012

Assignee: International Business Machines Corporation

Inventors: Frederic Bauchot, Jean-Luc Collet, Gerard Marmigere, Joaquin Picon
Address label re-work station

Patent number: 8260455

Abstract: An address label rework station according to the invention includes a conveyor which can simultaneously transport a series of spaced parcels along a conveyor path in a substantially horizontal stream. The station includes a camera, A work space, a microphone is usable by an operator in the work space, a targeting device directing a human operator in the work space. Station also includes a sensor system, a labeler and a printer to print shipping information. The station also includes a computer configured to receive and recognize image and voice data, generate a label with a recognized shipping address, and control the conveyer to apply a new label to the parcel by the labeler at the position selected using the targeting device.

Type: Grant

Filed: December 3, 2009

Date of Patent: September 4, 2012

Assignee: Siemens Industry, Inc.

Inventors: Dale E. Redford, Michael D. Carpenter, James M. Pippin
Optical character recognition verification

Patent number: 8249399

Abstract: A method for optical character recognition (OCR) verification, the method includes: receiving a first character image that was obtained from applying an OCR process on a document; wherein the first character image is classified, by the OCR, as being associated with a first character; receiving a first character code of a text; replacing the first character code by the first character image; and evaluating a correctness of the OCR based upon a response of a user to a display of the text first character image.

Type: Grant

Filed: September 16, 2008

Date of Patent: August 21, 2012

Assignee: International Business Machines Corporation

Inventors: Ella Barkan, Dan Shmuel Chevion, Boaz Ophir, Doron Tal
Hierarchical alignment of character sequences representing text of same source

Patent number: 8170289

Abstract: Systems and methods for character-by-character alignment of two character sequences (such as OCR output from a scanned document and an electronic version of the same document) using a Hidden Markov Model (HMM) in a hierarchical fashion are disclosed. The method may include aligning two character sequences utilizing multiple hierarchical levels. For each hierarchical level above a final hierarchical level, the aligning may include parsing character subsequences from the two character sequences, performing an alignment of the character subsequences, and designating aligned character subsequences as the anchors, the parsing and performing the alignment being between the anchors generated from an immediately previous hierarchical level if the current hierarchical level is below the first hierarchical level. For the final hierarchical level, the aligning includes performing a character-by-character alignment of characters between anchors generated from the immediately previous hierarchical level.

Type: Grant

Filed: September 21, 2005

Date of Patent: May 1, 2012

Assignee: Google Inc.

Inventors: Shaolei Feng, Raghavan Manmatha
Image text to character information conversion

Patent number: 8155444

Abstract: Converting text may be provided. A user selectable element may be used to select a text. The selected text may include a first text within an electronic document and a second text within an image. The second text within the image may be converted to character information by receiving the image. The image may have image character information and an image type. An aspect of the received image may be adjusted based on the image type. Optical character recognition may be performed on the adjusted image to extract character information. The character information may include characters and corresponding location information for the characters. The extracted character information may be evaluated to improve the recognition quality of the extracted character information as compared to the image character information.

Type: Grant

Filed: January 15, 2007

Date of Patent: April 10, 2012

Assignee: Microsoft Corporation

Inventors: Alex J. Simmons, Radoslav P. Nickolov, Peter Baer, Vincent Lascaux, Igor Kofman
Technique for correcting character-recognition errors

Patent number: 8150161

Abstract: Embodiments of a computer system, a method, and a computer-program product (e.g., software) for use with the computer system are described. These embodiments may be used to identify and correct errors in financial information that was extracted using character-recognition software, such as optical character recognition software and/or intelligent character recognition software. In particular, potential errors may be identified by comparing the financial information for a current financial transaction of a user with expected financial information from one or more previous financial transactions of the user. Error metrics for these potential errors may be determined and used to correct at least some of the potential errors. For example, values of the Levenshtein edit distance may be determined based on the comparison, and one or more potential errors associated with one or more minimum values of the Levenshtein edit distance may be corrected.

Type: Grant

Filed: September 22, 2008

Date of Patent: April 3, 2012

Inventors: William T. Laaser, Rajalakshmi Ganesan, James A. Schneider
Shape clustering in post optical character recognition processing

Patent number: 8131085

Abstract: Techniques for shape clustering and applications in processing various documents, including an output of an optical character recognition (OCR) process. The output of an OCR process is classified into a plurality of clusters of clip images and a representative image for each cluster is generated to identify clusters whose clip images were incorrectly assigned character codes by the OCR process.

Type: Grant

Filed: July 15, 2011

Date of Patent: March 6, 2012

Assignee: Google Inc.

Inventors: Luc Vincent, Raymond W. Smith
Fast key-in for machine-printed OCR-based systems

Patent number: 8103132

Abstract: A method for correcting results of OCR or other scanned symbols. Initially scanning and performing OCR classification on a document. Clustering character/symbol classifications resulting from the OCR based on shapes. Creating super-symbols based on at least a first difference in the shapes of the clustered characters/symbols exceeding a first threshold. A carpet of super-symbols, emphasizing localized differences in similar symbols, is displayed for analysis testing.

Type: Grant

Filed: March 31, 2008

Date of Patent: January 24, 2012

Assignee: International Business Machines Corporation

Inventors: Asaf Tzadok, Eugeniusz Walach
Dictation selection

Patent number: 8032372

Abstract: A computer program product for computing a correction rate predictor for medical record dictations, the computer program product residing on a computer-readable medium includes computer-readable instructions for causing a computer to obtain a draft medical transcription of at least a portion of a dictation, the dictation being from medical personnel and concerning a patient, determine features of the dictation to produce a feature set comprising a combination of features of the dictation, the features being relevant to a quantity of transcription errors in the transcription, analyze the feature set to compute a predicted correction rate associated with the dictation and use the predicted correction rate to determine whether to provide at least a portion of the transcription to a transcriptionist.

Type: Grant

Filed: September 13, 2005

Date of Patent: October 4, 2011

Assignee: eScription, Inc.

Inventors: Roger Scott Zimmerman, George Zavaliagkos
Image text replacement

Patent number: 7912289

Abstract: Image text enhancement techniques are described. In an implementation, graphically represented text included in an original image is converted into process capable text. The process capable text may be used to generate a text image which may replace the original text to enhance the image. In further implementations the process capable text may be translated from a first language to a second language for inclusion in the enhanced image.

Type: Grant

Filed: May 1, 2007

Date of Patent: March 22, 2011

Assignee: Microsoft Corporation

Inventors: Aman Kansal, Feng Zhao
Decision criteria for automated form population

Patent number: 7849398

Abstract: A method is provided for selecting fields of an electronic form for automatic population with candidate text segments. The candidate text segments can be obtained by capturing an image of a document, applying optical character recognition to the captured image to identify textual content, and tagging candidate text segments in the textual content for fields of the form. The method includes, for each of a plurality of fields of the form, computing a field exclusion function based on at least one parameter selected from a text length parameter, an optical character recognition error rate, a tagging error rate, and a field relevance parameter; and determining whether to select the field for automatic population based on the computed field exclusion function.

Type: Grant

Filed: April 26, 2007

Date of Patent: December 7, 2010

Assignee: Xerox Corporation

Inventors: Sebastien Dabet, Marco Bressan, Hervé Poirier
Image processing apparatus

Patent number: 7813576

Abstract: A difference image is obtained between two images which are objects of comparative viewing, regardless of whether the images are processed images. A judgment means judges whether the two images are processed images, based on process confirmation data attached thereto. A correction means corrects images which have been judged to be processed images by the judgment means to a state equivalent to that of the images prior to image processes. The correction is performed based on image processing condition data which are attached to the processed images. A positional alignment means aligns the positions of the two images. An inter image calculation means performs inter image calculation between unprocessed or corrected images.

Type: Grant

Filed: November 24, 2003

Date of Patent: October 12, 2010

Assignee: FUJIFILM Corporation

Inventor: Akira Oosawa
Method and device for mobile object information management, and computer product

Patent number: 7778447

Abstract: A method, device and computer program for mobile object information management program includes obtaining a first image by photographing identification information of a mobile object, executing character recognition process on the first image to obtain a first character recognition result, determining accuracy of the first character recognition result, registering, as the identification information corresponding to the mobile object, a plurality of first character recognition results, for each of which the accuracy is determined as low, and outputting the first character recognition results registered.

Type: Grant

Filed: May 18, 2004

Date of Patent: August 17, 2010

Assignee: Fujitsu Limited

Inventors: Kunikazu Takahashi, Kazuyuki Yasutake, Nakaba Yuhara
METHOD, SYSTEM AND COMPUTER READABLE RECORDING MEDIUM FOR CORRECTING OCR RESULT

Publication number: 20100169077

Abstract: Disclosed is a method, system and computer readable recording medium for correcting an OCR result. According to an exemplary embodiment of the present invention, there is provided a method for correcting an OCR result, the method including performing character recognition on content including character information using an OCR technique, removing extra carriage return information from the content, outputting the character recognition result, and correcting word spacing on the outputted result.

Type: Application

Filed: December 30, 2009

Publication date: July 1, 2010

Applicant: NHN Corporation

Inventors: Byoung Seok YANG, Hee Cheol Seo, Do Gil Lee, Ki Joon Sung
Comparative and analytic apparatus method for converting two-dimensional bit map data into three-dimensional data

Patent number: 7729541

Abstract: A method is provided for converting a two-dimensional image or bitmap of a handwritten manuscript into three-dimensional data The three-dimensional data can be used to automatically recognize features of the manuscript, such as characters or words. The method includes the steps of: converting the two-dimensional image into three-dimensional volumetric data; filtering the three-dimensional volumetric data; and processing the filtered three-dimensional volumetric data to resolve features of the two-dimensional image. The method can be used, for example, to differentiate between ascenders, descenders, loops, curls, and endpoints that define the overall letter forms in handwritten text, manuscripts or signatures.

Type: Grant

Filed: March 16, 2005

Date of Patent: June 1, 2010

Assignee: Arizona Board of Regents, A Body Corporate, Acting for and on Behalf of Arizona State University

Inventors: Anshuman Razdan, John Femiani
Medium processing apparatus, medium processing method, medium processing system and computer readable recording medium with medium processing program recorded thereon

Patent number: 7715633

Abstract: The present invention comprises, in order to allow recognition of content of information indicated in a medium accurately, based on an image data obtained by reading a medium, an extraction unit for extracting each of plural information items from an image data obtained by reading a medium in which each of plural information items satisfying a predetermined relationship is indicated in plural areas, a recognition unit for recognizing content of each of plural information items, and a confirmation unit which evaluates whether or not content of plural information items recognized by the recognition unit is correct based on the predetermined relationship, and confirms content of plural information items as recognized by the recognition unit if correct, and executes correction of recognition content recognized by the recognition unit if incorrect, based on the predetermined relationship, to confirm content of plural information items.

Type: Grant

Filed: April 26, 2006

Date of Patent: May 11, 2010

Assignees: Fujitsu Limited, Fujitsu Frontech Limited

Inventors: Koichi Kanamoto, Shinichi Eguchi

1 2 3 next