Patents Assigned to ABBYY Development LLC
  • Patent number: 8861856
    Abstract: The invention relates to methods for determining a logical structure of a document. The system stores a collection of models, each of which describes one or more possible logical structures. At least one document hypothesis is generated for the whole document. For each document hypothesis, the system verifies the document hypothesis on each page, for example, by generating at least one block hypothesis for each block in the document based on the document hypothesis, selecting a best block hypothesis for each block, selecting the model that corresponds to a best document hypothesis the document hypothesis that has a best degree of correspondence with the selected best block hypotheses for the document, and forming a representation of the document based on the best document hypothesis described.
    Type: Grant
    Filed: August 28, 2012
    Date of Patent: October 14, 2014
    Assignee: ABBYY Development LLC
    Inventors: Dmitry Deryagin, Konstantin Anisimovich
  • Patent number: 8855413
    Abstract: Described is a method for identifying text or other information in one or more images and reflowing images of individual elements of text at a word boundary or character boundary on devices of different sizes. The text may be rescaled while retaining the look and feel of the original text. The size may be scaled according to one or more parameters. Text may be captured in a plurality of images and merged together to form a single document or document-like collection. Text may be fully recognized, indexed, sorted and/or be made searchable. Text may be wrapped around objects and features identified as non-text or non-informational elements in an image. Borders or edges between successive elements of text may be smoothed, combined, overlapped and/or blended. Backgrounds of text may be adjusted to make the appearance of successive elements aesthetically pleasing or as close to the original as possible.
    Type: Grant
    Filed: May 13, 2011
    Date of Patent: October 7, 2014
    Assignee: ABBYY Development LLC
    Inventor: Ding-Yuan Tang
  • Patent number: 8805093
    Abstract: In one embodiment, the invention provides a method for a machine to perform machine-readable form pre-recognition analysis. The method comprises preliminarily assigning at least one graphic image in a form for identification of form type, preliminarily creating at least one model of the said graphic image for identification of the form type, parsing a form image into regions, determining an image form type for the form image, comprising: (a) detecting on the form image at least one of said graphic images for identification of the form type, (b) performing a primary identification of the form image type based on a comparison of the detected graphic image with the said model, and(c) performing a profound analysis using a supplementary data said-primary identification results in multiple possibilities for the form image type.
    Type: Grant
    Filed: December 22, 2010
    Date of Patent: August 12, 2014
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Zuev, Irina Filimonova, Sergey Zlobin
  • Patent number: 8787690
    Abstract: The invention provides various methods and techniques for binarizing an image, generally in advance of further processing such as optical character recognition (OCR). One step includes establishing boundaries of image objects of an image and classifying each image object as either suspect or non-suspect. Another step includes creating a local binarization threshold map that may include or store threshold binarization values associated with image objects classified as non-suspect. Yet another step includes expanding the local binarization threshold map to cover the entire image thereby creating a global binarization threshold map for the entire image. The methods and techniques are capable of identifying and working with separation objects and incuts in images.
    Type: Grant
    Filed: December 16, 2011
    Date of Patent: July 22, 2014
    Assignee: ABBYY Development LLC
    Inventor: Olga Kacher
  • Publication number: 20140188456
    Abstract: A method for providing the appropriate meaning of an entry in a text is described. The method includes the steps of determining if there are alternative meanings of the entry in an electronic dictionary and if there are alternative meanings determining the dictionary markup theme associated with each of the alternative meanings of the entry. Also, the theme associated with the text is determined. For a hierarchical structure associated with themes of entries in the electronic dictionary, the distance between the theme of the text with the dictionary markup theme of the alternative meanings of the entry is compared. Based on the distance between the theme of the text and the dictionary markup theme of the alternative meanings of the entry, the appropriate meaning is selected.
    Type: Application
    Filed: December 27, 2012
    Publication date: July 3, 2014
    Applicant: ABBYY Development LLC
    Inventors: Alexander Rylov, Ivan Arkhipov
  • Patent number: 8750571
    Abstract: Embodiments of the invention disclose techniques for processing of machine-readable forms of unfixed or flexible format. An auxiliary brief description may be optionally specified to determine the spatial orientation of the image. A method of searching for elements of a document comprises the following main operations in addition to the operations of preliminary image processing: selecting the varieties of structural description from several available variants, determining the orientation of the image, selecting the text objects, where the text must be recognized, and determining the minimal required volume of recognition, recognizing the text objects, searching for elements of the form. Searching for elements of the form comprises the following actions: selecting a searched element in the structural description, gaining the algorithm of search constraints from the structural description, searching for the element, testing the obtained variants.
    Type: Grant
    Filed: August 9, 2013
    Date of Patent: June 10, 2014
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Zuev, Diar Tuganbaev, Irina Filimonova
  • Patent number: 8731233
    Abstract: A system is proposed for automated document processing, comprising a document, consisting of two sections—a main section, containing data in printed character form, and a supplementary section in a machine-readable form; a document forming means; a document inputting means; a character recognition means; a main and supplementary data comparison means. Said system uses the supplementary section data to confirm the main section data. The supplementary section data can fully or partly duplicate the main section data, supplement it and also comprise other additional data. The supplementary machine-readable section can be realized in a form of coded consecutive characters, printed graphic image (bar-code), magnetic, optical, microprocessor or other kind of data storage means. For enhancing security of documents all or a part of data can be coded prior to introduction into the supplementary section.
    Type: Grant
    Filed: January 22, 2003
    Date of Patent: May 20, 2014
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Anisimovich, Konstantin Zuev, Andrey Lubenets
  • Patent number: 8724930
    Abstract: Embodiments of the present invention disclose a copying method that combines optical character recognition (OCR) technology and a search in order to improve the quality of a copy despite the presence of degrading factors. In one embodiment, the search comprises an Internet search and is used to reconstruct/enhance the copy digitally before outputting the copy to print or some other digital medium. Advantageously, a copy produced using the techniques of the present invention may be at least equal to if not better than the original document copied.
    Type: Grant
    Filed: June 1, 2009
    Date of Patent: May 13, 2014
    Assignee: ABBYY Development LLC
    Inventor: Ding-Yuan Tang
  • Publication number: 20140126812
    Abstract: A method for detecting a junction in a received image of the line of text to update a junction list with descriptive data is provided. The method includes creating a color histogram based on a number of color pixels in the received image of the line of text and detecting, based at least in part on the received image of the line of text, a rung within the received image of the line of text. The method also includes identifying a horizontal position of the detected rung in the received image of the line of text and identifying a gateway on the color histogram, wherein the identified gateway is associated with the detected rung. The junction list is updated with data including a description of the identified gateway.
    Type: Application
    Filed: October 14, 2013
    Publication date: May 8, 2014
    Applicant: ABBYY Development LLC
    Inventors: Yuri CHULININ, Oleg Senkevich
  • Patent number: 8660371
    Abstract: In one embodiment, there is provided a method for an Optical Character Recognition (OCR) system. The method comprises: recognizing an input character based on a plurality of classifiers, wherein each classifier generates an output by comparing the input character with a plurality of trained patterns; grouping the plurality of classifiers based on a classifier grouping criterion; and combining the output of each of the plurality of classifiers based on the grouping.
    Type: Grant
    Filed: May 6, 2010
    Date of Patent: February 25, 2014
    Assignee: ABBYY Development LLC
    Inventor: Diar Tuganbaev
  • Patent number: 8606015
    Abstract: Disclosed is a method of bit-mapped image analysis that comprises a whole image data representation via its component objects. The objects are assigned to different levels of complexity. The objects may be hierarchically connected by spatially-parametrical links. The method comprises preliminarily generating a classifier of image objects consisting of one or more levels differing in complexity; parsing the image into objects; attaching each object to one or more predetermined levels; establishing hierarchical links between objects of different levels; establishing links between objects within the same level; and performing an object feature analysis. Object feature analysis comprises generating and examining a hypothesis about object features and correcting the concerned object's features of the same and other levels in response to results of hypothesis examination. Object feature analysis may also comprise execution of a recursive X-Y cut within the same level.
    Type: Grant
    Filed: December 21, 2011
    Date of Patent: December 10, 2013
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Anisimovich, Dmitry Deryagin, Vladimir Rybkin
  • Patent number: 8571264
    Abstract: A method and system for recognizing all varieties of objects in an image by using structure models are disclosed. Structural elements are sought when comparing a structural model with an image but only within a framework of one or more generated hypotheses. The method for identifying objects includes preliminarily creating a structural model of objects by specifying a plurality of basic geometric structural elements corresponding to one or more portions of the object, recording a spatial characteristic of each identified basic geometric structural element, and recording a relational characteristic for each specified basic geometric structural element. Objects in the image are isolated and a list of hypotheses for each object is provided. Hypotheses are tested by determining if the corresponding group of basic geometric structural elements corresponds to another supposed object described in a classifier. Results of testing of hypotheses may be saved and the results may be used to identify objects.
    Type: Grant
    Filed: August 15, 2011
    Date of Patent: October 29, 2013
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Anisimovich, Vadim Tereshchenko, Alexander Shamis
  • Patent number: 8571262
    Abstract: Embodiments of the invention disclose techniques for processing of machine-readable forms of unfixed or flexible format. An auxiliary brief description may be optionally specified to determine the spatial orientation of the image. A method of searching for elements of a document comprises the following main operations in addition to the operations of preliminary image processing: selecting the varieties of structural description from several available variants, determining the orientation of the image, selecting the text objects, where the text must be recognized, and determining the minimal required volume of recognition, recognizing the text objects, searching for elements of the form. Searching for elements of the form comprises the following actions: selecting a searched element in the structural description, gaining the algorithm of search constraints from the structural description, searching for the element, testing the obtained variants.
    Type: Grant
    Filed: September 8, 2010
    Date of Patent: October 29, 2013
    Assignee: ABBYY Development LLC
    Inventors: Konstantin Zuev, Diar Tuganbaev, Irina Filimonova
  • Patent number: 8559718
    Abstract: A method is described for creating a scheme for dividing a text line of Chinese, Japanese or Korean (CJK) characters into character cells prior to applying classifiers and recognizing individual characters. Gaps between characters are found as a window is moved down the length of a text line. A histogram is built based on distances from the start of the window to a respective gap as the window is moved. The window is moved to the end of each gap after each gap is found and distances measured. This is repeated until the window reaches the end of the text line. A linear division graph (LDG) is constructed according to the detected gaps. Penalties for certain distances are applied. An optimum path is one with a minimal penalty sum and can be used as a scheme for dividing a text line into character cells.
    Type: Grant
    Filed: April 27, 2012
    Date of Patent: October 15, 2013
    Assignee: ABBYY Development LLC
    Inventor: Yuri Chulinin
  • Patent number: 8548259
    Abstract: Techniques and methods are disclosed herein for combining and weighting of values from and associated with classifiers. Classifiers are used to recognize characters as part of an optical character recognition (OCR) system. Various methods of normalization facilitate combining of results of classifiers. For example, weight values may be entered into a weight table having two columns, one that includes weights from comparing patterns with images of correct characters, the other column includes weights from comparing patterns with images of incorrect characters.
    Type: Grant
    Filed: October 24, 2012
    Date of Patent: October 1, 2013
    Assignee: ABBYY Development LLC
    Inventor: Diar Tuganbaev
  • Patent number: 8548241
    Abstract: Described herein is a method for segmenting a document image into a picture component, a special or significant picture component, and a non-picture component. The non-picture component is compressed and may include character blocks. Separately, picture components are compressed with a lossy algorithm or with a preliminary defined compression ratio. Subsequently, the compressed picture component, significant picture component and the compressed non-picture component are saved in memory or in a storage location so that the document image may be recomposed based on the compressed picture component or compressed significant picture component and the compressed non-picture component.
    Type: Grant
    Filed: July 10, 2012
    Date of Patent: October 1, 2013
    Assignee: ABBYY Development LLC
    Inventors: German Zyuzin, Maksim Pikhenko, Vyacheslav Sapronenko