Patents Assigned to ABBYY Development LLC
-
Patent number: 10198628Abstract: There is disclosed a method of analyzing a digital image of a document (to determine, as example, a document suitability for server-based OCR processing) in a computer system that includes a user electronic device (for acquiring or storing a digital image of a document) connectable to a server (for executing the server-based OCR processing of the digital image to create a recognized-text document). The method is executable by the user electronic device and comprises: acquiring the digital image of the document; analyzing an OCR quality parameter associated with a compressed digital image to be created from the digital image using a compression algorithm and a compression parameter; in response to the OCR quality parameter being above or equal to a pre-determined threshold: transmitting the compressed digital image to the server.Type: GrantFiled: December 13, 2016Date of Patent: February 5, 2019Assignee: ABBYY DEVELOPMENT LLCInventors: Vasily Loginov, Ivan Zagaynov
-
Patent number: 10200448Abstract: An original image of a physical document is received by a client device, a reduced file containing data indicative of a document type is created based on the original image that is substantially smaller in size than the original image. The client device sends a request including the reduced file to a server for information pertaining to a document type for the physical document, receives location information based on the document type from the server for at least one portion of the original image that contains at least one content item for the physical document, and extracts the at least one portion of the original image based on the location information to generate at least one extracted portion of the image. The client device sends a second request including the at least one extracted portion of the image to the server for the at least one content item. Responsive to receiving the at least one content item from the server device, the client device provides the at least one content item for display.Type: GrantFiled: December 14, 2016Date of Patent: February 5, 2019Assignee: ABBYY DEVELOPMENT LLCInventor: Vladimir Rybkin
-
Patent number: 10169305Abstract: A document marking projection system receives a target document comprising text content, determines a set of similar documents using an index of stored documents, where the set of similar documents are similar to the target document, and selects a first similar document from the set of similar documents that is most similar to the target document. The document marking projection system determines one or more portions of text content in the first similar document that are different from respective one or more portions of text content in the target document, determines a first location of a first marking within the first similar document, determines a projected marking for the target document in view of one or more differences between the first portion of the text content in the first similar document and a respective portion of the text content in the target document, and stores the projected marking for the target document.Type: GrantFiled: June 16, 2017Date of Patent: January 1, 2019Assignee: ABBYY Development LLCInventors: Evgeny Indenbom, Sergey Kolotienko
-
Patent number: 10152648Abstract: There is disclosed a method of determining a document type associated with a digital document, the method executable by an electronic device. A processor of the electronic device is configured to execute a plurality of machine learning algorithm (MLA) classifiers, each of the plurality of MLA classifiers having been trained to identify a specific document type. The plurality of MLA classifiers is ranked in a hierarchical order of execution of the plurality of MLA classifiers. A method of training the plurality of MLA classifiers is also disclosed.Type: GrantFiled: June 29, 2016Date of Patent: December 11, 2018Assignee: ABBYY DEVELOPMENT LLCInventor: Irina Zosimovna Filimonova
-
Patent number: 10140691Abstract: A distortion correction component of a mobile device receives an image of a spread open multi-page document, determines a binding edge line of the spread open multi-page document, determines a first set of substantially vertical straight lines lying left of the binding edge line and a second set of substantially vertical straight lines lying right of the binding edge line. The distortion correction component then determines a first vanishing point based on the first set of substantially vertical straight lines and a second vanishing point based on the second set of substantially vertical straight lines. A first quadrangle is determined based on the first vanishing point and a second quadrangle is determined based on the second vanishing point. A corrected image for the first page is generated based on the first quadrangle and a corrected image for the second page is generated based on the second quadrangle.Type: GrantFiled: May 18, 2016Date of Patent: November 27, 2018Assignee: ABBYY Development LLCInventor: Ivan Germanovich Zagaynov
-
Patent number: 10115036Abstract: A page orientation component of an image processing device receives an image of a document, transforms the image to a binarized image by performing a binarization operation on the image, and identifies a portion of the binarized image that comprises one or more rows of textual content. The page orientation component identifies a plurality of horizontal runs of white pixels and a plurality of vertical runs of white pixels in the one or more rows of textual content in the portion of the binarized image. The page orientation component generates a first histogram for the plurality of horizontal runs of white pixels, and a second histogram for the plurality of vertical runs of white pixels, and determines an orientation of the one or more rows of textual content in the image based on the first histogram and the second histogram.Type: GrantFiled: June 16, 2016Date of Patent: October 30, 2018Assignee: ABBYY Development LLCInventors: Ivan Germanovich Zagaynov, Vladimir Yurievich Rybkin
-
Patent number: 10108856Abstract: The present disclosures provide methods of optical character recognition for extracting information from a patterned document, which have at least static element and at least one information field. Related computer systems and computer-readable non-transitory storage media are also disclosed.Type: GrantFiled: June 28, 2016Date of Patent: October 23, 2018Assignee: ABBYY Development LLCInventor: Aleksey Ivanovich Kalyuzhny
-
Patent number: 10108815Abstract: Systems and methods for redacting certain content (e.g., content representing private, privileged, confidential, or otherwise sensitive information) from electronic documents. An example method comprises: identifying, by a computing device, two or more layers in an electronic document; processing each of the identified layers to produce a layer text representing one or more objects comprised by the layer; combining the produced layer texts to produce a combined text of the electronic document; and identifying, within the combined text of the electronic document, a target character string corresponding, in view of a specified search function, to a specified character string.Type: GrantFiled: October 7, 2014Date of Patent: October 23, 2018Assignee: ABBYY Development LLCInventor: Ivan Yurievich Korneev
-
Patent number: 10068155Abstract: A method of verifying optical character recognition (OCR) results may involve: performing OCR on one or more initial images of a document and displaying initial OCR results of the document to a user; receiving a feedback from the user regarding an error location in the initial OCR results, the error location being a location of a misspelled character sequence; receiving an additional image of the document, which corresponds to the error location, and performing OCR of the additional image to produce additional OCR results; identifying a cluster of character sequences, which correspond to the error location, using the initial OCR results and the additional OCR results; identifying an order of character sequences in the cluster of character sequences based on their respective probability values; and displaying to the user modified optical character recognition results, which contain in the error location a corrected character sequence.Type: GrantFiled: September 26, 2016Date of Patent: September 4, 2018Assignee: ABBYY Development LLCInventor: Aleksey Ivanovich Kalyuzhny
-
Patent number: 10068156Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed create and store a decision tree, the nodes of which include classifiers that each recognizes the symbol that corresponds to a symbol image. Input of a symbol image to the decision tree and processing of the symbol image through one or more nodes of the decision tree returns a symbol corresponding to the symbol image.Type: GrantFiled: March 19, 2015Date of Patent: September 4, 2018Assignee: ABBYY Development LLCInventors: Yuri Chulinin, Yury Vatlin
-
Patent number: 10043092Abstract: Systems and methods for performing optical character recognition (OCR) are disclosed. An example method may include receiving a current image that overlaps with a previous image of a series of images of an original document; performing OCR of the current image to produce an OCR text; identifying a plurality of textual artifacts in the images that are each represented by a sequence of symbols having a frequency of occurrence within the OCR text falling below a threshold frequency; identifying corresponding base points that are each associated with a textural artifact; identifying parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image; associating part of the OCR text with a cluster of symbol sequences, the symbol sequences being produced by processing previously received images; identifying a median string representing the cluster; and producing a resulting OCR text representing a portion of the original document.Type: GrantFiled: May 31, 2016Date of Patent: August 7, 2018Assignee: ABBYY DEVELOPMENT LLCInventor: Aleksey Kalyuzhny
-
Patent number: 10019245Abstract: For resolving an initialization order of static objects located in a plurality of object files using a processor device, for each object file, the objects in the object file are categorized as defined static objects or undefined objects. A directed graph is created of the plurality of object files. Then, topological sorting is applied to the directed graph to yield the order of the plurality of object files that ensures a correct initialization of the static objects.Type: GrantFiled: October 7, 2014Date of Patent: July 10, 2018Assignee: ABBYY DEVELOPMENT LLCInventors: Eugene Egorov, German Zyuzin
-
Patent number: 9996760Abstract: Systems and methods are described for receiving a current image that partially overlaps with a previous image of a series of images of an original document; performing optical character recognition (OCR) of the current image, producing an OCR text and a corresponding text layout; identifying textual artifacts in the current and previous images, each represented by a sequence of symbols having a frequency of occurrence within the OCR text below a threshold frequency; identifying corresponding base points associated with textual artifacts; identifying parameters of a coordinate transformation converting coordinates of the previous image into coordinates of the current image; associating part of the OCR text with a cluster of symbol sequences, wherein the symbol sequences are produced by processing previously received images; identifying an order of clusters of symbol sequences reflecting a layout of the original document; and producing a resulting OCR text representing a portion of the original document.Type: GrantFiled: May 31, 2016Date of Patent: June 12, 2018Assignee: ABBYY DEVELOPMENT LLCInventor: Aleksey Kalyuzhny
-
Patent number: 9959475Abstract: The subject matter of this specification can be implemented in, among other things, a method that includes identifying edges of a section of a document in a source image that includes at least one row of text. The method includes identifying characters in the document. The method includes identifying word portions. The method includes generating polynomials that approximate points of the characters within the word portions. The method includes generating a second polynomial that approximates the points of the characters of word portions. The method includes identifying a stretching coefficient of the row of text based on a length of the section between the edges relative to a length of the second polynomial. The method includes mapping portions of the source image along the row of text to new positions in a corrected image based on the second polynomial and the stretching coefficient.Type: GrantFiled: June 28, 2016Date of Patent: May 1, 2018Assignee: ABBYY DEVELOPMENT LLCInventors: Maksim Petrovich Kalenkov, Dmitrij Yurievich Chubanov
-
Patent number: 9922247Abstract: Systems and methods for enhancing and comparing documents. An example method comprises: comparing document images to identify a first document image of a reference document that corresponds with a second document image of a related document; transforming the second document image based on a layout of the first document image; and performing character recognition of the second document image.Type: GrantFiled: January 2, 2015Date of Patent: March 20, 2018Assignee: ABBYY DEVELOPMENT LLCInventors: Ivan Khintsitskiy, Andrey Isaev
-
Patent number: 9911034Abstract: The current application is directed to methods and systems that convert document images, which contain Arabic text and text in other languages in which symbols are joined together to produce continuous words and portions of words, into corresponding electronic documents. In one implementation, a document-image-processing method and system to which the current application is directed employs numerous techniques and features that render efficiently computable an otherwise intractable or impractical document-image-to-electronic-document conversion. These techniques and features include transformation of text-image morphemes and words into feature symbols with associated parameters, efficiently identifying similar morphemes and words in an electronic store of standard-feature-symbol-encoded morphemes and words, and identifying candidate inter-character division points and corresponding traversal paths using the similar morphemes and words identified in the word store.Type: GrantFiled: June 18, 2013Date of Patent: March 6, 2018Assignee: ABBYY DEVELOPMENT LLCInventor: Yury Georgievich Chulinin
-
Patent number: 9892114Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a subset of the total number of symbols frequently used in the scanned document image or images. One or more lists of graphemes for the language of the text are then ordered in most-likely-occurring to least-likely-occurring order to facilitate a second optical-character-recognition step in which symbol images extracted from the one or more scanned-document images are associated with one or more graphemes most likely to correspond to the scanned symbol image.Type: GrantFiled: October 7, 2014Date of Patent: February 13, 2018Assignee: ABBYY DEVELOPMENT LLCInventor: Yuri Chulinin
-
Patent number: 9886320Abstract: An algorithm for assigning priorities to tasks queued for processing by users based on how heavily each task's user used the system resources in the past, including the number of tasks queued by the user in the past, the volume of these tasks, and the amount of processor time used. In the OCR context, the tasks are graphic files placed on servers and chosen for processing in accordance with the assigned priorities.Type: GrantFiled: June 24, 2016Date of Patent: February 6, 2018Assignee: ABBYY DEVELOPMENT LLCInventors: Vasily Vladimirovich Panferov, Dmitry Konstantinovich Mesheryakov
-
Patent number: 9858506Abstract: The current document is directed to methods and systems that convert document images containing mathematical expression into corresponding electronic documents. In one implementation, an image or sub-image containing a mathematical expression is recursively partitioned into blocks separated by white-space stripes. Horizontal and vertical partitioning are alternately and recursively applied to the image or sub-image containing a mathematical expression until the lowest-level blocks obtained by partitioning correspond to symbols recognizable by character-recognition methods. Graph-based analysis of the recognized symbols provides a basis for encoding an equivalent representation of the mathematical expression contained in the image or sub-image.Type: GrantFiled: April 6, 2015Date of Patent: January 2, 2018Assignee: ABBYY DEVELOPMENT LLCInventors: Dmitry Isupov, Anton Masalovitch
-
Patent number: 9817821Abstract: Methods are described for translation of one or more words in a source language into a target language based on context, history and meaning of portions of the source text. Translation may involve selection of electronic dictionaries when translating from a source language to one or more target languages. Various aspects of history, context and structures of words that reflect lexical, morphological, syntactic, and semantic properties facilitate selection or presentation of translations and options to a user. The methods are applicable to genre classification, topic detection, news analysis, authorship analysis, internet searches, and creating corpora for other tasks, etc.Type: GrantFiled: December 19, 2012Date of Patent: November 14, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Alexander Gennadievich Rylov, Ivan Sergeevich Arkhipov