Patents Assigned to ABBYY Development LLC
-
Patent number: 9811885Abstract: Disclosed are systems, computer-readable mediums, and methods for detecting glare in a frame of image data. A frame of image data is preprocessed. A set of connected components in the preprocessed frame is determined. A set of statistics is calculated for one or more connected components in the set of connected components. A decision for the one or more connected components is made, using the calculated set of statistics, if the connected component is a light spot over text. Whether glare is present in the frame is determined.Type: GrantFiled: August 4, 2016Date of Patent: November 7, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Konstantin Bocharov, Mikhail Kostyukov
-
Patent number: 9811726Abstract: Disclosed are systems, computer-readable mediums, and methods for determining that text contains Chinese, Japanese, or Korean characters. One method includes determining a language hypothesis for each text fragment in a plurality of text fragments identified from connected components in a document image. The method further includes selecting a first subset of text fragments from the plurality of text fragments based on ratings for the language hypothesis of each text fragment in the plurality of text fragments. The method further includes verifying, by a processor, the language hypothesis of one or more text fragments in the first subset of text fragments based on optical character recognition of the one or more text fragments. The method further includes determining, by the processor, that Chinese, Japanese, or Korean (CJK) characters are present in the document image based on the verification of the language hypothesis of each of the one or more text fragments.Type: GrantFiled: June 26, 2016Date of Patent: November 7, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Mikhail Yurievich Atroshchenko, Dmitry Georgievich Deryagin, Yuri Georgievich Chulinin
-
Patent number: 9792895Abstract: Disclosed are systems, methods and computer program products for using prior frame data for OCR processing of frames in video sources to detect natural language text therein. An example includes receiving a frame from a video source and retrieving prior frame data associated with the video source. The OCR-processing includes using prior frame data to detect blobs similar to blobs described in the prior frame data; using detected similar blobs to detect in the frame character candidates similar to character candidates described in the prior frame data; using detected similar character candidates to detect in the frame text candidates similar to text candidates described in the prior frame data; and using detected similar text candidates to detect in the frame text strings similar to text strings described in the prior frame data.Type: GrantFiled: September 24, 2015Date of Patent: October 17, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Ivan Khintsitskiy, Andrey Isaev, Sergey Fedorov
-
Patent number: 9772995Abstract: Disclosed are systems, computer-readable mediums, and methods for providing a meaning of an entry in a text is described. A lexico-morphological analysis is performed on the text. A syntactical analysis is performed on the text. A semantic analysis is performed on the text. A syntactical structure and a semantic structure for the entry is chosen. One or more syntactic links between each alternative meaning of words in proximity to the entry is determined. A weight is determined. One or more semantic links between each word in proximity to the entry are determined. For each semantic link, a weight associated with each semantic link is determined; and based on the weights associated with each semantic and syntactic link, determining meaning of the entry.Type: GrantFiled: December 24, 2013Date of Patent: September 26, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Alexander Gennadievich Rylov, Ivan Sergeevich Arkhipov
-
Patent number: 9767388Abstract: An improved method for verifying whether a character-recognition technology has correctly identified which characters are represented by character images involves displaying the uncertain character images in place of their respective hypothesis characters in a document being read a verifier. The verifier may mark incorrectly spelled words containing the uncertain character images. Based on the markings, a system adjusts a confidence level associated with the hypothesis about the uncertain character in order to obtain a confirmed hypothesis linked to the uncertain character.Type: GrantFiled: October 7, 2014Date of Patent: September 19, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Aram Bengurovich Pakhchanian, Michael Pavlovich Pogosskiy
-
Patent number: 9754187Abstract: For extracting data from a document with fixed structure, we recognize key words in an image of the document; identify reference object based on these key words, create templates based on the identified reference objects; match the created templates against the image of the document while recognizing fields in the image of the document these templates; and select the best template using quality of the recognized field.Type: GrantFiled: December 16, 2014Date of Patent: September 5, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Vasily Vladimirovich Panferov, Andrey Anatolievich Isaev
-
Patent number: 9740692Abstract: Disclosed are systems, computer-readable mediums, and methods for creating a flexible structure description. To create the flexible structure description an image of a document of a particular document type that contains a table is received. An entry describing an item in the table is received. Title elements within the document are searched for based upon the entry. Data fields and anchor elements are detected for the entry. A flexible structure description for the particular document type is generated that includes a set of search elements for each data field in the image of the document and the title elements. The flexible structure description is matched against the image. Data from the image is extracted based upon the matching of the flexible structure description against the image.Type: GrantFiled: November 5, 2014Date of Patent: August 22, 2017Assignee: ABBYY Development LLCInventors: Sergei Golubev, Irene Filimonova, Sergey Zlobin
-
Patent number: 9740927Abstract: Systems and methods for identifying screenshots within document images. An example method comprises: receiving an image of at least a part of a document; identifying, within the image, a polygonal object having a visually distinct border comprising a plurality of edges of one or more intersecting rectangles; asserting a screenshot image hypothesis with respect to the identified polygonal object; and responsive to evaluating at least one condition associated with one or more attributes of the identified polygonal object, classifying the identified polygonal object as a screenshot image.Type: GrantFiled: December 9, 2014Date of Patent: August 22, 2017Assignee: ABBYY Development LLCInventor: Dmitry Deryagin
-
Patent number: 9710704Abstract: Systems and methods for finding and presenting differences between documents are provided. One method includes identifying one or more differences between a first document and at least one second document. The method further includes determining each of the one or more differences to be either a significant difference or an insignificant difference. The determination of whether each of the one or more differences is a significant difference or an insignificant difference may be performed in an automated manner without intervention from a user. The method further includes providing an identification of the significant differences to the user. The method further includes either hiding the insignificant differences from the user or providing an identification of the insignificant differences in a different manner than a manner in which the identification of the significant differences is provided.Type: GrantFiled: December 3, 2014Date of Patent: July 18, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Vasily Vladimirovich Panferov, Andrey Anatolievich Isaev, Catherine Yurievna Bobrova, Olga Zhukovskaya
-
Patent number: 9684843Abstract: A data capture component of a mobile device receives information for an identification of a data field in a physical document. The data capture component receives a video stream comprising a plurality of frames, wherein each frame comprises a portion of the physical document. A frame is selected from the plurality of frames in the video stream. One or more text regions in the frame are identified. Each of the identified text region(s) in the frame is processed to identify data of each of the identified text region(s) and to select data of one of the identified text region(s) that corresponds to a set of attributes associated with the data field. The selected data is then compared with data of text regions of a subsequent frame. If the data of the text regions of the subsequent frame is a closer match to the set of attributes, the selected data is updated. A display field is then provided with the selected data for presentation in a user interface.Type: GrantFiled: December 14, 2015Date of Patent: June 20, 2017Assignee: ABBYY DEVELOPMENT LLCInventor: Andrey Isaev
-
Patent number: 9648208Abstract: Systems and method for improving the quality of document images are provided. One method includes identifying a plurality of image fragments within a previously received document image that includes text. The method further includes separating the plurality of image fragments into a plurality of classes. Each class includes a subset of the plurality of image fragments that are substantially similar to one another. The method further includes, for each of the plurality of classes: (1) processing a class of image fragments to generate a combined and substantially enlarged image fragment for the class; and (2) filtering the combined and substantially enlarged image fragment to generate a filtered image fragment for the class. The method further includes generating an improved document image by replacing or modifying the image fragments within the document image based on the filtered image fragments.Type: GrantFiled: June 25, 2014Date of Patent: May 9, 2017Assignee: ABBYY Development LLCInventors: Mikhail Kostyukov, Ivan Zagaynov
-
Patent number: 9633256Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify, for each symbol image within a scanned document, a set of graphemes that match, with high frequency, symbol patterns that, in turn, match the symbol image. The set of graphemes identified for a symbol image is associated with the symbol image as a set of candidate graphemes for the symbol image. The set of candidate graphemes are then used, in one or more subsequent steps, to associate each symbol image with a most likely corresponding symbol code.Type: GrantFiled: December 10, 2014Date of Patent: April 25, 2017Assignee: ABBYY Development LLCInventor: Yuri Chulinin
-
Patent number: 9633257Abstract: Automatic classification of different types of documents is disclosed. An image of a form or document is captured. The document is assigned to one or more type definitions by identifying one or more objects within the image of the document. A matching model is selected via identification of the document image. In the case of multiple identifications, a profound analysis of the document type is performed—either automatically or manually. An automatic classifier may be trained with document samples of each of a plurality of document classes or document types where the types are known in advance or a system of classes may be formed automatically without a priori information about types of samples. An automatic classifier determines possible features and calculates a range of feature values and possible other feature parameters for each type or class of document. A decision tree, based on rules specified by a user, may be used for classifying documents.Type: GrantFiled: June 25, 2014Date of Patent: April 25, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Irina Filimonova, Sergey Zlobin, Andrey Myakutin
-
Patent number: 9626601Abstract: Systems and methods for identifying transformations to be applied to at least part of a document image for improving the OCR quality. An example method comprises: constructing, by a computer system, an ordered list of transformations to be applied to an image comprising a character string, each transformation corresponding to a hypothesis asserted with respect to one or more characteristics of the image; applying, to the image, a leading transformation on the list to produce a transformed image; evaluating a quality of the transformed image to produce a quality estimate; and updating the list in view of the quality estimate.Type: GrantFiled: December 16, 2014Date of Patent: April 18, 2017Assignee: ABBYY Development LLCInventor: Sergey Kuznetsov
-
Patent number: 9626555Abstract: Methods, systems, and apparatus, including computer programs encoded on a computer storage medium, for classifying one or more document images based on its content by determining blocks layout of the document image; recognizing the document image to obtain digital content data representing text content or the potential graphical content of the image; calculating feature values of the document image for features based on the digital content data and the blocks layout; and classifying the document image as belonging to one of document classes based on the calculated feature values.Type: GrantFiled: December 16, 2014Date of Patent: April 18, 2017Assignee: ABBYY DEVELOPMENT LLCInventors: Anatoly Smirnov, Vasily Panferov, Andrey Isaev
-
Patent number: 9613299Abstract: Methods and systems for performing character recognition of a document image include analyzing verification performed by a user on a recognized text obtained by character recognition of a document image, identifying analogous changes of a first incorrect character for a first correct character, and prompting the user to initiate a training of a recognition pattern based on the identified analogous changes.Type: GrantFiled: December 11, 2014Date of Patent: April 4, 2017Assignee: ABBYY Development LLCInventors: Michael Krivosheev, Natalia Kolodkina, Alexander Makushev
-
Patent number: 9589185Abstract: The current document is directed to methods and systems for identifying symbols corresponding to symbol images in a scanned-document image or other text-containing image, with the symbols corresponding to Chinese or Japanese characters, to Korean morpho-syllabic blocks, or to symbols of other languages that use a large number of symbols for writing and printing. In one implementation, the methods and systems to which the current document is directed carry out an initial processing step on one or more scanned images to identify a set of graphemes that most likely correspond to each symbol image that occurs in the scanned document image. The graphemes are selected for a symbol image based on accumulated votes generated from symbol patterns identified as likely related to the symbol image using one or more decision forests.Type: GrantFiled: October 12, 2015Date of Patent: March 7, 2017Assignee: ABBYY Development LLCInventors: Yury Georgievich Chulinin, Oleg Senkevich
-
Patent number: 9519641Abstract: Methods are described for efficient and substantially instant recognition and translation of text in photographs. A user is able to select an area of interest for subsequent processing. Optical character recognition (OCR) may be performed on the wider area than that selected for determining the subject domain of the text. Translation to one or more target languages is performed. Manual corrections may be made at various stages of processing. Variations of translation are presented and made available for substitution of a word or expression in the target language. Translated text is made available for further uses or for immediate access.Type: GrantFiled: October 15, 2012Date of Patent: December 13, 2016Assignee: ABBYY Development LLCInventors: Ekaterina Solntseva, Konstantin Tarachyov
-
Patent number: 9519404Abstract: Aspects of the present disclosure relate to image segmentation for data verification. A method of the disclosure comprises: receiving, using a processing device, an image of at least a part of a document; identifying a first image region in the image that corresponds to data to be verified by a user; extracting data from the image of at least the part of the document partitioning the image into a plurality of image segments based on positioning information related to the first image region, wherein the plurality of image segments comprises a first image segment and a second image segment, and wherein the second image segment comprises the first image region; and presenting data extracted from the first image region in association with the first image segment and the second image segment.Type: GrantFiled: May 15, 2015Date of Patent: December 13, 2016Assignee: ABBYY Development LLCInventor: Diana Kanivets
-
Patent number: D771077Type: GrantFiled: December 5, 2012Date of Patent: November 8, 2016Assignee: ABBYY Development LLCInventor: Anatoly Ryzhkov