Abstract: Provided are methods and system for recognizing characters such as mathematical expressions or chemical formulas. An example method comprises the steps of receiving and processing an image by a pre-processing module to obtain one or more candidate regions, extracting features of each of the candidate regions by a feature extracting module such as a convolutional neural network (CNN), encoding the features into a distributive representation for each of the candidate regions separately using an encoding module such as a first long short-term memory (LSTM) based neural network, decoding the distributive representation into output representations using a decoding module such as a second LSTM-based recurrent neural network, and combining the output representations into an output expression, which is outputted in a computer-readable format or a markup language.
Type:
Grant
Filed:
June 20, 2016
Date of Patent:
January 30, 2018
Assignee:
Machine Learning Works, LLC
Inventors:
Pavel Savchenkov, Evgeny Savinov, Mikhail Trofimov, Sergey Kiyan, Aleksei Esin