Abstract: A method includes extracting information from an unstructured data source, the method including: scraping, by at least one processor, a plurality of texts from the unstructured data source, extracting, by the at least one processor, from the plurality of texts a chunk of relevant text, summarizing, by the at least one processor, using a pre-trained summarizer, the chunk of relevant text to obtain semi-structured information comprising a set of sentences that summarize the chunk of relevant texts, and postprocessing, by the at least one processor, the semi-structured information to obtain structured information. The method can be executed highly efficiently, in particular on massively parallel hardware.
Type:
Grant
Filed:
April 11, 2022
Date of Patent:
November 19, 2024
Assignee:
AtomLeap GmbH
Inventors:
Raghavendran Pownraju Mahendravarman, Berinike C. K. Tech, Arsen Hnatiuk, Robin P. G. Tech