Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.
Type:
Grant
Filed:
September 11, 2020
Date of Patent:
March 30, 2021
Assignee:
metacluster It, UAB
Inventors:
Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
Abstract: The task, logic of HTTP/HTTPS session statistics interception and collection is moved to the client side instead of the proxy layer. Encrypted HTTPS tunnel is terminated at the client end, making the actual content or data in transit invisible to both proxies and the smart proxy rotator (SPR). Client's scraping software has a plug-in installed that expands its functionality. HTTP/HTTPS session quality metrics are intercepted and collected at the client side, then sent to the SPR. Proxy usage mark “can be used” is obtained from the SPR for the currently analyzed proxy, based on the results of metrics analysis.
Type:
Grant
Filed:
October 1, 2019
Date of Patent:
April 28, 2020
Assignee:
metacluster It, UAB
Inventors:
Martynas Juravicius, Eivydas Vilcinskas