Patents by Inventor Martynas Juravicius
Martynas Juravicius has filed for patents to protect the following inventions. This listing includes patent applications that are pending as well as patents that have already been granted by the United States Patent and Trademark Office (USPTO).
-
Publication number: 20230018983Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: ApplicationFiled: July 12, 2021Publication date: January 19, 2023Applicant: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Publication number: 20220414166Abstract: ADVANCED RESPONSE PROCESSING IN WEB DATA COLLECTION discloses processor-implemented apparatuses, methods, and systems of processing unstructured raw HTML responses collected in the context of a data collection service, the method comprising, in one embodiment, receiving raw unstructured HTML documents and extracting text data with associated meta information that may comprise style and formatting information. In some embodiments data field tags and values may be assigned to the text blocks extracted, classifying the data based on the processing of Machine Learning algorithms. Additionally, blocks of extracted data may be grouped and re-grouped together and presented as a single data point. In another embodiment the system may aggregate and present the text data with the associated meta information in a structured format. In certain embodiments the Machine Learning model may be a model trained on a pre-created training data set labeled manually or in an automatic fashion.Type: ApplicationFiled: July 1, 2022Publication date: December 29, 2022Applicant: Metacluster LT, UABInventors: Martynas JURAVICIUS, Andrius KUKSTA
-
Publication number: 20220414397Abstract: Systems and methods that allow examination of response data collected from content providers and provide for classification and routing according to the classification. The process of classification employs an unsupervised, or partially unsupervised, Machine Learning classifier model for identifying data collection responses that contains no data, mangled data, or a block, for assigning a classification correspondingly and for feeding the classification decision back to a data collection platform.Type: ApplicationFiled: August 30, 2022Publication date: December 29, 2022Applicant: METACLUSTER LT, UABInventors: Martynas JURAVICIUS, Andrius KUKSTA
-
Patent number: 11470174Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: GrantFiled: April 22, 2022Date of Patent: October 11, 2022Assignee: METACLUSTER LT, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
-
Patent number: 11468137Abstract: Systems and methods to intelligently optimize data collection requests are disclosed. In one embodiment, systems are configured to identify and select a complete set of suitable parameters to execute the data collection requests. In another embodiment, systems are configured to identify and select a partial set of suitable parameters to execute the data collection requests. The present embodiments can implement machine learning algorithms to identify and select the suitable parameters according to the nature of the data collection requests and the targets. Moreover, the embodiments provide systems and methods to generate feedback data based upon the effectiveness of the data collection parameters. Furthermore, the embodiments provide systems and methods to score the set of suitable parameters based on the feedback data and the overall cost, which are then stored in an internal database.Type: GrantFiled: March 22, 2022Date of Patent: October 11, 2022Assignee: METACLUSTER LT, UABInventors: Martynas Juravicius, Erikas Bulba, Mantas Briliauskas
-
Publication number: 20220318564Abstract: Systems and methods that allow examination of response data collected from content providers and provide for classification and routing according to the classification. The process of classification employs an unsupervised, or partially unsupervised, Machine Learning classifier model for identifying data collection responses that contains no data, mangled data, or a block, for assigning a classification correspondingly and for feeding the classification decision back to a data collection platform.Type: ApplicationFiled: March 30, 2021Publication date: October 6, 2022Applicant: METACLUSTER LT, UABInventors: Martynas JURAVICIUS, Andrius KUKSTA
-
Patent number: 11461588Abstract: Systems and methods that allow examination of response data collected from content providers and provide for classification and routing according to the classification. The process of classification employs an unsupervised, or partially unsupervised, Machine Learning classifier model for identifying data collection responses that contains no data, mangled data, or a block, for assigning a classification correspondingly and for feeding the classification decision back to a data collection platform.Type: GrantFiled: March 30, 2021Date of Patent: October 4, 2022Assignee: METACLUSTER LT, UABInventors: Martynas Juravicius, Andrius Kuksta
-
Patent number: 11416291Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: GrantFiled: July 12, 2021Date of Patent: August 16, 2022Assignee: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Patent number: 11416564Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: GrantFiled: July 12, 2021Date of Patent: August 16, 2022Assignee: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Publication number: 20220247829Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: ApplicationFiled: April 22, 2022Publication date: August 4, 2022Applicant: METACLUSTER LT, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
-
Patent number: 11379542Abstract: ADVANCED RESPONSE PROCESSING IN WEB DATA COLLECTION discloses processor-implemented apparatuses, methods, and systems of processing unstructured raw HTML responses collected in the context of a data collection service, the method comprising, in one embodiment, receiving raw unstructured HTML documents and extracting text data with associated meta information that may comprise style and formatting information. In some embodiments data field tags and values may be assigned to the text blocks extracted, classifying the data based on the processing of Machine Learning algorithms. Additionally, blocks of extracted data may be grouped and re-grouped together and presented as a single data point. In another embodiment the system may aggregate and present the text data with the associated meta information in a structured format. In certain embodiments the Machine Learning model may be a model trained on a pre-created training data set labeled manually or in an automatic fashion.Type: GrantFiled: June 25, 2021Date of Patent: July 5, 2022Assignee: Metacluster LT, UABInventors: Martynas Juravicius, Andrius Kuksta
-
Patent number: 11372937Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: GrantFiled: July 12, 2021Date of Patent: June 28, 2022Assignee: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Patent number: 11343342Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: GrantFiled: June 30, 2021Date of Patent: May 24, 2022Assignee: METACLUSTER LT, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
-
Patent number: 11314833Abstract: Systems and methods to intelligently optimize data collection requests are disclosed. In one embodiment, systems are configured to identify and select a complete set of suitable parameters to execute the data collection requests. In another embodiment, systems are configured to identify and select a partial set of suitable parameters to execute the data collection requests. The present embodiments can implement machine learning algorithms to identify and select the suitable parameters according to the nature of the data collection requests and the targets. Moreover, the embodiments provide systems and methods to generate feedback data based upon the effectiveness of the data collection parameters. Furthermore, the embodiments provide systems and methods to score the set of suitable parameters based on the feedback data and the overall cost, which are then stored in an internal database.Type: GrantFiled: November 9, 2021Date of Patent: April 26, 2022Assignee: METACLUSTER LT, UABInventors: Martynas Juravicius, Erikas Bulba, Mantas Briliauskas
-
Publication number: 20220100808Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a request for a web crawler to be enriched with a customized browsing profile in order to be categorized as an organic human user to obtain targeted content.Type: ApplicationFiled: September 29, 2020Publication date: March 31, 2022Applicant: metacluster lt, UABInventor: Martynas Juravicius
-
Patent number: 11281730Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: GrantFiled: July 12, 2021Date of Patent: March 22, 2022Assignee: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Publication number: 20220086248Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: ApplicationFiled: June 30, 2021Publication date: March 17, 2022Applicant: METACLUSTER LT, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
-
Patent number: 11204971Abstract: Embodiments disclose a system that allows for improved generation of web requests for scraping that, because of the nature of the requests and time and manner they are sent out, appear more organic, as in human generated, than conventional automated scraping systems. The system then manages how a client request to scrape a target website is made to the site, masking the request in a manner that makes it appear to the Web server as if the request is not generated by an automated system. In this way, by appearing more organic, Web servers may be less likely to block requests from the disclosed system or may take longer to block requests from the disclosed system. By avoiding Web servers blocking requests and extending the lifetime of IP proxies before they are blocked, embodiments can use a limited IP proxy address space more efficiently.Type: GrantFiled: July 12, 2021Date of Patent: December 21, 2021Assignee: Metacluster LT, UABInventors: Eivydas Vilcinskas, Arnas Petruskevicius, Giedrius Stalioraitis, Martynas Juravicius, Rimantas Stankevicius
-
Patent number: 11140235Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: GrantFiled: February 25, 2021Date of Patent: October 5, 2021Assignee: METACLUSTER LT, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis
-
Patent number: 10965770Abstract: Systems and methods of task implementation are extended as provided herein and target the web crawling process through a step of submitting a request by a customer to a web crawler. The systems and methods allow a more complex request for a web crawler to be defined in order to receive more specific data. In one aspect, a method for data extraction and gathering from a Network by a Service provider infrastructure include the following steps: checking the parameters of a request received from a User's Device, adjusting the request parameters according to pre-established Scraping logic, selecting a Proxy according to the criteria of the pre-established Scraping logic, sending the adjusted request to the Target through the selected Proxy, checking metadata received from the Target, and forwarding the data to the User's device.Type: GrantFiled: September 11, 2020Date of Patent: March 30, 2021Assignee: metacluster It, UABInventors: Eivydas Vilcinskas, Martynas Juravicius, Giedrius Stalioraitis