Abstract: A smart spider and web scraping system with custom templates, artificial intelligence for custom attribute extraction, artificial intelligence for providing quick fixes for machine learning extracted web page data, and direct HTTP request extraction without crawling.
Type:
Application
Filed:
October 24, 2024
Publication date:
May 1, 2025
Applicant:
Zyte Group Limited
Inventors:
Mikhail Korobov, Konstantin Lopukhin, Kevin Bernal, Javier Casas, Rakesh Mehta, Cristi Constantin, Iván Sánchez, Nikita Vostretsov, Taras Shevchenko
Abstract: A web scaping system configured with artificial intelligence and image object detection. The system processes a web page with a neural network to perform object detection to obtain structured data, including text, image and other kinds of data, from web pages. The neural network allows the system to efficiently process visual information (including screenshots), text content and HTML structure to achieve good quality and decrease extraction time.
Abstract: A web scaping system configured with artificial intelligence and image object detection. The system processes a web page with a neural network to perform object detection to obtain structured data, including text, image and other kinds of data, from web pages. The neural network allows the system to efficiently process visual information (including screenshots), text content and HTML structure to achieve good quality and decrease extraction time.
Abstract: A web scaping system configured with artificial intelligence and image object detection. The system processes a web page with a neural network to perform object detection to obtain structured data, including text, image and other kinds of data, from web pages. The neural network allows the system to efficiently process visual information (including screenshots), text content and HTML structure to achieve good quality and decrease extraction time.
Abstract: A web scaping system configured with artificial intelligence and image object detection. The system processes a web page with a neural network to perform object detection to obtain structured data, including text, image and other kinds of data, from web pages. The neural network allows the system to efficiently process visual information (including screenshots), text content and HTML structure to achieve good quality and decrease extraction time.