minimal_web_scraper.main#
Overview#
Functions#
- minimal_web_scraper.main.download(target_url: str, timeout: int = 1) tuple[bytes, str | None]#
Download the HTML content of the URL, using requests library.
Use a custom header.
- Parameters:
target_url (str) – url to download
- Raises:
ValueError – if the target_url is not a valid URL
- Raise:
may raise exceptions from the requests and urlparse library
- Returns:
content of the HTML page and the encoding
- minimal_web_scraper.main.scrape(url: str) Any#
Orchestrate the download and parse of the resource at the URL.
- Parameters:
url – URL to parse
- Returns:
extracted informations by a implemented
parsers.BaseParser.parse()- Raise:
parsers.exceptions.ParserNotFound()
Attributes#
- minimal_web_scraper.main.headers#