minimal_web_scraper.parsers.base#
Overview#
Base class for creating custom parsers. |
Classes#
- class minimal_web_scraper.parsers.base.BaseParser#
Base class for creating custom parsers.
Subclasses must override
parse()andscope_urls. Parsers are not intended to be instantiated.Overview
Attributes# Define which URLs the parser is intended to parse.
Members
- scope_urls: list[str]#
Define which URLs the parser is intended to parse.
- abstract classmethod parse(html_content: bytes, encoding: str | None) Any#
Abstract method to parse HTML chunks.
- Parameters:
html_content – the raw HTML to parse
encoding – the associated encoding of the HTML
- Returns:
return the extracted elements