Installation#

This will guide you through the installation of Python and set up of your project environment. If you already have Python on your machine and know how to use a virtual environment, you can skip to Download the library.

The project name for this example is my_first_scraper, feel free to replace this by your own project name.

Git#

You will also need to install Git for this tutorial.

Check Git version#

Before installing Git, you you may want to verify that it is not already installed. Most of the Linux distributions come with GIt.

Open a terminal:

$ git --version

If Git is installed, the command should returns something like git version X.X.X. You can skip to Python, else read the next paragraph.

Git Installation#

Official guide: https://git-scm.com/book/en/v2/Getting-Started-Installing-Git

Windows#

The official site to download Git : https://git-scm.com/download/win.

Follow the instruction of the installer and verify the installation by running the command at Check Git version.

Linux#

Execute the following commands:

$ sudo apt update
$ sudo apt install git

Verify the installation by running the command at Check Git version.

Python#

Check Python version#

Before installing Python, you may want to verify that it is not already installed. Most of the Linux distributions come with Python.

Open a terminal:

$ python --version

If Python is installed, the command should returns something like Python 3.10 or a higher version of Python. You can skip to Project creation, else read the next paragraph.

Python Installation#

Alternative guide:

Windows#

Officials guides for Windows:

Download the Python Windows installer for your system and follow its instructions. I recommand you to check the box Add Python to PATH, to make Python available in the command line.

Linux#

If your distribution is not represented here, use its package manager.

$ sudo apt update
$ sudo apt install python3

You can choose which version to install by adding it number after. For example, installing Python 3.11 on Debian:

$ sudo apt install python3.11

Verify that python is installed by running the command in Check Python version

Project creation#

To create the project folder:

$ mkdir my_first_scraper
$ cd my_first_scraper

Virtual environment#

Official documentation of venv: https://docs.python.org/3/library/venv.html.

It is recommended to use a virtual environment to manage dependencies:

$ python -m venv venv

Now your project directory should be:

my_first_scraper
|
- venv

minimal-web-scraper installation#

Activation of the environement#

In a terminal, activate the newly created virtual environment with:

$ venv\Scripts\activate.bat

You should see now, a (venv) has been append before the prompt of your command line.

Note

PowerShell can throw you an error. See about Execution policies.

Execute the following command (there is no equivalent in cmd):

(venv)$ Get-Command python

If the path returned is not in the project sub-directory, reiterate the command from above.

Download the library#

To use the library, we need to download it. For that, the standard tool is pip:

(venv)$ pip install git+https://github.com/Gamma120/minimal-web-scraper.git

Verify the library is installed:

(venv)$ pip list

minimal-web-scraper must be in the list returned.

Next#

Next, you will see how to use the library to create your first scraper.