KTH Logo

Working with Python Virtual Environments


Note: This post has been updated to reflect the modules on Dardel (December 2021).


When we use Python in our work or personal projects, it is often necessary to use a number of packages that are not distributed as standard Python libraries. We therefore need to install those packages based on the specific requirements of every project. In the scenario of working on multiple projects, it is not uncommon that different projects have conflicting requirements in terms of packages. For example, project A may require version 1.0 of a certain package, while project B may require version 2.0 of the same package. A solution to this conflict is to separate the packages for different projects or purposes with the help of a so-called “virtual environment”.

A Python virtual environment is an isolated run-time environment that makes it possible to install and execute Python packages without interfering with the outside world. Without a virtual environment, Python packages are installed either in the system site directory, which can be located via the following command:

$ python -c 'import site; print(site.getsitepackages())'

or in the so-called Python user base, which is usually in the “$HOME/.local” folder. A Python package installed in this way can have only one version, and it is therefore not possible to work with two or more projects that have conflicting requirements regarding the versions of a certain Python package. With the help of a virtual environment, we can have different Python site directories for different projects and have those site directories isolated from each other and from the system site directory.

This blog post will briefly introduce two ways of creating and managing a Python virtual environment: venv or conda.

(more…)

Skip the configuration, get to the cluster: Docker way

Sometimes it can be a daunting task to get all the Kerberos and SSH configurations right on your first attempt at using the PDC systems. Nearly every day PDC Support receives a number of help requests and questions from researchers who have run into configuration problems. So PDC has introduced an alternative way of logging in to our clusters by using Docker containers with pre-configured Kerberos and SSH files.

What is Docker?

Docker is a tool used to deploy and run single or many applications within what are known as containers. Containers are packed with all the parts that are needed to run an application, such as the actual application, any relevant tools, libraries or other necessary information. Each Docker container is delivered as a single package with all the necessary material included. This means that the person using the Docker container does not have to worry about installing any new tools/libraries or configuring them before using them. Since everything is preloaded, the applications inside the containers are ready-to-use and can be executed regardless of any customized settings on the host machine.

Later in this blog article, we will talk about a Docker container that we have developed for the purpose of logging in to PDC clusters.

(more…)