18  Available Modules & Packages

Modules and packages allow code re-use and distribution. There are several modules and packages available which provide definitions (functions, constants, classes) for a variety of use cases.

There are 3 main resources for using external modules and packages for use in Python code.

18.1 Built-in

These are built-in objects that are always available. They are loaded by default and do not need the use of import.

18.2 Standard library

Standard library consists of several modules (files and packages) available with standard installation.

These have to be loaded using importand are not loaded by default like the built-in objects.

Some of these, like math and sys, are written in C for speed.

Full list of available modules with documentation can be found in Python docs: library reference.

18.2.1 Frequently used modules

Topic Module Description

Python Runtime Services

sys

System-specific parameters and functions

Generic Operating System Services

os

Miscellaneous operating system interfaces

io

Core tools for working with streams

time

Time access and conversions

argparse

Parser for command-line options, arguments and sub-commands

File and Directory Access

pathlib

Object-oriented filesystem paths

os.path

Common pathname manipulations

glob

Unix style pathname pattern expansion

shutil

High-level file operations

Data Persistence

sqlite3

DB-API 2.0 interface for SQLite databases

File Formats

csv

CSV File Reading and Writing

Functional Programming Modules

itertools

Functions creating iterators for efficient looping

functools

Higher-order functions and operations on callable objects

operator

Standard operators as functions

Data Types

datetime

Basic date and time types

zoneinfo

IANA time zone support

Text Processing Services

re

Regular expression operations

Numeric and Mathematical Modules

math

Basic math

statistics

Statistics

18.3 Python package index (PyPI)

PyPI handles open source contributions to the language. These are external packages which have to be installed before they can be loaded usingimport.

As Python is one of the most popular languages, it is easy to find a package for almost every use case by searching the web or PyPI.

pip is the installer for external packages on PyPI.

18.3.1 Some important packages

Topic

Module

Description

Installation

pip: home, doc

Python install package

Jupyter

jupyterlab: home, doc

Jupyterlab, interactive notebooks

Scientific Computing

numpy: home, doc

Fundamental package for scientific computing with Python

scipy: home, doc

Mathematics, science, and engineering. It includes modules for statistics, optimization, integration, linear algebra, Fourier transforms, signal and image processing, ODE solvers, and more.

Data Analysis

pandas: home, doc

Data structures and operations helpful for data analysis. Dataframes, series, …

Data Visualization

matplotlib: home, doc

Generic visualization

seaborn: home, doc

Based on matplotlib, for statistical visualizations

18.4 Virtual Environments

Virtual environments are used to creates separate installation location for external packages, which helps with following

  • keep installation of external packages organized for different tasks
  • keep track and manage version requirements for external packages needed for specific projects

At a high level, virtual environment isolates the use of external packages from built-in and standard library.

18.4.1 Why use a virtual env?

The external packages keep releasing new versions for bug fixes, enhancements and new features.

When working on projects which depend on external packages it is critical to keep track of version of external packages as latest version might no longer support a feature that the project code needed.

When installing packages in base installation there might be a conflict between dependencies of different project’s. Some projects might depend on a version of a package while other projects might depend on a different version of the same package.

Some projects might need a specific version of Python itself.

There might be some packages which are needed for one time tasks. This keeps on adding complexity of managing packages which are not needed any more.

All these reasons lead to development of virtual environments and its use is recommended.

18.4.2 Usage

  • How to create a virtual environment?
    • venv module is part of standard library and can be used with bash commands
    • unix/mac: python3 -m venv <path to new venv>
    • win: py [-v] venv <path to new venv>

This creates a folder with the name in the given path. Usually it is helpful to name virtual environments starting with ., e.g. .py-venv. This helps identify that this is a Python virtual environment folder. This dot keeps the folder hidden which is helpful in keeping the folder view clean as the venv folder is seldom used directly.

It is recommended to use Python version in bash commands, while creating and restoring venv.

  • how to activate venv using bash?
    • Linux/Mac:
      • source <venv path>/bin/activate
    • Windows:
      • source <venv path>/Scripts/activate
    • Editors, like VSCode, allow to select Python interpreter for running a project
      • venv can be set to be activated by default

Once the venv is activated, the associated Python version and external packages installed in the virtual environment are used. Installing packages using pip installs them into the activated virtual environment.

18.4.3 Project dependencies

  • venv along with pip can be used to manage project dependencies

  • venv keeps installed packages isolated in a virtual environment

  • pip is used to create a list of required external modules installed in venv

    • windows: py [-v] -m pip freeze > py-requirements.txt
    • unix/mac: python3 -m pip freeze > py-requirements.txt
    • version dependencies can be managed as well
  • pip is used to re-create venv by installing required external modules contained in <py-requirements.txt> file in venv

    • windows: py [-v] -m pip install -r py-requirements.txt
    • unix/mac: python3 -m pip install -r py-requirements.txt
  • resources: pip user guide

While using pip or venv it is advisable to use full command with specific Python version. This ensures that the associated pip and venv is used.

pip records <package/module name>==<version number installed>. pip documentation has more details how to manage packages for a project using pip.

18.4.4 Structuring venv’s

There are a couple of commonly used strategies for organizing and structuring virtual environments.

  • Centralized approach: keep all virtual environments in a central location
    • related projects can share virtual environments
    • typically location is $HOME/.venvs/<name of venv>
  • De-centralized approach: each project has its own virtual environment
    • simple but effective as there is no need to manage conflicts
    • similar projects can be grouped in a root directory with a common venv if needed

Given the amount of memory used by virtual environments and simplicity of management, de-centralized approach makes more sense unless there is a specific reason.

A sample small project’s folder structure with venv and git could look like below.

  • project root
    • .git/
    • .py-venv/
    • docs/
    • main.py
    • inputs.py
    • category_1_funcs.py
    • category_2_funcs.py
    • py-requirements.txt
    • .gitignore