23 Workflow
23.1 Overview
Designing workflow in context of programming can be split into 2 dimensions.
- Program: micro level, workflow while writing code
- Project: macro level, workflow of overall project
23.2 Program
While writing code there are 2 major dimensions.
- Editing tools: like prediction, linting etc.
- these only help with basic checks and completion
- these are managed at project level through editor settings
- Elements of code related to properties of a good program
This part of workflow design is related to the 2nd dimension and is dealt with refactoring.
23.2.1 Refactoring
Refactoring means reviewing and changing code to attain properties of a good program, without changing the actual output.
Recommended properties of a good program are: readable, testable, modular, extensible, efficient
Some practical aspects of writing programs are
- There are multiple ways to solve the same problem
- It is a cyclical process of writing code and refactoring
- More often than not, refactoring opportunities become apparent only while reviewing code
23.2.2 Recommendations
- Follow naming conventions
- Use doc strings
- Use and follow type annotations
- make exceptions only if necessary
- Use comments where necessary
- Any task repeated more than a couple of times can be considered to be put into a function
- Functions should have minimum possible responsibilities, ideal is single responsibility
- Use appropriate data types
23.2.3 Sample workflow
Below is a workflow for writing code, to cover aspects that editing tools cannot cover.
- Step 1: focus on getting the the code to produce the correct result using recommended practices
- Step 2: review the code for opportunities for refactoring
- Step 3: refactor code
- Step 4: goto step 1
Note that the process given is an infinite recursion, the base case is when there is no further refactoring needed and it depends on size of the project, skills and experience. For small projects 2 to 3 recursions should be enough. For larger projects the requirements expand quickly.
23.3 Projects
Learning and practising project workflow management from the very start and for smallest of projects is recommended as it has many advantages like
Workflow becomes operationally more
- organized
- efficient
Reduces errors
Allows more time on design and thought
Section 23.3.2 contains discussions on some key considerations while managing programming projects in general from a user of programming perspective. Python related solutions are discussed at respective places.
Section 23.3.3 illustrates a Python specific sample project structure which can work as a starter template.
Python documentation has a section dedicated to this for structuring Python projects specifically. Although Python documentation aims at developers who need to publish their packages on PyPI
, still it gives good background to Python project management in general.
23.3.1 Tools: Settings
While programming, most of the interaction is with editor. Through editor all the underlying tools, like terminal, Python, git etc., are accessible.
Managing the tools and related settings, including extensions, is an essential part of project workflow.
VSCode related settings can easily be managed through use of workspaces and profiles.
Tasks in VSCode provide automation related to projects.
Sync can be used to keep the settings in the cloud, which makes it easy to switch between computers.
23.3.2 Components
23.3.2.1 Dependencies
There are 2 key dependencies of a Python project.
- Python version used: document using a
pyproject.toml
file - External packages and their version used in the project
- virtual environments provide solutions
23.3.2.1.1 Python version
Python, once installed, is machine and os independent. Python is available for most of the used computer system and operating system combinations.
If a project runs on a Python version on a pc then it will run on a different pc with a different architecture and operating system if the same Python version can be installed on it.
As one starts to use programming, recording the Python version should be sufficient, to be able to reproduce the project later. To be extra safe machine and operating system can be documented too if needed.
pyproject.toml
file is used currently by Python developers for storing metadata about a Python project for creating and distributing packages, which can be used for storing Python version details and some other basic metadata about the project in a structured way. Note that the name is required to be pyproject.toml
in case automation tools are used later as they check for a file with this specific name.
More details can be found at Python Docs: Packaging: pyproject.toml
23.3.2.1.2 External packages
As a user in programming most of the projects will use external modules and packages to find solution to a problem.
Dependencies and solutions related to this have been discussed in Section 18.4 related to virtual environments.
23.3.2.2 Documentation
Documentation is a critical part of any code project as it helps the author and users throughout the lifecycle of the project. The most common situation is when looking at the code written by self after some time becomes hard to understand. Documentation helps in this situation too.
There are 3 key areas of documentation.
- Doc strings: document functions, classes
- Comments: in the code itself to explain key concepts and logic applied
readme.md
: documentation for the project at high level
VSCode extension, autoDocstring - Python Docstring Generator can be used to assist in creating docstrings. Using such tools help use best practices evolved by experience of developers.
There are tools like Sphinx
to automate parts of documentation of Python projects. These are generally needed for large coding projects.
One of the most important aspect is to structure the code so that it documents itself. For example naming objects, files and folders well so that they are self explanatory. Organizing function definitions and calls such that they self explain the flow of logic implemented. Giving thought to these aspects complements documentation.
23.3.2.3 Version control
Uses of version control systems has been discussed in Chapter 6. While using git
below are some things that should be part of the workflow.
- Regularly maintain
.gitignore
file.py-venv
folder is large and need not be tracked as it can be restored using requirements file- anticipate and add directory and file patterns at the start of the project, it is inefficient to untrack files/folders later
- Do regular structured commits with helpful messages
23.3.3 Sample structure
- project root
- .git/
- .py-venv/
- docs/
- src/
inputs.py
category_1_funcs.py
category_2_funcs.py
main.py
orapp.py
pyproject.toml
py-requirements.txt
readme.md
.gitignore
Since the main.py
(or app.py
) can directly call functions from directories beneath it, e.g. import src.category_1_funcs as <category_1>
and use functions as <category_1>.<function_name>()
, there is no need for using packages with __init__.py
for very small projects.
The only drawback is you cannot cross reference objects from files across folders, unless they are in flat hierarchy below.
Functions can further be put into subdirectories.