22  Regular Expressions

Regular expressions, commonly referred to as “regex”, are an abstract specification of creating patterns to do advanced search and replace operations on strings. They are like specification for a mini language for string (text) operations.

There are multiple implementations of this abstract specification with many common features but subtle differences for different use cases.

Python itself has multiple different flavors/implementations of regular expressions.

The key idea is to provide much more powerful pattern searching than regular string methods provide. For example, to search all pdf files in a folder you can use *.pdf pattern to do a search on all file names. Or, ^project pattern to find all files starting with the word project.

Regular expressions provide much more advanced features but are slower compared to regular string methods. The python tutorial has a section dedicated to describe the choice between using string methods and regular expressions.

Regular expressions can get very messy very fast. So, in the beginning, use only basics of regex, when it is clear that string method is not available or will be too complex. The most common situation is to use regex for path related operations.