11 Data types

11.1 Introduction

Data type refers to object types that store information and provide some operations on that information.

Compare a function and a number, both are objects but a function stores code and numbers stores numeric data primarily. That is why, function, class etc. are not referred to as data types like int or float which are numeric data types.

Data types are the most critical part of any language. They are used to store, access and operate upon information within code.

Numbers and text are the most fundamental data types.

Some languages like C, distinguish between characters and strings, where strings are treated as sequence of characters. Python has just strings for text, which are a sequence of characters, and can be a sequence of single character.

Then there are collections which provide ways to combine objects to create more complex data. Data types like list, dictionary etc. are provided in higher level languages. Every high level language has its own implementations and syntax with differences, but underlying design principles are the same.

Data Structures and Algorithms part of the book gives a high level background of how the data designs have evolved.

11.1.1 Overview

Below tree provides an overview of data types implemented in Python with categories.

The nodes with red text are the data types which are builtin in Python and are covered in the book
The nodes with blue text are data types available by loading from standard library

Standard library is discussed in architecture part of the book: Section 18.2

Collection is generally used for collection of objects, and it can be compounded, collection of collection of objects.

Sequence type is a collection of objects with preserved order. In base Python, strings, tuples and lists are sequence type.

Sequences can be mutable or immutable.

Iterable is any collection of objects from which objects can be retrieved one at a time and hence can be looped through. str, tuple, list, dict, set are all iterables.

Sequence is a subset of collections. All sequences are iterables.

Methods available in any sequence type can be categorized as below.

common methods for sequence types
- sequence level operations (e.g. length of a sequence)
- element level operations (e.g. indexing and slicing)
- methods available for iterables (can be used in loops)
if mutable then methods for mutable sequence types
methods specific to a sequence type

Iterable methods are related to use in looping and are discussed in respective sections.

Based on this categorization, methods are discussed for respective sequence types, string, list and tuple.

All this information is end product of developments in the field of data structures. The DSA part of the book has more information on this, post reading which context of why Python data types behave the way they do will be more clear.

11.1.2 Objectives

Focus on understanding the underlying concepts and where to look for information in a structured way.

Creation and syntax
Operations: understand how operations are structured across data types
Indexing and slicing for sequence types
Implications: understand the usage and implications using example use cases given
- Mutable vs Immutable data types
Choosing data types: understand when to use which data type

11.2 None

None signifies absence of value. A variable might be defined but not bound to any object. None is a placeholder to signify this state.

It is specially useful in conditionals, to avoid error when checking if a value has been assigned to a variable. This is covered in chapter on conditionals (Section 11.10.1.2.3, Section 15.1.5).

11.2.1 Example

some_var = None
print(f'{some_var=}, {type(some_var)=}')

>>>  some_var=None, type(some_var)=<class 'NoneType'>

11.3 Numeric types

11.3.1 Summary

Boolean	Integers	Rationals	Real	Complex
`bool`	`int`	`fractions.Fraction`	`float` `decimal.Decimal`	`complex`

numeric data types are immutable
for basic calculations int and float are sufficient
- others are listed for completeness
boolean and comparison operations are discussed separately

Immutable implies that once an object is created, its value cannot be modified.

For variable assignment this implies that if a variable is storing some number and is assigned another number, a new object is created in background. This does not have any significant impact in case of numeric data types.

Mutability is discussed in more detail at the end of this chapter.

11.3.2 Specifications

Numbers, integers or floats, can be typed as done in regular math. There are some special syntax available for code readability.

Underscores can be used for better code readability. During execution they are treated normally.

one_million_int = 1_000_000
one_million_float = 1_000_000.00
print(f'{one_million_int = }')

>>>  one_million_int = 1000000

print(f'{one_million_float = }')

>>>  one_million_float = 1000000.0

Scientific notation can be used with floats. e has to be preceded by a number.

x = [1e-2, 3.314e+5]
print(x)

>>>  [0.01, 331400.0]

11.3.3 Operations

Regular math operations can done using the symbols provided as listed below. Other functions commonly used are

round(x[, n]) is a builtin function provided
Standard library has more options for numeric operations
- math module
  - e.g. math.floor(x), math.ceil(x), math.trunc(x) etc.
- random module for pseudo random number generations

addition	substraction	multiplication	division	exponents	floored division	modulo
`+`	`-`	`*`	`/`	`**`	`//`	`%`

using division always returns float
operations with int and float return float

11.3.3.1 Increment/Decrement

Incrementing and decrementing a value is provided through operators += and -=.

x += n is same as x = x + n
x -= n is same as x = x - n

Python uses a special syntax for these common operations and can be extended to below operations.

x *= n is same as x = x * n
x /= n is same as x = x / n
x **= n is same as x = x ** n

Caution for float

There are some issues and limitations with floating point arithmetic using float.

It is recommended to go through them at python documentation on limitations of using float type.

11.3.4 Examples

11.3.4.1 Example 1

Below is a basic example of assigning int and float. Note that if decimal is present then, even if number is integer, it is stored as float.

num1 = 10; num2 = 10.0

print(f'{num1 = }, {type(num1) = }')

>>>  num1 = 10, type(num1) = <class 'int'>

print(f'{num2 = }, {type(num2) = }')

>>>  num2 = 10.0, type(num2) = <class 'float'>

11.3.4.2 Example 2

Operations with int and float return float.

num1 = .25; num2 = 100
num3 = num2 * num1

print(f'{num1 = }, {type(num1) = }')

>>>  num1 = 0.25, type(num1) = <class 'float'>

print(f'{num2 = }, {type(num2) = }')

>>>  num2 = 100, type(num2) = <class 'int'>

print(f'{num3 = }, {type(num3) = }')

>>>  num3 = 25.0, type(num3) = <class 'float'>

11.3.4.3 Example 3

Objects of type int, within certain range (-5 to 256), are not duplicated for performance reasons.

some_int_1 = 10; some_int_2 = 10

some_int_1 is some_int_2

>>>  True

The basic idea is to intern for memory optimizations. Sometimes useful for strings, string interning. This causes surprizes such as this example.

11.3.4.4 Example 4

Numeric data types are immutable. In the example below, when some_int is assigned a new value, a new object is created in memory and bound to some_int.

some_int = 10
print(hex(id(some_int)), f'{some_int=}')

>>>  0x75bcb47e0210 some_int=10

some_int += 1
print(hex(id(some_int)), f'{some_int=}')

>>>  0x75bcb47e0230 some_int=11

11.4 String

11.4.1 Overview

In Python, a string (str type object) is an immutable sequence of unicode code points. More generally speaking it is an immutable sequence of characters, numbers and symbols.

Strings are sequence type
- like mathematics, order has meaning in sequences
  - string is not same as trgins
- this helps enable support for indexing and slicing
Strings are immutable
- a new object is created in memory on modification
  - referred as copy-on-modify
- adding/deleting/changing elements is not provided by default
Python Tutorial: Gentle introduction to text
Python library reference: Detailed documentation on str

11.4.2 Specifications

11.4.2.1 Overview

Strings can be created using
- single quotes: 'some string'
- double quotes: "some string"
- multi-line strings
  - triple single quotes: '''some string'''
  - triple double quotes: """some string"""
- raw strings just need a preceding r character for any method
  - r"string with \", r'string with \'
Special string types
- multiline strings
- raw strings
- formatted string literals
print changes the way results are displayed

11.4.2.2 Basic strings

string_1 = 'using single quotes'

string_2 = "using double quotes"

string_3 = "including \"double quotes\" using double quotes"

string_4 = 'including "double quotes" using single quotes'

11.4.2.3 Multiline strings

Multiline strings can be created using triple quotes (single/double). A physical new line within a string is not included in the string. The spaces and tabs on a line are included, see string_2 below.

string_1 = """This is a multiline string
with no tabs using triple double quotes"""
string_2 = '''This is a multiline string
              with tabs using triple single quotes'''

print(string_1)

>>>  This is a multiline string
>>>  with no tabs using triple double quotes

print(string_2)

>>>  This is a multiline string
>>>                with tabs using triple single quotes

11.4.2.4 Using backslash (`\`)

Backslash can be used to insert some special character sequences in a string, which are used by the print and similar functions which can parse such special character sequences.

Examples:

newline: \n
tabs: \t
escape quote symbol: \' or \"

Note in below examples, when variables are output without print function, special character sequences like newline and tab are not parsed and shown as is.

string_1 = "Line 1\nLine 2"
string_2 = "text 1\ttext 2"

string_1; print(string_1)

>>>  'Line 1\nLine 2'
>>>  Line 1
>>>  Line 2

string_2; print(string_2)

>>>  'text 1\ttext 2'
>>>  text 1 text 2

11.4.2.5 Raw strings

Raw strings do not escape backslash (\). To create a raw string prepend string with r or R character.

One typical use case is to store windows path which have backslashes. Note in below example since \u has special meaning it gives error while creating the path which contains such sequence of characters.

string_1 = "C:\user\name"

>>>  (unicode error) 'unicodeescape' codec can't decode bytes in position 2-3: truncated \uXXXX escape (<string>, line 1)

string_2 = r"C:\User\name"

print(string_2)

>>>  C:\User\name

11.4.2.6 Formatted strings

Formatted strings are used for mixing hard coded text and variable values with formatting.

old syntax
- "text with {0[:fs]} and {1}".format(var1, var2)
new f-string syntax (Python version >= 3.6)
- f'text with {var1[:fs]} and {var2[:fs]}'

where fs to be read as format specifier

This is specially useful in controlling the format of output message from the code. Message could be an error, warning or a regular informative message.

There are a lot of options to play around which can be found at link.

11.4.2.6.1 Examples

11.4.2.6.1.1 Ex 1

Using old format style.

user_name = "First Last"
user_age = 20
my_string = "Name: {0}\nAge: {1}".format(user_name, user_age)

print(my_string)

>>>  Name: First Last
>>>  Age: 20

11.4.2.6.1.2 Ex 2

Using old format style with format specifier.

user_name = "First Last"
user_age = 20
user_balance = 1000001
my_string = "Name: {0:^30}\nAge: {1:^30}\nBalance: {2:,.2f}".format(\
  user_name, user_age, user_balance)

print(my_string)

>>>  Name:           First Last          
>>>  Age:               20              
>>>  Balance: 1,000,001.00

11.4.2.6.1.3 Ex 3

Using new f-string with format specifier.

user_name = "First Last"
user_age = 20
user_balance = 1000001
my_string = f"Name: {user_name:>15}\nAge: {user_age:>16}\
    \nBalance: {user_balance:>12.2f}"

print(my_string)

>>>  Name:      First Last
>>>  Age:               20    
>>>  Balance:   1000001.00

11.4.3 Operations

Below are the 2 major categories of operations a string supports.

common operations on sequence types
operations specific to strings

Common sequence operations like indexing and slicing are provided with examples below, but are same for all sequence types like tuple and list.

11.4.3.1 Sequence

operations on sequence itself
- length (len(s))
- concatenate (s1 + s2), for same type sequences
- repeat (s*n or n*s) where n is the number of repeats
- comparisons, for same type sequences
operations on items in sequence
- retrieve by position: index/slice (s[i[, j[, k=1]]])
- min/max
- check element’s
  - existence: e in s, e not in s
  - index: s.index(e)
  - count: s.count(e)

11.4.3.1.1 Index & Slice

Indexing refers to retrieving elements by position. Slicing refers to extracting subset of elements of a sequence.

indexing starts at 0 and ends at n - 1
negative indices are allowed
usage
- s[i]: return item at index i
- s[i:j]: return items from index i to j-1
  - returns j - i items
- s[i:j:k]: return items from index i to j-1 with step k
  - k=1 by default

11.4.3.1.2 Examples

string_1 = "abcdefgh"

Enumerate function is used to get pairs of index and elements of a sequence.

print([*enumerate(string_1)])

[(0, ‘a’), (1, ‘b’), (2, ‘c’), (3, ‘d’), (4, ‘e’), (5, ‘f’), (6, ‘g’), (7, ‘h’)]

select c to f

string_1[2:6]

>>>  'cdef'

select second last - g

string_1[-2]

>>>  'g'

[(0, ‘a’), (1, ‘b’), (2, ‘c’), (3, ‘d’), (4, ‘e’), (5, ‘f’), (6, ‘g’), (7, ‘h’)]

select last 3 elements

string_1[-3:]

>>>  'fgh'

[(0, ‘a’), (1, ‘b’), (2, ‘c’), (3, ‘d’), (4, ‘e’), (5, ‘f’), (6, ‘g’), (7, ‘h’)]

select d onwards

string_1[3:]

>>>  'defgh'

select up till d

string_1[:4]

>>>  'abcd'

string_1[:3] + string_1[3:]

>>>  'abcdefgh'

11.4.3.2 String specific

There are a lot of default operations (link)
- case, find/replace, checks, strip, split, …
- usually easy to use, look at reference as needed
regular expressions
- searching and matching patterns in strings
- depends on module re in standard library
- advanced topic, avoid at this stage

11.4.3.3 Arithmetic Operators

The + and * operations can be used with strings, other operators will give error.

Note that operations work with compatible type of objects.

+: concatenate strings
- works with str type objects, i.e. strings
*: repeat a string
- works with a str and an int

11.4.3.3.1 Example 1

some_str_1 = "some string 1"; some_str_2 = "some string 2"
concat_1_2 = some_str_1 + " " + some_str_2
print(concat_1_2)

>>>  some string 1 some string 2

11.4.3.3.2 Example 2

some_str = "xyz"
print(some_str*5)

>>>  xyzxyzxyzxyzxyz

11.4.3.3.3 Example 3

some_str_1 = "some string 1"; some_str_2 = "some string 2"
concat_1_2 = some_str_1 * some_str_2

>>>  Error: TypeError: can't multiply sequence by non-int of type 'str'

11.5 Tuple

11.5.1 Overview

Tuple is an immutable collection of ordered, heterogeneous objects with below features

sequence type (collection of ordered objects)
- this helps enable support for indexing and slicing
- the position of data has meaning
- useful in passing and operating on set of objects within code
heterogeneous: can contain any type of object
- more efficient if items are homogeneous
immutable
- modify in-place operations are not supported
  - a new object is created in memory on modification if elements are immutable
- adding/deleting/changing elements is not provided by default
Python Tutorial: Gentle introduction to tuples
Python library reference: Detailed documentation on sequences

11.5.2 Specifications

11.5.2.1 Creation syntax

using commas
- comma decides the tuple
- parenthesis are just for code readability
using tuple constructor: tuple()
using unpacking (Python special syntax)
using comprehension (covered in Python special features)

11.5.2.2 Creation by use case

using elements
- 0 item: () or tuple()
- 1 item: i, or (i,)
  - (i) will give error, comma is needed
- more than 1 item: i1, i2, i3 or (i1, i2, i3)
using elements from another iterable[s]:
- using tuple constructor: tuple(iterable[s])
- using unpacking
  - t = *l,, t = (*l,), t = (*s, *l)
  - t = (*l) will give error, comma is needed

11.5.3 Operations

Tuple has access to common operations on sequence types with no additional methods.

operations on sequence itself
- length (len(s))
- concatenate (s1 + s2), for same type sequences
- repeat (s*n or n*s) where n is the number of repeats
- comparisons, for same type sequences
operations on items in sequence
- retrieve by position: index/slice (s[i[, j[, k=1]]])
- min/max
- check element’s
  - existence: e in s, e not in s
  - index: s.index(e)
  - count: s.count(e)

11.6 List

11.6.1 Overview

List is a mutable collection of ordered, heterogeneous objects with following features.

sequence type
- this helps enable support for indexing and slicing
- the position of data has meaning
- useful in passing and operating on set of objects within code
heterogeneous: can contain any type of object
- more efficient if items are homogeneous
mutable
- adding/deleting/changing elements is provided by default
- modify in-place operations are supported
Python Tutorial: Gentle introduction to list: part 1, part 2
Python library reference: Detailed documentation on sequences

11.6.2 Specifications

empty list: [] or list()
using elements: [i1, i2, ...], [i1]
using elements from iterable[s]:
- using list constructor: list(iterable)
- using unpacking
  - [*t], [*t,], [*s, *t, *l]
- using comprehension (covered in Python special features)

11.6.3 Operations

There are 2 sets of operations a list supports.

common operations on sequence types
operations on mutable sequence types

11.6.3.1 Sequence operations

operations on sequence itself
- length (len(s))
- concatenate (s1 + s2), for same type sequences
- repeat (s*n or n*s) where n is the number of repeats
- comparisons, for same type sequences
operations on items in sequence
- retrieve by position: index/slice (s[i[, j[, k=1]]])
- min/max
- check element’s
  - existence: e in s, e not in s
  - index: s.index(e)
  - count: s.count(e)

11.6.3.2 Mutable sequence operations

most of the operations are in-place
- implies no new object creation on modification
operations on sequence itself
- copy (shallow), extend, repeat, reverse
operations on items
- delete, replace, append, clear all, remove, insert, pop
link

11.7 Range

Range is a special iterable to generate a sequence of integers with following characteristics.

immutable sequence type
cannot see all elements at a time
- have to be unpacked into a list or tuple
syntax for creation
- range(stop)
- range(start, stop[, step])
  - start is included
  - stop is excluded
primarily used for loops which is discussed at Section 12.3.1

11.7.1 Examples

print(range(10))

>>>  range(0, 10)

print(list(range(10)))

>>>  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9]

print([*range(11)])

>>>  [0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print([*range(1, 11)])

>>>  [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]

print([*range(-1, -11)])

>>>  []

print([*range(-1, -11, -1)])

>>>  [-1, -2, -3, -4, -5, -6, -7, -8, -9, -10]

11.8 Dictionary

11.8.1 Overview

A dictionary is a mutable mapping type collection of heterogeneous objects mapped to keys that are hashable and unique objects.

in newer versions (>3.9) the order is guaranteed
collection of {key: value} pairs where
- key can be any hashable object
  - strings, numeric data types can be used
  - immutable type which contain only immutable objects
  - tuples with immutable objects can be used
  - lists, dictionary cannot be used
- value can be any Python object

A dictionary is useful when a collection of objects is needed with the option to do quick searches based on keys rather than index, unlike sequences.

Concept of hashable objects is introduced in DSA (Section 21.4.4).

11.8.2 Specifications

using key value pairs separated by commas
- d = {"key1": value1, "key2": value2, ...}
using type constructor
- d = dict([("key1", value1), ("key2", value2), ...])
- d = dict(key1=value1, key2=value2)
create empty dictionary
- d = {}
- d = dict()
using comprehensions (covered in Python special features)
if a key is passed multiple times, final value exists

11.8.3 Operations

operations on dictionary itself
operations on keys and values

11.8.3.1 Operations on dictionary

length: len(d)
clear: d.clear()
shallow copy: d.copy()
update from another dictionary: d.update([other])

11.8.3.2 Operations on keys and values

check keys: key in d / key not in d
view all keys/values: d.keys() / d.values() / d.items()
get all keys/values as list of tuples: list(enumerate(d))
get all keys as list: list(d)
get all keys as list reversed: list(reversed(d))
get value, error if key not present: d[key]
set value, inserts key if key not present: d[key] = value
del key/value, deletes last entry and returns deleted key, value: d.popitem()
if key not present return default if defined else error
- get value: d.get(key[, default])
- del key/value and return deleted value: d.pop(key[, default])

11.9 Set

Set is a collection of unique objects with operations related to math sets available, e.g. union, intersection.

In other words, set is a special dictionary with keys only.

In Python specifically, set is an unordered collection of hashable objects. In newer versions (>3.9) the order is guaranteed.

Sets are commonly used for

membership testing: search by value
removing duplicates from a collection

11.9.1 Specifications

A set can be created using curly braces with the exception of empty set.

create a set with valid keys
- curly braces
- set constructor

some_set = {key1, key2, ...}
some_set = set(iterable)

empty set can be created using set() constructor
- using {} creates an empty dictionary

11.10 Boolean data type

Boolean data type, True and False, is the fundamental unit for implementing boolean conditional expressions.

Boolean comparison operator are used to create elementary conditions. Combination operators allow for building larger conditions by combining multiple conditions.

Conditional control flow blocks, if and match, use conditions.

The idea is based on boolean math. It is essential in controlling the flow of the program based on state of one or more objects in the program.

bool in Python, based on usage, can refer to
- data type
- function
bool type is used to represent boolean values
bool type inherits from int type
bool data type can take value from 2 built-in constants, True and False
- underlying int values are int(True) = 1 and int(False) = 0 respectively
can be stored in variables like other objects
- useful in conditional blocks

Below are some examples for familiarity.

basics

bool(True), int(True), type(True)

>>>  (True, 1, <class 'bool'>)

bool(False), int(False), type(False)

>>>  (False, 0, <class 'bool'>)

storing in variables
- it is useful to name boolean variables like is_<some check>

is_int = True
is_int, bool(is_int), int(is_int), type(is_int)

>>>  (True, True, 1, <class 'bool'>)

11.10.1 Boolean comparison operators

Boolean comparison operators are used for object comparisons and return True or False, if used with compatible object types.

They are mostly used to create boolean conditions which are used in conditional blocks, if and match..case.

When used with sequence types (string, tuple, list)

equality operators (==, !=) return True if all elements are equal (not equal) in content
inequality operators only test minimum and maximum as appropriate
membership testing check for existence of elements and is most useful

11.10.1.1 Options and syntax

Operation	Python Operator	Comments
basic value comparisons	`==`, `!=`, `<`, `>`, `<=`, `>=`	compares values different types ok, but must be compatible
membership testing	`in`, `not in`	used with sequence types
object id comparison	`is`, `is not`	compares memory address works on all type

11.10.1.2 Examples

11.10.1.2.1 Numeric

num_1 = 10; num_2 = 15; num_3 = 10.0

num_1 == num_2, num_1 < num_2, num_1 <= num_3

>>>  (False, True, True)

can be stored in variables

cnd = num_1 > num_3

print(f"{cnd = }, {type(cnd) = }")

>>>  cnd = False, type(cnd) = <class 'bool'>

11.10.1.2.2 Sequence type

Membership testing is more useful for sequence types. Below examples illustrate the usage.

11.10.1.2.2.1 Strings

test character in a string

some_string = "abcd"
some_chr_1 = "a"
some_chr_2 = "e"

some_chr_1 in some_string

>>>  True

some_chr_2 in some_string

>>>  False

test a small string in a longer string

some_long_str = "A reasonably long string"
some_short_str = "long"

some_short_str in some_long_str

>>>  True

11.10.1.2.2.2 Tuples and lists

some_list = [1, 2, 3, 4, (1, 2, 3)]
some_tuple = 1, 2, 3
num_1 = 3; num_2 = 5

num_1 in some_list

>>>  True

some_tuple in some_list

>>>  True

num_2 in some_tuple

>>>  False

store a boolean operation in a variable

cnd = some_tuple in some_list

>>>  cnd = True, type(cnd) = <class 'bool'>

11.10.1.2.3 None type

Since there is a single instance of None type object created in a Python session, object id comparison is useful. None is used to signify if a variable is defined but not assigned a value yet.

Below example illustrates the object id comparison for None type in isolation.

some_var = None
some_var is None

>>>  True

11.10.2 Boolean combination operators

Boolean combination operators use boolean math to provide means of combining multiple comparison operations and conditions to form larger conditions.

Operation	Python Operator	Comments
not	`not x`	inverts the bool value if `x` is false, then `True`, else `False`
and	`x and y`	test if both conditions are `True`
or	`x or y`	test if any condition is `True`
()	`(cnd1 and cnd2) or (cnd3)`	used to group conditions conditions inside `()` are evaluated first

basic examples of combining conditions
- a == 10
- a >= 5 and a <= 10
- (a > 0 and a < 10) or (a >= 10 and a < 25)
order of precedence used for evaluation
1. ()
2. comparison operators have same priority (==, !=, <, >, <=, >=)
3. not > and > or
chained comparisons
- are automatically converted to paired and comparisons
- example: a < b < c is same as a < b and b < c
- this is specific to Python
for conditions with too many nested combinations it is recommended to use ()
- best for code readability
- avoid errors due to precedence order
can be stored in variables like other objects
- and used later in control flow

and and or operators are based on logic gates in boolean math. Truth tables, given below, summarize results for logic gates. 0 and 1 are used instead of False and True for better readability.

`x`	`y`	`x and y`	`x or y`
1	1	1	1
1	0	0	1
0	1	0	1
0	0	0	0

There are some additional features which Python provides related to boolean data type and are discussed in the Python special features chapter (Section 15.1). They are left from this section to keep the complexity low at this stage.

11.11 Generic concepts

11.11.1 Iterable unpacking

Iterable unpacking is a special feature in newer versions of Python. Some features were introduced in Python version 2, more features added using PEP-3132: Extended Iterable Unpacking in version 3.

* unpacks remaining items
returns a list
advantages
- better code readability
- easier than using indexing
- faster
gives error if there is a mismatch in number of items and variables

11.11.1.1 Examples

Unpack and assign elements of an iterable to variables
- get first and remaining items of an iterable

some_list = [1, 2, 3, 4]; some_tuple = (1, 2, 3, 4)

first_item = some_list[0]
end_items = some_list[1:]

print(f'{first_item = }, {end_items = }')

>>>  first_item = 1, end_items = [2, 3, 4]

first_item, *end_items = some_list

print(f'{first_item = }, {end_items = }')

>>>  first_item = 1, end_items = [2, 3, 4]

Unpack and assign elements of an iterable to variables
- get last and remaining items of an iterable

some_list = [1, 2, 3, 4]; some_tuple = (1, 2, 3, 4)

begin_items = some_tuple[0:-1]
last_item = some_tuple[-1]

print(f'{begin_items = }, {last_item = }')

>>>  begin_items = (1, 2, 3), last_item = 4

*begin_items, last_item = some_tuple

print(f'{begin_items = }, {last_item = }')

>>>  begin_items = [1, 2, 3], last_item = 4

Unpack and assign elements of an iterable to variables
- get first two, last and remaining middle items of an iterable

some_list = [1, 2, 3, 4, 5, 6, 7]; some_tuple = (1, 2, 3, 4, 5, 6, 7)

first_item, second_item, *remaining_items, last_item = some_tuple

print(f'{first_item = }, {second_item = }')

>>>  first_item = 1, second_item = 2

print(f'{remaining_items = }, {last_item = }')

>>>  remaining_items = [3, 4, 5, 6], last_item = 7

Combine iterables into another

some_list_1 = [1, 2, 3]; some_tuple_1 = (4, 5);
some_tuple_2 = (*some_list_1, *some_tuple_1)

>>>  some_tuple_2=(1, 2, 3, 4, 5)

** for mapping types - dictionary

some_dict_1 = {"key1": "value1", "key2": "value2.1"}
some_dict_2 = {"key2": "value2.2", "key3": "value3"}
some_dict_3 = {**some_dict_1, **some_dict_2}

>>>  some_dict_3={'key1': 'value1', 'key2': 'value2.2', 'key3': 'value3'}

11.11.2 Implications of mutability

Modify in-place: any modification to the object does not lead to creation of a new object
Copy on modify: create a new copy of object if modified, opposite of modify in-place
Mutable \(\implies\) modify in-place
- Lists and dictionaries can be modified without creation of new object on RAM
Immutable \(\implies\) copy on modify
- Strings and tuples create new objects on RAM if modified

Modify in-place means any modification to the object does not lead to creation of a new object. For e.g. strings and tuples create new objects on RAM if modified, whereas lists and dictionaries can be modified without creation of new object on RAM.

Implications
- flexibility
- efficiency (in terms of speed and memory)
Immutable objects (strings, tuples) are
- efficient for constant data
- less flexible
Mutable objects (lists, dictionaries) are
- less efficient for constant data
- flexible, support in-place modification
- can be more efficient if data keeps changing over time

11.11.2.1 Changing elements

Since strings and tuples are immutable, elements cannot be assigned new values though indexing. This is unlike mutable types where this is allowed, e.g. lists.

some_string = "abcdee"
some_string[-1] = "f"

>>>  Error: TypeError: 'str' object does not support item assignment

some_tuple = (0, 1, 1)
some_tuple[2] = 2

>>>  Error: TypeError: 'tuple' object does not support item assignment

some_list = [0, 1, 5]
some_list[2] = 2
print(some_list)

>>>  [0, 1, 2]

Strings have internal methods that can change elements, but then they follow copy-on-modify.

some_string_1 = "abc"
some_string_2 = some_string_1.replace("a", "b")

print(f'{some_string_1=}, {some_string_2=}\n\
    {some_string_1 is some_string_2 = }')

>>>  some_string_1='abc', some_string_2='bbc'
>>>      some_string_1 is some_string_2 = False

some_string = "abc"; some_string_orig_id = hex(id(some_string))

>>>  some_string='abc', some_string_orig_id = '0x75bcb4720df0'

>>>  hex(id(some_string.replace("a", "b"))) = '0x75bca8b839f0'

>>>  some_string='abc', some_string_orig_id = '0x75bcb4720df0'

some_string = some_string.replace("a", "b")

>>>  some_string='bbc'
>>>  hex(id(some_string)) = '0x75bca8b934b0'
>>>  some_string_orig_id = '0x75bcb4720df0'

11.11.2.2 Propagation of changes

Mutable types like lists or dictionaries, when passed around through variable assignment, changes are propagated.

when you make changes to original list they propagate
- this is required in many cases
- but can also lead to a bug
- be aware of the concept
to avoid default behavior when needed use
- constructor list(iterable)
- unpacking [*iterable]
- loops

some_list_1 = [1, 2, "a", "b"]
some_list_2 = some_list_1

print(f'{some_list_1=}, {some_list_2=}\n{some_list_2 is some_list_1 = }')

>>>  some_list_1=[1, 2, 'a', 'b'], some_list_2=[1, 2, 'a', 'b']
>>>  some_list_2 is some_list_1 = True

some_list_1[2] = "abc"

print(f'{some_list_1=}, {some_list_2=}\n{some_list_2 is some_list_1 = }')

>>>  some_list_1=[1, 2, 'abc', 'b'], some_list_2=[1, 2, 'abc', 'b']
>>>  some_list_2 is some_list_1 = True

some_list_2[-1] = "xyz"

print(f'{some_list_1=}, {some_list_2=}\n{some_list_2 is some_list_1 = }')

>>>  some_list_1=[1, 2, 'abc', 'xyz'], some_list_2=[1, 2, 'abc', 'xyz']
>>>  some_list_2 is some_list_1 = True

using constructor or unpacking does not pass the object itself

some_list_1 = [1, 2, "a", "b"]
some_list_2 = list(some_list_1)

>>>  some_list_1=[1, 2, 'a', 'b'], some_list_2=[1, 2, 'a', 'b']
>>>  some_list_2 is some_list_1 = False

some_list_1[2] = "abc"

>>>  some_list_1=[1, 2, 'abc', 'b'], some_list_2=[1, 2, 'a', 'b']
>>>  some_list_2 is some_list_1 = False

using constructor or unpacking does not pass the object itself

some_list_1 = [1, 2, "a", "b"]
some_list_2 = [*some_list_1]

print(f'{some_list_1=}, {some_list_2=}\n{some_list_2 is some_list_1 = }')

>>>  some_list_1=[1, 2, 'a', 'b'], some_list_2=[1, 2, 'a', 'b']
>>>  some_list_2 is some_list_1 = False

some_list_1[2] = "abc"

print(f'{some_list_1=}, {some_list_2=}\n{some_list_2 is some_list_1 = }')

>>>  some_list_1=[1, 2, 'abc', 'b'], some_list_2=[1, 2, 'a', 'b']
>>>  some_list_2 is some_list_1 = False

11.11.2.3 Mutable in immutable

tuple is immutable in terms of its element objects
the contained object remains mutable if it is mutable

some_list = [1, 2, 3, 4, 5]
some_tuple = (some_list, "some other object")

print(f'{some_list=}\n{some_tuple=}\n{some_list is some_tuple[0] = }')

>>>  some_list=[1, 2, 3, 4, 5]
>>>  some_tuple=([1, 2, 3, 4, 5], 'some other object')
>>>  some_list is some_tuple[0] = True

some_list.pop()

>>>  5

print(f'{some_list=}\n{some_tuple=}\n{some_list is some_tuple[0] = }')

>>>  some_list=[1, 2, 3, 4]
>>>  some_tuple=([1, 2, 3, 4], 'some other object')
>>>  some_list is some_tuple[0] = True

if only contents are needed and propagation is to be avoided use unpacking or constructor

some_list = [1, 2, 3, 4, 5]
some_tuple_1 = *some_list,; some_tuple_2 = tuple(some_list)

print(f'{some_list=}\n{some_tuple_1=}\n{some_tuple_2=}')

>>>  some_list=[1, 2, 3, 4, 5]
>>>  some_tuple_1=(1, 2, 3, 4, 5)
>>>  some_tuple_2=(1, 2, 3, 4, 5)

print(f'{some_list is some_tuple_1 = }')

>>>  some_list is some_tuple_1 = False

print(f'{some_list is some_tuple_2 = }')

>>>  some_list is some_tuple_2 = False

some_list.pop()

>>>  5

print(f'{some_list=}\n{some_tuple_1=}\n{some_tuple_2=}')

>>>  some_list=[1, 2, 3, 4]
>>>  some_tuple_1=(1, 2, 3, 4, 5)
>>>  some_tuple_2=(1, 2, 3, 4, 5)

print(f'{some_list is some_tuple_1 = }')

>>>  some_list is some_tuple_1 = False

print(f'{some_list is some_tuple_2 = }')

>>>  some_list is some_tuple_2 = False

11.11.2.4 Shallow vs deep copy

Shallow copy creates a new object for the collection being copied but does not create new objects, if items in the collection are themselves collection. This can have side effects.

Regular copy method available in all collections (string, tuple, list, dictionary) makes a shallow copy.

This works fine if elements of the collection are immutable objects like numbers, strings or tuples, but if there are mutable types like list or dict then propagation will occur.

This might not be desirable at times, so a deep copy is needed which creates new objects for all elements of the original container going through nested structure of the collection recursively.

The standard library has copy module which has deepcopy function to achieve this.

Below example illustrates the point. It is recommended to do experiments to understand the concept.

some_list_1 is a list containing a list, tuple and a dictionary.

some_list_2 is a regular copy, so is different object from some_list_1, but elements point to the same underlying some_list, some_tuple and some_dict

some_list_3 is a regular copy from copy module, so behaves similar to some_list_2.

some_list_4 is a deep copy from copy module, so elements are different objects as well.

import copy
some_list = [1,2,3]; some_tuple = (4, 5); some_dict = {"six": 6, "seven": 7}
some_list_1 = [some_list, some_tuple, some_dict]
some_list_2 = some_list_1.copy()
some_list_3 = copy.copy(some_list_1)
some_list_4 = copy.deepcopy(some_list_1)

>>>  some_list_2 is some_list_1 = False

>>>  some_list_3 is some_list_1 = False

>>>  some_list_4 is some_list_1 = False

>>>  some_list_2[0] is some_list_1[0] = True

>>>  some_list_3[0] is some_list_1[0] = True

>>>  some_list_4[0] is some_list_1[0] = False

11.1 Introduction

11.1.1 Overview

11.1.2 Objectives

11.2 None

11.2.1 Example

11.3 Numeric types

11.3.1 Summary

11.3.2 Specifications

11.3.3 Operations

11.3.3.1 Increment/Decrement

11.3.4 Examples

11.3.4.1 Example 1

11.3.4.2 Example 2

11.3.4.3 Example 3

11.3.4.4 Example 4

11.4 String

11.4.1 Overview

11.4.2 Specifications

11.4.2.1 Overview

11.4.2.2 Basic strings

11.4.2.3 Multiline strings

11.4.2.4 Using backslash (\)

11.4.2.5 Raw strings

11.4.2.6 Formatted strings

11.4.2.6.1 Examples

11.4.2.6.1.1 Ex 1

11.4.2.6.1.2 Ex 2

11.4.2.6.1.3 Ex 3

11.4.3 Operations

11.4.3.1 Sequence

11.4.3.1.1 Index & Slice

11.4.3.1.2 Examples

11.4.3.2 String specific

11.4.3.3 Arithmetic Operators

11.4.3.3.1 Example 1

11.4.3.3.2 Example 2

11.4.3.3.3 Example 3

11.5 Tuple

11.5.1 Overview

11.5.2 Specifications

11.5.2.1 Creation syntax

11.5.2.2 Creation by use case

11.5.3 Operations

11.6 List

11.6.1 Overview

11.6.2 Specifications

11.6.3 Operations

11.6.3.1 Sequence operations

11.6.3.2 Mutable sequence operations

11.7 Range

11.7.1 Examples

11.8 Dictionary

11.8.1 Overview

11.8.2 Specifications

11.8.3 Operations

11.8.3.1 Operations on dictionary

11.8.3.2 Operations on keys and values

11.9 Set

11.9.1 Specifications

11.10 Boolean data type

11.10.1 Boolean comparison operators

11.10.1.1 Options and syntax

11.10.1.2 Examples

11.10.1.2.1 Numeric

11.10.1.2.2 Sequence type

11.10.1.2.2.1 Strings

11.10.1.2.2.2 Tuples and lists

11.10.1.2.3 None type

11.10.2 Boolean combination operators

11.11 Generic concepts

11.11.1 Iterable unpacking

11.11.1.1 Examples

11.11.2 Implications of mutability

11.11.2.1 Changing elements

11.11.2.2 Propagation of changes

11.11.2.3 Mutable in immutable

11.11.2.4 Shallow vs deep copy

11.4.2.4 Using backslash (`\`)