This is not a programming tutorial, but rather a quick introduction to Python 3 for people who already know how to program in another imperative language. Only basic features of the language are covered, and this excludes object-oriented programming, which will not be needed for classwork. If you struggle through this document, or want to know more about Python 3, please follow the Python tutorial online. In any case, please refer to the extensive Python documentation for more detailed instructions on how to install Python, reference documents on both the language and its libraries, information on how Python 3 differs from previous versions, and much more.
Python in all its versions is widespread in both industry and academia. The SciKit library comes with the Anaconda distribution of Python and is in any case straightforward to install if you have a different distribution. The module scikit-learn has quite a bit of software for machine learning. Other languages do too, but Python's software is more broadly supported and used, and is free.
We will learn the basics of Python 3 by studying some sample code. The math underlying the sample is introduced in the next Section, and the Section thereafter lists code that can be used to explore the math. The rest of this document introduces various aspects of Python 3 syntax and semantics using the sample code as a running example.
Consider the matrix
$$ A = \left[\begin{array}{cccc}3 & 4 & 1 & 2\\5 & 0 & 7 & 3\\7 & 2 & 6 & 9\\1 & 8 & 3 & 0 \end{array}\right]\;. $$
The entries appear in what seems random order, both in the rows (horizontal) and the columns (vertical). Let us now sort the values in each of the rows independently to obtain a new matrix
$$ B = \left[\begin{array}{cccc}1 & 2 & 3 & 4\\0 & 3 & 5 & 7\\2 & 6 & 7 & 9\\0 & 1 & 3 & 8 \end{array}\right] $$
(check that the entries in each row of $B$ are the same as those in the corresponding row of $A$, but in non-decreasing order). Let us now sort each of the columns of $B$. This yields
$$ C = \left[\begin{array}{cccc}0 & 1 & 3 & 4\\0 & 2 & 3 & 7\\1 & 3 & 5 & 8\\2 & 6 & 7 & 9 \end{array}\right]\;. $$
As expected, since we just sorted the columns of $B$ to obtain $C$, the columns of $C$ contain values in non-decreasing order. Perhaps unexpectedly, however, the rows of $C$ are also still sorted in non-decreasing order! (Check this.) Why did sorting the columns of $B$ not mess up the ordering within each of its rows? Is this a coincidence for a carefully contrived matrix $A$?
There is a theorem, sometimes referred to as the no messing-up theorem, that states that this is not a coincidence, and works for any matrix, square or rectangular.
The proof of the no messing-up theorem is simple. Here, however, we will merely experiment with this mathematical fact to introduce some Python 3 code. We need a function that sorts numbers or other values, some data structure and code that lets us work with matrices, and two functions that sort the rows and columns of a matrix, respectively. While the numpy
Python library implements matrices for you, we will use very simple, vanilla Python 3, rather than this library, so we explore some of the basic constructs of the language.
What follows is minimal code. If you were to write your own matrix manipulation library, it would likely have to be fancier than this. The following code comes with numbered lines, so we can refer to parts of it in subsequent text.
# Merge sort is a very elegant, recursive sorting algorithm.
# It splits the list to be sorted in half, sorts each half recursively,
# and then merges the two sorted halves by comparing their heads iteratively.
def mergeSort(lst, before = lambda a, b: a < b):
'''Sort the list lst by the comparison criterion cmp (default is "<")'''
if len(lst) > 1:
mid = len(lst) // 2
left = lst[:mid]
right = lst[mid:]
mergeSort(left, before)
mergeSort(right, before)
(i, j, k) = (0, 0, 0)
while i < len(left) and j < len(right):
if before(left[i], right[j]):
lst[k] = left[i]
i += 1
else:
lst[k] = right[j]
j += 1
k += 1
while i < len(left):
lst[k] = left[i]
i += 1
k += 1
while j < len(right):
lst[k] = right[j]
j += 1
k += 1
def checkMatrix(a):
'''Is the argument a list of equal-length lists of items of the same type?
If so, return a pair with the matrix dimensions.
Otherwise, raise a type error.'''
if type(a) != list or type(a[0]) != list:
raise TypeError('not a list of lists')
n = len(a[0])
if n > 0: t = type(a[0][0])
for row in a:
if len(row) != n: raise TypeError('rows of different lengths')
if n > 0:
for x in row:
if type(x) != t: raise TypeError('items of different types')
return (len(a), len(a[0]))
def printMatrix(a):
'''Print a matrix (very basic: does not align columns)'''
checkMatrix(a)
for row in a: print(*row, end = '\n')
def sortRows(a):
'''Sort each row of a matrix in non-descending order'''
checkMatrix(a)
for row in a: mergeSort(row)
def sortCols(a):
'''Sort each column of a matrix in non-descending order'''
(m, n) = checkMatrix(a)
for j in range(n):
col = [row[j] for row in a]
mergeSort(col)
for i in range(m): a[i][j] = col[i]
We can check the no messing-up theorem for a bigger matrix than the one in the example, to show that the matrix need not be square. The entries could be more than single digits, but the printMatrix
function is very simple, and does not line up columns nicely unless all the numbers have the same length. (A useful exercise for you is to rewrite printMatrix
and fix that.) Here is a sample run:
A = [[3, 4, 1, 2, 6], [5, 0, 7, 3, 4], [7, 2, 6, 9, 3], [1, 8, 3, 0, 8]]
printMatrix(A)
sortRows(A)
printMatrix(A)
sortCols(A)
printMatrix(A)
Rows and columns are both sorted in non-decreasing order. Of course, just trying some examples is not a proof. We will study how to write proofs later in this course.
Let us now study the sample code. You may want to open this page in a separate browser window and place the two windows side to side as you read on. You also want to have an interpreter handy to experiment in. The file mergeSort.py contains the sample code.
Everything following a hash character #
to the end of a line is a comment (see lines 1-3).
An instruction goes on a single line by itself. There is no semicolon or anything else at the end of an instruction. However, if a line is too long, you can put a backslash character \ and continue on the next line. Often, if you have a very long line there is a way to rewrite your code more succinctly, and this solution is then preferable.
In many other languages, a block of instructions is enclosed by a pair of delimiters such as curly braces or begin
and end
. In Python, a block is opened by a colon on the line that precedes it, and is then indented one level deeper than the preceding line. There are many blocks in the sample code. Looking at lines 16-23, the colon at the end of line 16 starts a block that ends on line 23. This block contains two subblocks (introduced by if
and else
) and the lone instruction on line 23. Note that this instruction is inside the while
construct, but outside the else
, because of the way it is indented. Indentation matters in Python!
Everything in Python is an object. This includes names such as a
or var1
or __init__
that are used to denote objects. Functions are objects as well, and they are almost first-class citizens in Python. This means that they can be passed to and returned by functions, and referred to by name.
A difference from statically typed languages is that in Python object names are not declared as being of a certain type. Names are type-free. The objects they denote, on the other hand, have a type, and the Python interpreter checks that operations are performed on objects whose types are legal for that operator. For instance, if we define
a = 3
b = 2.4
c = 'word'
then we can check the types of a
, b
, c
as follows:
type(a)
type(b)
type(c)
and if we subsequently assign 1.2
to a
, then type(a)
returns float
.
However, the following is an error, because two objects of incompatible types are being compared:
a < c
Spend a little time understanding what this error message tells you. Parsing this information is an important first step to debugging your code.
Line 35 in the sample code above contains the header of a simple function definition. It starts with the keyword def
, which is followed by the name of the function and the list of arguments in parentheses. If there is more than one argument, commas are used to separate them. The body of the function is a block that ends on line 48. It is good style to separate function definitions with blank lines for readability, but the interpreter ignores blank lines. Only indentation tells us where a block ends.
The function definition on line 5 is a mouthful, and illustrates several features of Python. First, it has two arguments named lst
and before
. The name lst
is awkward, and list
would be preferable. However, list
is a built-in Python function. It would be OK to use list
as the first argument name, but then the Python version of list
would be inaccessible by its name only. This does not matter in mergeSort
. Also, if you do not know that list
is a built-in object, you won't use the built-in object in your code, so nothing bad happens. However, others who do know of the built-in list
may be confused when reading your code, so it is best to avoid redefining built-in objects.
Second, the argument before
has a default value, introduced by the =
sign. This means that you can call the function mergeSort
without specifying the value of before
, and the function will then use the default. You can override the default by passing an argument that is then bound to before
. Of course, for this to work, arguments with default values must be after any arguments without default values. Python has other ways to refer to arguments, but we will ignore these.
Third, the default value of before
is an anonymous function, that is, a function without a name. Matlab programmers will be used to the lambda
construct, but Java programmers may not be. A Python anonymous function starts with the keyword lambda
, is followed by the (comma-separated) list of arguments without parentheses, and by a single expression, with no return
statement. Nothing more complex is allowed, and this is one reason why Python functions are only almost first-class citizens.
For instance, you could sort a list in reverse order (large to small) by calling
a = [3, 1, 2]
mergeSort(a, lambda x, y: x > y)
a
This is exactly equivalent to the following code, but is more concise:
a = [3, 1, 2]
def gt(x, y): return x > y
mergeSort(a, gt)
a
As an aside, note that a single statement (as opposed to a block with multiple statements) can be placed right after a colon, rather than on its new line, as done in the definition of gt
above. However, a line can have only one colon in it.
To sort in non-decreasing order, you can omit the scond argument to mergeSort
, and just say
a = [3, 1, 2]
mergeSort(a)
a
Every object, incuding functions, can have fields, which need not be declared. You could add a field to mergeSort
like this:
mergeSort.author = 'Carlo Tomasi'
and then access it like this:
mergeSort.author
Lines 6 and 36-38 in the sample code show examples of doc strings, which document functions. They are optional, and they must be enclosed in triple quotes (single or double). These are not just comments, but are rather available to the code through the __doc__
field of a function object. For instance,
mergeSort.__doc__
Fields like __doc__
that start and end with two underscore characters are built in by convention.
Name aliasing occurs when there are several names for the same object. This may be the greatest source of danger, especially for Matlab programmers, so please read this section carefully and try all the examples yourself.
The function mergeSort
works on the input list lst
in place. That is, it makes no copy of lst
, but rather just moves the contents of the original lst
around. Once mergeSort
returns control to the caller, whatever list was bound to the argument lst
will be sorted:
a = [3, 1, 2]
mergeSort(a)
a
This is name aliasing, since the same list [3, 1, 2]
has name a
at the command prompt and lst
inside mergeSort
. There is more: mergeSort
calls itself recursively (on lines 12 and 13), so at any one time during execution of this function there may be many invocations of mergeSort
on the call stack. The lst
names in two different invocations are considered different names, but they all refer to the same object, whose name at the command prompt is a
. So all these invocations work on the same list, which is initially [3, 1, 2]
and then changes during execution.
Name aliasing can occur without any function calls, as shown in the following example:
a = [3, 1, 2]
b = a
b[0] = 0
b
So far there is nothing surprising: We changed entry number 0
in b
(yes, indices start at 0
in Python), and the change is visible when we inspect the value of b
. However, changing b
results in a
changing as well, because a
and b
are aliases for the same object:
a
Name aliasing in Python occurs only for so-called mutable objects such as lists. The rules for distinguishing between mutable and immutable objects are not straightforward, and even the distinction between mutable and immutable is not unequivocal. For our purposes, numbers, strings and tuples are immutable, while dictionaries and lists are mutable. Here is an example with numbers:
n = 3.2
m = n
m = m + 1
n
The value that name n
refers to did not change after changing m
, so here m
is not an alias for n
. Similarly for strings:
s = "some string"
t = s
t = "awe" + t
t
We used the +
symbol for string contcatenation. Again t
is not an alias for s
because strings are immutable, and s
still has its old value:
s
Python's rules for name aliasing get confusing when immutable objects contain mutable ones, and this is particularly the case with tuples. The examples that follow show the issue, and the conclusion for us is that tuples are best left unused.
Line 15 in mergeSort
shows an example of tuples, which are immutable:
(i, j, k) = (0, 0, 0)
The triple (a triple is a t-uple with $t=3$) (0, 0, 0)
contains numbers, which are immutable, but the triple (i, j, k)
contains names, and these can be bound to different objects, possibly mutable. Let us try the following:
t = (2, [1, 3], 0)
t
which is an (immutable) triple whose second element (element number 1) is a (mutable) list. What happens if we now try to change the second element by appending the number 5 to the list?
t[1].append(5)
t
This change is allowed! However, if we try to assign a new value to t[1]
, we get an error:
t[1] = [1, 3, 5, 7]
If we try to assign anything to t[0]
we get an error as well:
t[0] = 4
The rule that these examples seem to imply is that elements of an immutable object such as a tuple cannot be reassigned, whether they are mutable or not. However, mutable elements of immutable objects can be changed by operations other than assignment. This is confusing, and is one place where Python really fails to shine. To avoid confusion, it is best for us to ignore tuples altogether, except perhaps as convenient shorthands in multiple assignments, as in line 15 of the sample code.
So our abridged rule for mutability is as follows: Numbers and strings are immutable, while dictionaries and lists are mutable. Forget about tuples. Even without tuples, think about aliasing carefully when you program, or else results can be unpredictable. Aliasing is the main reason why functional programming languages such as Scheme are simpler to implement and understand than imperative languages like Python.
Since mergeSort
works in place, it need not return anything when it is done: its main effect is a side effect, that is a change to an object that is not explicitly returned. We could have added a return statement
return lst
after line 33 (at tle same level of indentation as the if
in line 7), so you could then say
b = mergeSort(a)
but this would be very confusing: not only do you have a sorted version of the original a
in b
, but a
itself would be sorted after this statement. So let us stick with the original definition of mergeSort
. If you wanted mergeSort
to have no side effects, its body would make a copy of lst
as the first order of business:
c = lst.copy()
(this is the standard way in Python for making a copy of an object), then work on c
instead of lst
, and finally return c
.
The function checkMatrix
(lines 35-48) is the only function in the code sample that has a return
statement at the end. However, also this function has side effects, albeit of a different nature: If the input a
does not satisfy one of the conditions in the if
statements, a TypeError
exception is raised, and the program aborts.
Lines 7 and 17 are examples of conditional statements, and line 16 is an example of a while
loop header with a condition. There is really nothing very different here, relative to other languages, except that logical connectives are the English words
and, or, not
rather than symbols such as bars or ampersands. The two Boolean values True
and False
are also provided.
The and
and or
operators only go as far as needed in their evaluations. For instance, when evaluating
a and b
the interpreter first evaluates a
. If a
is false, then b
will not be evaluated, because its logical value does not matter, and the result of the and
is false. Similarly, when evaluating
a or b
the clause b
is evaluated if and only if a
is false.
There is no switch
statement in Python, but there is an elif
, which stands for "else if":
day = 'today'
if day == "Sunday": n = 0
elif day == "Monday": n = 1
elif day == "Tuesday": n = 2
elif day == "Wednesday": n = 3
elif day == "Thursday": n = 4
elif day == "Friday": n = 5
elif day == "Saturday": n = 6
else: print("What does '", day, "' mean?", sep='')
Incidentally, the example above shows that strings can be delimited by either single or double quotes. You can use double quotes when the string contains single quotes (as in the print
statement above), and viceversa. Alternatively, quotes can be escaped by a backslash. Triple quotes (''' ... '''
) or equivalently triple double quotes (""" ... """
) can span multiple lines, and
s = '''Abra
cadabra'''
is exactly the same as
s = 'abra\ncadabra'
but perhaps more legible. We already encountered triply-quoted strings when talking about function doc strings.
Python also provides a convenient construct called the ternary conditional:
x = 3
parity = "even" if x // 2 == 0 else "odd"
parity
This is a conveniently concise expression equivalent to the following:
x = 3
if x // 2 == 0: parity = "even"
else: parity = "odd"
parity
In these expressions, the operator //
denotes integer division. We could also have written
x = 3
parity = "odd" if x % 2 else "even"
parity
where %
is the modulo operator, that is, x % 2
is the remainder of the integer division of x
by 2
.
The range
function is a simple way to generate an interval of consecutive integers:
list(range(3, 7))
Note that the resulting list starts at 3 (the first argument to range
), but ends at 6, just before the value of the second argument. The first argument can be omitted and then defaults to zero:
list(range(7))
The expression range(3, 7)
does not generate the actual list of values in the interval, but rather an iterable object, that is, an object that returns the successive items of the desired sequence when needed. The list
function then takes the iterable object and transforms it into an explicit list.
The reason why range
itself does not generate the list is that the list may be very long, and not all of it may be needed. For instance, range(1000000)
represents a list of one million entries, but the object returned by this call is very small. If the entries in the list are used one at a time, as in a for
loop, it would be wasteful to store the entire list. A typical use scenario for range
is shown in lines 63 and 66 of the sample code. Line 63 is
for j in range(n):
where n
is the number of columns in a matrix.
The range function can take a third argument that represents the stride in the sequence. For instance,
list(range(-3, 6, 2))
lists every other integer starting from $-3$ and up to but not including $6$.
A subtler example of for
loop is line 58 in the function sortRows
of the sample code:
for row in a: mergeSort(row)
To understand what this means, note that we represent matrices as lists of lists:
a = [[3, 4, 1, 2], [5, 0, 7, 3], [7, 2, 6, 9], [1, 8, 3, 0]]
printMatrix(a)
(the printMatrix
function is on lines 50-53, and is used to make matrices more legible). The checkMatrix
function on lines 35-48 checks that the list of lists has the correct format: all rows have the same length, and all entries have the same type.
Since name a
in sortRows
(lines 55-58) denotes a list of list, its individual elements are lists: a[0]
is [3, 4, 1, 2]
, and so forth. A list is an interable object, and the construct
for row in a
means the following: "Iterate over all the objects in the iterable object a
. For each of them, bound the name row
to it and execute the body of the for
loop." So with the matrix a
given above, row is [3, 4, 1, 2]
in the first iteration, [5, 0, 7, 3]
, in the second, and so forth.
You can break out of a loop with a break
statement.
The right-hand side of line 64 of the sample code is an example of a list comprehension:
col = [row[j] for row in a]
This line is equivalent to the following sequence of instructions:
col = []
for row in a: col.append(row[j])
For example:
a = [[3, 4, 1, 2], [5, 0, 7, 3], [7, 2, 6, 9], [1, 8, 3, 0]]
j = 2
col = [row[j] for row in a]
col
which is the third column (column number 2) of a
. The Python manual has much more about comprehensions. However, comprehensions are just convenient constructs introduced for succinctness, similarly to the ternary conditional introduced earlier. You can program without them.
Elements of a sequence such as a list or a string can be accessed by an index enclosed in square brackets, as we have already seen in the expression row[j]
and a few other examples in this document. Indices start at 0 in Python.
Multiple elements can be extracted by slicing the sequence, that is, using a construct of the form
[start:stop]
or
[start:stop:stride]
If you are familiar with Matlab, you know what this means, but pay attention to the different ordering (in Matlab, the stride is between start
and stop
, not at the end).
This notation, when used for indexing, extracts every elements of a sequence starting at start
and ending before stop
with the given stride
, if specified (just as range
does with its two or three arguments). The stride
cannot be zero.
If start
is omitted (but not the first colon), it defaults to 0 if stride
is positive, or to the length of the sequence if stride
is negative. If stop
is omitted (but not the first colon), it defaults to the length of the sequence if stride
is positive, or to -1 if stride
is negative. If the stride is omitted (with our without the second semicolon) it defaults to 1.
Examples:
s = 'lacerated'
s[0:4]
and the last expression is equivalent to s[:4]
and to s[:4:]
and to s[:4:1]
and to s[0:4:]
and so forth. A few more examples, most of which have synonyms:
s[4:]
s[::3]
s[1::3]
s[2::3]
s[5:2:-1]
(pay attention to the last one: the character at position 2 is excluded).
s[::-1]
which is a neat way to reverse a string or another sequence. While the last expression expands logically to s[len(s)-1:-1:-1]
, for obscure reasons Python returns an empty string if you type just that:
s[len(s)-1:-1:-1]
This is all the more surprising since range
does the right thing:
list(range(len(s)-1, -1, -1))
In the examples thruoghout this document, we displayed objects bound to names by just typing the names to the command prompt. This works in an interpreter, but not in a module, which does not print out anything unless explicitly asked to. The print
statement is used to that effect. Line 53 of the sample code has an example:
print(*row, end = '\n')
Let us analyze this statement. The star before row
is an unpacking operator. It takes a list and unpacks it into a sequence of values for functions that require multiple arguments. So if row
is [3, 4, 1, 2]
, then
print(*row)
is the same as
print(3, 4, 1, 2)
So the star effectively "erases the square brackets" of the list. This operator can be used only where it makes sense, such as in an argument list and a few other places.
This example shows that the print
function can take any number of arguments. It will convert them to strings and print them, separated by a single blank space by default. You can change the default by adding a named argument sep
that specifies a different string. For instance
sep = ', '
will put a comma and a blank space between elements:
row = [3, 4, 1, 2]
print(*row, sep = ', ', end = '\n')
The named-parameter assignment end = '\n'
in the example tells print
to put a newline character when it is done printing.
Again, there is much more about print
in the Python manual.