Building Blocks¶
Optimization is the route to all evil
Getting right first and fast then by D. Knuth. AKA “Get it right first, then make it fast”.
For this lecture, we followed a lot of the contents already included in the following tutorials.
We’ll follow the concise Software Carpentry Testing Tutorial authored by Dr. Katy Huff.
Also this Dr. Katy Huff.
1. Motivation¶
Let’s start by taking a look to some of the reasons why continously testing our code is a good prectice that produce better code and more reproducible too.
1.1. Numerical precision¶
As we saw in the notebook simple-numerical-chaos.ipynb, notebook, even simple arithmetic in computers can produce surprising numerical behavior. This means that, especially when we handle lots of data, we should strive to always validate that our codes are producing the answers we expect them to produce.
In brief, the basic issue is that even two algebraically equivalent forms of the same (simple!) expression, in a computer, may give different results:
def f1(x): return r*x*(1-x)
def f2(x): return r*x - r*x**2
r = 3.9
x = 0.8
print('f1:', f1(x))
print('f2:', f2(x))
print('difference:', (f1(x)-f2(x)))f1: 0.6239999999999999
f2: 0.6239999999999997
difference: 2.220446049250313e-16
Now, the decimal digits of the difference are just garbage: eirher f1(x) or f2(x) have no information after the last digit. The apparent precision in the difference f1(x) - f2(x) is completely spourious.
Now, this raises the question about what does it mean to get the right answer from our code and what does it mean to be reproducible in scientific computing.
This short example help us to undersrand what is important in the context of computational
1.2. Implementing or changing features¶
Testing also help us when we want to make significant changes in our code and we want to ensure that the functionallity of the code doesn’t go affected by these new changes. These cases include
Adding a new function/feature that communicates with other existing pieces of code.
Making changes to the implementation of existing function, for example by changing the data types or the algorithm we use for certain operations
Change the data we used to feed our code
2. Types of tests¶
There are different classes of test that evaluate the correctness of our code at different levels and scales. In this course, we re goign to cover the following tests:
Assertions statements
Exceptions statements
Unit tests
Regression tests
Integration tests
2.1. Assestions¶
The assert statement in Python just evaluates when some given condition is true or false. If False, it interrupst the exectution of the code.
assert 1+1 == 2, "One plus one is not two."As you can see from the previous example, you can also add a small text description for the error induced. in this way, assertion statements are very simple to write and evaluate.
As you can imagine from the discussion in the previous section, we need to be careful at the moment of comparing objects in Python. For example, for float types we have
assert 0.1 + 0.2 == 0.3---------------------------------------------------------------------------
AssertionError Traceback (most recent call last)
Cell In[3], line 1
----> 1 assert 0.1 + 0.2 == 0.3
AssertionError: The problem here is induced by floating point aritmethics in our code. In order to raise an AssertionError here, we can use numpy.testing.assert_allclose():
from numpy.testing import assert_allclose
assert_allclose(0.1 + 0.2, 0.3)Since assertions are raised when a given condition is not satisfied, we can also use any other functionallity that retuns True/False for doing this. Other examples are
import math
assert math.isclose(0.1 + 0.2, 0.3), "Numbers are not close."import pytest
assert 0.1 + 0.2 == pytest.approx(0.3), "Numbers are not close."Ussually assertion statements go inside a functions or definitions an help us to keep the correctness of the code. In pair programming, it is the role of the observer to think in cases where the code may not work and think about simple assertion statements that will help prevent those errors.
2.2. Exceptions¶
Different kinds of errors that occur as we write code include syntax, runtime and semantic errors. Specially for runtime errors, Python give us a clue about what kind or error may happened during the execution of our code. For example,
1 / 0---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[7], line 1
----> 1 1 / 0
ZeroDivisionError: division by zeromy_dict = {'a':1, 'b':2}
my_dict['c']---------------------------------------------------------------------------
KeyError Traceback (most recent call last)
Cell In[8], line 2
1 my_dict = {'a':1, 'b':2}
----> 2 my_dict['c']
KeyError: 'c'my_dict + {'c':3}---------------------------------------------------------------------------
TypeError Traceback (most recent call last)
Cell In[9], line 1
----> 1 my_dict + {'c':3}
TypeError: unsupported operand type(s) for +: 'dict' and 'dict'There are many more different kind of built-in exceptions in Python. You can find some more examples in this link. A general RuntimeError is raised when the detected error doesn’t fall in any of the other categories.
There are different ways of dealing with runtime errors in Python, there include the
try...exceptclauseraisestatement
def division(numerator, denominator):
try:
return numerator / denominator
except ZeroDivisionError:
return 0division(1,1)1.0division(1,0)0Now, at the moment of raising an error we would like to print a meaningful message. We can do this
def division(numerator, denominator):
try:
return numerator / denominator
except ZeroDivisionError:
raise ZeroDivisionError(f"You cannot divide by {denominator=}")division(1,0)---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[13], line 3, in division(numerator, denominator)
2 try:
----> 3 return numerator / denominator
4 except ZeroDivisionError:
ZeroDivisionError: division by zero
During handling of the above exception, another exception occurred:
ZeroDivisionError Traceback (most recent call last)
Cell In[14], line 1
----> 1 division(1,0)
Cell In[13], line 5, in division(numerator, denominator)
3 return numerator / denominator
4 except ZeroDivisionError:
----> 5 raise ZeroDivisionError(f"You cannot divide by {denominator=}")
ZeroDivisionError: You cannot divide by denominator=0If you already know what may be causing an error in your code, you can avoind the use of the try / except statement and directly raise an exception when certain critical condition happens:
def division(numerator, denominator):
if denominator == pytest.approx(0.0):
raise ZeroDivisionError(f"You cannot divide by {denominator=}")
return numerator / denominatordivision(1, 0)---------------------------------------------------------------------------
ZeroDivisionError Traceback (most recent call last)
Cell In[16], line 1
----> 1 division(1, 0)
Cell In[15], line 3, in division(numerator, denominator)
1 def division(numerator, denominator):
2 if denominator == pytest.approx(0.0):
----> 3 raise ZeroDivisionError(f"You cannot divide by {denominator=}")
4 return numerator / denominator
ZeroDivisionError: You cannot divide by denominator=0Something cool about exceptions is that their are classes and Python allow us to create new assertion errors.
class LightSpeedBound(Exception):
"""
Defines a new exception error of my preference.
"""
pass
def lorentz_factor(v, c=299_792_458):
if v > c:
raise LightSpeedBound(f"The current velocity {v} cannot exceed the speed of light")
return 1 / (1 - v**2/c**2) ** 0.5lorentz_factor(300_000_000)---------------------------------------------------------------------------
LightSpeedBound Traceback (most recent call last)
Cell In[18], line 1
----> 1 lorentz_factor(300_000_000)
Cell In[17], line 9, in lorentz_factor(v, c)
7 def lorentz_factor(v, c=299_792_458):
8 if v > c:
----> 9 raise LightSpeedBound(f"The current velocity {v} cannot exceed the speed of light")
10 return 1 / (1 - v**2/c**2) ** 0.5
LightSpeedBound: The current velocity 300000000 cannot exceed the speed of light2.3. Unit Tests¶
In previous section we were discussing about the importance of writting clean and modular code. Having small functions that perfom very specific tasks help us to desing pipelines for testing those small units of code. That is the purpose of unit tests, to individually test the functions in our code.
The way of writing unit tests consist in defining function that will return an assert statement testing whenever the output matches the true answer.
import numpy as np
def division(numerator, denominator):
if denominator == pytest.approx(0.0):
raise ZeroDivisionError(f"You cannot divide by {denominator=}")
return numerator / denominator
def test_float_division():
assert np.isclose(division(2.0,0.5), 4.0)test_float_division()The next step is to scalate this! Having more than one test for function that can evaluate different cases (eg, different types) and then extent to all the functions in your code. For example, for the division function we probably want to add a test that fix the expected behaviour when dividing by zero. Surprisingly, we can assert that the output of a funcition is an Error itself:
import pytest
def test_division_by_zero():
with pytest.raises(ZeroDivisionError):
division(numerator=10.0, denominator=0.0)test_division_by_zero()2.4. Integration tests¶
As their name indicate, integration tests are the responsible of evaluating how multiple units of code work together, instead of individually. For example, it is easy to see how a simple code that has the division function can fail, even when each unit has being tested independnely.
In general, any test that involves more than one function is called an integration test. Let’s see the following example that uses inheritance classes in Python.
class Person:
def __init__(self, name, age):
self.name = name
self.age = age
def birthday(self):
self.age += 1
def append_lastname(self, lastname):
self.name += " " + lastname
class Student(Person):
def __init__(self, name, age, major):
super().__init__(name, age)
self.major = major
self.grades = {}
def add_grade(self, course, grade):
self.grades[course] = gradedef test_student():
subject = Student("Facu", 28, "Statistics")
subject.birthday()
subject.add_grade("Stat 159", "A+")
assert subject.age == 29 and subject.grades["Stat 159"] == "A+"
test_student()2.5. Regression tests¶
Regression tests try to fix in time the expected behaviour of certain piece of code. This is particularry useful when we don’t know what the true output of a piece of code is, but we want to ensure the stability of the code. In a sense, we want to be sure that as we make changes we don’t break or change the code that, in principle, was working before.
Another example of a regression test happens after we found and fix a bug in our code. After detecting an error, we may want to include a test for this so we are sure that the bug doesn’t reapear in the future.