Test Driven Development in Python

TDD is a mindset

Test driven development is a way of designing as much as it is a way of developing. I have been trying to use it on and off for a several years now without much success. I think I understand the process, but not necessarily the mindset. So, I am taking on a challenge in hopes of learning the mindset as well as the process. The transition to TDD is going to take practice and I intend to get that practice using the Project Euler problems that I started on a couple of years ago. One of the added benefits of using the Euler problems is that they are simple enough usually (or I should say, "So far") that they can be solved using a single class or even a single method. This makes writing tests for the design a little more direct and therefore simpler.

I have decided to use Python, for now. Python is a simple language to use because it is interpreted rather than compiled. I am also taking this opportunity to learn to use VIM a little better. I have setup VIM as my python ide, and am pretty happy with it so far. And when I am away from my main machine I can use my Koding virtual machine to work on a problem, but that is another post. Plus I have always wanted to know more python, and so far I am really enjoying using it, it was super easy to get up and running with TDD. Writing a unit test in python is very easy, as you will see. So, despite the fact that I write C# all day everyday at work, I am trying to use an easier setup to get some practice in with Test Driven Development.

The Problem

I am currently on problem number 6 at Project Euler, which is title: Sum Square Difference. The statement of the problem is this:

The sum of the squares of the first ten natural numbers is,
1² + 2² + ... + 10² = 385
The square of the sum of the first ten natural numbers is,
(1 + 2 + ... + 10)² = 55² = 3025
Hence the difference between the sum of the squares of the first ten natural numbers and the square of the sum is 3025 - 385 = 2640.
Find the difference between the sum of the squares of the first one hundred natural numbers and the square of the sum.

This led me to do some research. The past couple of problems I have tackled with more of a brute-force tack and so I decided that this time I would do some research first. I knew there were some formulas that could help me with this one. I basically needed a formula for the following:

The sum of the squares for 1..n
The square of the sums of 1..n

With formulas for these two tasks I could then simply diff the result for my answer. I have sample values from the problem statement to use in my unit tests of the formulas. Here are the formulas I will use in solving the problem, both were readily available through Google or Wikipedia searches.

The Test Class (and the first test)

With test driven development, rather than writing some code for the problem first, of course I need to write a test. So what test? Well I look at the problem and see that I need to find the difference between the sum of the squares of a list of numbers and the square of the sum of those numbers. Therefore I will need to be able to get those values. My first test will be for a correct sum of squares value (the second formula above).

I know that I am supposed to dive right in and write a test for a method that does what I need. This is where you actually do some designing as well. I wrote the following test as the first code for this problem:

And here is the output for that test:

As you can see the test failed with an 'ImportError: No module named Problem6'. That is because I need to add a module for Problem6, which is a new file named Problem6.py. In order to maintain the strict tenets of test driven development I need to only to the minimum required to get past this error. So I will add a new file/module named Problem6.py, and then re-run the test. Here is the output from the next run:

Now we failed with an AttributeError: 'module' object has no attribute 'Problem6Solution' and this is because the new module we just added is empty and has no class. So, I will add a class to the module and name it 'Problem6Solution' and a method named sumOfSquares that accepts an input number. This should be enough to get through this error. Here is the first look at the solution class:

Now we can run the test again and see what we get this time. It should be an assertion error because we are not yet performing the formula calculation. Here is the result:

Ah, there is my assertion error. Now I am finally ready to add some code to execute the formula and hopefully pass the test. The formula is simple but as I am a bit of a noobie with python I looked up the math.pow method to help me execute the power calls. Here is the code for the sumOfSquares method complete:

Repeat for Remaining Tests

The rest of the process is a little smoother in that now we have our classes and files and beginnings completed. We have a passing test and are under way! The next test will be for another piece of our solution's puzzle, the square of sums formula, where we sum 1..n and then square the result. We have the formula for summing 1..n so it should be fairly simple. Let us see what the test looks like:

Of course we will run the test without writing any new code to see it fail. Then we will refactor until it passes. I am seeing though, at this point that in writing a test first I am letting the desired output dictate the design of the class. I am starting with what I need and only writing the minimum code in order to get it. Resulting in a fairly concise bit of coding. I am still only utilizing this at a very simple level, but I can see the benefits none the less. What follows if the test result and the code written to make it pass:

And now our test 2 passes.

For the last test, where we test a solve method that takes the difference of our two sums, I will show the resulting classes in their entirety. You will notice there will now be a third test and a third method in the solution class, called 'solve'.

Summary

I have added a main method at the end of the solution class that actually shows the answer and the elapsed time to calculate it. That, in my mind, makes this a complete solution that does provide the output desired. And, thanks to Test Driven Development, without anything extra. Of course I could further refactor the formula methods to shorten them, specifically using the sum() method rather than the += operator. However, I have left the code the way it is because I feel it is more readable this way. And in my book, readability counts for something.

Lastly here is the output, both the passing tests and the printed answer!

johng .info

TDD is a mindset

The Problem

The Test Class (and the first test)

Repeat for Remaining Tests

Summary