Test-driven development is not about testing. Test-driven development is about development (and design), specifically improving the quality and design of code. The resulting unit tests are just an extremely useful by-product.
That's all I'm going to tell you about test-driven development. The rest of this article will show you how it works. Come work on a project with me; we'll build a very simple tool together. I'll make mistakes, fix them, and change designs in response to what the tests tell me. Along the way, we'll throw in a few refactorings, design patterns, and object-oriented design principles.
To make this project fun, we'll do it in Python.
Python is an excellent language for test-driven development because it (usually) does exactly what you want it to without getting in your way. The standard library even comes with everything you need in order to start developing TDD-style.
I assume that you're familiar with Python but not necessarily familiar with
test-driven development or Python's unittest module. You
need to know only a little in order to start testing.
unittest ModuleSince version 2.1, Python's standard library has included a
unittest module, based on JUnit
(by Kent Beck and Erich Gamma), the de facto standard unit test framework for
Java developers. Formerly known as PyUnit, it also runs on Python
versions prior to 2.1 with a separate download.
Let's jump right in. Here's a "unit" and its tests--all in one file:
import unittest
# Here's our "unit".
def IsOdd(n):
return n % 2 == 1
# Here's our "unit tests".
class IsOddTests(unittest.TestCase):
def testOne(self):
self.failUnless(IsOdd(1))
def testTwo(self):
self.failIf(IsOdd(2))
def main():
unittest.main()
if __name__ == '__main__':
main()
![]()
LightsThroughout this article, I'll use a traffic light to show the state of the tests. Green indicates that the tests pass, and red warns that they fail. A shining yellow light indicates a problem that prevents us from completing a test. TDD practitioners often talk about receiving a "green light" or "green bar" from the graphical test runner that comes with JUnit. |
Methods whose names start with the string test with one
argument (self) in classes derived from
unittest.TestCase are test cases. In the above example,
testOne and testTwo are test cases.
Grouping related test cases together, test fixtures are classes that derive
from unittest.TestCase. In the above example,
IsOddTests is a test fixture. This is true even though
IsOddTests derives from a class called TestCase, not
TestFixture. Trust me on this.
Test fixtures can contain setUp and tearDown
methods, which the test runner will call before and after every test case,
respectively. Having a setUp method is the real justification for
fixtures, because it allows us to extract common setup code from multiple test
cases into the one setUp method.
In Python we typically don't need a tearDown method, because we
can usually rely on Python's garbage collection facilities to clean up our
objects for us. When testing against a database, however, tearDown
could be useful for closing connections, deleting tables, and so on.
Looking back at our example, the main function defined in the
unittest module makes it possible to execute the tests in the same
manner as executing any other script. This function examines
sys.argv, making it possible to supply command-line arguments to
customize the test output or to run only specific fixtures or cases (use
--help to see the arguments). The default behavior is to run all
test cases in all test fixtures found in the file containing the call to
unittest.main.
Executing the test script above should produce output that resembles:
..
----------------------------------------------------------------------
Ran 2 tests in 0.000s
OK
If the second test had failed, the output would have looked something like this:
.F
======================================================================
FAIL: testTwo (__main__.IsOddTests)
----------------------------------------------------------------------
Traceback (most recent call last):
File "C:\jason\projects\tdd-py\test.py", line 14, in testTwo
self.failIf(IsOdd(2))
File "C:\Python23\lib\unittest.py", line 274, in failIf
if expr: raise self.failureException, msg
AssertionError
----------------------------------------------------------------------
Ran 2 tests in 0.000s
FAILED (failures=1)
Typically, we wouldn't have the tests and the unit being tested in the same file, but it doesn't hurt to start out that way and then extract the code or the tests later.
Guess what I have trouble remembering to do:
0 0 * * * [ `date +\%m` -ne `date -d +4days +\%m` ] \
&& mail -s 'Pay the rent!' me-and-my-wife@example.org < /dev/null
|
Related Reading
Learning Python |
That little puzzle is a line out of my crontab that emails me a reminder to pay the rent on the last four days of each month. Pathetic? Probably. It works, though. I haven't been late paying rent since I started using it.
As clever as I thought I was for coming up with this, it wasn't practical for everything--especially for events that occur only once. Also, there's no way I could teach my wife enough bash scripting techniques in order to add a reminder to our calendar.
Most people use a good old-fashioned wall calendar for this type of thing. That's not techno-geeky enough for me.
I could use Outlook or Evolution or some productivity application, but that would open up a whole new can of worms. We don't use just one computer. We both use multiple computers and operating systems at home and at work. How could we easily synchronize all of those machines?
It was after realizing that our email is available to us no matter where we were that I hit upon the motivation for my project. The email reminding me to pay the rent was with me no matter what machine I'm on because I always check my email via IMAP, so my email is accessible from everywhere.
Why not email the upcoming events in my calendar to me just like my reminder to pay the rent? Brilliant, I thought. I know just the tools that can do this, too: the BSD calendar application and the new kid on the block, pal.
My wife and I have a private wiki that we use for keeping track of notes.
It's great. Despite the fact that my wife's an accountant and not a geek, she
has no trouble using it. I figured we could use the wiki to edit our calendar
file. I would write a little cron job to fetch the calendar file--probably
using wget--from the wiki and pipe that into whatever tool best
fit our needs.
Unfortunately, after looking at both calendar and
pal, I discovered that neither was what I was looking for.
The calendar file format requires a <tab> character between dates and
descriptions. Since I wanted to use our personal wiki to edit the calendar
file, inserting <tab> characters would be an issue (upon hitting
<tab>, focus jumps out of the text area to the next form control).
calendar also doesn't support any of the fancy output options that
pal does.
The pal format was much too geeky for even me to want to use,
and it didn't support the one really important use case I had so far: setting a
reminder for the last day of the month.
My wife and I sat down and came up with something both of us would want to use. Here are some examples:
30 Nov 2004: Dinner with the Darghams.
April 10: Happy Anniversary!
Wednesday: Piano lesson.
Last Thursday: Goody night at book study. Yum.
-1: Pay the rent!
Unlike the calendar format, a colon separates dates from descriptions. How Pythonic.
Like the calendar format, omitted certain fields are wildcards. The
April 10 event happens every year. The Last Thursday event happens on the
final Thursday of every month of every year.
The -1 event happens on the last day of every month of every year too. I
took this idea from Python's array subscript syntax, where foo[-1]
selects the last element in the foo array. I thought it was a little geeky, but
my wife understood it right away.
My goal is to write a small application that can run from cron to read a file in this format and email my wife and me the events we have scheduled for the next seven days. That shouldn't be too hard, should it?
From this point on, I'm writing this article in real time, having contrived nothing. I didn't write the code first and then write the article--I'm writing the article as I write the code. Yes, I expect to make mistakes. In fact, I'm counting on it. Making mistakes is the best way to learn.
Being test infected means that I must write this tool by writing all of my unit tests before writing the code I expect the tests to exercise.
The first thing I do when starting a new project is to create an empty fixture that fails:
import unittest
class FooTests(unittest.TestCase):
def testFoo(self):
self.failUnless(False)
def main():
unittest.main()
if __name__ == '__main__':
main()
![]()
I do this out of habit, just to make sure I have everything typed in correctly and to test that the test runner can find the fixture.
Notice the class named FooTests and its testFoo
method. At this point I have no idea what I'm going to test first. I
just want to make sure that I have everything ready once things get going.
Let's start out easy and test the first example from above with the full day, month, and year specified for the event. In order to create this test, I need know what to test. Am I testing a class? A function?
This is where we put on our designer hats for a brief moment and try to use our experience and intuition to come up with some piece to the puzzle that will help us reach our goal. It's OK if we make a mistake here; the tests will reveal that right away, before we invest too much in this design. We certainly don't want to draft any documents filled with diagrams. Save those for later, after we have a clue about what will actually work.
For this project, I should probably create objects that can say whether they "match" a given date. These objects will act as a "pattern" for dates. (I'm using regular expressions as a metaphor here.)
Eventually, I'll have to write a parser that will read in a file and create these pattern objects, but I'll do that later. These pattern objects are probably an easier place to start.
There might be multiple types of patterns--but I won't think about that now, because I could be wrong. Instead, I'll start coding so I can let it tell me what it wants to become:
def testMatches(self):
p = DatePattern(2004, 9, 28)
d = datetime.date(2004, 9, 28)
self.failUnless(p.matches(d))
![]()
Notice that I changed the name of the method from testFoo to
something more appropriate, because I now have an idea about what to test. I've
also invented a class name, DatePattern, and a method name,
matches. (The datetime module is part of Python 2.3
and up--I had to import it at the top of my file in order to use it.)
This test, of course, fails miserably--the DatePattern class
doesn't even exist yet! But I at least know now the name of the class I need to
implement. I also know the name and signature of one of its methods and the
signature for its __init__ method. Here's what I can do with this
knowledge:
class DatePattern:
def __init__(self, year, month, day):
pass
def matches(self, date):
return True
![]()
Now the test passes! It's time to move on to the next test.
You probably think I'm joking, don't you? I'm not.
Test-driven development is best when you move in the smallest possible increments. You should only be writing code that makes the current failing test case(s) pass. Once the tests pass, you're done writing code. Stop!
The above code is worthless, right? It basically says that every pattern matches every date. How can I justify spending the time to come up with a "real" implementation? By adding another test:
def testMatchesFalse(self):
p = DatePattern(2004, 9, 28)
d = datetime.date(2004, 9, 29)
self.failIf(p.matches(d))
![]()
We now have one passing test and one failing test.
I could change the matches method to return False
in order to make this new test case pass, but that would break the old one! I
now have no choice but to implement DatePattern correctly so that
both tests can pass. Here's what I came up with:
class DatePattern:
def __init__(self, year, month, day):
self.date = datetime.date(year, month, day)
def matches(self, date):
return self.date == date
![]()
Both tests now pass. Woo-hoo! I'm not happy with the
DatePattern class, though. So far, it's nothing more than a simple
wrapper around Python's date class. Why am I not just using
date instances for my "patterns"?
It might turn out that the DatePattern class is unnecessary, but
I'm not going to make that decision on my own. Instead, I'm going to write
another test--one that I think will confirm the necessity of the
DatePattern class:
def testMatchesYearAsWildCard(self):
p = DatePattern(0, 4, 10)
d = datetime.date(2005, 4, 10)
self.failUnless(p.matches(d))
![]()
Voilà! This test fails!
Why am I so happy about a failing test? My reasoning is simple: this
proves that the current implementation of DatePattern is
insufficient. It can't be just a simple wrapper around
date and therefore can't be just a date.
While typing this test, I had to make a decision about how to represent wildcards. What occurred to me first was to use 0. After all,
there's no year 0 (contrary to popular belief), month 0, or day 0. This may not
have been the best choice, but I'm going to roll with it for now.
It's time to make the new test pass (while making sure not to break the old ones):
class DatePattern:
def __init__(self, year, month, day):
self.year = year
self.month = month
self.day = day
def matches(self, date):
return ((self.year and self.year == date.year or True) and
self.month == date.month and
self.day == date.day)
![]()
To be honest, I'm already starting to feel like I'll need to do some refactoring as I add more wildcard functionality to the class, but I want to write a few more tests first.
Let's add a test where the month is a wildcard:
def testMatchesYearAndMonthAsWildCards(self):
p = DatePattern(0, 0, 1)
d = datetime.date(2004, 10, 1)
self.failUnless(p.matches(d))
![]()
Fixing matches so that the test passes results in this:
def matches(self, date):
return ((self.year and self.year == date.year or True) and
(self.month and self.month == date.month or True) and
self.day == date.day)
![]()
This method is getting uglier every time we touch it--I'm now positive that it will be my first refactoring victim.
I now have a test for using wildcards for both years and months. Will I need one for days? A pattern containing nothing but wildcards would match every day. When would that be useful?
At this point I can't think of a reason to support wildcard days, so I
won't bother writing a test for it. Because of that, I also won't bother
implementing any code to support it in the DatePattern class.
Remember, code gets written only when there's a failing test that needs the new
code in order to pass. This prevents us from writing code that should not exist
in our application, which should help keep it from becoming unnecessarily
complex.
Let's move on. We need to support events that occur on a specified day of every week:
def testMatchesWeekday(self):
p = DatePattern(
![]()
Uh, what now?
At this point, I realized that the DatePattern class might not
be what I want to use for this test. Its __init__ method doesn't
accept a weekday. Should I use a different class, or modify the existing
one?
I decided to modify the existing one for now, as that will require the least amount of work. If this turns out to be a bad idea, I can always refactor later.
def testMatchesWeekday(self):
p = DatePattern(0, 0, 0, 2) # 2 is Wednesday
d = datetime.date(2004, 9, 29)
self.failUnless(p.matches(d))
![]()
This doesn't pass because DatePattern.__init__ doesn't accept five arguments (counting self). I modified __init__ to
look like this:
def __init__(self, year, month, day, weekday=0):
self.year = year
self.month = month
self.day = day
self.weekday = weekday
![]()
I gave weekday a default value so that I wouldn't need to
update the other test cases. Everything compiles and runs, but the new test case
doesn't pass.
The astute reader has probably already realized that I'm now passing in 0
for the day argument. There's the wildcard I didn't think I would
need--now I need it!
Here's my new matches method:
def matches(self, date):
return ((self.year and self.year == date.year or True) and
(self.month and self.month == date.month or True) and
(self.day and self.day == date.day or True) and
(self.weekday and self.weekday == date.weekday() or True))
![]()
Now all of the components of a pattern allow for wildcards. How very interesting.
With this new method, testMatchesWeekday passes but
testMatchesFalse now fails! What gives?
I honestly can't tell why testMatchesFalse fails by looking at
the code. This is going to call for some simple debugging. Unfortunately, I
tried to cram all of the logic for the matches method into one
expression (spanning four lines!), so there's no place for me to insert any
print statements to help me see which part is failing. It's finally time to do
that refactoring I've been wanting to do.
The refactoring I want to apply is the Compose
Method from Joshua Kerievsky's excellent book, Refactoring to
Patterns. By extracting smaller methods from the current
matches method, I can not only make matches clearer
but also make it possible to debug whichever part is currently causing me
grief.
This is the result:
def matches(self, date):
return (self.yearMatches(date) and
self.monthMatches(date) and
self.dayMatches(date) and
self.weekdayMatches(date))
def yearMatches(self, date):
if not self.year: return True
return self.year == date.year
def monthMatches(self, date):
if not self.month: return True
return self.month == date.month
def dayMatches(self, date):
if not self.day: return True
return self.day == date.day
def weekdayMatches(self, date):
if not self.weekday: return True
return self.weekday == date.weekday()
![]()
Code PickinessI recently read a weblog post
by Ian Bicking about what he considers to be code smells in Python code. (Editors note: The link to the weblog post by Ian Bicking was not available at the time of publishing.) I
thought one, using " |
The matches method is now much clearer, don't you agree? It
might seem like a ridiculous thing to do, but writing intention-revealing code
is much more important than being clever. I was trying to be too clever before
and it caused a bug--one that I wouldn't have come across if I had done this
from the beginning.
After applying this refactoring and rerunning the tests, I expected to see
the testMatchesFalse test still failing, but it's now passing.
Somewhere in my original logic I made an error, and I have no idea where it
was--I'll leave finding it as an exercise for the reader. In the meantime, not only do I have simpler code now but it also actually works the way I expect it to.
Take that!
Would I have noticed this bug without tests? I have no doubt that I would, but how long would it have been before I realized that this was a problem? With my unit tests, I noticed it immediately, so I knew exactly what to fix.
Wildcards essentially work for all of the components I'm testing so far. This is good, but I think the next test will cause trouble. It starts out innocently enough:
def testMatchesLastWeekday(self):
p = DatePattern(0, 0, 0, 3
![]()
Er, I'm stuck again.
In case it's not obvious (and it's not--why didn't Python's
datetime module define constants for weekdays?), the 3
represents Thursday.
How do I indicate that I only want to match the last Thursday in a
month? Do I need to add yet another argument to
DatePattern.__init__?
This is where that sneaking suspicion in the back of my head is finally starting to warrant some closer attention. I might be trying to cram too much functionality into one class.
I haven't written much code yet, but that's a good thing, since it seems that
the code I have written might not have been sufficient for what I want to do
with it. Without the tests, I might not have discovered what a mess I was
writing until it was too late. At this point, I haven't invested too much time
into the DatePattern class, so I won't feel bad about throwing it
away if that's what I'll need to do.
I have some ideas about how to restructure the code so that it's as simple and yet as functional as I want it to be, but we're going to have to save those for Part 2 of this article, which will be published shortly.
Code and tests are available for download and inspection.
Jason Diamond is a consultant specializing in C++, C#, and XML, and is located in sunny Southern California.
Return to the Python DevCenter.
Copyright © 2009 O'Reilly Media, Inc.