This 30-day project explores the refactoring of a legacy system. The Everything Engine is an aging software project that powers Perl Monks, Everything 2, and a few other websites. It suffers from poor design and maintainiability. Learn what it’s like to look over the shoulder of an experienced developer as he refactors, redesigns, and updates the code.
Today’s task is continuing to port tests from the procedural form to the Test::Class form.
Day #07
Today’s task is to continue to port Everything::Node::node’s
tests to Test::Class.
While thinking about this, I realized that I could probably remove the
SUPER() and AUTOLOAD() methods from
Everything::Node. Then I wondered if the tests would pass. I don’t
want to break anything without knowing about it (and ideally having the tests
in place first to prove that it still works), but I don’t think there are any
tests that really exercise that.
There also aren’t any tests that show that ecore still works either, even though I know it probably doesn’t quite.
This all argues for something like customer tests, instead of developer tests, where this exercises as much of the system as a whole as possible. To do this, I have to be able to:
- give the engine a request
- check the results
- work against a database full of testing data
The last dependency is the most difficult — it probably means I need to
finish the Everything::NodeBase refactoring. (Okay, that’s not
necessarily true. I could hack up the system just enough to get this swapping
in place, perhaps by making NodeBase just subclassable enough…
but I have the feeling that’s a lot of work.)
On to fixing the node test. I want to ease into this testing
again, not just because it’s the start of the hour. I feel a little under the
weather, so I’m trying to avoid some physical discomfort. The final test in the
file, for restrictTitle() looks like the smallest and
simplest.
I missed a semicolon on one line editing these tests, but otherwise it all worked on the first try. Nice.
There are a lot of methods related to workspacing that don’t really belong
here. In a previous attempt to refactor ebase, I planned to take the database
code out of Everything::NodeBase and allow subclassing
that module, so that there could be an
Everything::NodeBase::Workspace class that contained all of the
workspacing code. It’s still a good idea. Alas, it’s not today’s good idea, so
I have to test this code eventually. Of course, getNodeKeepKeys()
looks shorter.
Seriously, once I start migrating the tests for any of these awkward methods, it’ll go pretty quickly. It’s like untangling a huge knot. You poke and pull at it for a while, but once you get a little slack the rest of it just falls apart. (I don’t like the frustration until you get to that point, though I really like the effect and even enjoy the way the last few bits fall into place.)
getNodeKeepKeys() took about three minutes to test. I like
that. There’s not much else short in the file though.
getNodeKeys() was quicker, though I forgot to change one
$mock to $node. The Test::MockObject::Extends
technique I discovered in day 6 worked perfectly here.
There aren’t many other short tests, so now it’s time for the hateful
ones, starting with fieldToXML(). Hey, there’s another
localized typeglob assignment in here, suitable for the TMOE
technique again. I knew this would be useful. I like how it makes
somewhat ucky code just go away.
Of course, looking at how the code works I realize that doesn’t work. It’s still a nice thought though. At least I can undo that typing with no loss. After running the test, I realize I didn’t update the subroutine attribute. I always make at least one little mistake like this — pair programming would catch a lot of these. All tests still pass though.
I found a couple of other smaller tests –
getIdentifyingFields() took about a minute to test. Hooray.
updateFromImport() might take five.
Why am I cherry-picking? I tend to do that when I write tests for untested code, and I just noticed the connection here too. Part of it is getting in a groove. Part of it is liking to make visible progress. I did choose the largest node test file to migrate first though — that’s because once I have this in place, I should be able to inherit a lot of tests and get rid of a lot of duplication while simultaneously improving the test coverage.
That doesn’t mean I want to jump into the nastiest method to test first though. I’m not that crazy.
After s/mock/node/g again in this test method, I wonder idly if
I even use the mock object (not the wrapped one, the actual mock
object) anywhere. Maybe not. Looking at the tests so far, not really. That’s
interesting.
Looking more closely at the updateFromImport() test, I also
wonder if I really do need to mock getNodeKeys() and
getNodeKeepKeys(). Obviously it doesn’t hurt the tests to do it,
in the sense that they pass, but mocking is a big enough hammer that I like to
use it only where absolutely necessary. (Sure, I did write Test::MockObject
based on what I learned in this process, but if I’d thought really really hard
and been a lot smarter, I might have started with
Test::MockObject::Extends. I didn’t, so learn from my
example.)
After undoing most of the mocks (and adding update()), the test
fails in Everything::Node::getNodeDatabaseHash(). That, at least,
needs mocking. Of course, the original code merely mocked
getNodeKeys() and getNodeKeepKeys(). I think it was
on the right track, so I’ll stick with that. It passes again. That’s 54 tests
out of 288, I believe — although I’ve deleted a few. That’s progress.
The next test is for xmlFinal(). That’s another 6
tests.
The next test is for conflictsWith(). I have about 20
minutes left. I’ll probably have to take on one of the larger tests very
soon. About a third of the test module is new code and the rest is
commented-out old, procedural code. Some of that will go away, but the new
file will probably end up somewhat larger.
That’s fine. Some of that code will replace code in other test files. Some of that code will go in other test files when the features (XML handling, node saving and loading) go in other modules. The important point is that it’s easier to read now, much easier to reuse, and much, much easier to maintain. I also have more confidence in it now, not the least because I’m getting rid of a few dirty tricks here and there.
I deleted a couple of tests from conflictsWith(). I used to
mock everything, testing that the methods tested internally called
other methods. I don’t care about that as much anymore, mostly because if
the internals change but do the same thing in a different way (or in a
different order), I want the tests to care only about getting the right
answers. I can still mock the methods as appropriate, but I don’t have to
check their calls.
The next test is for verifyFieldUpdate(). I’ve decided to
tackle the rest of the tests in order after this. I’m lazy, but I’m not
that lazy. (Finding the next test to rewrite was starting to take
too long and I started to feel guilty about it.) After adding a missing
closing parenthesis, the test passed the first time.
(I bet having a pair programming partner would have made me feel guilty for skipping around even sooner, too.)
It looks like insert() is the next test. I suspect that I
will end up with several test_insert*() methods when this is
over. The tests look like there are functional chunks, so I’ll take them
one at a time. That should break them up into more manageable pieces
anyway. (The method implementation is a frightening 85 lines of code
anyway. Yuck. It does too much! 20 is fairly long, in my mind.)
If I’m a good programmer now, and I hope I am, it’s because I’ve lived through all sorts of mistakes in design and implementation like this code currently has. Of course, the five hundred thousand plus posts on Perlmonks.org argue that some version, at least, of this code was useful – but it’s not entirely pretty.
Distractions aside, what’s next? The next section of insert()
testing is that the method returns false if there are duplicates and the
nodetype does not allow duplicates. Again, the test cared about how
the method checks for duplicates. I don’t. Those tests can go away. I don’t
think the method will pass from the start now, but it’s worth trying.
Nope. I didn’t mock something — the user access test. That’s easy enough to fix.
Because I broke out this set of tests into a separate method, I wonder if it’s worth testing what happens when the nodetype doesn’t have duplicate restrictions in here. It probably won’t work, but it’s a great experiment.
It didn’t work, but I think that may be because it’s not getting a new node
id. Maybe not. It’s still worth trying. Hmm, that didn’t work either. Looking
at the code again, I’m starting to think that I’m not even exercising the
correct code path. Is this code even getting to the point where it
tries to restrict duplicates? I bet it isn’t. I bet
restrictTitle() is failing. It is. That’s another one to mock.
After doing that, there’s a big blowup on
Everything::Node::AUTOLOAD(). Excellent. I’m making progress.
This is probably the place where I need an unwrapped mock object, as the
current code uses the $node as the mock database. That’ll fix
things.
Changing $node->{DB} to the mock database object reveals
that the latter doesn’t mock everything it should. Great! That’s another
step.
That didn’t fix it, and that’s because $node doesn’t mock
getId() either.
After fixing that, the remaining errors reveal unmocked methods on the mock database. This method is too long and it does too much — that’s why there’s so much to mock here! I can’t refactor it yet though, so the only solution is to add a lot of mocking code.
There’s another method to mock on the nodetype node (my reused
$node): getNTableArray(). Again, I stick with the
simplest code. This time, everything works, in the sense that all of the tests
run. One fails. That’s fine; I probably didn’t update the return value. Yep.
That fixed it.
That was a lot of work to get the second test to pass, but I’m glad I did it — I was exercising the wrong code path before.
This does show one potential limitation of the Test::Class
approach and that is that resetting the state of the fixture before every test
means that you might have to write more setup code in every test method. That’s
not necessarily a bad thing. If you do test-driven design, this could
lead you to write shorter, less-coupled methods. I think that’s the biggest
problem with this code.
Also, one of the failings of the original test code is that it relied on a
lot of spooky action at a distance. Compare the amount of mocking code in the
new version with that of the old version — compare also the number of times I
have to call clear() in the new version. Statefulness can help
sometimes, but it has grown less manageable here.
Now 69 out of 228 tests pass, and I bet the second number is closer to 200 now. That’s about a third, for two days of work. I don’t know if the next couple of days will show much faster progress, but I still feel a little disappointed, even though I did make a fair amount of progress today. I’m 40% of the way through the test file.
I’ll undo the change to t/Node/node.t so that the normal tests all pass and check in this version. Everything still passes and the diff looks sane, so this is checkin #839.

