This 30-day project explores the refactoring of a legacy system. The Everything Engine is an aging software project that powers Perl Monks, Everything 2, and a few other websites. It suffers from poor design and maintainability.
Here’s what I learned about programming, refactoring, and 30-day projects during this series.
Final Thoughts
That’s it, this is the end of my first published thirty day project. I spent around thirty hours or so working on the Everything Engine, trying to refactor it into something a little more nice and usable. I had some successes and some failures and learned a lot. I hope you did too.
Project Summary
Why a thirty day project? First, because it’s a lot bigger than the usual “Hack for a couple of hours, then write a weblog about it.” I know 30 weblogs over a period of about six weeks is a lot to ask people to pay attention to, but I tried to throw in other publishing as well. (It’s a lot longer than my attention span, at least.)
I think it’s also nice to watch a project evolve over time. I tried to be very clear about my missteps and changes of direction. I like to think you’d have seen the same thing if you’d been pairing with me. That may or may not be valuable, but I do think it’s a noble goal.
I didn’t have an ultimate goal when I started the project, except to leave it in a cleaner state. On the whole, I think I did. I didn’t achieve as much as I wanted to achieve, but thirty hours of solid work represents about as much development work as I would do in an actual job during a week.
Thus if you look at the whole project as a single developer taking a week off to clean up the system (and if I were to do that in an ongoing, paying project, I would have more concrete goals), it looks pretty good.
Technical Thoughts
The most important part of the retrospective is the technical part. That was the goal, not just to clean up a system but to document how I did it.
Successes
The system is, on the whole, cleaner than it was when I started. Even the first day made a big improvement; fixing the directory layout and making it easier to manage files in Subversion allowed me to spend less time thinking about how the project worked and more time making other changes.
I spent the most time refactoring the tests for all of the node classes. Part of this enabled me to improve test coverage dramatically, but most of it was so that I could perform further refactorings.
This is didactically valuable in that it shows how refactorings build on each other. Granted, the previous system design was far from a good design, so a project built with more discipline from the start would have needed less invasive changes.
Making that work required me to fix the design in some other parts of the system as well, notably improving the node method dispatch. This did remove some of the dynamism from the system (so it’s technically a behavior change and not a refactoring), but as a temporary change, I feel comfortable making it. It would have been very difficult to support the old behavior alongside the new.
I did get to make a few other code changes, though I didn’t take them as far as I wanted to. The database access scheme still needs some attention, but it’s getting much cleaner than it was. Every time I made a change to the system, I noticed how much extra coupling there was between components. There’s a lot of work to go, but every change makes things a little cleaner and other changes possible.
Toward the end of the series, when I started to concentrate on the database code, I realized that my ultimate goal had to be running the node tests against a live database. The mocking code was clever (and there was a lot of it), but I’d never get any more confidence in the system without much more integration.
I don’t have much of that yet, but I do have a start, so I know it’s possible and even doable.
My biggest success overall was discovering a new technique that combines
Test::MockObject::Extends with Test::Class. I
find it so useful that I do it by default now. (It also helped me fix some
bugs in Test::MockObject and SUPER.) I’ve since written about this in Mocks in Your Text Fixtures, on Perl.com.
Failures
Not everything went perfectly though. I’m pretty sure the system as a whole has some subtle bugs that weren’t there before I started making changes. (I know it had some even before that.) The test coverage still isn’t good enough. It’s better and it’s fixable, but I’m more aware than ever of how incomplete it is.
I spent a lot of time translating node tests to the new style by hand when I could have written a small program to do it for me. If I’d done this all in a week, I’d have automated the process. By spreading it out, I felt less pain at repeating myself.
I also feel like I made a bunch of little messes. I remember promising to clean up a few things later. Of course, that’s how refactoring goes. I turned a big tangle of code into several smaller tangles. Hopefully they’re more manageable, but I still feel a little bit guilty about not making perfectly clean code in only thirty hours. It’ll pass.
Lessons Learned
Mostly, I learned that refactoring a large system is more difficult than it seems — and that the ease of refactoring depends on the quality of the test suite.
We had a big test suite and it covered a lot of the system, but it didn’t cover it particularly well. I learned a lot from writing it (and have learned even more since then).
It’s also funny/ironic to see how some of the newer web application systems do what they do. Everything did a lot of that several years ago, but it never had quite the polish of the other systems. It would be nice to bring it up to date… but it needs some infrastructure work first.
Project Thoughts
This was my first thirty day project, so its worth thinking about how that worked too. I’d like to do another, but I’m not sure what it might be yet.
Successes
I really like the format and the discipline of choosing a workweek-sized task and documenting that honestly. It’s a nice size for a project and it’s not an investment too heavy for me or for readers. I hope.
I didn’t receive many public comments, but I did receive a fair amount of private feedback that people found the series valuable. For a subject as esoteric as the software behind a handful of (admittedly useful) sites, as well as class-based testing with Perl, it was nice to know that other people followed along. I hope the series is generally useful.
I also learned a lot about how to set up this series. I tried to work a week in advance, knowing that I’d publish three times a week. For the most part, that worked. I fell behind toward the end due to a vacation, but I don’t really regret that. (I wonder how it affected the readers, though.)
Failures
Our weblog instance here makes it difficult to categorize these posts. I hope we can rectify that soon.
I was never sure whether to post patches (especially 1000+ line monsters) or refer people to the Subversion repository with the appropriate checkin numbers. I almost always did the latter, but removing the code from the weblog, even by one click, likely reduced the number of people who read the code.
The series bogged down in the middle when I spent two weeks of entries
refactoring the test cases from the procedural form into the
Test::Class form. I know that was as boring to read as it was
to write. I promised that I’d be honest about the process though. If I do
this again, I’ll try to pick something with less tedium in the middle.
Lessons Learned
The three-a-week publishing schedule seemed to work. It gave me a chance to fit in the hour of work per virtual day in my normal schedule. Having a week or two in advance helps work around deadlines, so I’ll do that next time.
Starting a new project (or at least a new feature of an existing project) seems more interesting than refining an old project. Yet I’m not sure how pure research will go. For example, if I write a game in Ruby, will I run into trouble when I find out I don’t know how to use Rake effectively? I don’t know.
I don’t know how to get more people to read the code as I make it, and I don’t know how to handle potential contributions during the process. That may never come up, though.
I do think that getting the right project (and sticking to the schedule) will be more interesting to readers; maybe measuring that by feedback isn’t the right approach. However, I do still like this idea of learning in public. It would be nice to see other people do it.
I’m not sure what my next project will be, but I hope to decide soon. Thanks for reading along.
All Thirty Days
Based on reader feedback, here’s a list of links to all 30 entries.

Well we ever get access to svn repository with your changes?
Hm, I thought I'd published the link earlier. It's at SourceForge, under the name "everydevel": http://svn.sourceforge.net/viewcvs.cgi/everydevel/
As a closet fan of eCore it's really cool seeing the ol' lamp get some attention. I'm glad you enjoyed it too and hopefully this will lead some more contruibutions/authors to ecore.
I admit as a noncoder, I pretty much skimmed the Greek sections of your posts and was just content seeing that it's being worked on. A month ago I publically wondered "is anyone really using this thing still", good to see people are!
Thanks,
jeffm on everydevel
Thanks for publishing. :)
I was hoping there'd be more focus on the system itself and was a bit disappointed that the tests took up so much time. It's not just a matter of tedium and boredom, it's that any rearrangements in the main code were buried under a sandpile of test suite changes. Maybe that would have been OK if the rearrangements had been meatier; though if they could have been, then the system would have been in a better shape to begin with.
You're right, Aristotle. It would have been nice if I could have cleaned up the tests first, but I didn't remember how much work they needed. It was also difficult to justify working on the system without either needing to use it myself (I don't, at the moment) or writing about it... so it may not be a good example of what a 30 Day Project could be.
Just wanted to thanks for an interesting set of posts, I've dipped in and out of the series over the last month or so and found it an intriguing exercise. I think you're right when you said it bogged down when documenting the refactoring of the test cases although it had to be done to be true to your original goals. I've learned some interesting tips with regards to Test::MockObject and as a result this should encourage the use of a more test driven development approach in the future. I'm not sure what other large, well-known open source Perl systems there are out there but I'd be interested to see a similar exercise on something which most people would recognise regardless of whether they were a Perl developer or not - any ideas ?
I remember stumbling across early installments in this series in April. But, notwithstanding seeing you and hearing you speak at YAPC::NA in Chicago, I forgot about the series until I stumbled across the Retrospective entry yesterday. Why? Because the blog format meant that you couldn't include hyperlinks to entries yet to be written and it didn't include links to entries already written. Even fiddling around the location bar didn't easily work because O'Reilly primarily organizes its blogs by year and month -- and only thereafter by entry number.
So, would it be possible to refactor the blog pages so that if I now go to entry #2, it includes a link back to entry #1 and a link forward to #3? This would make the series more accessible and useful going forward.
Notwithstanding these technical problems, I think the approach is a good one and was struck by its similarity to the approach I took in my own talk at YAPC on taking over maintenance of CPAN modules.
My biggest success overall was discovering a new technique that combines Test::MockObject::Extends with Test::Class. I find it so useful that I do it by default now.
Can you provide a link to read more about this?
Perrin, see Mocks in Your Test Fixtures.