Programming has much in common with other kinds of writing. In particular, one of the goals of programming, as with writing, is to write in a way that is both beautiful and interesting. Of course, as with other kinds of writing, one’s first drafts are typically neither beautiful nor interesting. Rather, one starts by hacking something out that achieves one’s ends, without worrying too much about the quality of the code.
For a certain kind of program, this approach is sufficient. One may write a quick throwaway program to accomplish a one-off task. However, for more ambitious programs, this kind of throwaway approach is no longer feasible; the more ambitious the goals, the higher the quality of the code you must produce. And the only way to get high quality code is to start with the first draft of your code, and then gradually rewrite – refactor – that code, improving its quality.
I’m a relative novice as a programmer, and I’m still learning the art of refactoring. I try to set aside a certain amount of time for improving the quality of my code. Unfortunately, in my refactoring attempts, I’m occasionally stymied, simply going blank. I may look at a section of code and think “gosh, that’s terrible”, but that doesn’t mean that I instantly have concrete, actionable ways of improving the quality of the code.
To address this problem, I produced the following refactoring checklist as a way of stimulating my thinking when I’m refactoring. The items on the checklist are very and mechanical to check, and so provide a nice starting point to improve things, and to get more deeply into the code.
- Start by picking a single method to refactor. Attempting to refactor on a grander scale is too intimidating.
- If a method is more than 40 lines long, look for ways of splitting out pieces of functionality into other methods.
- Are variables, methods, classes and files well named? This question can often be broken down into two subquestions. First, is there any way I can improve the readability of my code by changing the names? Second, is my naming consisten? So, for example, a file named “bunny” should define a class named “bunny”, not “rabbit”.
- When I see a problem, do something about it, even if I can’t see a fix right away. In particular, at the very least add a comment which attempts to describe the problem. This makes future refactoring easier, and sends a clear message to my subconscious that I mean to eliminate all the bad code from my program. As a bonus, more often than not it stimulates some ideas for how to improve the code.
- Partial solutions are okay. It’s fine to replace an ugly hack by a slightly less ugly hack.
- How readable is the code being refactored? In particular, is it obvious what this section of code does? If not, how could I make it obvious?
- Explain the code to someone else. I’ve never explained a piece of code to someone without having a bunch of ideas for how to improve it.
- Is there a way of abstracting away what is being done, in a way that would enable other uses?
I highly recommend “Code Complete” by Steve McConnell. He talks about how to approach coding, and backs up everything he says with loads of data.
If you write a piece of code and truly think it’s terrible, it probably needs to be rewritten from scratch, not just refactored. Coding is a learning process, and sometimes you learn what not to do.
One problem I suffered from when I first started writing code was trying to make things _too_ abstract. Often the best thing to do is code only for the purpose at hand. That will keep things tidy and simple, which actually makes it easier to adapt for more purposes in the future. Knuth said early optimization is the root of all evil, but early generalization could be just as bad.
If you want to write code that is useful for multiple purposes, you’ll need multiple use cases to validate your design.
Hi Jon,
I recently bought McConnell’s book, and have been working my way through it. Another nice read in a similar vein is “The Pragmatic Programmer”, by Hunt and Thomas. It’s a bit more lightweight.
Incidentally, McConnell cowrites a blog (Software Best Practices, link in my blogroll) that is pretty good.
Your comment about generalization really captures something nicely (as does the Knuth quote)! Paraphrasing very slightly, “Premature generalization is the root of much evil”. As you say, it seems as though the trick is to write something concrete, for a concrete purpose, and only then to extract and generalize. Skipping the concrete step just doesn’t work.