APIs and the art of building powerful programs

In a recent essay, Steve Yegge observed, wisely in my opinion, that “the worst thing that can happen to a code base is size”.

Nowadays, I spend a fair bit of my time programming, and, like any programmer, I’m interested in building powerful programs with minimal effort. I want to use Yegge’s remark as a jumping off point for a few thoughts about how to build more powerful programs with less effort.

Let’s start by asking what, exactly, is the problem with big programs?

For solo programmers, I think the main difficulty is cognitive overload. As a program gets larger it gets harder to hold the details of the entire program in one’s head. This means that it gradually becomes more difficult to understand the side effects of changes to the code, and so alterations at one location become more likely to cause unintended (usually negative) consequences elsewhere. I’d be surprised if there are many programmers who can hold more than a few hundred thousand lines of code in their head, and most probably can’t hold more than a few tens of thousands. It is suggestive that these numbers are comparable to the sizes involved in other major coherent works of individual human creativity – for example, a symphony has roughly one hundred thousand notes, and a book has tens of thousands of words or phrases.

An obvious way to attempt to overcome this cognitive limit is to employ teams of programmers working together in collaboration. Collaborative programming is a fascinating topic, but I’m not going to discuss it here. Instead, in this essay I focus on how individual programmers can build the most powerful possible programs, given their cognitive limitations.

The usual way individual programmers overcome the limits caused by program size is to reuse code other people have built; perhaps the most familiar examples of this are tools such as programming languages and operating systems. This reuse effectively allows us to incorporate the codebase in these tools into our own codebase; the more powerful the tools, the more powerful the programs we can, in principle, build, without exceeding our basic cognitive limits.

I’ve recently experienced the empowerment such tools produce in a particularly stark way. Between 1984 and 1990, I wrote on the order of a hundred thousand lines of code, mostly in BASIC and 6502 assembly language, with occasional experimentation using languages such as Forth, Smalltalk and C. At that point I stopped serious programming, and only wrote a few thousand lines of code between 1990 and 2007. Now, over the last year, I’ve begun programming again, and it’s striking to compare the power of the tools I was using (and paying
for) circa 1990 with the power of today’s free and open source tools. Much time that I would formerly have spent writing code is now instead spent learning APIs (application programming interfaces) for libraries which let me accomplish in one or a few lines of code what would have formerly taken hundreds or thousands of lines of code.

Let’s think in more detail about the mechanics of what is going on when we reuse someone else’s code in this way. In a well-designed tool, what happens is that the internals of the tool’s codebase are hidden behind an abstract external specification, the API. The programmer need only master the API, and can ignore the internal details of the codebase. If the API is sufficiently condensed then, in principle, it can hide a huge amount of functionality, and we need only learn a little in order to get the benefit of a huge codebase. A good API is like steroids for the programming mind, effectively expanding the size of the largest possible programs we can write.

All of this common knowledge to any programmer. But it inspires many natural questions whose answers perhaps aren’t so obvious: How much of a gain does any particular API give you? What makes a particular API good or bad? Which APIs are worth learning? How to design an API? These are difficult questions, but I think it’s possible to make some simple and helpful observations.

Let’s start with the question of how much we gain for any given API. One candidate figure of merit for an API is the ratio of the size of the codebase implementing the API to the size of the abstract specification of the API. If this ratio is say 1:1 or 3:2 then not much has been gained – you might just as well have mastered the entire codebase. But if the ratio is 100:1 then you’ve got one hundred times fewer details you need to master in order to get the benefit of the codebase. This is a major improvement, and potentially greatly expands the range of what we can accomplish within our cognitive limits.

One thing this figure of merit helps explain is when one starts to hit the point of diminishing returns in mastering a new API. For example, as we master the initial core of a new library, we’re often learning a relatively small number of things in the API that nonetheless enable us to wield an enormous codebase. Thus, the effective figure of merit is high. Later, when we begin to master more obscure features of the library, the figure of merit drops substantially.

Of course, this figure of merit shouldn’t be taken all that seriously. Among its many problems, the number of lines of code in the codebase implementing the API is only a very rough proxy for the sophistication of the underlying codebase. A better programmer could likely implement the same API in fewer lines of code, but obviously this does not make the API less powerful. An alternate and better figure of merit might be the ratio of the time required for you to produce a codebase implementing the API, versus the time required to master the API. The larger this ratio, the more effort the API saves you. Regardless of which measure you use, this ratio seems a useful way of thinking about the power of an API.

A quite different issue is the quality of the API’s design. Certain abstractions are more powerful and useful than others; for example, many programmers claim that programming in Lisp makes them think in better or more productive ways. This is not because the Lisp API is especially powerful according to the figure of merit I have described above, but rather because Lisp (so I am told) introduces and encourages programmers to use abstractions that offer particularly effective ways of programming. Understanding what makes such an abstraction “good” is not something I’ll attempt to do here, although obviously it’s a problem of the highest importance for a programmer!

Which APIs are worth learning? This is a complicated question, which deserves an essay in its own right. I will say this: one should learn many APIs, across a wide variety of areas, and make a point of studying multiple APIs that address the same problem space using greatly different approaches. This is not just for the obvious reason that learing APIs is often useful in practice. It’s because, just as writers need to read, and movie directors should watch movies from other directors, programmers should study other people’s APIs. They should study other people’s code, as well, but it’s a bit like a writer studying the sentence structure in Lord of the Rings; you’ll learn a lot, but you may just possibly miss the point that Frodo is carrying a rather nasty ring. As a programmer, studying APIs will alert you to concepts and tricks of abstraction that may very well help in your own work, and which will certainly help improve your skill at rapidly judging and learning unfamiliar APIs.

How does one go about mastering an API? At its most basic level mastery means knowing all or most of the details of the specification of the API. At the moment I’m trying to master a few different APIs – for Ruby, Ruby on Rails, MySQL, Amazon EC2, Apache, bash, and emacs. I’ve been finding it tough going, not because it’s particularly difficult, but just because it takes quite a lot of time, and is often tedious. After some experimentation, the way I’ve been going about it is to prepare my own cheatsheets for each API. So, for example, I’ll take 15 minutes or half an hour and work through part of the Ruby standard library, writing up in my cheatsheet any library calls that seem particularly useful (interestingly, I find that when I do this, I also retain quite a bit about other, less useful, library calls). I try to do this at least once a day.

(Suggestions from readers for better ways to learn an API would be much appreciated.)

Of course, knowing the specification of an API is just the first level of mastery. The second level is to know how to use it to accomplish real tasks. Marvin Minsky likes to say that the only way you can really understand anything is if you understand it in at least two different ways, and I think a similar principle applies to learning an API – for each library call (say), you need to know at least two (and preferably more) quite different ways of applying that library call to a real problem. Ideally, this will involve integrating the API call in non-trivial ways with other tools, so that you begin to develop an understanding of this type of integration; this has the additional benefit that it will simultaneously deepen your understanding of the other tools as well.

Achieving this second level of mastery takes a lot of time and discipline. While I feel as though I’ve made quite some progress with the first level of mastery, I must admit that this second level tries my patience. It certainly works best when I work hard at finding multiple imaginative ways of applying the API in my existing projects.

There’s a still higher level of mastery of an API, which is knowing the limits of the abstract specification, and understanding how to work around and within those limits. Consider the example of manipulating files on a computer file system. In principle, operations like finding and deleting files should be more or less instantaneous, and for many practical purposes they are. However, if you’ve ever tried storing very large number of files inside a single directory (just how large depends on your file system), you’ll start to realize that actually there is a cost to file manipulation, and it can start to get downright slow with large numbers of files.

In general, for any API the formal specification is never the entire story. Implicit alongside the formal specification is a meta-story about the limits to that specification. How does the API bend and break? What should you do when it bends or breaks? Knowing these things often means knowing a little about the innards of the underlying codebase. It’s a question of knowing the right things so you can get a lot of benefit, without needing to know a huge amount. Poorly designed APIs require a lot of this kind of meta-knowledge, which greatly reduces their utility, in accord with our earlier discussion of API figures of merit.

We’ve been talking about APIs as things to learn. Of course, they are also things you can design. I’m not going to talk about good API design practice here – I don’t yet have enough experience – but I do think it’s worth commenting on why one ought to spend some fraction of one’s time designing and implementing APIs, preferably across a wide variety of domains. My experience, at least, is that API design is a great way of improving my skills as a programmer.

Of course, API design has an immediate practical benefit – I get pieces of code that I can reuse at later times without having to worry about the internals of the code. But this is only a small part of the reason to design APIs. The greater benefit is to improve my understanding of the problems I am solving, of how APIs function, and what makes a good versus a bad API. This improved understanding makes it easier to learn other APIs, improves how I use them, and, perhaps most important of all, improves my judgement about which APIs to spend time learning, and which to avoid.

Fred Brooks famously claimed that there is “no silver bullet” for programming, no magical idea or technique that will make it much easier. But Brooks was wrong: there is a silver bullet for programming, and it’s this building of multiple layers of abstraction using ever more powerful tools. What’s really behind Brooks’ observation is a simple fact of human psychology: as more powerful tools become available, we start to take our new capabilities for granted and so, inevitably, set our programming sites higher, desiring ever more powerful programs. The result is that we have to work as hard as ever, but can build more powerful tools.

What’s the natural endpoint of this process? At the individual level, if, for example, you master the API for 20 programming tools, each containing approximately 50,000 lines of code, then you can wield the power of one million lines of code. That’s a lot of code, and may give you the ability to create higher level tools that simply couldn’t have been created at the lower level. Those higher level tools can be used to create still higher level tools, and so on. Stuff that formerly would have been impossible first becomes possible, then becomes trivial, and finally becomes invisible, absorbed into higher-level primitives. If we move up a half a dozen levels, buying a factor of 2-5 in power at each layer, the result is that we can get perhaps a factor of 1,000 or more done. Collectively, the gain for programmers over the long run is even greater. As time goes on we will see more and more layers of abstraction, built one on top of the other, ever expanding the range of what is possible with our computing systems.

6 comments

  1. “A good API is like steroids for the programming mind”…

    Of course, as is always the case with steroids, this may cause shrinkage in other areas — notably in this instance, the non-programming mind 🙂

  2. A great article on the importance and leverage of good API design. In addition to well designed APIs is, in my opinion, the importance of the design of the programming tools themselves. That is, the programming environment. I while back I delved into .NET programming. While the .NET API is extremely well designed (I should also give credit to Java, since .NET is effectively a Java clone), it’s the programming tools that are the most powerful aspect of .NET programming. When I began writing C# programs I had next to no knowledge of the API in my head. However, the programming environment compensated for this by helping me out with the API. Suppose ‘i’ is an integer. As soon as I type ‘i.’ the environment recognized that I’m about to call a method associated with ‘i’ and brings up a drop-down menu listing all the available methods, with descriptions of the methods. The methods names are chosen in such a way that it is obvious what the meaning is. For example, the meaning of methods like ‘ToString’ and ‘ToDouble’ are obvious when shown in the drop-down menu. This makes memorization of methods to a large extent unnecessary. Credit should also be given to today’s visual development tools, which further reduce the amount of code that developers need to write.

  3. Hi Peter,

    Thanks for the comments, and kind words. I’ve never used a full-fledged IDE, but from your description it certainly sounds like they’re a useful way of ramping quickly up the API learning curve. I occasionally find myself opening the Ruby console, which does code completion, simply for the reason you describe; it’d be much more convenient to have this done as I’m coding. Should be pretty easy to do in emacs, now that I think about it…

    Michael

  4. Good article, and a good read! It evoked some nostalgia in me (for ye olde days as a programmer), along with a sort of “Good lord, the programming world has left me behind while I was doing physics!” feeling.

    I can’t help musing on how this relates to fault tolerance, in the abstract. The story of human civilization, and its progress toward greater complexity, is mostly about fault tolerance. If we want complex things that work, then we arrange the pieces in a hierarchy. Done right, this approach builds a structure that continues to stand even when one or more of its pieces are faulty.

    We do this socially (etiquette on Tokyo’s subway system has to be robust to a very high fault rate), financially (so far, the sub-prime crisis is not developing into a 1929), in engineering (analysis of the I-35 bridge collapse in Minneapolis shows it probably couldn’t happen with newer, failure-resistant bridge designs), and of course in computer programming (where hierarchical design allows efficient debugging, and reuse of tested code). For that matter, we are ourselves explicitly fault-tolerant critters at the DNA level. I think it’s very cool that this concept goes so deep!

  5. Robin — Not to mention the ultimate fault-tolerant machine: the human brain. Luckily for all of us, the brain’s software does not crash as often as MS Office. We must be running Linux or Mac OSs 🙂

Comments are closed.