Michael Nielsen – Page 38 – Michael Nielsen

Anonymous browsing now possible at the Academic Reader

Another update: as of a few weeks ago, it’s possible to browse anonymously for up to three days at the Academic Reader. So you can now easily try the service out and see if it’s for you.

After three days of anonymous use, you’ll be required to create an account. This isn’t imposed frivolously – it’s because the anonymous browsing is implemented using browser-based ookies, which aren’t particularly stable over periods of months. Creating an account ensures that your data (e.g., choice of journals, comments, personal feeds) won’t be lost if you’re a long-term user.

Living Reviews in Relativity added!

I just added the entire back catalog of one of my favourite journals, “Living Reviews in Relativity”, to the Academic Reader. It’s not part of the default feed set, so you’ll need to click on “Find more external feeds”, near the top of the left-hand column, if you want to add it. The journal publishes only a few articles per year, so it’s not a particularly active feed. But the articles are usually very good, so if you’re interested in relativity, it’s worth adding to your feeds.

Machine-readable Open Access scientific publishing

Over the last 50 years, scientific publishing has become remarkably profitable. The growth of commercial publishers has greatly outstripped not-for-profit society journals, and some of those commercial publishers have achieved remarkable success – for example, in 2006 industry titan Elsevier had revenues of approximately EU1.5 billion in 2006 on their science and medical journals.

Against this backdrop, an Open Access movement has emerged which has lobbied with considerable success for various types of open access to the scientific literature. Many funding bodies (including six of the seven UK Research Councils, the Australian Research Council, and the US’s NIH) are now considering making mandatory Open Access provisions for all research they support. A sign of the success of the Open Access movement is that some journal publishers have started an aggressive counter-lobbying effort, going under the Orwellian moniker “Publishers for Research Integrity in Science and Medicine” (PRISM). (For much more background, see some of the excellent blogs related to Open Access – e.g., Peter Suber, Stevan Harnad, Coturnix, and John Wilbanks).

What does Open Access actually mean? In fact, there are many different types of Open Access, depending on exactly what types of access (and when) are allowed to the papers under consideration. However, most of the effort in the Open Access movement seems to have focused on providing access to human readers of the papers, providing documents in formats like html or pdf. While these formats are good for humans, they are rather difficult for machines to break down and extract meaning from – to pick a simple example, it is not easy for a machine to reliably extract a list of authors and institutions from the raw pdf of a paper.

I believe it is important to establish a principle of Machine-readable Open Access. This is the idea that papers should be published in such a way that both the paper and its metadata (such as citations, authors, title, and so on) should be made freely available in a format that is easily machine readable. Phrased another way, it means that publishers should provide Open APIs that allow other people and organizations access to their data.

The key point of Machine-Readable Open Access is that it will enable other organizations to build value-added services on top of the scientific literature. At first, these services will be very simple – a better way of viewing the preprint arxiv, or better search engines for the scientific literature. But as time goes on, Machine-Readable Open Access will enable more advanced services to be developed by anyone willing to spend the time to build them. Examples might include tools to analyse the research literature, to discover emerging trends (perhaps using data mining and artificial intelligence techniques applied to citation patterns), to recommend papers that might be of interest, to automatically produce analyses of grants or job applications, and to point out connections between papers that otherwise would be lost.

Indeed, provided such services themselves support open APIs, it will become possible to build still higher level services, and thus provide a greater return on our collective investment in the sciences. All this will be enabled by ensuring that at each level data is provided in a format that is not only human readable, but which is also designed to be accessed by machines. For this reason, I believe that the Open Access provisions now being considered by funding agencies would be greatly strengthened if they mandated Machine-Readable Open Access.

arXiv now complete at the Academic Reader

A quick update: I recently completed uploading metadata from all the papers at the arXiv to the Academic Reader.

The tension between information creators and information organizers

In 2006, a group of Belgian newspapers sued Google, ostensibly to get snippets of their news stories removed from Google News (full story). In fact, the newspapers were well aware that this could be easily achieved by putting a suitable file on their webservers, instructing Google’s web crawler to ignore their webservers. What then was the real purpose of the lawsuit? It’s difficult to know for sure, but it seems likely that it was part of a ploy to pressure Google into paying the newspapers for permission to reuse the newspaper’s content.

This story is an example of a growing tension between creators of information, whether it be blogs, books, movies, music, or whatever, and organizers of information, such as Google. This tension is tightening sharply as people develop more services for organizing information, and profits increasingly flow toward the organizers rather than the creators.

As another example, in 2007, Google had advertising revenues of approximately 16 billion dollars(!), most of it from search. Yet, according to one study, approximately twenty-five percent of the number one search results on Google led to Wikipedia. Wikipedia, of course, does not directly benefit from Google’s advertising profits. I bet that at least some of Google’s best sources – e.g., Wikipedia, the New York Times, and some of the top blogs – are not happy that Google reaps what may seem a disproportionately large share of the advertising dollar.

Other examples of new niches in the organization of information include RSS readers (Bloglines, Netvibes); social news sites (Digg, Reddit); even my own Academic Reader. In each case, there is a natural tension between the creators of the underlying information, and the organizing service.

Now, of course, it’s greatly to the public benefit for such organizing services to thrive. However, for this to happen, a great deal of information must be made publicly available, preferably in a machine-readable format, like RSS or OAI. If the information is partially or completely locked up (think, e.g., Facebook’s friendship graph), then that enormously limits the web of value that can be built on top of the information. Yet Facebook is understandably very cautious about opening that information up, fearing that it would harm their business.

The situation is further complicated by the fact that the best people to organize and add value to information are often not the original creators of that information. This is for two reasons. First, is lack of technical expertise – the New York Times does lots of good reporting, but this doesn’t mean they’ll do a good job at providing a search interface to their archive of old articles. Second, is the problem of conflicts of interest – the New York Times would have a much harder time running something like Google News than Google does, since other news organizations would not co-operate with them.

Summing up the problem here in a single sentence, the question is this: to what extent should information be made freely accessible, in order to best serve the public interest?

There has, of course, been a lot of debate about this question, but much of that debate has centered around filesharing of music, movies and so on, where the additional value being added to the information is often minimal. The question becomes much more interesting when applied to services like Google News which add additional layers of meaning and organization to information.

At present, the legal situation is not clear. As an example, in the Belgian newspaper case, one might ask whether or not Google’s useage was acceptable under the fair use doctrine for copyright? After all, Google News only excerpted a few lines from the Belgian newspapers. Obviously, the Belgian Courts thought this was not fair use, but other jurisdictions are yet to follow suit.

If the situation today has not yet been resolved, then what might we see in an ideal future? On the one hand, it is highly desirable for information to be freely available for other people to add value. This will often mean making use of a large fraction (or all) of the content, a type of reuse not currently recognized as fair use, yet which is clearly in the public’s interest.

On the other hand, it is also highly desirable for content producers to have incentives to produce content. What we’re seeing at present is a migration of value up the chain from content creators like the New York Times to content organizers, like Google. This, in turn, is causing the content creators to erect fences around their data. The net result is not in anybody’s best interest.

I don’t know what the resolution of this problem is. But it is a real problem, and it’s going to get worse, and it worries me that we’ll end up in a world where the balance is too much one way or the other.

Basic papers on cluster-state quantum computation

Cluster-state quantum computing is an extremely interesting approach to quantum computing. Instead of doing lots of coherent “quantum gates”, as in the usual approach to quantum computing, cluster-state quantum computing provides a way of doing quantum computing with measurements alone. This is surprising from a fundamental point of view, and also turns out to be surprisingly practical – in many physical systems, but especially in optics, it seems like cluster-state quantum computation might be a lot easier to do than regular quantum computing.

Anyways, here’s a collection of basic papers on cluster-state quantum computing, which pretty much mirrors the papers I used to give my students as a way of learning the basics. It’s in no way meant to be complete, or slight anyone, or whatever – it’s just a starter pack describing a lot of the basic ideas.

(As will be obvious if you click on the link, I’m sharing this collection of papers using a website I’ve been developing, the Academic Reader. Suggestions for how to improve the sharing of collections of papers are welcome.)

APIs and the art of building powerful programs

In a recent essay, Steve Yegge observed, wisely in my opinion, that “the worst thing that can happen to a code base is size”.

Nowadays, I spend a fair bit of my time programming, and, like any programmer, I’m interested in building powerful programs with minimal effort. I want to use Yegge’s remark as a jumping off point for a few thoughts about how to build more powerful programs with less effort.

Let’s start by asking what, exactly, is the problem with big programs?

For solo programmers, I think the main difficulty is cognitive overload. As a program gets larger it gets harder to hold the details of the entire program in one’s head. This means that it gradually becomes more difficult to understand the side effects of changes to the code, and so alterations at one location become more likely to cause unintended (usually negative) consequences elsewhere. I’d be surprised if there are many programmers who can hold more than a few hundred thousand lines of code in their head, and most probably can’t hold more than a few tens of thousands. It is suggestive that these numbers are comparable to the sizes involved in other major coherent works of individual human creativity – for example, a symphony has roughly one hundred thousand notes, and a book has tens of thousands of words or phrases.

An obvious way to attempt to overcome this cognitive limit is to employ teams of programmers working together in collaboration. Collaborative programming is a fascinating topic, but I’m not going to discuss it here. Instead, in this essay I focus on how individual programmers can build the most powerful possible programs, given their cognitive limitations.

The usual way individual programmers overcome the limits caused by program size is to reuse code other people have built; perhaps the most familiar examples of this are tools such as programming languages and operating systems. This reuse effectively allows us to incorporate the codebase in these tools into our own codebase; the more powerful the tools, the more powerful the programs we can, in principle, build, without exceeding our basic cognitive limits.

I’ve recently experienced the empowerment such tools produce in a particularly stark way. Between 1984 and 1990, I wrote on the order of a hundred thousand lines of code, mostly in BASIC and 6502 assembly language, with occasional experimentation using languages such as Forth, Smalltalk and C. At that point I stopped serious programming, and only wrote a few thousand lines of code between 1990 and 2007. Now, over the last year, I’ve begun programming again, and it’s striking to compare the power of the tools I was using (and paying
for) circa 1990 with the power of today’s free and open source tools. Much time that I would formerly have spent writing code is now instead spent learning APIs (application programming interfaces) for libraries which let me accomplish in one or a few lines of code what would have formerly taken hundreds or thousands of lines of code.

Let’s think in more detail about the mechanics of what is going on when we reuse someone else’s code in this way. In a well-designed tool, what happens is that the internals of the tool’s codebase are hidden behind an abstract external specification, the API. The programmer need only master the API, and can ignore the internal details of the codebase. If the API is sufficiently condensed then, in principle, it can hide a huge amount of functionality, and we need only learn a little in order to get the benefit of a huge codebase. A good API is like steroids for the programming mind, effectively expanding the size of the largest possible programs we can write.

All of this common knowledge to any programmer. But it inspires many natural questions whose answers perhaps aren’t so obvious: How much of a gain does any particular API give you? What makes a particular API good or bad? Which APIs are worth learning? How to design an API? These are difficult questions, but I think it’s possible to make some simple and helpful observations.

Let’s start with the question of how much we gain for any given API. One candidate figure of merit for an API is the ratio of the size of the codebase implementing the API to the size of the abstract specification of the API. If this ratio is say 1:1 or 3:2 then not much has been gained – you might just as well have mastered the entire codebase. But if the ratio is 100:1 then you’ve got one hundred times fewer details you need to master in order to get the benefit of the codebase. This is a major improvement, and potentially greatly expands the range of what we can accomplish within our cognitive limits.

One thing this figure of merit helps explain is when one starts to hit the point of diminishing returns in mastering a new API. For example, as we master the initial core of a new library, we’re often learning a relatively small number of things in the API that nonetheless enable us to wield an enormous codebase. Thus, the effective figure of merit is high. Later, when we begin to master more obscure features of the library, the figure of merit drops substantially.

Of course, this figure of merit shouldn’t be taken all that seriously. Among its many problems, the number of lines of code in the codebase implementing the API is only a very rough proxy for the sophistication of the underlying codebase. A better programmer could likely implement the same API in fewer lines of code, but obviously this does not make the API less powerful. An alternate and better figure of merit might be the ratio of the time required for you to produce a codebase implementing the API, versus the time required to master the API. The larger this ratio, the more effort the API saves you. Regardless of which measure you use, this ratio seems a useful way of thinking about the power of an API.

A quite different issue is the quality of the API’s design. Certain abstractions are more powerful and useful than others; for example, many programmers claim that programming in Lisp makes them think in better or more productive ways. This is not because the Lisp API is especially powerful according to the figure of merit I have described above, but rather because Lisp (so I am told) introduces and encourages programmers to use abstractions that offer particularly effective ways of programming. Understanding what makes such an abstraction “good” is not something I’ll attempt to do here, although obviously it’s a problem of the highest importance for a programmer!

Which APIs are worth learning? This is a complicated question, which deserves an essay in its own right. I will say this: one should learn many APIs, across a wide variety of areas, and make a point of studying multiple APIs that address the same problem space using greatly different approaches. This is not just for the obvious reason that learing APIs is often useful in practice. It’s because, just as writers need to read, and movie directors should watch movies from other directors, programmers should study other people’s APIs. They should study other people’s code, as well, but it’s a bit like a writer studying the sentence structure in Lord of the Rings; you’ll learn a lot, but you may just possibly miss the point that Frodo is carrying a rather nasty ring. As a programmer, studying APIs will alert you to concepts and tricks of abstraction that may very well help in your own work, and which will certainly help improve your skill at rapidly judging and learning unfamiliar APIs.

How does one go about mastering an API? At its most basic level mastery means knowing all or most of the details of the specification of the API. At the moment I’m trying to master a few different APIs – for Ruby, Ruby on Rails, MySQL, Amazon EC2, Apache, bash, and emacs. I’ve been finding it tough going, not because it’s particularly difficult, but just because it takes quite a lot of time, and is often tedious. After some experimentation, the way I’ve been going about it is to prepare my own cheatsheets for each API. So, for example, I’ll take 15 minutes or half an hour and work through part of the Ruby standard library, writing up in my cheatsheet any library calls that seem particularly useful (interestingly, I find that when I do this, I also retain quite a bit about other, less useful, library calls). I try to do this at least once a day.

(Suggestions from readers for better ways to learn an API would be much appreciated.)

Of course, knowing the specification of an API is just the first level of mastery. The second level is to know how to use it to accomplish real tasks. Marvin Minsky likes to say that the only way you can really understand anything is if you understand it in at least two different ways, and I think a similar principle applies to learning an API – for each library call (say), you need to know at least two (and preferably more) quite different ways of applying that library call to a real problem. Ideally, this will involve integrating the API call in non-trivial ways with other tools, so that you begin to develop an understanding of this type of integration; this has the additional benefit that it will simultaneously deepen your understanding of the other tools as well.

Achieving this second level of mastery takes a lot of time and discipline. While I feel as though I’ve made quite some progress with the first level of mastery, I must admit that this second level tries my patience. It certainly works best when I work hard at finding multiple imaginative ways of applying the API in my existing projects.

There’s a still higher level of mastery of an API, which is knowing the limits of the abstract specification, and understanding how to work around and within those limits. Consider the example of manipulating files on a computer file system. In principle, operations like finding and deleting files should be more or less instantaneous, and for many practical purposes they are. However, if you’ve ever tried storing very large number of files inside a single directory (just how large depends on your file system), you’ll start to realize that actually there is a cost to file manipulation, and it can start to get downright slow with large numbers of files.

In general, for any API the formal specification is never the entire story. Implicit alongside the formal specification is a meta-story about the limits to that specification. How does the API bend and break? What should you do when it bends or breaks? Knowing these things often means knowing a little about the innards of the underlying codebase. It’s a question of knowing the right things so you can get a lot of benefit, without needing to know a huge amount. Poorly designed APIs require a lot of this kind of meta-knowledge, which greatly reduces their utility, in accord with our earlier discussion of API figures of merit.

We’ve been talking about APIs as things to learn. Of course, they are also things you can design. I’m not going to talk about good API design practice here – I don’t yet have enough experience – but I do think it’s worth commenting on why one ought to spend some fraction of one’s time designing and implementing APIs, preferably across a wide variety of domains. My experience, at least, is that API design is a great way of improving my skills as a programmer.

Of course, API design has an immediate practical benefit – I get pieces of code that I can reuse at later times without having to worry about the internals of the code. But this is only a small part of the reason to design APIs. The greater benefit is to improve my understanding of the problems I am solving, of how APIs function, and what makes a good versus a bad API. This improved understanding makes it easier to learn other APIs, improves how I use them, and, perhaps most important of all, improves my judgement about which APIs to spend time learning, and which to avoid.

Fred Brooks famously claimed that there is “no silver bullet” for programming, no magical idea or technique that will make it much easier. But Brooks was wrong: there is a silver bullet for programming, and it’s this building of multiple layers of abstraction using ever more powerful tools. What’s really behind Brooks’ observation is a simple fact of human psychology: as more powerful tools become available, we start to take our new capabilities for granted and so, inevitably, set our programming sites higher, desiring ever more powerful programs. The result is that we have to work as hard as ever, but can build more powerful tools.

What’s the natural endpoint of this process? At the individual level, if, for example, you master the API for 20 programming tools, each containing approximately 50,000 lines of code, then you can wield the power of one million lines of code. That’s a lot of code, and may give you the ability to create higher level tools that simply couldn’t have been created at the lower level. Those higher level tools can be used to create still higher level tools, and so on. Stuff that formerly would have been impossible first becomes possible, then becomes trivial, and finally becomes invisible, absorbed into higher-level primitives. If we move up a half a dozen levels, buying a factor of 2-5 in power at each layer, the result is that we can get perhaps a factor of 1,000 or more done. Collectively, the gain for programmers over the long run is even greater. As time goes on we will see more and more layers of abstraction, built one on top of the other, ever expanding the range of what is possible with our computing systems.

Killer Bean Forever

The lead animator of the Matrix has spent the last 4 years working 14 hours a day to create a full-length feature animated movie called Killer Bean Forever (site and trailer).

Aside from general awesomeness (and that’s quite an aside) it’s amazing that it’s even possible for a single person to create something on the scale of Killer Bean Forever. With amazing open source 3d suites like Blender quickly catching up with the top commerical products like Maya, and people open sourcing more and more CG effects, it won’t be long before products like Killer Bean Forever can be produced by less dedicated individuals.

A simple Wiki with Ruby on Rails

I prepared the following simple demo for RailsNite Waterloo. It’s a very simple Wiki application, illustrating some basic ideas of Ruby on Rails development.

To get the demo running, we need a Ruby on Rails installation. I won’t explain here how to get such an installation going. See the Rails site to get things up and running. I’ll assume that you’re using an installation which includes Ruby on Rails version 1.2.* with MySQL, running on Windows, from the command line. Most of this should work with other installations as well, but I haven’t tested it.

We start from the command line, and move to the “rails_apps” directory, which typically sits somewhere within the Ruby on Rails installation. From the command line we run:

rails wiki
cd wiki

This creates a new directory called wiki, and installs some basic files into that directory. What are those files? To understand the
answer to that question, what you need to understand is that Ruby on Rails really has two parts.

The first part is the Ruby programming language, which is a beautiful object-oriented programming language. Ruby is a full-featured programming language, and can be used to do all the things other programming languages can do. Like most programming languages, Ruby has certain strengths and weaknesses; Ruby sits somewhere in the continuum of programming languages near Python and Smalltalk.

The second part of the framework is Ruby on Rails proper, or “Rails” as we’ll refer to it from now on. Rails is essentially a suite of programs, written in Ruby, that make developing web applications in Ruby a lot easier. What happened when you ran rails wiki above is that Rails generated a basic Ruby web application for you. What all those files are that were generated is the skeleton of a Ruby web application.

So what Rails does is add an additional layer of functionality on top of Ruby. This sounds like it might be ugly, but in fact Ruby is designed to be easily extensible, and in practice Rails feels like a very natural extension of ordinary Ruby programming.

To get a Rails application going, we need to do one more piece of configuration. This is generating a database that will be used to store the data for our application. We do this using mysqladmin, which comes with MySQL:

mysqladmin -u root create wiki_development

If you’re not all that familiar with MySQL you may be wondering whether you’ll need to learn it as well as Ruby and Rails. The answer is that for basic Rails applications you only need to know the very basics of MySQL. For more advanced applications you’ll need to know more, but the learning curve is relatively gentle, and you can concentrate on first understanding Ruby and Rails. In this tutorial I’ll assume that you have a basic understanding of concepts such as tables and rows, but won’t use any complex features of relational databases.

With all our configuration set up, lets start a local webserver. From the command line type:

ruby script/server

Now load up http://localhost:3000/ in your browser. You should see a basic welcome page. We’ll be changing this shortly.

Let’s get back to the database for a minute. You may wonder why we need a database at all, if Ruby is an object-oriented language. Why not just use Ruby’s internal object store?

This is a good question. One reason for using MySQL is that for typical web applications we may have thousands of users accessing a site simultaneously. Ruby wasn’t designed with this sort of concurrency in mind, and problems can occur if, for example, two users try to modify the same data near-simultaneously. However, databases like MySQL are designed to deal with this sort of problem in a transparent fashion. A second reason for using MySQL is that it can often perform operations on data sets much faster than Ruby could. Thus, MySQL offers a considerable performance advantage.

Using MySQL in this way does create a problem, however. Ruby is an object-oriented programming language, and it’s designed to work with objects. If all our data is being stored in a database, how can we use Ruby’s object-orientation? Rails offers a beautiful solution to this problem, known as Object Relational Mapping (ORM). One of the core pieces of Rails is a class known as ActiveRecord which provides a way of mapping between Ruby objects and rows in the database. The beauty of ActiveRecord is that from the programmer’s point of view it pretty much looks like the rows in the database are Ruby objects!

This is all a bit abstract. Let’s work through an example of ActiveRecord in action. The basic object type in our wiki is going to be a page. Let’s ask Rails to generate a model named Page:

ruby script/generate model Page

You should see the following:

      exists  app/models/
      exists  test/unit/
      exists  test/fixtures/
      create  app/models/page.rb
      create  test/unit/page_test.rb
      create  test/fixtures/pages.yml
      create  db/migrate
      create  db/migrate/001_create_pages.rb

For our purposes, the important files are app/models/page.rb, which contains the class definition for the Page model, and 001_create_pages.rb, which is the file that will set up the corresponding table in the database

(You’ll notice, by the way, that 001_create_pages.rb is pluralized and in lower case, when our original model is not. This is one of the more irritating design decisions in Rails – it automatically pluralizes model names to get the corresponding database table name, and the cases can vary a lot. It’s something to watch out for.)

The next step is to decide what data should be associated with the Page model. We’ll assume that every page has a title, and a body, both of which are strings. To generate this, edit the file db/migrate/001_create_pages.rb so that it looks like this:

class CreatePages < ActiveRecord::Migration
  def self.up
    create_table :pages do |t|
      t.column "title", :string
      t.column "body", :string
    end
  end

  def self.down
    drop_table :pages
  end
end

This is known as a migration. It’s a simple Ruby file that controls changes made to the database. The migration can also be reversed – that’s what the “def self.down” method definition does. By using a series of migrations, it is possible to both make and undo modifications to the database structure used by your Rails application.

Notice, incidentally, that Rails created most of the migration code for you when you asked it to generate the model. All you have to do is to fill in the details of the fields in the database table / object model.

The actual creation of the database is now done by invoking the rake command, which is the Ruby make utility:

rake db:migrate

Incidentally, when run in devleopment mode (the default, which we’re using) the Rails webserver is really clever about reloading files as changes are made to them. This means that you can see the effect of changes as you make them. However, this doesn’t apply to changes to the structure of the database, and it’s usually a good idea to restart the webserver after using rake to run a migration. If you’re following along, do so now by hitting control C to interrupt the webserver, and then running ruby script/server again.

Now that we have a Page class set up, the next step is to add a way of interacting with the model over the web. Our wiki is going to have three basic actions that it can perform: (1) creating a page; (2) displaying a page; (3) editing a page.

To make this happen, we ask Rails to generate what is known as a controller for the Page model:

ruby script\generate controller Page

Once again, this generates a whole bunch of Ruby code. The most important for us is the file app/controller/page_controller.rb. When generated it looks like:

class PageController < ApplicationController
end

What we want is to add some Ruby methods that correspond to the three actions (displaying, creating, and editing a page) that we want to be able to do on a page. Edit the file to add the three method definitions:

class PageController < ApplicationController

def create_page
end

def display_page
end

def edit_page
end

end

(Incidentally, the names here are a bit cumbersome. I started with the simpler method names create, display and edit, and then wasted an hour or so, confused by various weird behaviour caused by the fact that the word display is used internally by Ruby on Rails. A definite gotcha!)

These methods don’t do anything yet. In your browser, load the URL http://localhost:3000/page/create_page. You’ll get an error message that says “Unknown action: No action responded to create”. In fact, what has happened is that Rails parses the URL, and determines from the first part (“page”) that it should load page_controller.rb, and from the second part that it should call the create_page action.

What is missing is one final file. Create the file app/views/page/create_page.rhtml, and add the following:

Hello world

Now reload the URL, and you should see “Hello world” in your browser. Let’s improve this so that it displays a form allowing us to create an instance of the Page model. Let’s re-edit the file so that it looks like this instead:

<% form_for :page, :url => {:action => :save_page} do |form| %>
  <p>Title: <%= form.text_field :title, :size => 30 %></p>
  <p>Body: <%= form.text_area :body, :rows => 15 %></p>
  <p><%= submit_tag "Create page" %></p>
<% end %>

There’s a lot going on in this code snippet. It’s not a raw html file, but rather a template which blends html and Ruby. In particular, if you want to execute Ruby code, you can do so using:

<% INSERT RUBY CODE HERE %>

All Ruby code is treated as an expression, and returns a value. If you want the value of that expression to be displayed by the template, you use a slight variant of the above, with an extra equals sign near the start:

<%= INSERT RUBY EXPRESSION TO BE EVALUATED HERE %>

The first line of the snippet tells us that this is a form for objects of class Page, and that when the form is submitted, it should call the save_page action in the page controller, which we’ll add shortly. The result of the form is pretty straightforward – it does more or less what you’d expect it to do. Let’s add a save action (i.e., a method) to the page controller:

def save_page
  new_page = Page.create(params[:page])
  redirect_to :action => "display_page", :page_id => new_page.id
end

What happens is that when the submit tag is clicked by the user, the details of the form field are loaded into a Ruby has called params[:page]. We then create a new Page model object using Page.create(params[:page]), which we call new_page. Finally, we redirect to the action display_page, passing it as a parameter a unique id associated to the new page we’ve just created.

Let’s now create a view for the display_page action. Start by editing the display_page action so that it looks like:

def display_page
  @page = Page.find_by_id(params[:page_id])
end

What is happening is that the :page_id we were passed before is being passed in as a hash, params[:page_id], and we are now asking Rails to find the corresponding model object, and assign it to the variable @page. We now create a view template that will display the corresponding data, in app/views/page/display_page.rhtml:

<h1><%= @page.title %></h1>
<%= @page.body %>

Okay, time to test things out. Let’s try loading up the URL http://localhost:3000/page/create_page. Type in a title and some body text, and hit the “Create page” button. You should see a webpage with your title and body text.

Let’s modify the page slightly, adding a link so we can create more pages. Append the following to the above code for the display_page template:

<%= link_to "Create page", :action => "create_page" %>

This calls a Rails helper method that generates the required html. Of course, in this instance it would have been almost equally easy to insert the html ourselves. However, the syntax of the above helper method generalizes to much more complex tasks as well, and so it’s worth getting used to using the Rails helpers.

In our skeleton for the page controller we had an edit_page action. This could be done along very similar lines to the create_page action we’ve already described. In fact, there’s an interesting alternative, which is to use Rails’ built in Ajax (Javascript) libraries to edit the fields inplace. We’ll try this instead.

To do it, we need to make sure that the appropriate Javascript libraries are loaded whenever we load a page. There are many ways of achieving this, but one way is to generate a general html layout that will be applied application wide. Create a file named app/views/layouts/application.rhtml with the contents:

<html>
<head>
<%= javascript_include_tag :defaults %>
</head>
<body>
<%= yield %>
</body>
</html>

The java_script_include_tag helper ensures that the appropriate javascript libraries are loaded. Whenever any view from the application is displayed, the output from the view template will be inserted where the yield statement is.

The final steps required to get this to work are to first delete the edit_page method from the page controller. Then modify the controller by inserting two lines so that the top reads:

class PageController < ApplicationController

in_place_edit_for :page, :title
in_place_edit_for :page, :body

[...]

Modify app/views/display_page.rhtml so that it reads:

<h1><%= in_place_editor_field :page, :title %></h1>
<%= in_place_editor_field :page, :body %>

Once again, we’re using Rails helpers to make something very simple. Let’s modify it a bit further, adding a div structure, adding a link to make page creation easy, and adding a list of all pages in the database, with links to those pages.

<div id="main">
  <h1><%= in_place_editor_field :page, :title %></h1>
  <%= in_place_editor_field :page, :body, {}, {:rows => 10} %>
</div>

<div id="sidebar">
  <p><%= link_to "Create a page", :action => "create_page" %></p>
  <center><h3>Existing pages</h3></center>
  <% for page in Page.find(:all) %>
    <%= link_to page.title, :action => :display_page, :page_id => page.id %>
    <br>
  <% end %>
</div>

We’ll use a similar div structure for the create_page action:

<div id="main">
  <% form_for :page, :url => {:action => :save_page} do |form| %>
    <p>Title: <%= form.text_field :title, :size => 30 %></p>
    <p>Body: <%= form.text_area :body, :rows => 15 %></p>
    <p><%= submit_tag "Create page" %></p>
  <% end %>
</div>

Let’s modify the layout in app/views/layouts/application.rhtml in order to load a stylesheet, and add a shared header:

<html>
<head>
<%= stylesheet_link_tag 'application' %>
<%= javascript_include_tag :defaults %>
</head>
<body>

<div id="header">
  <center><h1>RailsNite Wiki</h1></center>
</div>

<%= yield %>
</body>
</html>

Finally, let’s drop a stylesheet in. Here’s a very simple one, that goes in public/stylesheets/application.css:

body 	{
	font-family: trebuchet ms, sans-serif;
	font-size: 16px;
	}

#header {
	position: absolute;
	top: 0em;
	left: 0em;
	right: 0em;
	height: 5em;
        background: #ddf;
	}

#main {
	position: absolute;
	top: 5em;
	left: 0em;
	right: 20em;
	padding: 1em;
	}

#sidebar {
	position: absolute;
	top: 5em;
	right: 0em;
	width: 20em;
	background: #efe;
	}

There you have it! A very simple Wiki in 42 lines of Rails code, with a few dozen extra lines of templates and stylesheets. Of course, it’s not much of a wiki. It really needs exception handling, version histories for pages, user authentication, and a general clean up. But it is nice to see so much added so quickly, and all those other features can be added with just a little extra effort.

Refactoring checklist

Programming has much in common with other kinds of writing. In particular, one of the goals of programming, as with writing, is to write in a way that is both beautiful and interesting. Of course, as with other kinds of writing, one’s first drafts are typically neither beautiful nor interesting. Rather, one starts by hacking something out that achieves one’s ends, without worrying too much about the quality of the code.

For a certain kind of program, this approach is sufficient. One may write a quick throwaway program to accomplish a one-off task. However, for more ambitious programs, this kind of throwaway approach is no longer feasible; the more ambitious the goals, the higher the quality of the code you must produce. And the only way to get high quality code is to start with the first draft of your code, and then gradually rewrite – refactor – that code, improving its quality.

I’m a relative novice as a programmer, and I’m still learning the art of refactoring. I try to set aside a certain amount of time for improving the quality of my code. Unfortunately, in my refactoring attempts, I’m occasionally stymied, simply going blank. I may look at a section of code and think “gosh, that’s terrible”, but that doesn’t mean that I instantly have concrete, actionable ways of improving the quality of the code.

To address this problem, I produced the following refactoring checklist as a way of stimulating my thinking when I’m refactoring. The items on the checklist are very and mechanical to check, and so provide a nice starting point to improve things, and to get more deeply into the code.

Start by picking a single method to refactor. Attempting to refactor on a grander scale is too intimidating.
If a method is more than 40 lines long, look for ways of splitting out pieces of functionality into other methods.
Are variables, methods, classes and files well named? This question can often be broken down into two subquestions. First, is there any way I can improve the readability of my code by changing the names? Second, is my naming consisten? So, for example, a file named “bunny” should define a class named “bunny”, not “rabbit”.
When I see a problem, do something about it, even if I can’t see a fix right away. In particular, at the very least add a comment which attempts to describe the problem. This makes future refactoring easier, and sends a clear message to my subconscious that I mean to eliminate all the bad code from my program. As a bonus, more often than not it stimulates some ideas for how to improve the code.
Partial solutions are okay. It’s fine to replace an ugly hack by a slightly less ugly hack.
How readable is the code being refactored? In particular, is it obvious what this section of code does? If not, how could I make it obvious?
Explain the code to someone else. I’ve never explained a piece of code to someone without having a bunch of ideas for how to improve it.
Is there a way of abstracting away what is being done, in a way that would enable other uses?