Friday, November 26, 2010

The Science of Code

Found this blog post today via reddit. It has a really cool insight: The way people work with code is evolving into the same patterns that exist in the sciences today. At the one end you have the "physicists" - people that work with code on the lowest levels (either machine code, or algorithms, depending on your interpretation), and that can expect mathematical certainty. At the other end, you have code "biologists", that mostly work with whole organisms/programs, which are messy things, but which mostly work in mostly predictable ways.

There are a few neat consequences that you can pull out of the analogy. First, while wizards slinging machine code and novices putting scripts together are both "programmers", we probably need new designations for them, in the same way that you can't always lump physicists, chemists, and biologists together as scientists. Second, even though scripting is perceived as easier than low-level programming today, that could be because of the relative immaturity of the field, and not because it's inherently easier. See this comic, for example: physicists can look down on biologists, but biology is hard! Physics can be seen as the ultimate reductionism, and other sciences are simpler in terms of the physics they use, but harder precisely because they can't afford to reduce everything to that degree.

Higher-level programming languages, then, aren't just about simplification - they're also about specialization. (Maybe this is why domain specific languages (DSLs) are a big deal today? By creating a new language, you're jumping ahead of the existing languages in terms of specialization, which is akin to opening up a new field of study in our analogy.) By leaving some of the complexity of the lower levels behind, you're able to create new abstractions and concepts, which are interesting in and of themselves.

I think the analogy actually outstrips modern programming practices by a bit. If you want to write "organic" code, for instance, you need a specialized language like Erlang, since as far as I know it's the only language designed to handle failures of different parts of the program, and keep on running smoothly. Current languages mostly have the assumption that any fault is reason to terminate the program, because the whole thing should be 100% correct. From a physicists perspective, this is fine - if it's not 100% correct, you can't count on it doing anything right! I'm coming around to the "sloppy code" view the more I think about it, though.

The assumption that all code should be 100% correct is unreasonable in this day and age. It pains me to say it, because it goes against everything I've been taught (and quite a bit of what I've said in the past). All code is going to be a bit sloppy, simply because it's written by humans, and not by the faultless code-writing machines that those humans fancy themselves to be. What we need in the next generation of languages is more robust mechanisms for handling incorrect code; if we don't do that, we're not really designing languages to be used by human beings.

No comments: