Friday, January 30, 2009

IPv6 update!

Just for posterity's sake, this is what I'm using now for 6to4 on my DD-WRT v24sp1 router. (They said it couldn't be done, proving once again that "they" aren't usually that bright.)

I'm using the following startup script:

sleep 5
insmod ipv6
WANIP=$(ip -4 addr show dev eth1 | awk '/inet/ {print $2}' | cut -d/ -f1)
if [ -n "$WANIP" ]
V6PREFIX=$(printf '2002:%02x%02x:%02x%02x' $(echo $WANIP | tr . ' '))
ip tunnel add tun6to4 mode sit ttl 255 remote any local $WANIP
ip link set tun6to4 mtu 1280
ip link set tun6to4 up
ip addr add $V6PREFIX:0::1/16 dev tun6to4
ip addr add $V6PREFIX:1::1/64 dev br0
ip -6 route add 2000::/3 via :: dev tun6to4

Here is my Radvd configuration:

interface br0 {
MinRtrAdvInterval 3;
MaxRtrAdvInterval 10;
AdvLinkMTU 1280;
AdvSendAdvert on;
prefix 0:0:0:1::/64 {
AdvOnLink on;
AdvAutonomous on;
AdvValidLifetime 86400;
AdvPreferredLifetime 86400;
Base6to4Interface eth1;
AdvRouterAddr on;

This configuration autodetects the external IPv4 address and uses that to compute everything else, so it should work unmodified on your router. (If it doesn't work for you, the most likely problem will be the external interface - it changed from vlan1 to eth1 between v23 and v24, which really threw me off for a while.)

The only remaining problem with this configuration is that Radvd doesn't come up automatically when the router boots up, which means that while the router will have IPv6 connectivity, but none of the hosts on your network will. As a workaround, you can go into the Administration menu in the web interface to disable/enable Radvd, after which it should work. (Suggestions for a real solution are welcome!)

If you find yourself having to debug IPv6 issues, by the way, is an invaluable tool.

Tuesday, January 27, 2009


I totally meant to write a real post today, I really did. But around lunchtime, this strange, terrible urge came over me, the urge to post this video instead. I tried to fight it, but it was for naught; the video is simply too powerful.

Monday, January 26, 2009

To leave one's mark

So I was reading about RFCs today, because I was curious about how they actually get written and standardized. Somewhere along the line, I saw a reference to Jon Postel, and looked him up since he sounded familiar. It turns out that he was one of the original founders of the Internet, and did a lot of things, but the role I'm mainly talking about today is that of the designated RFC editor. He passed away a little over a decade ago, which leads into my main point here, but first some background.

The RFCs are a series of documents which, not to put too fine a point on it, make up the Internet. Every major standard and protocol on the Internet is defined by one. (For example, HTTP is officially defined by RFC 2616.) The name stands for "Request for Comments", but this is really a misnomer these days, since by the time something becomes an RFC it is well past the point of requesting comments. RFCs are never changed; instead, new RFCs are written to replace old ones, so in some sense they form an immutable history of the Internet, going back almost 40 years.

Which brings me back to Jon Postel. As the RFC editor, and the author of numerous RFCs, he had a tremendous influence on the Internet for much of its history. Even so, I was shocked when I found out that his obituary (which was written by Vint Cerf, no less!) was itself made into an RFC; he is remembered in RFC 2468.

There's just something about being immortalized forever in the fabric of the Internet that absolutely blows me away.

Friday, January 23, 2009


There is this annoyingly widespread perception among programmers that C is magically faster than anything else. In one sense this is true - if you have a program in some other language which is faster than the C equivalent, you can turn it into an identical C program, since C allows you to write assembly code inline. This isn't a very useful thing to actually do, however. In practice, C has to give up on a lot of performance, for a few reasons.

  1. Pernicious overallocation: When allocating memory, you either have to have complicated code to grow buffers on the fly, or allocate as much as you could ever possibly need up front. In C, the latter is almost always chosen, for simplicity's sake. This leads to programs taking up more memory than they really need to.
  2. Terrible string support: Or to be more precise, nonexistent. C strings have two major problems. First, null-terminated strings are slow - to get the length of a string, you have to read through the entire thing! They also makes it unreasonably difficult to apply certain loop optimizations to strings. Second, having strings as plain arrays means that you can't make immutable strings, which in turn means that every time you pass a string to code you don't control, the safe thing to do is make a copy of it. (See 1.)
  3. Pointers: In a nutshell, the problem here is that it's difficult (maybe impossible?) for the compiler to get complete data about what pointers go with what memory, and that hides a lot of optimizations from the compiler. A lot of work has been put into pointer analysis, and they've accomplished some pretty amazing stuff, but at the end of the day, it's still not a solved problem.
Why is it such a problem when stuff gets hidden from the compiler? C is as fast as it is for two reasons. The first is that, being a lower-level language, it allows people to write more low-level code. The second is that, after a few decades of improvement, it has really really good optimizing compilers. For a language that depends on compiler as much as C does, it's essential that the language not get in the compiler's way.

There's more I want to write about this, but I also kind of want to sleep. >_>

Wednesday, January 21, 2009

Arithmetic failure

So unfortunately it turns out that two half-written posts don't make a whole one! Therefore, no update today. >_>

Sunday, January 18, 2009

How to get IPv6 working

As of today, I finally have working IPv6 connectivty. It's taken me a while, and there were a lot of false starts and dead ends, so I thought I'd share how I did it.

  1. Hear about this new IPv6 thing - it sounds neat!
  2. Look up how to get connected to it
  3. Give up immediately because of the wall of intimidating jargon
  4. Wait a few months
  5. Try again, learning the jargon this time
  6. Get a tunnel from Hurricane Electric
  7. Go through steps to set up tunnel, only to discover that it doesn't work
  8. Give up again
  9. Wait for a year or so
  10. Try again, this time with Teredo
  11. Get a tunnel which works well initially, but ends up being somewhat unstable and annoying to use
  12. Give up again
  13. Wait for a few more months
  14. Try again, this time getting a tunnel from SixXS
  15. Fill out tons of personal information because apparently it would be terrible news if the wrong people got their hands on IPv6
  16. Attempt to set up tunnel
  17. Run into SixXS incredibly strict restrictions on common things you would do with a tunnel, such as setting it up in the first place
  18. Give up again
  19. Wait for a few more months
  20. Move into an apartment, with a roommate who uses DD-WRT. DD-WRT supports ipv6 in all forms, so this can't possibly go wrong, right?
  21. Discover that the latest version of DD-WRT has dodgy ipv6 support, because they happened to leave it out of some builds for no good reason. Whoops!
  22. Attempt to use Hurricane Electric again
  23. Set up tunnel - this time, believe it or not, it... fails completely, again.
  24. Contact HE tech support
  25. Get response from HE tech support - "it will never work give up now"
  26. Give up again
  27. Wait a month or two
  28. Try to set up 6to4 tunneling on the router
  29. Gaze in wonder as the tunnel actually works!
  30. Remove the old Hurricane Electric configuration stuff, breaking everything for some reason
  31. Wail; gnash teeth
  32. Google for 6to4 troubleshooting - turns out that this kind of information does not meet Google's stringent standards for what should be on the internet
  33. Go to #ipv6 on Freenode, on a hunch that they know what they're doing
  34. These guys are such badasses that they'll find and solve the last remaining problem in about 15 minutes
  35. Now you have IPv6!
  36. ???
  37. Profit
Now, that wasn't so hard, was it? When we run out of IP addresses in just over two years, I feel confident that everybody will be able to switch to IPv6 smoothly.

Thursday, January 15, 2009

If you're not one of us...

American society is splitting right down the middle, and I don't hear about how disturbing this is nearly often enough. Right now, we're slowly developing a split culture, with conservatives and liberals on opposite sides. There are historical parallels in older socialist movements, where members would live exclusively in the Party world, but I think that what's happening now is on a large enough scale that it should be regarded as something completely new.

Sarah Palin drove this home, in her own embarrassingly clumsy way, when she called on Real America - by which she meant the half of the country that thinks like her. You see examples of it occasionally, such as Conservapedia. People are trying to build an alternate, ideologically pure, parallel universe. I think conservatives are more guilty of this, for various reasons, but you see examples of it on the political left too.

The effects of the split are obvious. Wedge issues such as abortion become completely intractable, to the point that the two sides aren't even talking to each other any more, just yelling slogans. People acquire strong opinions on issues without even understanding them, based entirely on their party affiliation. Moderates and independents are marginalized, since everybody tries to classify them as siding more with one party over the other.

Who or what is driving this? I honestly don't have an answer to that one. What do you guys think?

Tuesday, January 13, 2009

Long-term energy prospects

When people bring up alternative energy, something deep within me compels me to bring up solar energy. I can't help it. I am That Guy. There is something about solar power that just speaks to me; it seems like a clean solution, both in terms of environmental impact and in terms of design.

But I should clarify. When I talk about solar energy, I'm kind of cheating, since I don't really mean photovoltaics. The fact is, a lot of the energy that's readily available on Earth is solar power. Wind? Driven largely by differences in air temperature, which the sun is responsible for. Hydroelectric? Driven by the water cycle, which is powered by the sun. Biofuels? Photosynthesis. (This last one is the most interesting to me - petroleum is a crutch, but if we can engineer plants to produce it (and we can), we can keep leaning on it basically forever, at which point it ceases to be a crutch.)

The fact is, there are a lot of ways to utilize solar power. What's more, it's incredibly abundant, and we know the sun will be around for billions of years. If you're looking at sci-fi scale stuff, the sun is basically the only game in town - nothing else even comes close to it in terms of raw energy output.

Still, for our everyday mundane energy needs, there are a few alternatives to solar power. For example, there's a lot of energy stored in the Earth, in the forms of both radioactive ores (nuclear and geothermal power), and the Earth's magnetic field (it should be possible to get energy from it, but right now that's too exotic to have a name). Both of these are fixed quantities, but they're large fixed quantities, and radioactive metals are relatively easy to get at.

Another energy source is tidal energy, which is actually an interesting case. If you think about it, tides are generated by the interaction of the Earth and the Moon, so tidal energy is actually borrowing energy from the Earth/Moon system somehow. Unfortunately, I can't quite figure out where the energy is coming from. My gut tells me that it's coming from the Earth's rotation, but my understanding of how tides work is kind of tenuous, so I'm not sure. Anybody wanna back me up on this?

Monday, January 12, 2009

Intellectual Dishonesty

First, courtesy of Kiriska: Flying Car! So far this year is off to a pretty good start.

On one of the other blogs I read, they recently linked to the following video. Watch it if you want, and try to spot the mathematical error.

Edit: So apparently the video is protected now, d'oh. I'll just summarize it. The person basically claims that the Bible predicts the speed of light. Their evidence is that, if you take the bit that says "for God ten thousand years is like a day", and calculate what speed God would have to be traveling for that to be true (based on relativistic time dilation) you end up with something which is pretty close to the speed of light.

What I really want to know is this. To anybody who could actually work out the math in the video, the error they've made is trivial. So, who made this video, and why? Is it somebody who is well-intentioned, but incredibly dense? Are they actively trying to fool people? Is this just some kind of elaborate meta-troll?

I want to believe the third one, but sometimes I wonder.

Friday, January 9, 2009

No post today

I am basically feeling too lazy to do a real post. >_>

Wednesday, January 7, 2009

Generalizing pipes

The pipe is a fairly fundamental piece of UNIX. By using the command line to string programs together, you can create far more powerful programs out of basic building blocks. Pipes have a ton of other advantages, too: you can write different components in different programming languages, since every programming language worth talking about supports the standard input/output streams; you can get pipeline parallelization basically for free, since the OS handles multiple concurrent processes; testing is easy, since you can take each part individually and feed it input manually.

Pipes have an important restriction, however - you can only chain then one after another. There's no clean way to, say, split a stream to multiple other commands, or to merge streams. This has actually been an issue for me in the past, with certain video encoding scenarios, and the solutions are pretty dismal. If you want to split a stream, you can either write the output to a temporary file and run the next stage separately (untenable for video encoding, since the temporary file in question would probably be a few dozen gigabytes), or you can run all the preliminary stages multiple times (workable, but slower than it needs to be).

What would a generalized pipe utility look like? Instead of taking a string of commands to execute in series, it should support execution of a graph of commands, with the edges in the graph representing dataflow, and the nodes being commands. This is really too much information to express on the command line concisely, so it would have to load the graph from a file. Splitting streams is relatively easy, and can be accomplished with existing pipe semantics. Merging streams is somewhat more difficult, and won't be possible without specifying separate text and binary modes (since really, it's not even clear what it means to merge two or more binary streams).

To be honest, this post is basically just a justification for a utility I wish existed. (Of course I've written a prototype, do you even have to ask?)

Sunday, January 4, 2009

Threading is not that hard

So there I was the other day, chilling in #python on freenode, when the discussion turns to threading. There is a regrettable tendency in the Python community to treat threading as some kind of voodoo black magic that should never be used. When I chimed in that I didn't think threading was all that hard, I was immediately told by several people that I was probably doing it wrong. Well, as you can imagine, that pissed me off enough that I'm writing this post now, on how to do threading the easy way.

Disclaimer: Most of this post owes a lot to Prof. Mike Dahlin, and his undergrad OS class. I've made some changes and added some stuff, but he should still probably get most of the credit for this list.

Here is what you do:
  • Identify your shared state, and create monitors for each shared piece of data - In Java locks and condition variables are available in java.util.concurrent.locks; in Python they're in the threading module. Shared data, broadly speaking, is anything that's accessed from more than one thread.
  • Whenever you access the shared data, do the following steps:

  • Acquire the lock
    Wait() on the condition variable for whatever conditions your code needs - spurious interrupts are generally allowed, so you should do this in a loop
    Optionally, assert your invariants
    Do your thing with the shared data
    Optionally, assert your invariants again
    Notify() the condition variable
    Release the lock

    This is sufficient to ensure that you won't get concurrent access to shared state, and you can avoid a whole lot of interesting and time-consuming bugs.

  • Acquire your locks in a consistent order - If two threads acquire the same locks in different orders, there's a chance of a deadlock.
  • Make sure that any given piece of code is only used from one thread - this isn't necessary for correctness, but it will save you a lot of headaches when you're debugging. Debug backtraces include line numbers, but they don't (as far as I know?) include which thread was running when the error happened. Don't make yourself guess! For larger projects, you should probably take this even further, and make sure that code that runs in separate threads goes in separate files.
  • Look at all your threads, figure out which other threads they could wait for, and put that information into a dependency graph. If that graph has any cycles, there's a chance your program could deadlock. A cycle doesn't guarantee a deadlock, of course, but an acyclic dependency graph will guarantee a deadlock-free program.
  • The while() { sleep() } pattern is just flat-out wrong, and painfully slow to boot. I've been guilty of it myself, so I know how tempting it is to use it when you don't know about condition variables, but now that you do, you have no excuse. There is no situation (that I can think of, anyway) where you would want to call sleep() in a loop. In fact, sleep() is usually wrong too, unless you're actually waiting for some fixed amount of real-world time to elapse.
  • People are occasionally tempted to skip locking in order to improve performance. It's true that locking imposes some overhead, but keep in mind: Moore's Law will magically make your program faster over time, but it will never make a buggy program less buggy.
Python partisans will generally recommend using process-based instead of thread-based concurrency, which is understandable since CPython (the most commonly-used implementation of Python) doesn't support thread-based parallelism. (You can have multiple threads, but only one will execute at a time, so you don't gain any performance.) Process-based parallelism is nice, but it has a few obvious drawbacks: You have the added overhead of extra Python interpreters, along with the added overhead of marshaling and unmarshaling data to pass between processes. Furthermore, processes avoid a lot of the problems that threads have by not having implicitly shared state, but it's still possible to have concurrent access bugs if you're not careful.

Thursday, January 1, 2009

Different levels of storage (an analogy)

Computers have various layers of storage, since there's a well-understood tradeoff between speed and capacity. This post is an attempt to explain these levels based on "distance" - specifically, how far away they would be if information traveled at the speed of light. The speed of light doesn't have any special significance here, it's just a measure which puts things on a convenient scale. All measurements are made using my current CPU, an Athlon 64 at 2.4 GHz.

The first layer of storage is the register file. The register file is made up of a small number of individual registers, each of which holds one value. The access time is one cycle, or about 0.42 nanoseconds. In this amount of time, light travels about 12.5 centimeters, which is about the width of your hand.

The next layer is the L1 (level one) cache. L1 caches are generally pretty small (a few dozen kilobytes) and fast (a few cycles access time). They're also frequently split into separate caches for instructions and data, for reasons that I'm not entirely sure of. I was unable to find actual numbers for the L1 cache latency of my processor, but 3 cycles is a reasonable assumption, since it doesn't vary too much between processors. In the amount of time it takes to access the L1 cache, light travels about 37.5 cm.

The L2 (level two) cache is larger and slower - sizes are usually around a few hundred KB (512 KB in my case), and latency is higher as well. The only source I could find for this says that latency varies within this processor family, so I'm assuming 16 cycles latency, which is about at the middle of the range they give. At this speed, light travels about 2 meters in the time it takes to read from L2 cache. (Some processors also include an L3 cache, which is generally a few megabytes. I couldn't find any numbers on the access time, though, so I'm leaving it out.)

Everything before this point is built into the CPU itself. The next level is RAM, which is on separate chips. RAM capacities are generally in the range of a few hundred megabytes to a few gigabytes, and the access times are usually measured in hundreds of cycles. Assuming 500 cycles total latency for a random read, access time is about 0.2 microseconds, and in this amount of time light travels 62.5 meters, or about 200 feet. (NB: 500 cycles is a reasonable assumption, but I actually just made it up. >_> If anybody can find some more accurate numbers for random RAM access time, I'll update this post.)

Everything above this point is volatile memory - temporary space the computer uses for storage, which is lost when the computer is turned off. Flash-based SSDs are the first storage method in this list which is non-volatile, and can be used for persistent storage. Flash memory is relatively new, and despite being 10-100x more expensive than hard drives, is still a compelling alternative, for reasons you'll see in a second. The average SSD can store a few dozen to a few hundred gigabytes. A reasonable access time for a random read of flash memory is 10 microseconds. In this amount of time, light travels about 3 kilometers, or 1.86 miles.

Hard drives are radically different from everything discussed so far, since they have moving parts. Rather than having direct access to all the data at all times, you have a moving arm and a spinning disk, and the arm can only read the data which is directly underneath it. Hard drives are somewhat more difficult to calculate access time for, because the access time varies based on how far the disk has to spin, and how far the arm has to move, in order to read the data. I'm going to gloss over a lot of stuff about rotational speeds and seek times, and just tell you that 5ms is a reasonable access time for a hard drive. In this amount of time, light travels about 1500 kilometers, or 932 miles. (To put this in perspective, Texas is only 790 miles across.)

The Internet is often considered another level of storage, at least by some people. A quick test from my home computer, using ping, shows that latency to different websites is generally from 100-200 ms. Using this range, we can say that in terms of the speed of light, the Internet is about 30,000-60,000 km away, or about 2-5 times the diameter of the Earth.