CPU’s stopped getting faster about five years ago. Since then, Intel and AMD started introducing multi-core processors, and the trend for the foreseeable future seems to be more and more processors per computer. That will be great, so long as the software industry is able to take advantage of the increased number of processors to deliver bigger and faster applications. However, that means parallel programming is in our future, and parallel programming is hard.
What’s more, the current dominant paradigm to take advantage of multiprocessor systems is threads, and threads appear to have very serious problems. Berkeley professor Edward Lee argues in no uncertain terms that the inherent non-determinism of threads programming dooms any software testing approach to failure, and will lead to buggy unreliable programs:
“A folk definition of insanity is to do the same thing over and over again and to expect the results to be different. By this definition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.”
“These same computer vendors are advocating more multi-threaded programming, so that there is concurrency that can exploit the parallelism they would like to sell us. Intel, for example, has embarked on an active campaign to get leading computer science academic programs to put more emphasis on multi-threaded programming. If they are successful, and the next generation of programmers makes more intensive use of multithreading, then the next generation of computers will become nearly unusable.”
Having read Lee’s paper, I was very interested when I learned of Joe Armstrong’s new book “Programming Erlang: Software for a Concurrent World.” Armstrong actually identifies a problem with threads that is related to but slightly different than non-determinism: the fact that different threads can access the same memory. Why is shared memory a big problem? Briefly, because a thread or process that needs to access shared memory must lock it, and if it crashes while the memory is locked, you’re in trouble. For more, see this post.
So what is Erlang, and what is it like? Well, it’s open-source, and has been developed and used in telecom companies for a long time, so there already exist extensive libraries. It is a general-purpose language designed for concurrent, distributed, and fault-tolerant applications. It adheres strongly to the functional programming paradigm. It is a dynamic language, comes with a shell, and uses pattern-matching extensively. In Erlang, it is easy to spawn very large numbers of very-lightweight processes. Processes communicate using messages, and do not share any memory. All in all, it’s a very funky language.
So far, I have only read a small portion of Armstrong’s book (through chapter 8, where concurrent programs are introduced), but it is already clear to me that this is a significant piece of work, well worth the time spent with it. Starting from scratch, Armstrong works his way up to explaining complex distributed and fault-tolerant applications, such as a streaming media server, that require surprisingly little code. I plan to say more in future blog posts, as I progress through the book. In the meantime, here’s a list of beginner Erlang links.
(By the way, in his paper quoted above, Edward Lee discusses Erlang briefly, and says that he believes that its unfamiliar syntax will continue to block its wide-spread adoption. I’m not so sure–perhaps Erlang has simply not broken through until now because there just wasn’t much need for a language suited for multi-processor programming.)