Archive for the ‘Technology’ Category

Version Control Using Subversion

August 5, 2007

I’ve worked with some twenty-odd interns over the last few years at MERL, all of them graduate students at top electrical engineering and computer science departments from around the world. I’ve also interviewed many more job candidates for our research lab. One fact that has continually amazed me is what a tiny fraction of them have had any previous experience using version control (also called “source code management”) systems.

Version control lets you save each version of a software project as you build it up. Because you know that you can always back up to a previous working version, you are much more confident in writing new code. All professional software developers use version control on every project they work on; working without it is sometimes compared to trying to write a novel with a word-processor that doesn’t have a delete key.

If you’re getting started with version control, I recommend that you use Subversion, (also known as “svn”) which has several fundamental improvements over the older CVS system. A free book explaining the system comes with it, but I also recommend “Pragmatic Version Control Using Subversion”, by Mike Mason.

If you’re a budding programmer or a graduate student in an EECS department, think of it this way–not only will using version control help you write better software, but it will help you get a job if you ever want to go into industry. I doubt we’re the only ones who routinely ask candidates whether they’ve used version control to help assess their software development skills. [And if it crossed your mind, you can’t fake your way through this; if you say that you’ve used it, you’ll get follow-ups.]


SimPy and Discrete-Event Simulation

August 3, 2007

I’ve been using the SimPy discrete-event simulation package lately, and I really like it.

As the SimPy home page says, “SimPy (= Simulation in Python) is an object-oriented, process-based discrete-event simulation language based on standard Python.” What does that mean? Well, first of all, it is a Python module, and you import and then use it like any other Python module.

(If you haven’t used Python, get yourself over to and download it now! Or if you use Mac OS X or Linux, you already have it. Python is a powerful, high-level general-purpose language, and comes with excellent documentation.)

“Discrete-event simulations” are something a good nerd will often need to write. They are simulations where things happen at discrete times. There are three levels of sophistication in handling such simulations. Level zero is to just step forward in time by small increments, checking every time step for each possible event. Not too bright, because you waste a lot of computation on time steps when nothing happens. Level one is the “event-oriented” approach, where you keep events in some “agenda,” and you process events in the agenda one at a time, with each event in the agenda possibly creating new events in the future that are then added to the agenda.

Level one is as far as most people get, but it’s not where you should stop, because writing an event-oriented simulation is painful and error-prone. The right thing to do is to go to the next level of sophistication, the “process-oriented” approach.

In the process-oriented approach, you create special objects, called processes, which are like “living” objects. Processes have a special event method which functions as an event loop. To program the events in your simulation, you need to write one or more process event methods which describes how each process object reacts to the possible events in the simulation. It turns out that these process event methods are very natural and easy to write, because they properly correspond to how we think about what is happening in the simulation.

So how does it work? Well, SimPy sets up and handles the event agenda underlying the system, so you don’t have to do it yourself. When you call the SimPy “simulate()” method, it begins stepping through the events in the event agenda, calling the appropriate process event methods defined in your processes in the correct order. The Python feature that makes the whole thing work is the Python “yield” statement, which is like a return, except that the next time a function with a “yield” is called, it picks up after the yield rather than at the beginning of the function. All the process event methods you define when using SimPy will use yields to give back control to the SimPy run-time system.

Anyways, Professor Norm Matloff from UC Davis has written some excellent tutorials, or you can use the documentation that comes with SimPy.

Zillions of Games

August 2, 2007

Visit Zillions of Games

If you like abstract games, especially chess variants, you should enjoy Zillions of Games. It’s quite an amazing piece of software; a piece of AI technology that I actually would have not thought possible.

Games are defined using a rules file (written in a Lisp-like language), and then a generalized alpha-beta search engine is unleashed. Basically, it works best for Chess and its many variants, but it plays a really surprising number of games very well. It comes with over 350 games and puzzles, and you can download thousands more.

It’s not that hard to modify existing rules files to define new games; my son was interested in having an opponent for a game he invented and we were able to modify an existing rules file to play his game in less than an hour–and it played very well.

Naturally, it doesn’t play all games as well as humans, but it plays so many games so well (with adjustable levels to make a fair fight if it’s too good) that one can’t complain. Unfortunately, it’s only available for Windows computers.

Multicellular Logic Circuits

August 2, 2007

I have recently begun working on biologically-inspired computing, and in particular am pursuing a new approach that I call “multi-cellular logic circuits.” These circuits are constructed from many identically-specified cells and undergo a developmental phase much like multi-cellular organisms. I plan to describe these ideas in much more detail here at Nerd Wisdom. Until I create those pages, you can visit this page to find a PDF and a PowerPoint version of a talk I gave on the subject at a conference in Santa Fe in May 2007.

EDIT: I have begun creating those pages. You can start here.


Fake Steve Jobs

August 2, 2007

If you don’t know about this blog, you should really check it out. It’s a very funny take on the software and computer industry, and you can even occasionally learn some interesting technical things while reading it. A good place to start is with this post. Apparently there’s some mystery about who actually writes this blog. (Edit: sadly, not anymore…)

The Elements of Computing Systems: Building a Modern Computer From First Principles

August 1, 2007


I already wrote a review of this wonderful book, written by Noam Nisan and Shimon Schocken, at, so I’ll just repeat it here:

I highly recommend this book if you are interested in learning about computer science. The book is organized around the idea of building a computer from the fundamental logic gates up–starting with the hardware (combinational logic gates, arithmetic logic units, sequential logic gates, the CPU and memory) and then through the software hierarchy (starting with the machine language, and working up through the assembler, a virtual machine, a compiler for a high-level language, and an operating system). As a “by-product,” one learns, by very relevant examples, many fundamental concepts of computer science.

You can just read the book, but the best idea is to follow the authors’ advice and do the projects where you implement every necessary piece of the computer system yourself. The projects are all very well organized. All the software modules necessary to emulate any part of the computer, plus half the chapters from the book, are available for free download from the authors’ web-site. It all works beautifully. If you want to skip any of the projects, you can, because the software is organized in such a way that it will use built-in modules instead of the ones you built if necessary.

The authors seem to have extensively tested the whole approach through the courses they have taught using this material. I also noticed that Harvard’s Computer Science 101 course is being taught based on this book. I have been using the book for self-study with absolutely no problems–in fact I have never had such a great experience with a self-study course. All you need is a Windows, Linux, or Mac OS X computer and access to the internet, and you can give yourself a wonderful education in computer science.

In terms of prerequisites, you only really need to have some experience with programming (e.g. with C, or ideally with Java or Python). I think that the book should work well for students or hobbyists who don’t have any more experience than that, but it is also great for much more experienced students, as a kind of integrative summary of the field. I also think the book is perfect for graduate students or researchers from other fields who want to learn how digital hardware and software systems are actually engineered.

Finally, I just want to compliment the authors on the extraordinary care that they have taken with the whole project. The computer design that you build up is wonderfully elegant–at every stage the design is just as simple as it can be while being sufficient. Every piece of emulation software works as advertised. Even the extra powerpoint or .pdf tutorials are nicely done. This is really quality work, and using it is just a real pleasure. Finally, the source code for all the software provided by the authors is available, so if you wanted to extend the provided emulators, you could do that.

In summary, I give this book my unqualified highest recommendation.

David MacKay’s “Information Theory, Inference, and Learning Algorithms”

August 1, 2007


This is an easy book for me to recommend. David J.C. MacKay is a professor in the physics department of Cambridge University, and he is a polymath who has made important contributions in a wide variety of fields. This textbook is an excellent introduction to modern error-correcting codes, compression, statistical physics, and neural networks. It is tied together by a recurring appeal to the power of Bayesian methods.

David wrote this book over the course of many years, publishing his drafts on the web. You can still view the entire book on the web here. But the book is very inexpensive; unless you’re very poor, you’ll really want to buy a copy.

As Bob McEliece (a professor at Caltech and Shannon medalist) wrote, “you’ll want two copies of this astonishing book, one for the office and one for the fireside at home.” I know this is true because I actually have two copies; I bought my own copy as soon as the book was published, and then found that David had kindly sent me a copy.