Archive for the ‘Technology’ Category

Testing in Python Using doctest

August 26, 2007

Along with using version control, another absolute key to developing reliable software is to systematically test your code as you write it. After all, source code needs to be bug-free to function properly, but all human beings generate bugs at a very high rate when writing code.

Fortunately, Python makes testing remarkably easy and convenient with its doctest module, which lets you put your tests right into your doc strings.

Here’s a tiny example of its use (for more, go to the documentation):


def cube(x):
    """
    >>> cube(10)
    1000
    """
    return x * x

def _test():
    import doctest
    doctest.testmod()

if __name__ == "__main__":
    _test()

I intentionally left a bug in this code. Here’s what happens when I run it:


[yedidia ~] python cube.py
**********************************************************************
File "cube.py", line 3, in __main__.cube
Failed example:
    cube(10)
Expected:
    1000
Got:
    100
**********************************************************************
1 items had failures:
   1 of   1 in __main__.cube
***Test Failed*** 1 failures.

An advantage of using doctest is that your doc strings will serve as examples, as well as tests, of the functions that they document. Examples are often the best kind of documentation for a function.

In fact, I find that if a function doc string explains the inputs to the function, the variable(s) returned by the function, and any side effects, along with the doctest examples, then there is rarely any need for other comments.

My favorite way to develop python code is actually within Emacs. I write a test for a function, then write the function itself, and then type Control-C Control-C in Emacs. Control-C Control-C will execute the python code. If your code is set up to run a _test() function like the code above, then Emacs will open up another buffer which will contain any doctest failures. When all the tests pass, I finish up the documentation of the inputs, outputs, and side effects. That way you can systematically build up your software, one reliable and documented function at a time, while never leaving Emacs.

Emin Martinian taught me this technique. Another approach is to use the IPython enhanced python shell.

Note: to have Control-C Control-C execute python code, you’ll need to add the following lines to your .emacs file (more details here):


(autoload 'python-mode "python-mode" "Python Mode." t)
(add-to-list 'auto-mode-alist '("\.py\'" . python-mode))
(add-to-list 'interpreter-mode-alist '("python" . python-mode))

Of course, you should always keep all your old tests, both because they serve as examples, and also because if any new code somehow breaks an old function, you’ll see it immediately.

Computer Chess Software

August 26, 2007

The best chess playing entity in the world today is almost certainly the computer program “Rybka,” developed primarily by International Master Vasik Rajlich.

Rybka currently tops all the computer chess rating lists. Since Rybka is pretty clearly stronger than the Deep Fritz program that defeated World Champion Vladimir Kramnik 4-2 last year, there’s not much doubt that it is better than any human chess player.

Rybka is available as a plug-in chess engine, which means that you also need to have a chess graphical user interface (GUI) to use it. There are a variety of GUI’s available, including the free Arena GUI.

Rybka plays wonderful chess of course. It is free of all the problems that used to plague computer programs–for example it freely sacrifices material for positional compensation. Nevertheless, it’s not the program that I prefer, for a couple reasons.

First, it is only available for Windows (and Linux using Wine), and I wanted a program for Mac OS X. The strongest program for Macs is clearly Hiarcs 11, developed by Mark Uniacke. Although Hiarcs 11 seems to be somewhat weaker than Rybka, it is still stronger than Deep Fritz, and any human being.

gamewin.jpg

Most importantly, on Macs, Hiarcs is paired with the very nice chess “Sigma Chess” GUI. This GUI/engine combination sports the absolutely crucial feature, missing from any version of Rybka, that you can adjust its strength down to any level from world champion to amateur levels as low as 1250 Elo. If you don’t know what 1250 Elo means, you’re probably an amateur who will lose to a 1250–it’s the level amateurs typically reach after playing in tournaments for about one year. (Note: to obtain this feature on the Mac, you’ll need to spend $20 to get the “Pro” version of Sigma Chess, plus $50 for Hiarcs 11. You should also be able to adjust the level of Hiarcs on Windows GUI’s like Arena, although I have no personal experience with this.)

Amazingly, at whatever level you choose, Hiarcs plays a very human-like game. Previous computer programs, when weakened, played poorly strategically, but still did not make tactical errors. A weakened Hiarcs will make tactical mistakes that you can take advantage of, and will also play at an appropriate level of strategic understanding, so there is no particular reason to adopt anti-computer approaches when playing against it.

Of course, Hiarcs comes with many other features, including the ability to use it to analyze your games. Most grandmasters use these programs to aid in their training, and Hiarcs is well-known as one of the best engines to use.

Personally, I also enjoy just watching my Hiarcs 11.2 engine play blitz chess against the 11.1 engine that I previously used. It’s like watching a couple world champions duke it out. I think it’s worth appreciating, that at least for the game of chess, we now have $50-$70 software running on commodity hardware, that essentially passes the Turing test.

If you are a beginner, you might be better off choosing the mass-market Chessmaster, which comes packed with a lot of nice tutorials, as well as some very weak opponents to start with. However, I don’t like Chessmaster as much as Hiarcs because the weakened opponents in Chessmaster have the strong-tactics weak-strategy characteristics of computer programs that become so tiresome after a while.

Programming Erlang

August 23, 2007

CPU’s stopped getting faster about five years ago. Since then, Intel and AMD started introducing multi-core processors, and the trend for the foreseeable future seems to be more and more processors per computer. That will be great, so long as the software industry is able to take advantage of the increased number of processors to deliver bigger and faster applications. However, that means parallel programming is in our future, and parallel programming is hard.

What’s more, the current dominant paradigm to take advantage of multiprocessor systems is threads, and threads appear to have very serious problems. Berkeley professor Edward Lee argues in no uncertain terms that the inherent non-determinism of threads programming dooms any software testing approach to failure, and will lead to buggy unreliable programs:

“A folk definition of insanity is to do the same thing over and over again and to expect the results to be different. By this definition, we in fact require that programmers of multithreaded systems be insane. Were they sane, they could not understand their programs.”

“These same computer vendors are advocating more multi-threaded programming, so that there is concurrency that can exploit the parallelism they would like to sell us. Intel, for example, has embarked on an active campaign to get leading computer science academic programs to put more emphasis on multi-threaded programming. If they are successful, and the next generation of programmers makes more intensive use of multithreading, then the next generation of computers will become nearly unusable.”

images-1.jpg

Having read Lee’s paper, I was very interested when I learned of Joe Armstrong’s new book “Programming Erlang: Software for a Concurrent World.” Armstrong actually identifies a problem with threads that is related to but slightly different than non-determinism: the fact that different threads can access the same memory. Why is shared memory a big problem? Briefly, because a thread or process that needs to access shared memory must lock it, and if it crashes while the memory is locked, you’re in trouble. For more, see this post.

So what is Erlang, and what is it like? Well, it’s open-source, and has been developed and used in telecom companies for a long time, so there already exist extensive libraries. It is a general-purpose language designed for concurrent, distributed, and fault-tolerant applications. It adheres strongly to the functional programming paradigm. It is a dynamic language, comes with a shell, and uses pattern-matching extensively. In Erlang, it is easy to spawn very large numbers of very-lightweight processes. Processes communicate using messages, and do not share any memory. All in all, it’s a very funky language.

So far, I have only read a small portion of Armstrong’s book (through chapter 8, where concurrent programs are introduced), but it is already clear to me that this is a significant piece of work, well worth the time spent with it. Starting from scratch, Armstrong works his way up to explaining complex distributed and fault-tolerant applications, such as a streaming media server, that require surprisingly little code. I plan to say more in future blog posts, as I progress through the book. In the meantime, here’s a list of beginner Erlang links.

(By the way, in his paper quoted above, Edward Lee discusses Erlang briefly, and says that he believes that its unfamiliar syntax will continue to block its wide-spread adoption. I’m not so sure–perhaps Erlang has simply not broken through until now because there just wasn’t much need for a language suited for multi-processor programming.)

“On Intelligence” and Numenta

August 21, 2007


book.png

“On Intelligence,” written by Jeff Hawkins with Sandra Blakeslee, is a great read, full of provocative ideas about how our brains work.

Hawkins founded Palm Computing and Handspring, but he says that his true lifelong passion has been trying to understand our brains. In 2002, he founded the Redwood Neuroscience Institute (now the Redwood Center for Theoretical Neuroscience at Berkeley).

At Redwood, he developed a theory of the brain, that he expounds in this book. The book is a popular science book; you will not find any equations or explicit algorithms. It reads very smoothly, undoubtably due to the fact that Hawkins was helped in writing the book by Sandra Blakeslee, a science writer for the New York Times.

Hawkins argues that the cortex consists of modules that are all performing the same algorithm. The purpose of that algorithm is to learn to complete and predict the spatio-temporal patterns coming into a module from “lower” modules in the brain’s hierarchy, using feedback from “higher” modules.

I find this hypothesis for how the brain works extremely attractive, and “On Intelligence” argues for the hypothesis almost too well, in that the difficulties of converting the hypothesis into a concrete and useful algorithm will slip by the reader, as if by sleight of hand. The problem, of course, is that it is easy to use words to talk about how a brain might work; the hard part is making a machine do the same thing.

Peter Dayan, a leading computational neuroscientist, has written a much more detailed review. And for a 2006 lecture by Hawkins about his theory, you can watch this video.

After reading the book, I actually spent a few weeks trying to make the hypothesis into a concrete algorithm, without much success. But Hawkins is certainly putting his money where his mouth is: he has founded a company, called Numenta, and Numenta has released software (the “Numenta Platform for Intelligent Computing” or “NuPIC”) that implements part of his theory. NuPIC appears to be based on software originally written by Numenta co-founder and Stanford graduate student Dileep George, but transformed by a team of software developers into a professional-quality product.

However, it is very disappointing that the NUPIC software does not include feedback in its hierarchies, nor does it let you learn and infer temporal patterns (only spatial ones). To be honest, I am not so surprised, because it was these elements (that are obviously central in Hawkins’ theory) that I found it so difficult to integrate into a real algorithm.

Numenta promises that future versions of NuPIC will fill these gaps. I sincerely wish them the best of luck! I should say that I think it is wonderful that an impatient guy like Hawkins pushes the field, using a different approach than the standard academic one.

So in summary, you should definitely read Hawkins’ book, and if you’re intrigued, you should by all means check out the free Numenta software. Whether this line of research will lead to anything significant though, I’m not really sure…

Structure and Interpretation of Computer Programs: Videos and Textbook

August 17, 2007


gjspicture.jpg

This set of videos, of Gerald Jay Sussman and Hal Abelson teaching their course on the “Structure and Interpretation of Computer Programs” in July 1986 for Hewlett-Packard employees, is one of the treasures of the internet. The clothes are out of style, but the material presented is still completely relevant.

The SICP web-site has lots of useful additional information, including the complete text of the 2nd edition of their classic textbook.

I am not going to try to write a better review of the Abelon’s and Sussman’s textbook than the Amazon review written by Peter Norvig. Norvig is the head of Research at Google, and a co-author of Artificial Intelligence: a Modern Approach, the leading AI textbook.

But I will add one comment: what I love most about these lectures is the point (see the picture above of Sussman in his wizard hat) that a computer programmer is like a wizard–he creates something real out of ideas. Computers let us be wizards; and I believe we have only scratched the surface of the possible “spells” that we can learn to cast.

If you want a nice implementation of Scheme, the beautiful dialect of Lisp that was invented by Sussman with his student Guy Steele, and used in this textbook, I highly recommend DrScheme.

Gateway High School, 1981

August 13, 2007


paulgraham_1960_3304843.gif

Paul Graham has a great web-site full of thought-provoking essays. Some of them were collected into his fascinating book “Hackers and Painters.”

One of those essays, “Gateway High School, 1981″ talks about our high school, which I in fact graduated from in 1981. The picture above is from that essay; it’s the Chess Club picture from the 1981 Gateway High School yearbook. I’m the one at the bottom left seated at the chess board; Paul is standing at the top left.

If you read Paul’s description, or read his famous essay about “Why Nerds are Unpopular,” my high school sounds very grim, but I really didn’t experience it the same way. I honestly had a pretty good time in high school; maybe that’s why I’m the one who’s smiling in the picture above. I even loved rooting for the football teams (both the Steelers and the Gators), and trace my football fanaticism to those years, when the Pittsburgh Steelers were a dynasty.

When I arrived at Harvard in 1981, I was at first a little intimidated by some of the other students who went to better high schools, but I soon realized that I was easily well-enough prepared. And I was hungry to learn.

Nowadays, I’m most concerned that high school students are so driven to perform in order to be accepted into the elite universities, and that many high school teachers drive the students so hard with homework, that by the time they get to college, the students can have had the natural desire to learn drummed out of them.

Mitsubishi Electric Research Labs is Hiring

August 8, 2007

Mitsubishi Electric Research Labs (MERL), the research lab where I work, is hiring. We have open research scientist positions in a number of different fields in computer science and electrical engineering.

Multicellular Logic Circuits, Part I: Evolution

August 7, 2007

If we want to construct artificial machines that rival the capabilities of biological organisms, we should try to understand the principles by which complex natural “machines” such as plants and animals are created.

It is generally agreed, at least by scientists, that all natural organisms have been “designed” by the completely blind and random process of evolution. Through evolution, a population of organisms tends to become progressively better adapted to its environment via the mutation of genomes of individuals in the population, and the selection and more rapid reproduction of the fittest organisms in that population.

51tk0pychxl_aa240_.jpg

Harvard professor Martin Nowak a has written a lovely and elegant book describing the mathematics of evolutionary dynamics, using the ideas of evolutionary game theory; here is a video of Nowak describing evolutionary game theory at Harvard in 2004.

My own interest is not so much in analyzing evolution, but in exploiting it. If we understand evolution so well, shouldn’t we be able to use it to design useful machines?

180px-img013biglittledogfx_wb.jpg

Of course, humans have already for many centuries exploited evolution, using artificial selection to breed domesticated animals or cultivate useful plants.

But I am looking for something else: the design of artificial machines through artificial selection. Although it has never been a mainstream idea, computer scientists have pursued such dreams since the 1950’s. When I was in graduate school in the 1980’s, I loved reading John Holland’s seminal 1975 book “Adaptation in Natural and Artificial Systems.”

adapt.jpg

Holland and his students were deeply influential in popularizing the whole field of genetic algorithms.

Another important figure in the field is John Koza, who has advocated for many years one of the most important variants of genetic algorithms, which he calls “genetic programming.” In genetic programming, computer programs, typically written in Lisp, are evolved through a process that involves mutating the programs by altering or swapping branches of the computation tree representing the program.

Genetic programming and genetic algorithms more generally, have had considerable success creating interesting and useful systems and programs. Nevertheless, I think it is fair to say that these ideas are still considered “fringe” ideas in the scientific and engineering community, and they have not widely replaced more conventional software and hardware design strategies.

So what might be missing? I will begin discussing that in Part II.


Marvel Trading Card Game / Magic the Gathering

August 7, 2007


61uwcobzlql_aa280_.jpg

Are you looking for a challenging strategy game in a portable format? If so, this is the game for you. This is an example of a remarkable piece of work that is being seriously under-appreciated.

A little background is in order. Collectible card games (now called “trading card games”) began in 1993 with “Magic the Gathering,” a game designed by Richard Garfield, who was just then earning his Ph.D. in mathematics at Penn.

The basic idea of Magic the Gathering is that you build your own deck of cards (which set of cards you can select from depends on the precise rules of the format you are playing) and then “duel” against an opponent who has built his own deck.

In a duel, you start with 20 “life” points, and the objective is to reduce your opponent to zero points before they do the same to you. You play some of your cards as “creatures” which attack your opponent or defend you against your opponent’s creatures. There are other cards that act as “sorceries,” “instants,” “artifacts,” or “enchantments,” which are “spells” that you cast and have all sorts of different effects, like killing creatures, reducing your opponent’s life, making your creatures stronger, and so on. (There are many other possibilities–there now exist roughly 10,000 different cards, so you can appreciate that they do many different things.) Each creature or other spell is paid for with “mana” that you usually obtain from your “land” cards.

When you play a “spell” it is put on a “stack,” and your opponent has a chance to respond to it before it “resolves.” Much as in a computer stack, the spells can pile up, and are resolved in last in, first out, order. Since many of the cards have effects which violate the normal rules, it can be very complicated to keep track of what is going on.

The key to winning in Magic is to design a deck where your cards act synergistically with each other, and then to be able to take advantage of all the subtle interactions and timing effects. It’s incredibly difficult and challenging to play well, no less challenging than Chess, and like Chess, it can be played either casually or in serious tournaments.

If you want to learn more about Magic, you should visit www.magicthegathering.com, which has a vast amount of information. You should probably start with the “Magic Academy” articles, and this article in particular.

250px-magic_players.jpg

I personally played seriously between 2004 and 2006, and my highlight was reaching 2nd place in a rather serious “Pro tour qualifier” tournament in 2005 (see the picture above for what one of these tournaments is like) with my Blue-White Ninja deck (although in that link, the Mike Flores gives me too much credit for the deck design; I copied an obscure but brilliant deck I found on the internet, a practice referred to as “net-decking”). For the last year, though, I haven’t played much, and my son Adam has seriously surpassed me in in both playing and deck-designing ability.

Magic the Gathering has been fantastically popular for a serious card game, and has spawned many imitations because of its ridiculously successful business model. Because the cards that are legal to build your decks with keep changing, the game keeps evolving (which is nice), but that means you have to keep spending money to buy more cards (which may not be so nice for you, but it is certainly nice for the publishers.) Actually, it’s a little more complicated than that–they’ve cleverly arranged it so that your old cards are still legal for some formats, and since they are no longer printed, those old cards tend to hold their value pretty well. And many Magic players actually quite enjoy collecting and trading their cards.

Anyways, most of the imitations, like Pokemon card game, Yu-gi-oh, or Duel Masters, are vastly simplified games targeting a younger set. (Most Magic players are males between the ages of 15 and 30.)

One imitative game, which at least roughly equals Magic in complexity and interest, is the “Marvel Trading Card Game” (also known as “Versus System.”) The theme is a little different; it has superheroes battling villains instead of monsters fighting and wizards casting spells and so on, but it works very similarly. Forgetting about the theme, and judging them strictly as abstract games, there’s little to make one prefer one game very much over the other–they are clearly closely related, and the competitive Marvel scene is dominated by Magic players.

Unfortunately, no versions of Magic have come out for portable game systems. There was a computer game back in 1998, but its AI was really pretty bad. There is a version of Magic the Gathering that can be played online, but you need to buy the cards for the same price as you pay for real cards. (You are able to play for free against other beginners with simple starter decks.) You can also play for free online using the excellent Magic Workstation program, but Magic Workstation doesn’t enforce the rules, so you need to be a relatively experienced player to use it.

Probably Wizards of the Coast (the publishers of Magic) haven’t wanted to cut into their card sales with a computer or portable game that effectively gives you a huge number of cards to play with for a relatively small price, or possibly they don’t believe that it’s possible to program an AI that plays Magic competently.

There have been numerous Yu-gi-oh and Duel Masters games for Nintendo DS’s and Gameboys, but I find those games to be distinctly inferior to Magic.

Anyways, we now have “Marvel Trading Card Game” for the DS, PSP, or PC. I have the DS version for my DS Lite. To play the game on a DS, you hold your DS sideways like a book. The game board is shown on the touchscreen, and you poke at your cards with the stylus to play them. On the other screen the currently selected card is shown in detail so that you can read all of its rules.

You can follow two “paths” through the game, one where you play with superhero cards, and one where you play with villain cards. When you win duels, you are rewarded with new cards, which you can use to make a more powerful deck, which you will definitely need to do, because your opponents’ decks will get more powerful. The game includes about 800 different cards in all. There are also tutorials and very challenging puzzles.

In general this is an extremely challenging game! The AI is very impressive (and it gets some advantage from having slightly better decks than you). You can win, but you will need think hard and play very carefully. After the first couple opponents, you start facing good decks, and I started losing duels more often than winning them. But since I can always identify my errors when I lose, it doesn’t feel unfair, just challenging. Each duel will take about 40 minutes (Versus duels tend to last longer than Magic duels, which typically only take 20 minutes). You can also play online with other players, but I haven’t tried that, because if I wanted to play online, I would just use Magic Workstation.

The game has been getting mediocre reviews (they mostly complain that the game is so hard; another problem is that the music is pretty bad–I agree that you should turn off the sound), so my viewpoint is obviously a minority one. But nevertheless, if you like Magic, or want to see what serious trading card games are about, or want something that will make the time fly during a long plane or train trip, I recommend you pick this up.


Follow

Get every new post delivered to your Inbox.

Join 25 other followers