Archive for the ‘AI’ Category

Multicellular Logic Circuits, Part III: A Model

September 26, 2007

In Part I and Part II of this series, I discussed genetic algorithms and why we might want to create artificial machines that begin life as a single cell, and develop into networks of identical communicating cells. In this post, I want to begin describing a model that works along these lines.

The model is a highly stylized and simplified cartoon of biological multicellular organisms. It my attempt to make the simplest model possible that captures the essence of what is happening in biology. So understand that biology is more complicated than this model; but the goal is a model stripped down to those essential elements that cannot be taken away if one wants something that looks like life. Thus, the model is proposed in the spirit of the Ising model of magnets in statistical physics; the simplest model that captures the general behavior we are looking for.

The first question is what do we want our machine (or “circuit” or “network” or “organism”; I will use these terms interchangeably) to do? As is quite conventional in hardware design, I will presume the organism receives some input signals from the world, and it is supposed to produce some desired output signal, which depends on the inputs it has received at the current and previous times. Thus, the circuit should in general be capable of creating memories, that lets it store something about previous inputs.

The organism begins its life as a single cell, and then has two phases in its life, a dynamic “embryonic” phase and a static “adult” phase. During the embryonic phase, the cells in the organism can undergo developmental events, primarily cell duplication, but also perhaps cell death or cell relocation, to sculpt out the final network of communicating cells. After the embryonic phase is complete (say after a fixed amount of time has passed, or some signal is generated by the circuit) the adult phase is entered. The network is static in structure during the adult phase. It is during the adult phase that the network can be tested to see whether it properly computes the desired input-output function. The figure above is a pictorial representation of the model that hopefully makes clear what I have in mind.

Each of the cells in the network will have an internal structure, defined primarily by “logic units” which send signals to each other. The computations performed by the organism will simply be the computations performed by the logic units inside of its cells. The details of what the logic units do, and how they are connected to each other, is specified by a “genome” or “program” for the organism.

Look at the figure below for a peek inside an individual cell in the model. Each cell will have an identical set of logic units, with identical connections between the logic units.

The logic units compute an output according to some fixed function of their inputs. They transmit that output after some delay, which is also part of their fixed function. The output of one logic unit will be the input of another; they send “signals” to each other.

These signals are of various types (see the above figure). The first type of signal, called a “factor signal,” will always go from a logic unit to another logic unit in the same cell. The second type of signal, called an “inter-cellular signal,” will always go from a logic unit to a logic unit in a different cell. The third type of signal, called a “developmental output signal,” will not actually go to another logic unit, but will be a signal to the cell development apparatus to perform some important development event, such as duplication or programmed cell death. Finally, the fourth type of signal, called a “developmental input signal,” will be used by the cell development apparatus to signal that some type of cell development event has occurred, and will serve as an input to logic units.

Remember that initial cell (the “fertilized egg”) will need to have a set of logic units that enable it to automatically create the adult network, so it must effectively contain the instructions for development as well as for the adult circuit. It might seem hard to imagine that this can work, but it can. In the next post in this series, I will discuss in more detail the process of development in this model, and then we will be in position to look at some interesting multicellular circuits that I have designed.

If you don’t want to wait, you can visit this page to find a PDF and a PowerPoint version of a talk I gave on the subject at a conference in Santa Fe in May 2007, although unfortunately, it might be hard to decipher without my explanation…

September 23, 2007

Sebastian Thrun is a professor of computer science and electrical engineering at Stanford, and director of the Stanford Artificial Intelligence Laboratory. He was the leader of Stanford’s team which won the \$2 million first prize in the 2005 DARPA Grand Challenge, which was a race of driver-less robotic cars across the desert, and also leads Stanford’s entry into the 2007 DARPA Urban Challenge.

One of the ingredients in the Stanford team’s win was their use of “probabilistic robotics,” which is an approach based on the recognition that all sensor readings and models of the world are inherently subject to uncertainty and noise. Thrun, together with Wolfram Burgard and Dieter Fox have written the definitive text on probabilistic robotics, which will be a standard for years to come. If you are seriously interested in robotics, you should read this book. (The introductory first chapter, which clearly explains the basic ideas of probabilistic robotics is available as a download here.)

The Laboratory of Intelligent Systems at the Swiss École Polytechnique Fédérale de Lausanne (EPFL) hosts the superb “Talking Robots” web-site, which consists of a series of podcast interviews with leading robotics researchers. I noticed that the latest interview is with Thrun, and liked it quite a bit; it is well worth downloading to your iPod or computer.

You can watch Thrun speaking about the DARPA Grand Challenge at this Google TechTalk.

Artificial Intelligence: A Modern Approach

September 20, 2007

“Artificial Intelligence: A Modern Approach,” by Stuart Russell (professor of computer science at UC Berkeley) and Peter Norvig (head of research at Google) is the best-known and most-used textbook about artificial intelligence, and for good reason; it’s a great book! The first edition of this book was my guide to the field when I was switching over from physics research to computer science.

I feel almost embarrassed to recommend it, because I suspect nearly everybody interested in AI already knows about it. So I’m going to tell you about a couple related resources that are maybe not as well-known.

First, there is the online code repository to the algorithms in the book, in Java, Python, and Lisp. Many of the algorithms are useful beyond AI, so you may find for example that the search or optimization algorithm that you are interested in has already been written for you. I personally have used the Python code, and it’s really model code from which you can learn good programming style.

Second, if you haven’t ever visited Peter Norvig’s web-site, you really should. I particularly recommend his essays “Teach Yourself Programming in Ten Years,” “Solving Every Sudoku Puzzle,” and “The Gettysburg Powerpoint Presentation.”

Multicellular Logic Circuits, Part II: Cells

September 18, 2007

In my post “Multicellular Logic Circuits, Part I: Evolution,” I discussed evolution and genetic algorithms; I want to continue that discussion here.

There are two salient facts of biology that are completely inescapable. The first is that all organisms are shaped by the process of evolution. The second is that all organisms are constructed from cells.

Furthermore, all complex multicellular organisms begin life as a single cell, and undergo a process of development through cell division to mature into an adult. And no matter how different any two organisms may be on the gross macroscopic level that we are used to, inside their cells the chemical processes of life are fundamentally very similar.

Thus it is no accident that the titles of the two leading textbooks in molecular biology are The Molecular Biology of the Gene by Watson, et. al. and The Molecular Biology of the Cell by Alberts et. al. [These are both great books. This link to the first chapter of MBOC is an excellent entry point into modern biology. And if you are serious about learning biology, I also strongly recommend the companion Molecular Biology of the Cell: A Problems Approach, by Wilson and Hunt, which will force you to think more actively about the material.]

It therefore seems reasonable that if we want to construct artificial systems that achieve the performance of natural ones, we should consider artificially evolving a system constructed from cells.

Although there are typically many different cell types in a mature multi-cellular organism, all the different cells of the organism, with the exception of sperm and egg cells, share an identical genetic specification in their DNA. The different behavior of cells with identical genetic specifications is the result of the cells having different histories and being subjected to different environments.

More specifically, the behavior of a biological cell is controlled by complex genetic regulatory mechanisms that determine which genes are transcribed into messenger RNA and then translated into proteins. One very important regulatory mechanism is provided by the proteins called “transcription factors” that bind to DNA regulatory regions upstream of the protein coding regions of genes, and participate in the promotion or inhibition of the transcription of DNA into RNA. The different histories of two cells might lead to one having a large concentration of a particular transcription factor, and the other having a low concentration, and thus the two cells would express different genes, even though they had identical DNA.

Another important mechanism that controls the differential development of different types of cells in a multi-cellular organism is the biochemical signaling sent between cells. Signals such as hormones have the effect of directing a cell down a particular developmental pathway.

In general, the transcription factors, hormones, and multitude of other control mechanisms used in biological cells are organized into a network which can be represented as a “circuit” where the state of the system is characterized by the concentrations of the different biochemical ingredients. In fact, biologists are now using wiring diagrams to help summarize biological circuits; see for example, the “Biotapestry editor” developed by Eric Davidson’s lab at Caltech.

[I strongly recommend Davidson’s recent book The Regulatory Genome: Gene Regulatory Networks in Development and Evolution for an exciting introduction to the burgeoning “evo-devo” field; if you don’t have any background in biology, you may prefer The Coiled Spring, by Ethan Bier for a somewhat more popular account.]

Turning to the problem of designing artifical systems, a natural question is what theoretical advantages exist, from the point of view of designing with evolution, to using an identical genetic specification for all the cells in a multi-cellular organism.

One potential advantage is that relatively small changes to the genetic specification of the organism can concurrently alter the behavior of many different kinds of cells at many different times during the development of the organism. Therefore, if there is the possibility of an advantageous change to the circuitry controlling a cell, then it can be found once and used many times instead of needing to find the same advantageous mutation repeatedly for each of the cells in the organism.

Another related potential advantage is that a highly complicated organism can be specified in a relatively compact way. If each of the trillions of cells in a complex organism like a human had to be separately specified, then the overall amount of information required to describe the human genome would be multiplied more than a trillion-fold. Clearly, it is much more efficient to re-use the identical circuitry in many different types of cells.

In other words, biology uses a strategy of specifying a complex multi-cellular organism by just specifying a single cell–all the other cells in the mature organism are grown organically out of the developmental process. This seems like a strategy worth imitating.

On the other hand, the constraint that each cell in an organism should share an identical genetic specification clearly causes complications from the point of view of design. For example, it is important that genes that are designed to function in one type of cell at one point in development not cause problems for different type of cell at a different point in development. Clearly, good design of the control logic that turns genes on and off is essential to the proper functioning of a multi-cellular organism.

In the next post in this series, I will turn to the construction of a concrete model for multi-cellular circuits that tries to capture, as simply as possible, the essence of what is happening in biology.

Computer Chess Software

August 26, 2007

The best chess playing entity in the world today is almost certainly the computer program “Rybka,” developed primarily by International Master Vasik Rajlich.

Rybka currently tops all the computer chess rating lists. Since Rybka is pretty clearly stronger than the Deep Fritz program that defeated World Champion Vladimir Kramnik 4-2 last year, there’s not much doubt that it is better than any human chess player.

Rybka is available as a plug-in chess engine, which means that you also need to have a chess graphical user interface (GUI) to use it. There are a variety of GUI’s available, including the free Arena GUI.

Rybka plays wonderful chess of course. It is free of all the problems that used to plague computer programs–for example it freely sacrifices material for positional compensation. Nevertheless, it’s not the program that I prefer, for a couple reasons.

First, it is only available for Windows (and Linux using Wine), and I wanted a program for Mac OS X. The strongest program for Macs is clearly Hiarcs 11, developed by Mark Uniacke. Although Hiarcs 11 seems to be somewhat weaker than Rybka, it is still stronger than Deep Fritz, and any human being.

Most importantly, on Macs, Hiarcs is paired with the very nice chess “Sigma Chess” GUI. This GUI/engine combination sports the absolutely crucial feature, missing from any version of Rybka, that you can adjust its strength down to any level from world champion to amateur levels as low as 1250 Elo. If you don’t know what 1250 Elo means, you’re probably an amateur who will lose to a 1250–it’s the level amateurs typically reach after playing in tournaments for about one year. (Note: to obtain this feature on the Mac, you’ll need to spend \$20 to get the “Pro” version of Sigma Chess, plus \$50 for Hiarcs 11. You should also be able to adjust the level of Hiarcs on Windows GUI’s like Arena, although I have no personal experience with this.)

Amazingly, at whatever level you choose, Hiarcs plays a very human-like game. Previous computer programs, when weakened, played poorly strategically, but still did not make tactical errors. A weakened Hiarcs will make tactical mistakes that you can take advantage of, and will also play at an appropriate level of strategic understanding, so there is no particular reason to adopt anti-computer approaches when playing against it.

Of course, Hiarcs comes with many other features, including the ability to use it to analyze your games. Most grandmasters use these programs to aid in their training, and Hiarcs is well-known as one of the best engines to use.

Personally, I also enjoy just watching my Hiarcs 11.2 engine play blitz chess against the 11.1 engine that I previously used. It’s like watching a couple world champions duke it out. I think it’s worth appreciating, that at least for the game of chess, we now have \$50-\$70 software running on commodity hardware, that essentially passes the Turing test.

If you are a beginner, you might be better off choosing the mass-market Chessmaster, which comes packed with a lot of nice tutorials, as well as some very weak opponents to start with. However, I don’t like Chessmaster as much as Hiarcs because the weakened opponents in Chessmaster have the strong-tactics weak-strategy characteristics of computer programs that become so tiresome after a while.

“On Intelligence” and Numenta

August 21, 2007

“On Intelligence,” written by Jeff Hawkins with Sandra Blakeslee, is a great read, full of provocative ideas about how our brains work.

Hawkins founded Palm Computing and Handspring, but he says that his true lifelong passion has been trying to understand our brains. In 2002, he founded the Redwood Neuroscience Institute (now the Redwood Center for Theoretical Neuroscience at Berkeley).

At Redwood, he developed a theory of the brain, that he expounds in this book. The book is a popular science book; you will not find any equations or explicit algorithms. It reads very smoothly, undoubtably due to the fact that Hawkins was helped in writing the book by Sandra Blakeslee, a science writer for the New York Times.

Hawkins argues that the cortex consists of modules that are all performing the same algorithm. The purpose of that algorithm is to learn to complete and predict the spatio-temporal patterns coming into a module from “lower” modules in the brain’s hierarchy, using feedback from “higher” modules.

I find this hypothesis for how the brain works extremely attractive, and “On Intelligence” argues for the hypothesis almost too well, in that the difficulties of converting the hypothesis into a concrete and useful algorithm will slip by the reader, as if by sleight of hand. The problem, of course, is that it is easy to use words to talk about how a brain might work; the hard part is making a machine do the same thing.

Peter Dayan, a leading computational neuroscientist, has written a much more detailed review. And for a 2006 lecture by Hawkins about his theory, you can watch this video.

After reading the book, I actually spent a few weeks trying to make the hypothesis into a concrete algorithm, without much success. But Hawkins is certainly putting his money where his mouth is: he has founded a company, called Numenta, and Numenta has released software (the “Numenta Platform for Intelligent Computing” or “NuPIC”) that implements part of his theory. NuPIC appears to be based on software originally written by Numenta co-founder and Stanford graduate student Dileep George, but transformed by a team of software developers into a professional-quality product.

However, it is very disappointing that the NUPIC software does not include feedback in its hierarchies, nor does it let you learn and infer temporal patterns (only spatial ones). To be honest, I am not so surprised, because it was these elements (that are obviously central in Hawkins’ theory) that I found it so difficult to integrate into a real algorithm.

Numenta promises that future versions of NuPIC will fill these gaps. I sincerely wish them the best of luck! I should say that I think it is wonderful that an impatient guy like Hawkins pushes the field, using a different approach than the standard academic one.

So in summary, you should definitely read Hawkins’ book, and if you’re intrigued, you should by all means check out the free Numenta software. Whether this line of research will lead to anything significant though, I’m not really sure…

Lectures on Disordered Systems

August 12, 2007

Many physicists study “disordered systems,” such as materials like glasses where the molecules making up the material are arranged randomly in space, in contrast to crystals, where all the particles are arranged in beautiful repeating patterns.

The symmetries of crystals make them much easier to analyze than glasses, and new theoretical methods had to be invented before physicists could make any headway in computing the properties of disordered systems. Those methods have turned out to be closely connected to approaches, such as the “belief propagation” algorithm, that are widely used in computer science, artificial intelligence, and communications theory, with the result that physicists and computer scientists today regularly exchange new ideas and results across their disciplines.

Returning to the physics of disordered systems, physicists began working on the problem in the 1970’s by considering the problem of disordered magnets (also called “spin glasses”). My Ph.D. thesis advisor, Philip W. Anderson summarized the history as follows:

“In 1975 S.F. (now Sir Sam) Edwards and I wrote down the “replica” theory of the phenomenon I had earlier named “spin glass”, followed up in ’77 by a paper of D.J. Thouless, my student Richard Palmer, and myself. A brilliant further breakthrough by G. Toulouse and G. Parisi led to a full solution of the problem, which turned out to entail a new form of statistical mechanics of wide applicability in fields as far apart as computer science, protein folding, neural networks, and evolutionary modelling, to all of which directions my students and/or I contributed.”

In 1992, I presented five lectures on “Quenched Disorder: Understanding Glasses Using a Variational Principle and the Replica Method” at a Santa Fe Institute summer school on complex systems. The lectures were published in a book edited by Lynn Nadel and Daniel Stein, but that book is very hard to find, and I think that these lectures are still relevant, so I’m posting them here. As I say in the introduction, “I will discuss technical subjects, but I will try my best to introduce all the technical material in as gentle and comprehensible a way as possible, assuming no previous exposure to the subject of these lectures at all.”

The first lecture is an introduction to the basics of statistical mechanics. It introduces magnetic systems and particle systems, and describes how to exactly solve non-interacting magnetic systems and particle systems where the particles are connected by springs.

The second lecture introduces the idea of variational approaches. Roughly speaking, the idea of a variational approach is to construct an approximate but exactly soluble system that is as close as possible to the system you are interested in. The grandly titled “Gaussian variational method” is the variational method that tries to find the set of particles and springs that best approximates an interacting particle system. I describe in this second lecture how the Gaussian variational method can be applied to heteropolymers like proteins.

The next three lectures cover the replica method, and combine it with the variational approach. The replica method is highly intricate mathematically. I learned it at the feet of the masters during my two years at the Ecole Normale Superieure (ENS) in Paris. In particular, I was lucky to work with Jean-Philippe Bouchaud, Antoine Georges, and Marc Mezard, who taught me what I knew. I thought it unfortunate that there wasn’t a written tutorial on the replica method, so the result were these lectures. Marc told me that for years afterwards they were given to new students of the replica method at the ENS.

Nowadays, the replica method is a little less popular than it used to be, mostly because it is all about computing averages of quantities over many samples of systems that are disordered according to some probability distribution. While those averages are very useful in physics, they are somewhat less important in computer science, where you usually just want an algorithm to deal with the one disordered system in front of you, rather than an average over all the possible disordered systems.

Santa Fe Institute Lectures

Multicellular Logic Circuits, Part I: Evolution

August 7, 2007

If we want to construct artificial machines that rival the capabilities of biological organisms, we should try to understand the principles by which complex natural “machines” such as plants and animals are created.

It is generally agreed, at least by scientists, that all natural organisms have been “designed” by the completely blind and random process of evolution. Through evolution, a population of organisms tends to become progressively better adapted to its environment via the mutation of genomes of individuals in the population, and the selection and more rapid reproduction of the fittest organisms in that population.

Harvard professor Martin Nowak a has written a lovely and elegant book describing the mathematics of evolutionary dynamics, using the ideas of evolutionary game theory; here is a video of Nowak describing evolutionary game theory at Harvard in 2004.

My own interest is not so much in analyzing evolution, but in exploiting it. If we understand evolution so well, shouldn’t we be able to use it to design useful machines?

Of course, humans have already for many centuries exploited evolution, using artificial selection to breed domesticated animals or cultivate useful plants.

But I am looking for something else: the design of artificial machines through artificial selection. Although it has never been a mainstream idea, computer scientists have pursued such dreams since the 1950’s. When I was in graduate school in the 1980’s, I loved reading John Holland’s seminal 1975 book “Adaptation in Natural and Artificial Systems.”

Holland and his students were deeply influential in popularizing the whole field of genetic algorithms.

Another important figure in the field is John Koza, who has advocated for many years one of the most important variants of genetic algorithms, which he calls “genetic programming.” In genetic programming, computer programs, typically written in Lisp, are evolved through a process that involves mutating the programs by altering or swapping branches of the computation tree representing the program.

Genetic programming and genetic algorithms more generally, have had considerable success creating interesting and useful systems and programs. Nevertheless, I think it is fair to say that these ideas are still considered “fringe” ideas in the scientific and engineering community, and they have not widely replaced more conventional software and hardware design strategies.

So what might be missing? I will begin discussing that in Part II.

Generalized Belief Propagation

August 5, 2007

In 2002, I gave a lecture at the Mathematical Sciences Research Institute on the work I did, together with Bill Freeman and Yair Weiss on Generalized Belief Propagation, and the correspondence between free energy approximations and message passing algorithms. The lecture is available as a streaming video, together with a pdf for the slides, here.

It’s worth mentioning that there are many other interesting research lectures available in MSRI’s video archive, and that the more recent ones are of higher production quality.

Here is our most recent and comprehensive paper on this subject, published in the July 2005 issue of IEEE Transactions on Information Theory, which gives many additional details compared to the lecture: MERL TR2004-040.

If that paper is too difficult, you should probably start with this earlier paper, which was more tutorial in nature: MERL TR2001-22.

If you’re looking for generalized belief propagation software, your best bet is this package written by Yair’s student Talya Meltzer.

P.S.: I realized I haven’t told those of you who don’t know anything about it what generalized belief propagation is. Well, one answer is to that is look at the above material! But here’s a little background text that I’ve copied from my research statement to explain why you might be interested:

Most of my current research involves the application of statistical methods to “inference” problems. Some important fields which are dominated by the issue of inference are computer vision, speech recognition, natural language processing, error-control coding and digital communications. Essentially, any time you are receiving a noisy signal, and need to infer what is really out there, you are dealing with an inference problem.

A productive way to deal with an inference problem is to formalize it as a problem of computing probabilities in a “graphical model.” Graphical models, which are referred to in various guises as “Markov random fields,” “Bayesian networks,” or “factor graphs,” provide a statistical framework to encapsulate our knowledge of a system and to infer from incomplete information.

Physicists who use the techniques of statistical mechanics to study the behavior of disordered magnetic spin systems are actually studying a mathematically equivalent problem to the inference problem studied by computer scientists or electrical engineers, but with different terminology, goals, and perspectives. My own research has focused on the surprising relationships between methods that are used in these communities, and on powerful new techniques and algorithms, such as Generalized Belief Propagation, that can be understood using those relationships.

I’ll tell you more in future posts; I promise.

Zillions of Games

August 2, 2007

If you like abstract games, especially chess variants, you should enjoy Zillions of Games. It’s quite an amazing piece of software; a piece of AI technology that I actually would have not thought possible.

Games are defined using a rules file (written in a Lisp-like language), and then a generalized alpha-beta search engine is unleashed. Basically, it works best for Chess and its many variants, but it plays a really surprising number of games very well. It comes with over 350 games and puzzles, and you can download thousands more.

It’s not that hard to modify existing rules files to define new games; my son was interested in having an opponent for a game he invented and we were able to modify an existing rules file to play his game in less than an hour–and it played very well.

Naturally, it doesn’t play all games as well as humans, but it plays so many games so well (with adjustable levels to make a fair fight if it’s too good) that one can’t complain. Unfortunately, it’s only available for Windows computers.