Archive for the ‘Computer Science’ Category

“On Intelligence” and Numenta

August 21, 2007


book.png

“On Intelligence,” written by Jeff Hawkins with Sandra Blakeslee, is a great read, full of provocative ideas about how our brains work.

Hawkins founded Palm Computing and Handspring, but he says that his true lifelong passion has been trying to understand our brains. In 2002, he founded the Redwood Neuroscience Institute (now the Redwood Center for Theoretical Neuroscience at Berkeley).

At Redwood, he developed a theory of the brain, that he expounds in this book. The book is a popular science book; you will not find any equations or explicit algorithms. It reads very smoothly, undoubtably due to the fact that Hawkins was helped in writing the book by Sandra Blakeslee, a science writer for the New York Times.

Hawkins argues that the cortex consists of modules that are all performing the same algorithm. The purpose of that algorithm is to learn to complete and predict the spatio-temporal patterns coming into a module from “lower” modules in the brain’s hierarchy, using feedback from “higher” modules.

I find this hypothesis for how the brain works extremely attractive, and “On Intelligence” argues for the hypothesis almost too well, in that the difficulties of converting the hypothesis into a concrete and useful algorithm will slip by the reader, as if by sleight of hand. The problem, of course, is that it is easy to use words to talk about how a brain might work; the hard part is making a machine do the same thing.

Peter Dayan, a leading computational neuroscientist, has written a much more detailed review. And for a 2006 lecture by Hawkins about his theory, you can watch this video.

After reading the book, I actually spent a few weeks trying to make the hypothesis into a concrete algorithm, without much success. But Hawkins is certainly putting his money where his mouth is: he has founded a company, called Numenta, and Numenta has released software (the “Numenta Platform for Intelligent Computing” or “NuPIC”) that implements part of his theory. NuPIC appears to be based on software originally written by Numenta co-founder and Stanford graduate student Dileep George, but transformed by a team of software developers into a professional-quality product.

However, it is very disappointing that the NUPIC software does not include feedback in its hierarchies, nor does it let you learn and infer temporal patterns (only spatial ones). To be honest, I am not so surprised, because it was these elements (that are obviously central in Hawkins’ theory) that I found it so difficult to integrate into a real algorithm.

Numenta promises that future versions of NuPIC will fill these gaps. I sincerely wish them the best of luck! I should say that I think it is wonderful that an impatient guy like Hawkins pushes the field, using a different approach than the standard academic one.

So in summary, you should definitely read Hawkins’ book, and if you’re intrigued, you should by all means check out the free Numenta software. Whether this line of research will lead to anything significant though, I’m not really sure…

Structure and Interpretation of Computer Programs: Videos and Textbook

August 17, 2007


gjspicture.jpg

This set of videos, of Gerald Jay Sussman and Hal Abelson teaching their course on the “Structure and Interpretation of Computer Programs” in July 1986 for Hewlett-Packard employees, is one of the treasures of the internet. The clothes are out of style, but the material presented is still completely relevant.

The SICP web-site has lots of useful additional information, including the complete text of the 2nd edition of their classic textbook.

I am not going to try to write a better review of the Abelon’s and Sussman’s textbook than the Amazon review written by Peter Norvig. Norvig is the head of Research at Google, and a co-author of Artificial Intelligence: a Modern Approach, the leading AI textbook.

But I will add one comment: what I love most about these lectures is the point (see the picture above of Sussman in his wizard hat) that a computer programmer is like a wizard–he creates something real out of ideas. Computers let us be wizards; and I believe we have only scratched the surface of the possible “spells” that we can learn to cast.

If you want a nice implementation of Scheme, the beautiful dialect of Lisp that was invented by Sussman with his student Guy Steele, and used in this textbook, I highly recommend DrScheme.

Lectures on Disordered Systems

August 12, 2007

Many physicists study “disordered systems,” such as materials like glasses where the molecules making up the material are arranged randomly in space, in contrast to crystals, where all the particles are arranged in beautiful repeating patterns.

The symmetries of crystals make them much easier to analyze than glasses, and new theoretical methods had to be invented before physicists could make any headway in computing the properties of disordered systems. Those methods have turned out to be closely connected to approaches, such as the “belief propagation” algorithm, that are widely used in computer science, artificial intelligence, and communications theory, with the result that physicists and computer scientists today regularly exchange new ideas and results across their disciplines.

Returning to the physics of disordered systems, physicists began working on the problem in the 1970’s by considering the problem of disordered magnets (also called “spin glasses”). My Ph.D. thesis advisor, Philip W. Anderson summarized the history as follows:

“In 1975 S.F. (now Sir Sam) Edwards and I wrote down the “replica” theory of the phenomenon I had earlier named “spin glass”, followed up in ’77 by a paper of D.J. Thouless, my student Richard Palmer, and myself. A brilliant further breakthrough by G. Toulouse and G. Parisi led to a full solution of the problem, which turned out to entail a new form of statistical mechanics of wide applicability in fields as far apart as computer science, protein folding, neural networks, and evolutionary modelling, to all of which directions my students and/or I contributed.”

csss_us.jpg

In 1992, I presented five lectures on “Quenched Disorder: Understanding Glasses Using a Variational Principle and the Replica Method” at a Santa Fe Institute summer school on complex systems. The lectures were published in a book edited by Lynn Nadel and Daniel Stein, but that book is very hard to find, and I think that these lectures are still relevant, so I’m posting them here. As I say in the introduction, “I will discuss technical subjects, but I will try my best to introduce all the technical material in as gentle and comprehensible a way as possible, assuming no previous exposure to the subject of these lectures at all.”

The first lecture is an introduction to the basics of statistical mechanics. It introduces magnetic systems and particle systems, and describes how to exactly solve non-interacting magnetic systems and particle systems where the particles are connected by springs.

The second lecture introduces the idea of variational approaches. Roughly speaking, the idea of a variational approach is to construct an approximate but exactly soluble system that is as close as possible to the system you are interested in. The grandly titled “Gaussian variational method” is the variational method that tries to find the set of particles and springs that best approximates an interacting particle system. I describe in this second lecture how the Gaussian variational method can be applied to heteropolymers like proteins.

The next three lectures cover the replica method, and combine it with the variational approach. The replica method is highly intricate mathematically. I learned it at the feet of the masters during my two years at the Ecole Normale Superieure (ENS) in Paris. In particular, I was lucky to work with Jean-Philippe Bouchaud, Antoine Georges, and Marc Mezard, who taught me what I knew. I thought it unfortunate that there wasn’t a written tutorial on the replica method, so the result were these lectures. Marc told me that for years afterwards they were given to new students of the replica method at the ENS.

Nowadays, the replica method is a little less popular than it used to be, mostly because it is all about computing averages of quantities over many samples of systems that are disordered according to some probability distribution. While those averages are very useful in physics, they are somewhat less important in computer science, where you usually just want an algorithm to deal with the one disordered system in front of you, rather than an average over all the possible disordered systems.

Santa Fe Institute Lectures

Mitsubishi Electric Research Labs is Hiring

August 8, 2007

Mitsubishi Electric Research Labs (MERL), the research lab where I work, is hiring. We have open research scientist positions in a number of different fields in computer science and electrical engineering.

Multicellular Logic Circuits, Part I: Evolution

August 7, 2007

If we want to construct artificial machines that rival the capabilities of biological organisms, we should try to understand the principles by which complex natural “machines” such as plants and animals are created.

It is generally agreed, at least by scientists, that all natural organisms have been “designed” by the completely blind and random process of evolution. Through evolution, a population of organisms tends to become progressively better adapted to its environment via the mutation of genomes of individuals in the population, and the selection and more rapid reproduction of the fittest organisms in that population.

51tk0pychxl_aa240_.jpg

Harvard professor Martin Nowak a has written a lovely and elegant book describing the mathematics of evolutionary dynamics, using the ideas of evolutionary game theory; here is a video of Nowak describing evolutionary game theory at Harvard in 2004.

My own interest is not so much in analyzing evolution, but in exploiting it. If we understand evolution so well, shouldn’t we be able to use it to design useful machines?

180px-img013biglittledogfx_wb.jpg

Of course, humans have already for many centuries exploited evolution, using artificial selection to breed domesticated animals or cultivate useful plants.

But I am looking for something else: the design of artificial machines through artificial selection. Although it has never been a mainstream idea, computer scientists have pursued such dreams since the 1950’s. When I was in graduate school in the 1980’s, I loved reading John Holland’s seminal 1975 book “Adaptation in Natural and Artificial Systems.”

adapt.jpg

Holland and his students were deeply influential in popularizing the whole field of genetic algorithms.

Another important figure in the field is John Koza, who has advocated for many years one of the most important variants of genetic algorithms, which he calls “genetic programming.” In genetic programming, computer programs, typically written in Lisp, are evolved through a process that involves mutating the programs by altering or swapping branches of the computation tree representing the program.

Genetic programming and genetic algorithms more generally, have had considerable success creating interesting and useful systems and programs. Nevertheless, I think it is fair to say that these ideas are still considered “fringe” ideas in the scientific and engineering community, and they have not widely replaced more conventional software and hardware design strategies.

So what might be missing? I will begin discussing that in Part II.


Generalized Belief Propagation

August 5, 2007

In 2002, I gave a lecture at the Mathematical Sciences Research Institute on the work I did, together with Bill Freeman and Yair Weiss on Generalized Belief Propagation, and the correspondence between free energy approximations and message passing algorithms. The lecture is available as a streaming video, together with a pdf for the slides, here.

It’s worth mentioning that there are many other interesting research lectures available in MSRI’s video archive, and that the more recent ones are of higher production quality.

Here is our most recent and comprehensive paper on this subject, published in the July 2005 issue of IEEE Transactions on Information Theory, which gives many additional details compared to the lecture: MERL TR2004-040.

If that paper is too difficult, you should probably start with this earlier paper, which was more tutorial in nature: MERL TR2001-22.

If you’re looking for generalized belief propagation software, your best bet is this package written by Yair’s student Talya Meltzer.

P.S.: I realized I haven’t told those of you who don’t know anything about it what generalized belief propagation is. Well, one answer is to that is look at the above material! But here’s a little background text that I’ve copied from my research statement to explain why you might be interested:

Most of my current research involves the application of statistical methods to “inference” problems. Some important fields which are dominated by the issue of inference are computer vision, speech recognition, natural language processing, error-control coding and digital communications. Essentially, any time you are receiving a noisy signal, and need to infer what is really out there, you are dealing with an inference problem.

A productive way to deal with an inference problem is to formalize it as a problem of computing probabilities in a “graphical model.” Graphical models, which are referred to in various guises as “Markov random fields,” “Bayesian networks,” or “factor graphs,” provide a statistical framework to encapsulate our knowledge of a system and to infer from incomplete information.

Physicists who use the techniques of statistical mechanics to study the behavior of disordered magnetic spin systems are actually studying a mathematically equivalent problem to the inference problem studied by computer scientists or electrical engineers, but with different terminology, goals, and perspectives. My own research has focused on the surprising relationships between methods that are used in these communities, and on powerful new techniques and algorithms, such as Generalized Belief Propagation, that can be understood using those relationships.

I’ll tell you more in future posts; I promise.

Multicellular Logic Circuits

August 2, 2007

I have recently begun working on biologically-inspired computing, and in particular am pursuing a new approach that I call “multi-cellular logic circuits.” These circuits are constructed from many identically-specified cells and undergo a developmental phase much like multi-cellular organisms. I plan to describe these ideas in much more detail here at Nerd Wisdom. Until I create those pages, you can visit this page to find a PDF and a PowerPoint version of a talk I gave on the subject at a conference in Santa Fe in May 2007.

EDIT: I have begun creating those pages. You can start here.

 

The Elements of Computing Systems: Building a Modern Computer From First Principles

August 1, 2007

51jfxm2x1yl_aa240_.jpg

I already wrote a review of this wonderful book, written by Noam Nisan and Shimon Schocken, at Amazon.com, so I’ll just repeat it here:

I highly recommend this book if you are interested in learning about computer science. The book is organized around the idea of building a computer from the fundamental logic gates up–starting with the hardware (combinational logic gates, arithmetic logic units, sequential logic gates, the CPU and memory) and then through the software hierarchy (starting with the machine language, and working up through the assembler, a virtual machine, a compiler for a high-level language, and an operating system). As a “by-product,” one learns, by very relevant examples, many fundamental concepts of computer science.

You can just read the book, but the best idea is to follow the authors’ advice and do the projects where you implement every necessary piece of the computer system yourself. The projects are all very well organized. All the software modules necessary to emulate any part of the computer, plus half the chapters from the book, are available for free download from the authors’ web-site. It all works beautifully. If you want to skip any of the projects, you can, because the software is organized in such a way that it will use built-in modules instead of the ones you built if necessary.

The authors seem to have extensively tested the whole approach through the courses they have taught using this material. I also noticed that Harvard’s Computer Science 101 course is being taught based on this book. I have been using the book for self-study with absolutely no problems–in fact I have never had such a great experience with a self-study course. All you need is a Windows, Linux, or Mac OS X computer and access to the internet, and you can give yourself a wonderful education in computer science.

In terms of prerequisites, you only really need to have some experience with programming (e.g. with C, or ideally with Java or Python). I think that the book should work well for students or hobbyists who don’t have any more experience than that, but it is also great for much more experienced students, as a kind of integrative summary of the field. I also think the book is perfect for graduate students or researchers from other fields who want to learn how digital hardware and software systems are actually engineered.

Finally, I just want to compliment the authors on the extraordinary care that they have taken with the whole project. The computer design that you build up is wonderfully elegant–at every stage the design is just as simple as it can be while being sufficient. Every piece of emulation software works as advertised. Even the extra powerpoint or .pdf tutorials are nicely done. This is really quality work, and using it is just a real pleasure. Finally, the source code for all the software provided by the authors is available, so if you wanted to extend the provided emulators, you could do that.

In summary, I give this book my unqualified highest recommendation.