Science | Nerd Wisdom

Archive for the ‘Science’ Category

Princeton Companion to Mathematics

October 5, 2007

From Terence Tao’s excellent blog, I learned about the upcoming Princeton Companion to Mathematics (PCM), a roughly 1000-page survey of and introduction to mathematics at the undergraduate level. It looks like the PCM editors have lined up a distinguished cast of mathematicians to write comprehensible articles covering all of mathematics. The editor-in-chief is Timothy Gowers, who like Tao is a Fields Medalist, and who has very recently started his own blog.

Gowers gives instructions for finding a large sample of articles that will appear in the book, but they are not completely transparent, so follow these:

Go here.
Use Username “Guest” and Password “PCM”.
Click on “Resources” in the sidebar.
Click on “Sample articles” in the sidebar.

You’ll find many interesting and highly readable articles in .pdf format.

Nerd Wisdom Home

Tags:Mathematics, Princeton Companion to Mathematics, Terence Tao, Timothy Gowers
Posted in Books, Mathematics, Science | Leave a Comment »

Computing Free Energies

October 3, 2007

In my last post, I discussed phase transitions, and how computing the free energy for a model would let you work out the phase diagram. Today, I want to discuss in more detail some methods for computing free energies.

The most popular tool physicists use for computing free energies is “mean-field theory.” There seems to be at least one “mean-field theory” for every model in physics. When I was a graduate student, I became very unhappy with the derivations for mean-field theory, not because there were not any, but because there were too many! Every different book or paper had a different derivation, but I didn’t particularly like any of them, because none of them told you how to correct mean-field theory. That seemed strange because mean-field theory is known to only give approximate answers. It seemed to me that a proper derivation of mean-field theory would let you systematically correct the errors.

One paper really made me think hard about the problem; the famous 1977 “TAP” spin glass paper by Thouless, Anderson, and Palmer. They presented a mean-field free energy for the Sherrington-Kirkpatrick (SK) model of spin glasses by “fait accompli,” which added a weird “Onsager reaction term” to the ordinary free energy. This shocked me; maybe they were smart enough to write down free energies by fait accompli, but I needed some reliable mechanical method.

Since the Onsager reaction term had an extra power of 1/T compared to the ordinary energy term in the mean field theory, and the ordinary energy term had an extra power of 1/T compared to the entropy term, it looked to me like perhaps the TAP free energy could be derived from a high-temperature expansion. It would have to be a strange high-temperature expansion though, because it would need to be valid in the low-temperature phase!

Together with Antoine Georges, I worked out that the “high-temperature” expansion (it might better be thought of as a “weak interaction expansion”) could in fact be valid in a low-temperature phase, if one computed the free energy at fixed non-zero magnetization. This turned out to be the key idea; once we had it, it was just a matter of introducing Lagrange multipliers and doing some work to compute the details.

It turned out that ordinary mean-field theory is just the first couple terms in a Taylor expansion. Computing more terms lets you systematically correct mean field theory, and thus compute the critical temperature of the Ising model, or any other quantities of interest, to better and better precision. The picture above is a figure from the paper, representing the expansion in a diagrammatic way.

We found out, after doing our computations but before submitting the paper, that in 1982 Plefka had already derived the TAP free energy for the SK model from that Taylor expansion, but for whatever reason, he had not gone beyond the Onsager correction term or noted that this was a technique that was much more general than the SK model for spin glasses, so nobody else had followed up using this approach.

If you want to learn more about this method for computing free energies, please read my paper (with Antoine Georges) “How to Expand Around Mean-Field Theory Using High Temperature Expansions,” or my paper “An Idiosyncratic Journey Beyond Mean Field Theory.”

This approach has some advantages and disadvantages compared with the belief propagation approach (and related Bethe free energy) which is much more popular in the electrical engineering and computer science communities. One advantage is that the free energy in the high-temperature expansion approach is just a function of simple one-node “beliefs” (the magnetizations), so it is computationally simpler to deal with than the Bethe free energy and belief propagation. Another advantage is that you can make systematic corrections; belief propagation can also be corrected with generalized belief propagation, but the procedure is less automatic. Disadvantages include the fact that the free energy is only exact for tree-like graphs if you add up an infinite number of terms, and the theory has not yet been formulated in an nice way for “hard” (infinite energy) constraints.

If you’re interested in quantum systems like e.g. the Hubbard model, the expansion approach has the advantage that it can also be applied to them; see my paper with Georges, or the lectures by Georges on his related “Dynamical Mean Field Theory,” or this recent paper by Plefka, who has returned to the subject more than 20 years after his original paper.

Also, if you’re interested in learning more about spin glasses or other disordered systems, or about other variational derivations for mean-field theory, please see this post.

Tags:free energy, high-temperature expansions, mean-field theory, Plefka, spin glasses, TAP
Posted in Inference, Physics, Science, Statistical Physics | 2 Comments »

Phase Transitions and Free Energies

October 2, 2007

Much of condensed matter and statistical physics is concerned with the explanation of phase transitions between different forms of matter. A familiar example is water, which has a transition from the solid phase of ice to the liquid phase, and then from the liquid phase to the gaseous phase of steam, as the temperature is increased.

In fact, many different materials have all sorts of exotic phases, as you vary the temperature, pressure, or composition of the material. Constructing “phase diagrams” which show what the different phases of a material should be as a function of the varying parameters is one of the main preoccupations of physicists.

How can these phase diagrams be constructed? Condensed matter physicists tend to follow the following algorithm. First, from arguments about the microscopic physics, construct a simple model of the local interactions of the molecules making up the material. Secondly, choose some method to approximately compute the “free energy” for that simple model. Finally, find the minima of the approximate free energy as a function of the adjustable parameters like the temperature. The phase diagram can be constructed by determining which phase has the lower free energy at each point in parameter space.

If the results disagree with experiment, your model is too simple or your approximation for the free energy is not good enough, so you need to improve one or the other or both; otherwise write up your paper and submit it for publication.

To illustrate, let’s consider magnetism. Although it is less familiar than the phase transition undergone by water, magnets also have a phase transition from a magnetized “frozen” phase at low temperature to an unmagnetized “paramagnetic” phase at high temperature.

The simplest model of magnetism is the Ising model. I’ve discussed this model before; to remind you, I’ll repeat the definition of the ferromagnetic Ising model: “In this model, there are spins at each node of a lattice, that can point ‘up’ or ‘down.’ Spins like to have their neighbors point in the same direction. To compute the energy of a configuration of spins, we look at all pairs of neighboring spins, and add an energy of -1 if the two spins point in the same direction, and an energy of +1 if the two spins point in opposite directions. Boltzmann’s law tells us that each configuration should have a probability proportional to the exp(-Energy[configuration] / T), where T is the temperature.”

Of course, as defined the Ising model is a mathematical object that can and has been studied mathematically independent of any relationship to physical magnets. Alternatively, the Ising model can be simulated on a computer.

Simulations (see this applet to experiment for yourself) show that at low temperatures (and if the dimensionality of the lattice is at least 2), the lattice of spins will over time tend to align so that they point together up more than down, or they all point down more than up. The natural symmetry between up and down is “broken.” At low temperatures one will find “domains” of spins pointing in the “wrong” direction, but these domains only last temporarily.

At high temperatures, on the other hand, each spin will typically fluctuate between pointing up and down, although again domains of like-pointing spins will form. The typical time for a spin to switch from pointing up to pointing down will increase as the temperature decreases, until it diverges towards infinity (as the size of the lattice approaches infinity) as one approaches the critical temperature from above.

Intuitively, the reason for this behavior is that at low temperature, the configurations where all the spins point in the same direction have a much lower energy, and thus a much higher probability, than other configurations. At high temperatures, all the configurations start having similar probabilities, and there are many more configurations that have equal numbers of up and down spins compared to the number of aligned configurations, so one typically sees the more numerous configurations.

This balance between energetic considerations (which make the spins align) and entropic considerations (which make the spins favor the more numerous unaligned configurations) is captured by the “free energy” F, which is given by the equation F=U-TS, where U is the average energy, S is the entropy, and T is the temperature. At low temperatures, the energy dominates the free energy, while at high temperatures, the entropy dominates.

All this intuition may be helpful, but like I said, the Ising model is a mathematical object, and we should be able to find approximation methods which let us precisely calculate the critical temperature. It would also be nice to be able to precisely calculate other interesting quantities, like the magnetization as a function of the temperature, or the susceptibility (which is the derivative of the magnetization with respect to an applied field) as a function of the temperature, or the specific heat (which is the derivative of the average energy with respect to the temperature) as a function of the temperature. All these quantities can be measured for real magnets, so if we compute them mathematically, we can judge how well the Ising model explains the magnets.

This post is getting a bit long, so I’ll wait until my next post to discuss in more detail some useful methods I have worked on for systematically and precisely computing free energies, and the other related quantities which can be derived from free energies.

Nerd Wisdom Home

Tags:free energy, Ising model, phase transitions
Posted in Physics, Science, Statistical Physics | 1 Comment »

Gallager’s LDPC error-correcting codes

September 28, 2007

Error-correcting codes are a technology that most people don’t think much about, if they even know they exist, but these codes work quietly in the background to enable such things as cell phones and other wireless technology, cable and satellite TV, and also the internet, including DSL, fiber-optic communications, and good old-fashioned dial-up modems. The modern communications revolution could not have begun without these codes.

So what’s the idea behind these codes? There’s a lot to say, and many textbooks have been written on the subject, so I’ll only give the briefest of introductions. [Some excellent textbooks I recommend include MacKay’s textbook which I’ve already reviewed, McEliece’s “Theory of Information and Coding”, and Lin and Costello’s “Error Control Coding.” See also this post for two forthcoming books available online.] EDIT: I’ve added some more information about LDPC decoders, with pointers to available software, in this post.

The basic idea is that we want to transmit some bits which represent some piece of text or picture or something. Unfortunately, when we transmit those bits, they need to travel through some channel (say a wire or through the air) and when they are received, the receiver only gets a noisy version of each bit. For example, each bit might be flipped independently from a 0 to a 1 or vice versa with some small probability (this is called the binary symmetric channel; many other channel models exist too).

To combat this noise, we send extra bits; so instead of sending say the 1000 bits that represent our message, we might send 2000, where the extra 1000 “check” bits have some known relationship to the original 1000. Both the transmitter and receiver agree on that relationship ahead of time; that is the “code.” Of course, all 2000 bits are now subject to the noise, so some of those extra bits could be flipped. Nevertheless, if the noise is small enough, the receiver can try to “decode” the original 1000 bits by finding the configuration of the 2000 bits which obeys all the constraints and is most probable.

In 1948, Claude Shannon proved a theorem that essentially said that if the noise in the channel was small enough, and if the number of extra bits that you were willing to send per original bit was large enough, that one could design very long codes, that if optimally decoded, would always remove all the noise and recover the transmitted message.

(By the way, it is this amazing property that codes can remove 100% of the noise that means that we can watch crystal-clear high-definition TV coming in over the airwaves, something I very much appreciate when I watch a football game these days. When advertisers talk about “digital this” or “digital that,” they really mean “error-corrected digital”.)

As an example of Shannon’s theorem, if one was willing to use one extra bit for every original bit, and the percentage of flipped bits in your binary symmetric channel was less than the Shannon limit of about 11%, his theorem tells you that codes exist that will reliably remove all the noise. However, Shannon’s proof was non-constructive; he didn’t tell us what these wonderful codes were. Shannon also proved a theorem that if the noise was higher than the “Shannon limit,” no codes exist that can reliably correct the noise.

So error-correcting coding theory deals with the problems of designing codes, and efficient encoders and decoders for those codes, that work as close to the Shannon limit as possible. Many theorists invented many interesting and useful codes and encoders and decoders over the years, but until the 1990’s, it still seemed a distant dream to most coding theorists that we would be able to find practical codes that performed near the Shannon limit.

What is very strange is that the best codes and decoders that were discovered in the 1990’s were actually a rediscovery of codes and decoders invented by Robert Gallager in the early 1960’s, for his Ph.D. thesis. Gallager’s thesis introduced “low density parity check” (LDPC) codes, and their decoding algorithm, the famous “belief propagation” decoding algorithm. His thesis also introduced many other important ideas in coding theory, including “density evolution,” simplified “bit-flipping decoders,” and analysis methods for optimal LDPC decoders. It is a truly remarkable work, that every aspiring coding theorist should read. Fortunately, it’s available online.

How is it possible that this work was forgotten? Well, in 1968, Robert Gallager wrote a magnificent textbook on information theory, called “Information Theory and Reliable Communication,” where he explained most of the information and coding theory known at the time, but neglected to talk about LDPC codes! I’m not sure why. Possibly he thought that he’d already covered the material in the 1963 monograph that was made from his thesis, or perhaps he didn’t think LDPC codes were practical at the time. In fact, the decoder was too complex for 1960’s technology, but Moore’s Law took care of that, although only now are LDPC codes widely replacing previously-developed codes in communications systems.

So the moral of the story is: if you write a ground-breaking Ph.D. thesis about a remarkable technology that is 30 years ahead of its time, please don’t forget to mention it in your classic textbook a few years later. Not a moral too many of us have to worry about, but still…

Nerd Wisdom Home

Tags:belief propagation, channel coding, digital communications, Error-correcting Codes, LDPC codes, Robert Gallager, Shannon limit
Posted in Algorithms, Books, Error-correcting Codes, Reviews, Science, Technology | 1 Comment »

Multicellular Logic Circuits, Part III: A Model

September 26, 2007

In Part I and Part II of this series, I discussed genetic algorithms and why we might want to create artificial machines that begin life as a single cell, and develop into networks of identical communicating cells. In this post, I want to begin describing a model that works along these lines.

The model is a highly stylized and simplified cartoon of biological multicellular organisms. It my attempt to make the simplest model possible that captures the essence of what is happening in biology. So understand that biology is more complicated than this model; but the goal is a model stripped down to those essential elements that cannot be taken away if one wants something that looks like life. Thus, the model is proposed in the spirit of the Ising model of magnets in statistical physics; the simplest model that captures the general behavior we are looking for.

The first question is what do we want our machine (or “circuit” or “network” or “organism”; I will use these terms interchangeably) to do? As is quite conventional in hardware design, I will presume the organism receives some input signals from the world, and it is supposed to produce some desired output signal, which depends on the inputs it has received at the current and previous times. Thus, the circuit should in general be capable of creating memories, that lets it store something about previous inputs.

The organism begins its life as a single cell, and then has two phases in its life, a dynamic “embryonic” phase and a static “adult” phase. During the embryonic phase, the cells in the organism can undergo developmental events, primarily cell duplication, but also perhaps cell death or cell relocation, to sculpt out the final network of communicating cells. After the embryonic phase is complete (say after a fixed amount of time has passed, or some signal is generated by the circuit) the adult phase is entered. The network is static in structure during the adult phase. It is during the adult phase that the network can be tested to see whether it properly computes the desired input-output function. The figure above is a pictorial representation of the model that hopefully makes clear what I have in mind.

Each of the cells in the network will have an internal structure, defined primarily by “logic units” which send signals to each other. The computations performed by the organism will simply be the computations performed by the logic units inside of its cells. The details of what the logic units do, and how they are connected to each other, is specified by a “genome” or “program” for the organism.

Look at the figure below for a peek inside an individual cell in the model. Each cell will have an identical set of logic units, with identical connections between the logic units.

The logic units compute an output according to some fixed function of their inputs. They transmit that output after some delay, which is also part of their fixed function. The output of one logic unit will be the input of another; they send “signals” to each other.

These signals are of various types (see the above figure). The first type of signal, called a “factor signal,” will always go from a logic unit to another logic unit in the same cell. The second type of signal, called an “inter-cellular signal,” will always go from a logic unit to a logic unit in a different cell. The third type of signal, called a “developmental output signal,” will not actually go to another logic unit, but will be a signal to the cell development apparatus to perform some important development event, such as duplication or programmed cell death. Finally, the fourth type of signal, called a “developmental input signal,” will be used by the cell development apparatus to signal that some type of cell development event has occurred, and will serve as an input to logic units.

Remember that initial cell (the “fertilized egg”) will need to have a set of logic units that enable it to automatically create the adult network, so it must effectively contain the instructions for development as well as for the adult circuit. It might seem hard to imagine that this can work, but it can. In the next post in this series, I will discuss in more detail the process of development in this model, and then we will be in position to look at some interesting multicellular circuits that I have designed.

If you don’t want to wait, you can visit this page to find a PDF and a PowerPoint version of a talk I gave on the subject at a conference in Santa Fe in May 2007, although unfortunately, it might be hard to decipher without my explanation…

Nerd Wisdom Home

Tags:bio-inspired models, development, multicellular logic circuits
Posted in AI, Biology, Computer Science, Science | Leave a Comment »

Talking about Probabilistic Robotics

September 23, 2007

Sebastian Thrun is a professor of computer science and electrical engineering at Stanford, and director of the Stanford Artificial Intelligence Laboratory. He was the leader of Stanford’s team which won the $2 million first prize in the 2005 DARPA Grand Challenge, which was a race of driver-less robotic cars across the desert, and also leads Stanford’s entry into the 2007 DARPA Urban Challenge.

One of the ingredients in the Stanford team’s win was their use of “probabilistic robotics,” which is an approach based on the recognition that all sensor readings and models of the world are inherently subject to uncertainty and noise. Thrun, together with Wolfram Burgard and Dieter Fox have written the definitive text on probabilistic robotics, which will be a standard for years to come. If you are seriously interested in robotics, you should read this book. (The introductory first chapter, which clearly explains the basic ideas of probabilistic robotics is available as a download here.)

The Laboratory of Intelligent Systems at the Swiss École Polytechnique Fédérale de Lausanne (EPFL) hosts the superb “Talking Robots” web-site, which consists of a series of podcast interviews with leading robotics researchers. I noticed that the latest interview is with Thrun, and liked it quite a bit; it is well worth downloading to your iPod or computer.

You can watch Thrun speaking about the DARPA Grand Challenge at this Google TechTalk.

Nerd Wisdom Home

Tags:artificial intelligence, DARPA Grand Challenge, probabilistic robotics, Robotics, Sebastian Thrun
Posted in AI, Algorithms, Books, Computer Science, Inference, Probability, Science, Technology | Leave a Comment »

Artificial Intelligence: A Modern Approach

September 20, 2007

“Artificial Intelligence: A Modern Approach,” by Stuart Russell (professor of computer science at UC Berkeley) and Peter Norvig (head of research at Google) is the best-known and most-used textbook about artificial intelligence, and for good reason; it’s a great book! The first edition of this book was my guide to the field when I was switching over from physics research to computer science.

I feel almost embarrassed to recommend it, because I suspect nearly everybody interested in AI already knows about it. So I’m going to tell you about a couple related resources that are maybe not as well-known.

First, there is the online code repository to the algorithms in the book, in Java, Python, and Lisp. Many of the algorithms are useful beyond AI, so you may find for example that the search or optimization algorithm that you are interested in has already been written for you. I personally have used the Python code, and it’s really model code from which you can learn good programming style.

Second, if you haven’t ever visited Peter Norvig’s web-site, you really should. I particularly recommend his essays “Teach Yourself Programming in Ten Years,” “Solving Every Sudoku Puzzle,” and “The Gettysburg Powerpoint Presentation.”

Nerd Wisdom Home

Tags:artificial intelligence, Peter Norvig, python code
Posted in AI, Algorithms, Computer Science, Inference, Programming, Reviews, Science, Software | 2 Comments »

Cynthia Kenyon’s Long-lived Worms

September 19, 2007

Professor Cynthia Kenyon is a pioneering researcher in the biology of aging. A couple years ago, she presented a Harvey Lecture at Rockefeller University on her work; that lecture was similar to the one I heard her give at last year’s Woods Hole summer school course on aging. I think that it’s worth highlighting some of the things she has to say.

“We began our studies in the early 1990s. At that time, and for years before, many people assumed that aging was a haphazard process, not subject to regulation. Our tissues just break down, and we die. But the more I thought about it, the more I started to question this view. A mouse lives two years, whereas a bat can live 30 years or more. A rat lives three years; a squirrel, 25. These animals differ by their genes, so there must be genes that affect aging. Also, nothing in biology seems to “just happen”; everything seems to be regulated, often in quite an extraordinary way.

My experience as a developmental biologist sharpened my thoughts about aging. People were once very skeptical about looking for developmental genes. Treating frog embryos with acid can produce a second head, and inhibiting pyrimidine synthesis in flies produces small wings, so many people thought that genes affecting development would also affect things like the Krebs cycle, or pH. They were wrong. There is a dedicated regulatory circuitry for pattern formation. In addition, many people thought that developmental mechanisms would differ completely in different kinds of animals, but again they were wrong. In fact, the degree of evolutionary conservation is striking. So it seemed to me that something as fundamental as aging might also be subject to regulation. Maybe there would be a molecular longevity “dial,” like a thermostat, that is universal but set to run at different rates in different kinds of animals. The dial would be turned up in mice (which age quickly) and down in bats (which age slowly). I wrote extensively about this in the 1990’s (Kenyon, 1996, 1997), suggesting, for example, that aging might be regulated by something like the heterochronic genes of C. elegans, which control the timing of developmental events.”

…

Click on figure to expand

“Since we obtained such a long lifespan when we killed the gonads of daf-2 mutants, we wondered what would happen if we reduced daf-2 activity even more in these animals. Using a stronger daf-2 allele would run the risk of triggering dauer formation, but we found that we could dodge dauer formation if we subjected long-lived daf-2(e1368) mutants to daf-2 RNAi soon after hatching. When we did this, and killed the gonads as well, the animals lived six times as long as normal (Fig. 2.16). Incredibly, the animals remained healthy and vigorous for a very long time. In fact, when Nuno Arantes-Oliveira, the graduate student doing this work, showed two 144-day-old animals, still moving around, to other lab members and asked them to guess the age of the animals, they reckoned five days! [For a movie of these two spunky animals, see Arantes-Oliveira et al. (2003).] It is remarkable that with just a few minor changes, it is possible to produce such an enormous lifespan extension (the equivalent of 500 years in humans) with no obvious effect on the vitality of the animals.”

…

“If we really could live longer, remaining youthful and disease-free, why haven’t scientists been working on this already? First, as I said, they didn’t think it was possible, since aging was thought to be unruly and random. Second, and even more important, we haven’t had any role models to emulate, primates that shoot rockets to the moon, go to the opera, and live for 300 years. If we did, we might already know how to stay young and live much longer than we do. We invented airplanes because we could see birds could fly. Now that we know that animals can live longer than they do, perhaps soon we will learn how to extend our own youthfulness and lifespan. It may not be that difficult. Since there are short-lived and long-lived insects, birds, and mammals, longevity must have evolved not just once but many times. Maybe the path to increased longevity is in us already, in the form of a network of genes and proteins, waiting to be nudged in just the right way.”

I recommend you read the whole thing–it’s quite readable, and the scientific results are breathtaking.

And if you’re interested, here is a video from earlier this year with Charlie Rose interviewing a panel of biologists about the remarkable progress that has been made in aging research recently. Members of the panel include Kenyon and Lenny Guarente, another leader in the field whose book I previously reviewed.

Nerd Wisdom Home

Tags:Aging, C. Elegans, Cynthia Kenyon
Posted in Aging, Biology, Health, Science | 1 Comment »

Multicellular Logic Circuits, Part II: Cells

September 18, 2007

In my post “Multicellular Logic Circuits, Part I: Evolution,” I discussed evolution and genetic algorithms; I want to continue that discussion here.

There are two salient facts of biology that are completely inescapable. The first is that all organisms are shaped by the process of evolution. The second is that all organisms are constructed from cells.

Furthermore, all complex multicellular organisms begin life as a single cell, and undergo a process of development through cell division to mature into an adult. And no matter how different any two organisms may be on the gross macroscopic level that we are used to, inside their cells the chemical processes of life are fundamentally very similar.

Thus it is no accident that the titles of the two leading textbooks in molecular biology are The Molecular Biology of the Gene by Watson, et. al. and The Molecular Biology of the Cell by Alberts et. al. [These are both great books. This link to the first chapter of MBOC is an excellent entry point into modern biology. And if you are serious about learning biology, I also strongly recommend the companion Molecular Biology of the Cell: A Problems Approach, by Wilson and Hunt, which will force you to think more actively about the material.]

It therefore seems reasonable that if we want to construct artificial systems that achieve the performance of natural ones, we should consider artificially evolving a system constructed from cells.

Although there are typically many different cell types in a mature multi-cellular organism, all the different cells of the organism, with the exception of sperm and egg cells, share an identical genetic specification in their DNA. The different behavior of cells with identical genetic specifications is the result of the cells having different histories and being subjected to different environments.

More specifically, the behavior of a biological cell is controlled by complex genetic regulatory mechanisms that determine which genes are transcribed into messenger RNA and then translated into proteins. One very important regulatory mechanism is provided by the proteins called “transcription factors” that bind to DNA regulatory regions upstream of the protein coding regions of genes, and participate in the promotion or inhibition of the transcription of DNA into RNA. The different histories of two cells might lead to one having a large concentration of a particular transcription factor, and the other having a low concentration, and thus the two cells would express different genes, even though they had identical DNA.

Another important mechanism that controls the differential development of different types of cells in a multi-cellular organism is the biochemical signaling sent between cells. Signals such as hormones have the effect of directing a cell down a particular developmental pathway.

In general, the transcription factors, hormones, and multitude of other control mechanisms used in biological cells are organized into a network which can be represented as a “circuit” where the state of the system is characterized by the concentrations of the different biochemical ingredients. In fact, biologists are now using wiring diagrams to help summarize biological circuits; see for example, the “Biotapestry editor” developed by Eric Davidson’s lab at Caltech.

[I strongly recommend Davidson’s recent book The Regulatory Genome: Gene Regulatory Networks in Development and Evolution for an exciting introduction to the burgeoning “evo-devo” field; if you don’t have any background in biology, you may prefer The Coiled Spring, by Ethan Bier for a somewhat more popular account.]

Turning to the problem of designing artifical systems, a natural question is what theoretical advantages exist, from the point of view of designing with evolution, to using an identical genetic specification for all the cells in a multi-cellular organism.

One potential advantage is that relatively small changes to the genetic specification of the organism can concurrently alter the behavior of many different kinds of cells at many different times during the development of the organism. Therefore, if there is the possibility of an advantageous change to the circuitry controlling a cell, then it can be found once and used many times instead of needing to find the same advantageous mutation repeatedly for each of the cells in the organism.

Another related potential advantage is that a highly complicated organism can be specified in a relatively compact way. If each of the trillions of cells in a complex organism like a human had to be separately specified, then the overall amount of information required to describe the human genome would be multiplied more than a trillion-fold. Clearly, it is much more efficient to re-use the identical circuitry in many different types of cells.

In other words, biology uses a strategy of specifying a complex multi-cellular organism by just specifying a single cell–all the other cells in the mature organism are grown organically out of the developmental process. This seems like a strategy worth imitating.

On the other hand, the constraint that each cell in an organism should share an identical genetic specification clearly causes complications from the point of view of design. For example, it is important that genes that are designed to function in one type of cell at one point in development not cause problems for different type of cell at a different point in development. Clearly, good design of the control logic that turns genes on and off is essential to the proper functioning of a multi-cellular organism.

In the next post in this series, I will turn to the construction of a concrete model for multi-cellular circuits that tries to capture, as simply as possible, the essence of what is happening in biology.

Tags:development, Eric Davidson, evolution, genetic algorithms, multicellular organisms
Posted in AI, Algorithms, Biology, Books, Computer Science, Reviews, Science | 2 Comments »

Programming in NetLogo

September 14, 2007

In my previous post about simulating the Ising model with the Metropolis algorithm in NetLogo, I said that I would return and and give a walk-through of the amazingly succint NetLogo code. Actually, I’m not going to do that; NetLogo code is sufficiently readable, and the documentation is sufficiently comprehensive, that there’s no real point.

Instead I want to discuss to what extent NetLogo can be considered a “real” programming language, suitable for work beyond its roots in education. The short answer is that it looks to me like quite a competitive language, which will make a particularly excellent choice for many scientific applications.

NetLogo is optimized for simulations of agents moving in a two-dimensional space. The moving agents are called “turtles” but you can think of them as objects endowed with a lot of built-in methods. It’s quite possible to use the turtles in the same way as objects in other object-based languages (although only a limited form of inheritance is available). For example, a turtle can contain other turtles as variables, and you can create new classes (called “breeds”) of turtles. The other basic objects, with many built-in methods, are the “patches” which tile the 2-d space, the “links” that you can set up between turtles, and the “observer.”

There are an impressive number of built-in primitive procedures, especially for anything that relates to simulations. You can also do all the basic things that you would expect a language to do: open and write to files, process strings, work with lists, etc.

The ability to quickly and easily build a GUI that will work on all platforms is very attractive. I have some experience building GUI’s for Mac OS X, using Cocoa and/or the Python bridge PyObjC (which is another worthwhile approach and something that I’ll post about at some point), and I can say that to build essentially the same simulation with the same GUI in NetLogo takes easily less than half as much work and code. It is also nice that it is so easy to construct applets and movies.

The syntax is similar to Lisp, but without parentheses, and with a great deal of syntactic sugar to make it look as close to English as possible. It is absolutely an optimal first language for the beginning programmer. My son was amazed that “everything worked, and when it didn’t I could understand the error messages.” He’s not really used to that from his experiences with other languages.

I looked for things that are missing. At first I thought hash tables weren’t there, but those are actually available through an extension. One thing that really is missing is the ability to treat a function as a value. You also can’t define your own special forms or macros, so I suppose that it’s not really Lisp. NetLogo also does not have, aside from in its core area of simulation, much in the way of libraries, but there is the ability to extend the language by writing functions using Java.

It’s easy to learn; but don’t neglect to look at the code examples section of the models library (you’ll need to download NetLogo first). You’ll see how easy it is to do things that take a lot more work in most languages.

Nerd Wisdom Home

Tags:GUI, Java, NetLogo, programming languages
Posted in Algorithms, Computer Science, Programming, Science, Software, Technology | 2 Comments »

Nerd Wisdom

Archive for the ‘Science’ Category

Princeton Companion to Mathematics

Computing Free Energies

Phase Transitions and Free Energies

Gallager’s LDPC error-correcting codes

Multicellular Logic Circuits, Part III: A Model

Talking about Probabilistic Robotics

Artificial Intelligence: A Modern Approach

Cynthia Kenyon’s Long-lived Worms

Multicellular Logic Circuits, Part II: Cells

Programming in NetLogo

About

Spread Nerd Wisdom!

Guest Articles

Recent Posts

Older Posts

Categories

Recent Comments

Blogs

Educational Sites

Other Useful Sites

Personal Technical Web-sites

Nerd Wisdom RSS Feed

Comments RSS Feed