On Blogs and Books

October 12, 2007

Jeff Atwood has a great blog, called “Coding Horror,” about software development. He’s been blogging continuously and very actively since 2004, which I find impressive; check out his archives.

One of his most recent posts is about the book that he’s just completed, on ASP.NET. I can’t say that I’m interested in the subject of his book, but I was interested in what he had to say about writing blogs versus writing books. Basically, he comes down heavily in favor of writing blogs:

“As I see it, for the kind of technical content we’re talking about, the online world of bits completely trumps the offline world of atoms:

  • it’s forever searchable
  • you, not your publisher, will own it
  • it’s instantly available to anyone, anywhere in the world
  • it can be cut and pasted; it can be downloaded; it can even be interactive
  • it can potentially generate ad revenue for you in perpetuity

And here’s the best part: you can always opt to create a print version of your online content, and instantly get the best of both worlds. But it only makes sense in that order. Writing a book may seem like a worthy goal, but your time will be better spent channeling the massive effort of a book into creating content online.”

He also points out that writing books is a lot harder than writing blogs:

“Writing a book is hard work. For me, writing blog entries feels completely organic, like a natural byproduct of what I already do. It’s not effortless by any means, but it’s enjoyable. I can put a little effort in, and get immediate results out after I publish the entry. The book writing process is far more restrictive. Instead of researching and writing about whatever you find interesting at any given time, you’re artificially limited to a series of chapters that fit the theme of the book. You slave away for your publisher, writing for weeks on end, and you’ll have nothing to show for it until the book appears (optimistically) six months down the road. Writing a book felt a lot like old fashioned hard work– of the indentured servitude kind.”

Charles Petzold, an experienced author of programming texts, chimes in here with more details on the declining economics of technical book-writing. Apparently a lot fewer people are buying programming books nowadays, so the financial situation for the authors of these books is getting worse. So Petzold agrees that it makes a lot more sense to write in blog format, but notes that blogs don’t usually pay very well.

Let me just give a different perspective, from someone who is more interested in academic writing than technical writing. I think that most academics don’t write books in order to make money, and that’s certainly true of the journal articles that are written. So for academics, moving from books to blogs has more of the upside and less of the downside than for other authors.

I know that I personally find writing a blog a lot more appealing than writing a book. As Atwood points out, you can write about whatever happens to appeal to you on the day you’re writing; and you get much faster feedback. Sure it’s some work, but all in all, it’s great!

Petzold has another criticism of blogs:

On the Internet, everything is in tiny pieces. The typical online article or blog entry is 500, 1000, maybe 1500 words long. Sometimes somebody will write an extended “tutorial” on a topic, possibly 3,000 words in length, maybe even 5,000.

It’s easy to convince oneself that these bite-sized chunks of prose represent the optimum level of information granularity. It is part of the utopian vision of the web that this plethora of loosely-linked pages synergistically becomes all the information we need.

This illusion is affecting the way we learn, and I fear that we’re not getting the broader, more comprehensive overview that only a book can provide. A good author will encounter an unwieldy jungle of information and cut a coherent path through it, primarily by imposing a kind of narrative over the material. This is certainly true of works of history, biography, science, mathematics, philosophy, and so forth, and it is true of programming tutorials as well.

Sometimes you see somebody attempting to construct a tutorial narrative by providing a series a successive links to different web pages, but it never really works well because it lacks an author who has spent many months (or a year or more) primarily structuring the material into a narrative form.

For example, suppose you wanted to learn about the American Civil War. You certainly have plenty of online access to Wikipedia articles, blog entries, even scholarly articles. But I suggest that assembling all the pieces into a coherent whole is something best handled by a trained professional, and that’s why reading a book such as James McPherson’s Battle Cry of Freedom will give you a much better grasp of the American Civil War than hundreds of disparate articles.

If I sound elitist, it’s only because the time and difficulty required for wrapping a complex topic into a coherent narrative is often underestimated by those who have never done it. A book is not 150 successive blog entries, just like a novel isn’t 150 character sketches, descriptions, and scraps of dialog.”

As somebody who writes blog posts that typically come in at 500-1500 words, but loves to read books, I want to respond to that.

The web lets us write material in whatever form we want. I’m comfortable with the 500-1500 word post. Other people with popular blogs write 2-sentence posts linking to an article. Paul Graham writes long essays. David MacKay puts drafts of his books online.

The fact is, that you can write about whatever you want, whenever you want, in whatever form you want. The most important point is that you don’t need permission to publish anymore. You don’t need a publisher; you are the publisher.

Which brings me to a related point. I find a lot of the current scientific publication process completely bizarre. Scientists write the articles, type-set the articles, review the articles, edit the articles, and then find that their own articles are not freely available online? And we write articles that are hard to understand because of space limitations? Online there are no space limitations! The entire system is lumbering on mostly as if we still were living in the early 20th century, when only a few specialized groups of people had the capability to publish, and delivering journal articles to people was necessarily expensive.

Most of the current system’s remaining justification, compared to a free system where everybody simply published their work online, and that was the end of it, is for credentialing articles by peer review and credentialing people by the number of peer-reviewed articles they’ve written. Look, I know peer review is important, but given the huge time sink of the current system, and the morale problems that it contributes to, I think that we should take a closer look at the costs and benefits, and maybe be more open to people who simply publish their work online (see, for example, Perelman’s papers proving the Poincare conjecture, which were never published in a peer-reviewed journal).

So if you feel the urge, publish yourself online in whatever format suits you. Just try to make the content worthwhile for somebody else in the world, and don’t worry about the rest. [End of rant.]

Advertisement

iTunes U

October 11, 2007

04_large20070905.jpg

I bought one of the new iPod Nanos last week. One of the main reasons that I wanted one was to be able to listen to and watch the video lectures available at iTunes U while walking or traveling. There’s quite a bit of interesting material available for free download. For example, I’ve been watching lectures from the weekly colloquium of the Stanford Computer Systems Laboratory.

There are also MIT courses about graduate-level Digital Communications taught by Edison and Shannon medalist David Forney, or introductory Biology by Eric Lander and Robert Weinberg, or Mathematical Methods for Engineers (look under Mathematics) by Gilbert Strang, all in video. The podcast section of iTunes Store also has many videos of entertaining 20-minute talks from the TED conferences held over the last few years.

Of course there’s all sorts of other material available; I’m just pointing out some more academic videos.

It’s somewhat annoying how difficult it is to find specifically video material; there’s much more material available only as audio, but the video material is not really specifically singled out in any way. Also, you should know that you can also watch all the material directly on your computer, but if you want to do that, you should also visit this MIT OpenCourseWare page or this Berkeley webcast page. Both of these pages have a lot of additional video courses available in formats other than the MP4 format used by the iPod.

The iPod nano is very small and light. It’s my first iPod, so I was pleasantly surprised to learn that the earbuds are actually pretty comfortable. The screen is really sharp, but it’s also really tiny, so while it’s OK to look at occasionally to see what’s going on, but I wouldn’t want to watch a long movie on it. The iPod Touch will definitely be a better option for that.

Video Conferencing Using iChat

October 10, 2007

Scientific collaboration is really difficult when you cannot talk directly with the person you’re working with. This has long made collaboration with distant colleagues cumbersome, to say the least. Fortunately, we live in the 21st century, and video conferencing technology now exists that is simple to set up, and works extremely well.

For the last few months, I have been using Apple’s video iChat software, and I can recommend it highly. It seems to have two major applications: connecting family members, and enabling business video meetings.

I have been video-conferencing with my colleague Stark Draper since he went to the University of Wisconsin earlier this summer. Video conferencing reliably works much better than phone conversations, and nearly as well as face-to-face meetings. It might seem surprising that it should be that big an improvement over phone conversations, but in fact human beings are very visual creatures, and a lot of information is conveyed by expressions and gestures. When you talk to somebody via video conference, they really seem like they’re with you in the same room.

If both you and your colleague have a recent Apple notebook or iMac with iSight camera built in, setting up iChat will be very easy; it just takes a couple minutes in all. You’ll need to sign up for a free trial .Mac account, which lets you use iChat (if you already have some other instant messaging account, like a Jabber account, you can also use that). The free trial lasts for 60 days after which you need to pay $100 if you want the full benefits of a .Mac account, but even if you don’t pay anything, you will still be able to continue to use your account for iChat. Since I wasn’t very interested in the other benefits of .Mac, that’s precisely what I did.

iChat uses the H.264 codec, which gives a very nice sharp image, although occasional video compression artifacts are noticeable (they’re interesting if you’ve worked on video compression, like me). There is also a very slight delay, which is noticeable, but not too bad. Actually setting up the video connection is trivial; it’s just a matter of knowing your partner’s .Mac account.

Other software like Skype exists for those of you with Windows or Linux. I don’t have any experience with video-conferencing with these services. If you do have such experience, and especially if you can compare with iChat, I’d be interested in your comments. Perhaps the main advantage for the Macs is just the fact that the camera is already built in.

Here’s a video that shows Steve Jobs demoing the new version of iChat that will be released with Leopard, the new version of Mac OS X coming out this month. It will give you a good idea of what a video chat is like; you might think that the reality is not so nice, but that’s essentially the quality I get with my video-conferences with Stark.

The upgrades for Leopard actually seem pretty minor to me; it will be nice to be able to share .pdf documents or presentations, but you can already basically do that by emailing the files. And I’m not too excited about the ability to distort my image or use weird back-drops. Originally, there was supposed to be a useful new feature in iChat for Leopard which enabled you to share your desk-top with your video partner, but it’s not clear whether that feature has been moved to another part of Leopard, delayed, or dropped altogether. We’ll soon find out. UPDATE: It looks like the desktop-sharing feature exists after all, which is excellent news, especially for people wanting to help out their computer-challenged friends and relations. However both sides of the chat will need to have Leopard for this to work.

Enhanced Chess

October 8, 2007

For a long time, Chess was considered the “ultimate test of cerebral fitness,” as the 1980’s song “One Night in Bangkok” proclaimed. Its popularity probably peaked in 1972, during the Fischer-Spassky world championship match which seemed to capture the interest of the entire world. Lately, however, its popularity has seriously declined, because of a number of factors:

1. Humans are no longer the best Chess players in the world. It’s somewhat depressing to study and play a game when you know that a program you can buy for $50 can out-play you, or any other human.

2. Chess opening theory has become extraordinarily detailed, and onerous to study. Very often, grandmaster games begin with 20 or more known theoretical moves. Moreover, nowadays most new theoretical novelties are suggested and analyzed by computer programs used by top players before their games. If a player does not keep up with opening theory, he will be at a disadvantage, but many people find the prospect of studying openings in this way increasingly dehumanizing.

3. Most games between top-flight Chess grandmasters end in a draw, simply because Chess is almost certainly a draw with best play. To take one example, Kramnik won his match against Kasparov in 2000 with 2 wins, no losses, and 14 draws.

I want to propose here a Chess variant, that I developed with my son Adam, that addresses all these problems. We call this variant “Enhanced Chess.” In Enhanced Chess, there are no draws, there is no opening theory (or endgame theory) to study, there is no advantage for the first player, and humans should out-class computers for a long time to come. The game is extremely skill-intensive; the more imaginative and “cerebrally fit” competitor will definitely win the game.

The game combines a couple existing Chess variants. The first of these variants is “Bughouse Chess.” Ordinary Bughouse Chess is played by two teams of two players each. On each team, one player plays White and the other plays Black. When you capture one of your opponent’s pieces, you hand it to your partner, who can drop it on his own board rather than making an ordinary move. Pawns may not be dropped on the first or eighth rank, but otherwise there are no restrictions on drops; you can even checkmate with a dropped piece. When a pawn reaches the eighth rank, it is promoted normally, but if the promoted piece is captured, it turns into a pawn again. The game ends when one of the kings is checkmated.

In ordinary Bughouse you may communicate freely with your partner. It is normally played with fast time controls (e.g. 5 minutes for each player for the game) as a very amusing social game with lots of trash-talk. Here’s a video of a typical Bughouse game, where the only unusual thing is that one of the players (at the back left) is Levon Aronian, one of the strongest Chess grandmasters in the world, and a participant in the 2007 World Chess Championship:

In Enhanced Chess on the other hand, a single player will control both the boards on his side, so that the game is again a two-player game. Moreover, the moves are made in a strict turn-based order. First Player 1 makes a move with White, then Player 2 makes a Black move followed by a White move, and then the players continue to alternate, always making a Black move followed by a White move. Notice that because Player 1 only makes one move on his first turn, and then Player 2 makes two moves, there is no obvious advantage to going first or second.

Time controls can be as slow as you like; Enhanced Chess is meant as a serious game.

Turn-based Bughouse is a good game, but if it became standard, it would quickly suffer from the problem of excessive opening theory. But that problem can be remedied by using the idea, proposed and popularized by Bobby Fischer for ordinary Chess, of randomizing the initial position. In Enhanced Chess, the initial random position must be identical on the two boards, as in the following diagram of an initial position:

enhancedchess001.jpg

Chess with randomized initial positions (sometimes called “Fischer Random Chess” or “Chess960” because of the 960 legal starting positions) is already a somewhat popular variant. Levon Aronian is the current “world champion” of this variant, having beaten Vishy Anand (the current world chess champion in ordinary Chess) in a short match in 2007.

One difficult point in Chess960 is how to handle castling. In Enhanced Chess, we propose to handle this problem by simply forbidding castling. Castling is usually of dubious utility in any Bughouse Chess variant in any case, and there’s no reason to carry all the complicated rules baggage. There would also be more than 960 legal starting positions, because while Chess960 tries to ensure that the kings start between the rooks, in Enhanced Chess that is not necessary. (We still required that the two bishops on each side start on opposite color squares).

It would in fact probably be sufficient if at the beginning of the game, the players alternated on deciding the positions of one piece at a time. For example, my first move would decide that the Kings go on b1 and b8. You then decide that one of the Knights should be placed on f1 and f8, and so on (one needs to forbid placements that place the bishops on the same color square or leave no option but for that to happen in the future). This version would remove any element of luck from the game, and I doubt that studying openings would pay with even this amount of “pre-processing.”

Making a pair of moves which repeats a position (on the combined boards) is forbidden in Enhanced Chess, and stalemate is a loss. These rules remove any theoretical possibility of a draw.

It is interesting to speculate on what the “correct” result would be in Enhanced Chess if God played against God. There is no obvious end to the game until positions start repeating. It’s clearly just idle speculation, but how many moves do you think would need to occur in the perfect game tree before one side won?

In practice, I don’t think that Enhanced Chess games would typically last any longer than ordinary Chess games, although a big difference is that players would definitely need to rely much more on intuition and much less on knowledge and calculation. One other big difference, which would definitely be a drawback for some players, is that reduced-material end-games would cease to exist. Enhanced Chess would be much like Shogi (Japanese Chess) on steroids, but maintaining all the familiar pieces (and removing the advantage of moving first).

Finally, I should address why I think computer programs will play Enhanced Chess poorly, at least in comparison to ordinary Chess. The main point is that programs basically use a search algorithm, that would be hobbled by the much larger number of possible moves in each position (remember that for each turn you get one move on each board, and all the drops mean many more possibilities as well, so for each turn there might be 10,000 possibilities for the computer to consider, instead of 30 or so). Evaluating a position will also much more difficult for a computer in Enhanced Chess compared to ordinary Chess, because material and mobility are less important. I would anticipate based on these points that Enhanced Chess programs would perform significantly worse than Shogi (Japanese chess) programs (where there are drops, but only one move per turn), and even in Shogi, computer programs still fall somewhat short of the top level.

In the 15th century, a series of major rules modifications were accepted into Chess (the modern Queen, Bishop, and Pawn moves, Castling, and en passant) that succeeded in reviving what had become a rather dull game. It seems to me that something similar will need to occur again before Chess can reverse its decline and regain its status as the King of Games.

 

Rock-Paper-Scissors Without the Luck

October 6, 2007

As I’ve mentioned before (click here for his “Duke and Pawn” game, or click here for a more game-theoretical type of game), my son Adam enjoys designing games. Here’s a game he designed, with very simple rules but complex play, on the theme of Rock-Paper-Scissors.

1. The two players are White and Black, and they each have three pieces, a Rock, a Paper, and a Scissors, that they set up on a 6 by 6 board as shown below.

rps2001.jpg

2. White moves first, and then the players alternate.

3. Normally, a player moves a single one of his pieces one square orthogonally (up or down or right or left) on each move. You may not place a second piece on the same square except to capture. (So for example, on the first move, White’s legal moves are to move his Paper up or to the left, or his Scissors up or to the left.)

4. Rocks can capture Scissors, Scissors can capture Papers, and Papers can capture Rocks (of the opposite color in each case) but no other captures are possible. A piece can be moved a second time on the same move if and only if it enables it to make a capture that move (e.g. if a Paper is diagonally below and to the right from an opposing Rock, it could move up and then move left in order to capture the Rock.)

5. The object of the game is to move one of your pieces to the square where the opposing Rock is placed at the beginning of the game. If you do so, you instantly win the game.

6. You may never repeat a position.

7. You must move each turn. If you have no moves, you lose.

The game has no draws. One important and non-obvious point of strategy: if you exchange your Rock for your opponent’s Scissors, you will have an important material advantage, because your Paper can safely rampage without worrying about any of his pieces, while his Rock still has to worry about your Paper, and his Paper still has to worry about your Scissors.

It might appear that the first player has a significant advantage, but our games haven’t worked out that way at all. By the way, we use Chess pieces for the pieces; rooks for rocks, pawns for papers, and bishops for scissors.

I like the irony that ordinary Rock-Paper-Scissors is a game of complete chance and Adam’s version has no chance. Have fun!

(EDIT: This game now has a name; it’s called “Cyclotron.”)

 

Princeton Companion to Mathematics

October 5, 2007

From Terence Tao’s excellent blog, I learned about the upcoming Princeton Companion to Mathematics (PCM), a roughly 1000-page survey of and introduction to mathematics at the undergraduate level. It looks like the PCM editors have lined up a distinguished cast of mathematicians to write comprehensible articles covering all of mathematics. The editor-in-chief is Timothy Gowers, who like Tao is a Fields Medalist, and who has very recently started his own blog.

Gowers gives instructions for finding a large sample of articles that will appear in the book, but they are not completely transparent, so follow these:

  1. Go here.
  2. Use Username “Guest” and Password “PCM”.
  3. Click on “Resources” in the sidebar.
  4. Click on “Sample articles” in the sidebar.

You’ll find many interesting and highly readable articles in .pdf format.

Computing Free Energies

October 3, 2007

In my last post, I discussed phase transitions, and how computing the free energy for a model would let you work out the phase diagram. Today, I want to discuss in more detail some methods for computing free energies.

The most popular tool physicists use for computing free energies is “mean-field theory.” There seems to be at least one “mean-field theory” for every model in physics. When I was a graduate student, I became very unhappy with the derivations for mean-field theory, not because there were not any, but because there were too many! Every different book or paper had a different derivation, but I didn’t particularly like any of them, because none of them told you how to correct mean-field theory. That seemed strange because mean-field theory is known to only give approximate answers. It seemed to me that a proper derivation of mean-field theory would let you systematically correct the errors.

One paper really made me think hard about the problem; the famous 1977 “TAP” spin glass paper by Thouless, Anderson, and Palmer. They presented a mean-field free energy for the Sherrington-Kirkpatrick (SK) model of spin glasses by “fait accompli,” which added a weird “Onsager reaction term” to the ordinary free energy. This shocked me; maybe they were smart enough to write down free energies by fait accompli, but I needed some reliable mechanical method.

Since the Onsager reaction term had an extra power of 1/T compared to the ordinary energy term in the mean field theory, and the ordinary energy term had an extra power of 1/T compared to the entropy term, it looked to me like perhaps the TAP free energy could be derived from a high-temperature expansion. It would have to be a strange high-temperature expansion though, because it would need to be valid in the low-temperature phase!

Together with Antoine Georges, I worked out that the “high-temperature” expansion (it might better be thought of as a “weak interaction expansion”) could in fact be valid in a low-temperature phase, if one computed the free energy at fixed non-zero magnetization. This turned out to be the key idea; once we had it, it was just a matter of introducing Lagrange multipliers and doing some work to compute the details.

expansion001.jpg

It turned out that ordinary mean-field theory is just the first couple terms in a Taylor expansion. Computing more terms lets you systematically correct mean field theory, and thus compute the critical temperature of the Ising model, or any other quantities of interest, to better and better precision. The picture above is a figure from the paper, representing the expansion in a diagrammatic way.

We found out, after doing our computations but before submitting the paper, that in 1982 Plefka had already derived the TAP free energy for the SK model from that Taylor expansion, but for whatever reason, he had not gone beyond the Onsager correction term or noted that this was a technique that was much more general than the SK model for spin glasses, so nobody else had followed up using this approach.

If you want to learn more about this method for computing free energies, please read my paper (with Antoine Georges) “How to Expand Around Mean-Field Theory Using High Temperature Expansions,” or my paper “An Idiosyncratic Journey Beyond Mean Field Theory.”

This approach has some advantages and disadvantages compared with the belief propagation approach (and related Bethe free energy) which is much more popular in the electrical engineering and computer science communities. One advantage is that the free energy in the high-temperature expansion approach is just a function of simple one-node “beliefs” (the magnetizations), so it is computationally simpler to deal with than the Bethe free energy and belief propagation. Another advantage is that you can make systematic corrections; belief propagation can also be corrected with generalized belief propagation, but the procedure is less automatic. Disadvantages include the fact that the free energy is only exact for tree-like graphs if you add up an infinite number of terms, and the theory has not yet been formulated in an nice way for “hard” (infinite energy) constraints.

If you’re interested in quantum systems like e.g. the Hubbard model, the expansion approach has the advantage that it can also be applied to them; see my paper with Georges, or the lectures by Georges on his related “Dynamical Mean Field Theory,” or this recent paper by Plefka, who has returned to the subject more than 20 years after his original paper.

Also, if you’re interested in learning more about spin glasses or other disordered systems, or about other variational derivations for mean-field theory, please see this post.

Phase Transitions and Free Energies

October 2, 2007

Much of condensed matter and statistical physics is concerned with the explanation of phase transitions between different forms of matter. A familiar example is water, which has a transition from the solid phase of ice to the liquid phase, and then from the liquid phase to the gaseous phase of steam, as the temperature is increased.

In fact, many different materials have all sorts of exotic phases, as you vary the temperature, pressure, or composition of the material. Constructing “phase diagrams” which show what the different phases of a material should be as a function of the varying parameters is one of the main preoccupations of physicists.

How can these phase diagrams be constructed? Condensed matter physicists tend to follow the following algorithm. First, from arguments about the microscopic physics, construct a simple model of the local interactions of the molecules making up the material. Secondly, choose some method to approximately compute the “free energy” for that simple model. Finally, find the minima of the approximate free energy as a function of the adjustable parameters like the temperature. The phase diagram can be constructed by determining which phase has the lower free energy at each point in parameter space.

If the results disagree with experiment, your model is too simple or your approximation for the free energy is not good enough, so you need to improve one or the other or both; otherwise write up your paper and submit it for publication.

To illustrate, let’s consider magnetism. Although it is less familiar than the phase transition undergone by water, magnets also have a phase transition from a magnetized “frozen” phase at low temperature to an unmagnetized “paramagnetic” phase at high temperature.

The simplest model of magnetism is the Ising model. I’ve discussed this model before; to remind you, I’ll repeat the definition of the ferromagnetic Ising model: “In this model, there are spins at each node of a lattice, that can point ‘up’ or ‘down.’ Spins like to have their neighbors point in the same direction. To compute the energy of a configuration of spins, we look at all pairs of neighboring spins, and add an energy of -1 if the two spins point in the same direction, and an energy of +1 if the two spins point in opposite directions. Boltzmann’s law tells us that each configuration should have a probability proportional to the exp(-Energy[configuration] / T), where T is the temperature.”

Of course, as defined the Ising model is a mathematical object that can and has been studied mathematically independent of any relationship to physical magnets. Alternatively, the Ising model can be simulated on a computer.

Simulations (see this applet to experiment for yourself) show that at low temperatures (and if the dimensionality of the lattice is at least 2), the lattice of spins will over time tend to align so that they point together up more than down, or they all point down more than up. The natural symmetry between up and down is “broken.” At low temperatures one will find “domains” of spins pointing in the “wrong” direction, but these domains only last temporarily.

At high temperatures, on the other hand, each spin will typically fluctuate between pointing up and down, although again domains of like-pointing spins will form. The typical time for a spin to switch from pointing up to pointing down will increase as the temperature decreases, until it diverges towards infinity (as the size of the lattice approaches infinity) as one approaches the critical temperature from above.

Intuitively, the reason for this behavior is that at low temperature, the configurations where all the spins point in the same direction have a much lower energy, and thus a much higher probability, than other configurations. At high temperatures, all the configurations start having similar probabilities, and there are many more configurations that have equal numbers of up and down spins compared to the number of aligned configurations, so one typically sees the more numerous configurations.

This balance between energetic considerations (which make the spins align) and entropic considerations (which make the spins favor the more numerous unaligned configurations) is captured by the “free energy” F, which is given by the equation F=U-TS, where U is the average energy, S is the entropy, and T is the temperature. At low temperatures, the energy dominates the free energy, while at high temperatures, the entropy dominates.

All this intuition may be helpful, but like I said, the Ising model is a mathematical object, and we should be able to find approximation methods which let us precisely calculate the critical temperature. It would also be nice to be able to precisely calculate other interesting quantities, like the magnetization as a function of the temperature, or the susceptibility (which is the derivative of the magnetization with respect to an applied field) as a function of the temperature, or the specific heat (which is the derivative of the average energy with respect to the temperature) as a function of the temperature. All these quantities can be measured for real magnets, so if we compute them mathematically, we can judge how well the Ising model explains the magnets.

This post is getting a bit long, so I’ll wait until my next post to discuss in more detail some useful methods I have worked on for systematically and precisely computing free energies, and the other related quantities which can be derived from free energies.

 

Beginning Go

October 1, 2007

If you’re interested in learning more about the game of Go, whether you’re an absolute beginner or already know something about the game, a good starting point is “Sensei’s Library”, a huge wiki (it currently contains 15722 pages) all about Go. There are many ways to explore the wiki, and tons of interesting topics to explore.

There are are many nice English language books on Go. I will just recommend, if you are a beginner, the excellent five volume “Learn to Play Go Series” by Janice Kim and Jeong Soo-hyun. (If you already know the rules and have played a couple games, you probably want to begin with volume 2).

Go-playing programs, unlike Chess programs, are no match for even moderately strong human amateurs, but for beginners or weak players like myself, they can provide an interesting challenge. I very much like the free Goban Go GUI for Mac OS X, which includes GNU Go, which is one of the best Go-playing programs. Goban also serves as a client for online Go servers and lets you replay Go games in the .sgf format.

Gallager’s LDPC error-correcting codes

September 28, 2007

Error-correcting codes are a technology that most people don’t think much about, if they even know they exist, but these codes work quietly in the background to enable such things as cell phones and other wireless technology, cable and satellite TV, and also the internet, including DSL, fiber-optic communications, and good old-fashioned dial-up modems. The modern communications revolution could not have begun without these codes.

So what’s the idea behind these codes? There’s a lot to say, and many textbooks have been written on the subject, so I’ll only give the briefest of introductions. [Some excellent textbooks I recommend include MacKay’s textbook which I’ve already reviewed, McEliece’s “Theory of Information and Coding”, and Lin and Costello’s “Error Control Coding.” See also this post for two forthcoming books available online.] EDIT: I’ve added some more information about LDPC decoders, with pointers to available software, in this post.

The basic idea is that we want to transmit some bits which represent some piece of text or picture or something. Unfortunately, when we transmit those bits, they need to travel through some channel (say a wire or through the air) and when they are received, the receiver only gets a noisy version of each bit. For example, each bit might be flipped independently from a 0 to a 1 or vice versa with some small probability (this is called the binary symmetric channel; many other channel models exist too).

To combat this noise, we send extra bits; so instead of sending say the 1000 bits that represent our message, we might send 2000, where the extra 1000 “check” bits have some known relationship to the original 1000. Both the transmitter and receiver agree on that relationship ahead of time; that is the “code.” Of course, all 2000 bits are now subject to the noise, so some of those extra bits could be flipped. Nevertheless, if the noise is small enough, the receiver can try to “decode” the original 1000 bits by finding the configuration of the 2000 bits which obeys all the constraints and is most probable.

In 1948, Claude Shannon proved a theorem that essentially said that if the noise in the channel was small enough, and if the number of extra bits that you were willing to send per original bit was large enough, that one could design very long codes, that if optimally decoded, would always remove all the noise and recover the transmitted message.

(By the way, it is this amazing property that codes can remove 100% of the noise that means that we can watch crystal-clear high-definition TV coming in over the airwaves, something I very much appreciate when I watch a football game these days. When advertisers talk about “digital this” or “digital that,” they really mean “error-corrected digital”.)

As an example of Shannon’s theorem, if one was willing to use one extra bit for every original bit, and the percentage of flipped bits in your binary symmetric channel was less than the Shannon limit of about 11%, his theorem tells you that codes exist that will reliably remove all the noise. However, Shannon’s proof was non-constructive; he didn’t tell us what these wonderful codes were. Shannon also proved a theorem that if the noise was higher than the “Shannon limit,” no codes exist that can reliably correct the noise.

So error-correcting coding theory deals with the problems of designing codes, and efficient encoders and decoders for those codes, that work as close to the Shannon limit as possible. Many theorists invented many interesting and useful codes and encoders and decoders over the years, but until the 1990’s, it still seemed a distant dream to most coding theorists that we would be able to find practical codes that performed near the Shannon limit.

What is very strange is that the best codes and decoders that were discovered in the 1990’s were actually a rediscovery of codes and decoders invented by Robert Gallager in the early 1960’s, for his Ph.D. thesis. Gallager’s thesis introduced “low density parity check” (LDPC) codes, and their decoding algorithm, the famous “belief propagation” decoding algorithm. His thesis also introduced many other important ideas in coding theory, including “density evolution,” simplified “bit-flipping decoders,” and analysis methods for optimal LDPC decoders. It is a truly remarkable work, that every aspiring coding theorist should read. Fortunately, it’s available online.

images.jpg

How is it possible that this work was forgotten? Well, in 1968, Robert Gallager wrote a magnificent textbook on information theory, called “Information Theory and Reliable Communication,” where he explained most of the information and coding theory known at the time, but neglected to talk about LDPC codes! I’m not sure why. Possibly he thought that he’d already covered the material in the 1963 monograph that was made from his thesis, or perhaps he didn’t think LDPC codes were practical at the time. In fact, the decoder was too complex for 1960’s technology, but Moore’s Law took care of that, although only now are LDPC codes widely replacing previously-developed codes in communications systems.

So the moral of the story is: if you write a ground-breaking Ph.D. thesis about a remarkable technology that is 30 years ahead of its time, please don’t forget to mention it in your classic textbook a few years later. Not a moral too many of us have to worry about, but still…