The Mathematics of Volleyball

Recently I was at a multi-day volleyball tournament, which gave me plenty of time to ponder the mathematics of the game. At different points in the game, I'd wonder what the odds were of each team winning. And when a team gained or lost a point, I'd wonder how important that point was. Clearly, if the score was 24-24, gaining a point made a huge difference. But how much difference did getting one point at the beginning of the game matter? It seemed like it didn't matter much, but did it?

I decided to analyze the game mathematically. I made the simplifying assumption that each team had 50-50 odds of winning each point. I found the analysis interesting, and it turns out to have close ties to Pascal's Triangle, so I'm posting it here in case anyone else is interested.

Volleyball games are scored using the rally point system, which means that one team gets a point on every serve. (Back in the olden days, volleyball used side-out scoring, which meant that only the serving team could get a point. Fortunately, rally point scoring is more mathematically tractable. Rally point scoring also keeps the game advancing faster.) The winner of each match is the best out of three sets (a set is the same as a game). In the league I was watching, the winner of a game is the first team to get 25 points and be ahead by at least 2. (Except if a third tiebreaker game is needed, it only goes to 15 points instead of 25.)

A few cases are easy to analyze mathematically. If we assume each team has a 50-50 chance of scoring each point and the score is tied, each team obviously has a 50% chance of winning the game. (With side-out scoring, it makes a difference which team is serving, but for rally point scoring we avoid that complication.) The second obvious case is if a team has 25 points and the other team has 23 or fewer points, the first team has 100% chance of winning (since they already won).

I will use the notation P(m,n) for the chance of the first team wining if the score is m to n. From above, P(n, n) = 50%, P(25, n) = 100% for n <= 23, and P(m, 25) = 0% for m <= 23.

The chance of winning in other cases can be calculated from the assumption that a team has a 50% chance of winning the point, and a 50% chance of losing: the chance of winning is the average of these two circumstances. Mathematically, we get the simple recurrence:

For instance, if the score is 25-24, if the first team scores, they win. If the second team scores, then the score is tied. In the first (winning) case, the first team wins 100%, and in the second (tied) case, the first team wins 50%. Thus, on average they will win 75% of the time from a 25-24 lead. That is, P(25, 24) = 75%, and by symmetry P(24, 25) = 25%. (Surprisingly, these are the only scores where the requirement to win by 2 points changes the odds.)

Likewise, if the score is 24-23, half the time the first team will score a point and win, and half the time the second team will score a point and tie. So the first team has 1/2 * 100% + 1/2 * 50% = 75% chance of winning, and P(24, 23) = 75%.

More interesting is if the score is 24-22, half the time the first team will score a point and win, and half the time the second team will score, making the score 24-23. We know from above that the first team has a 75% chance of winning from 24-23, so P(24, 22) = 1/2 * 100% + 1/2 * 75% = 87.5%.

We can use the recurrence to work backwards and find the probability of winning from any score. The following table shows the probability of winning for each score. The first team has the score on the left, and the second team has the score on the top.

Table with odds of winning when the score is m to n

012345678910111213141516171819202122232425
050%44%39%33%28%23%18%14%11%8%5%4%2%1%1%0%0%0%0%0%0%0%0%0%0%0%
156%50%44%38%33%27%22%17%13%10%7%5%3%2%1%1%0%0%0%0%0%0%0%0%0%0%
261%56%50%44%38%32%27%21%17%13%9%7%4%3%2%1%1%0%0%0%0%0%0%0%0%0%
367%62%56%50%44%38%32%26%21%16%12%9%6%4%3%1%1%0%0%0%0%0%0%0%0%0%
472%67%62%56%50%44%37%31%26%20%16%11%8%6%4%2%1%1%0%0%0%0%0%0%0%0%
577%73%68%62%56%50%44%37%31%25%20%15%11%7%5%3%2%1%0%0%0%0%0%0%0%0%
682%78%73%68%63%56%50%43%37%30%24%19%14%10%7%4%3%1%1%0%0%0%0%0%0%0%
786%83%79%74%69%63%57%50%43%36%30%24%18%13%9%6%4%2%1%1%0%0%0%0%0%0%
889%87%83%79%74%69%63%57%50%43%36%29%23%17%12%8%5%3%2%1%0%0%0%0%0%0%
992%90%87%84%80%75%70%64%57%50%43%36%29%22%16%11%8%5%3%1%1%0%0%0%0%0%
1095%93%91%88%84%80%76%70%64%57%50%43%35%28%21%15%11%7%4%2%1%0%0%0%0%0%
1196%95%93%91%89%85%81%76%71%64%57%50%42%35%27%20%14%9%6%3%2%1%0%0%0%0%
1298%97%96%94%92%89%86%82%77%71%65%58%50%42%34%26%19%13%8%5%2%1%0%0%0%0%
1399%98%97%96%94%93%90%87%83%78%72%65%58%50%42%33%25%18%12%7%4%2%1%0%0%0%
1499%99%98%97%96%95%93%91%88%84%79%73%66%58%50%41%32%24%17%11%6%3%1%0%0%0%
15100%99%99%99%98%97%96%94%92%89%85%80%74%67%59%50%41%31%23%15%9%5%2%1%0%0%
16100%100%99%99%99%98%97%96%95%92%89%86%81%75%68%59%50%40%30%21%13%7%3%1%0%0%
17100%100%100%100%99%99%99%98%97%95%93%91%87%82%76%69%60%50%40%29%19%11%5%2%0%0%
18100%100%100%100%100%100%99%99%98%97%96%94%92%88%83%77%70%60%50%39%27%17%9%4%1%0%
19100%100%100%100%100%100%100%99%99%99%98%97%95%93%89%85%79%71%61%50%38%25%14%6%2%0%
20100%100%100%100%100%100%100%100%100%99%99%98%98%96%94%91%87%81%73%62%50%36%23%11%3%0%
21100%100%100%100%100%100%100%100%100%100%100%99%99%98%97%95%93%89%83%75%64%50%34%19%6%0%
22100%100%100%100%100%100%100%100%100%100%100%100%100%99%99%98%97%95%91%86%77%66%50%31%12%0%
23100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%99%99%98%96%94%89%81%69%50%25%0%
24100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%99%98%97%94%88%75%50%25%
25100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%100%75%50%

Any particular chance of winning can be easily read from the table. For instance, if the score is 15-7, look where row 15 and column 7 meet, and you'll find that the first team has a 94% chance of winning. (This is P(15, 7) in my notation.)

The table illustrates several interesting characteristics of scores. The odds fall away from 50% pretty rapidly as you move away from the diagonal (i.e. away from a tied score). Points matter a lot more near the end of the game, though: you've only got a 1% chance of winning from an 18-24 position, while being six points behind at the beginning (0-6) still gives you an 18% chance of winning. However, a big deficit is almost insurmountable - if you're behind 0-15, you have less than a 1% chance of catching up and winning. (Note that 0% and 100% in the table are not exactly 0% and 100%, because there's always some chance to win or lose.)

Note that each score is the average of the score below and the score to the right - these are the cases where the first team gets the point and the second team gets the point. This corresponds directly to the equation above.

The table could be extended arbitrarily far if neither team gets a two point lead, but those cases are not particularly interesting.

Generating the score table with dynamic programming

To generate the table, I wrote a simple Arc program to solve the recurrence equation using dynamic programming:
(def scorePercent (s1 s2 max)
  (if (and (>= s1 max) (>= s1 (+ s2 2))) 100.
      (and (>= s2 max) (>= s2 (+ s1 2))) 0
      (is s1 s2) 50.
      (/ (+ (scorePercent s1 (+ s2 1) max)
            (scorePercent (+ s1 1) s2 max)) 2)))
The first two arguments are the current score, and the last argument is the amount to win (25 in this case). For instance:
arc> (scorePercent 24 22 25)
87.5
arc> (scorePercent 20 22 25)
22.65625
Unfortunately, the straightforward way of solving the problem has a severe performance problem. For instance, computing (scorePercent 5 7 25) takes hours and hours. The problem is that evaluating P(5, 7) requires calculating two cases: P(6, 7) and P(5, 8). Each of those requires two cases, each of which requires two cases, and so on. The result is an exponential number of evaluations, which takes a very very long time as the scores get lower. Most of these evaluations calculate the same values over and over, which is just wasted work. For instance, P(6, 8) is computed in order to compute P(6, 7) and P(6, 8) is computed again in order to compute P(5, 8).

There are a couple ways to improve performance. The hard way of solving the dynamic programming problem without this exponential blowup is to carefully determine an order in which each value can be calculated exactly once by working backwards, until you end up with the desired value. For instance, if the values are calculated going up the columns from right to left, each value can be computed immediately from two values that have already been computed, until we end up efficiently computing the whole table in approximately 25*25 steps. This requires careful coding to step through the table in the right order and to save each result as it is calculated. It's not too hard, but there's a much easier way.

The easy way of solving the problem is with memoization - when an intermediate value is calculated, remember its value in case you need it again, instead of calculating it over and over. With memoization, we can compute the results in any order we want, and automatically each result will only be computed once.

In Arc, memoization can be implemented simply by defining a function with defmemo, which will automatically memoize the results of the function evaluation:

(defmemo scorePercent (s1 s2 max)
  (if (and (>= s1 max) (>= s1 (+ s2 2))) 100.
      (and (>= s2 max) (>= s2 (+ s1 2))) 0
      (is s1 s2) 50.
      (/ (+ (scorePercent s1 (+ s2 1) max)
            (scorePercent (+ s1 1) s2 max)) 2)))
With this simple change, results are nearly instantaneous, rather than taking hours.

The above function generates a single entry in the table. To generate the full table in HTML with colored cells, I used a simple loop and Arc's HTML generating operations. If you're interested in Arc programming, the full code can be downloaded here.

Mathematical analysis

Instead of computing the probabilities through dynamic programming, it is possible to come up with a mathematical solution. After studying the values for a while, I realized rather surprisingly that the probabilities are closely tied to Pascal's Triangle. You may be familiar with Pascal's Triangle, where each element is the sum of the two elements above it (with 1's along the edges), forming a table of binomial coefficients:

Pascal's Triangle

Pascal's triangle

The game probabilities come from the triangle of partial sums of binomial coefficients, which is a lesser-known sequence that is easily derived from Pascal's Triangle. This sequence, T(n, k) is formed by summing the first k elements in the corresponding row of Pascal's Triangle. That is, the first element is the first element in the same row of Pascal's triangle, the second is the sum of the first two elements in Pascal's triangle, the third is the sum of the first three, etc.

T - the partial row sums of Pascal's Triangle

Partial row sums in Pascal's triangle
Mathematically, this triangle T(n, k) is defined by:


As with Pascal's Triangle, each element is the sum of the two above it, but now the right-hand border is powers of 2. This triangle is discussed in detail in the Online Encyclopedia of Integer Sequences. Surprisingly, this triangle is closely connected with distances in a hypercube, error-correcting codes, and how many pieces an n-dimensional cake can be cut into.

With function T defined above, the volleyball winning probabilities are given simply by:

For example, P(23,20) = T(6, 4)/2^6 = 89.0625%, which matches the table.

Intuitively, it makes sense that the probabilities are related to Pascal's Triangle, because each entry in Pascal's Triangle is the sum of the two values above, while each probability entry is the average of the value above and the value to the right in the table. Because taking the average divides by 2 in each step, an exponent of 2 appears in the denominator. The equation can be proved straightfowardly by induction.

The importance of a point

Suppose the score is m to n. How important is the next point? I'll consider the importance of the point to be how much more likely the team is to win the game if they win the point versus losing the point. For instance, suppose the score is 18-12, so the first team has a 92% chance of winning (from the previous table). If they win the next point, their chance goes up to 95%, while if they lose the point, their chance drops to 88%. Thus, we'll consider the importance to be 7%. Mathematically, if the score is m to n, I define the importance as P(m+1, n) - P(m, n+1).

Table with importance of the next point when the score is m to n

012345678910111213141516171819202122232425
011%11%11%11%10%9%8%7%6%5%4%3%2%1%1%0%0%0%0%0%0%0%0%0%0%0%
111%12%12%11%11%10%9%8%7%6%4%3%2%2%1%1%0%0%0%0%0%0%0%0%0%0%
211%12%12%12%12%11%10%9%8%7%6%4%3%2%2%1%1%0%0%0%0%0%0%0%0%0%
311%11%12%12%12%12%11%10%9%8%7%5%4%3%2%1%1%0%0%0%0%0%0%0%0%0%
410%11%12%12%13%13%12%12%11%9%8%7%5%4%3%2%1%1%0%0%0%0%0%0%0%0%
59%10%11%12%13%13%13%13%12%11%10%8%7%5%4%3%2%1%1%0%0%0%0%0%0%0%
68%9%10%11%12%13%13%13%13%12%11%10%8%6%5%3%2%1%1%0%0%0%0%0%0%0%
77%8%9%10%12%13%13%14%14%13%12%11%10%8%6%5%3%2%1%1%0%0%0%0%0%0%
86%7%8%9%11%12%13%14%14%14%14%13%11%10%8%6%4%3%2%1%0%0%0%0%0%0%
95%6%7%8%9%11%12%13%14%14%14%14%13%12%10%8%6%4%3%1%1%0%0%0%0%0%
104%4%6%7%8%10%11%12%14%14%15%15%14%13%12%10%8%6%4%2%1%1%0%0%0%0%
113%3%4%5%7%8%10%11%13%14%15%15%15%15%14%12%10%7%5%3%2%1%0%0%0%0%
122%2%3%4%5%7%8%10%11%13%14%15%16%16%15%14%12%10%7%5%3%1%1%0%0%0%
131%2%2%3%4%5%6%8%10%12%13%15%16%17%17%16%14%12%9%7%4%2%1%0%0%0%
141%1%2%2%3%4%5%6%8%10%12%14%15%17%18%18%17%15%12%9%6%3%2%1%0%0%
150%1%1%1%2%3%3%5%6%8%10%12%14%16%18%19%19%17%15%12%9%5%3%1%0%0%
160%0%1%1%1%2%2%3%4%6%8%10%12%14%17%19%20%20%18%16%12%8%4%2%0%0%
170%0%0%0%1%1%1%2%3%4%6%7%10%12%15%17%20%21%21%19%16%12%7%3%1%0%
180%0%0%0%0%1%1%1%2%3%4%5%7%9%12%15%18%21%23%23%21%16%11%5%2%0%
190%0%0%0%0%0%0%1%1%1%2%3%5%7%9%12%16%19%23%25%25%22%16%9%3%0%
200%0%0%0%0%0%0%0%0%1%1%2%3%4%6%9%12%16%21%25%27%27%23%16%6%0%
210%0%0%0%0%0%0%0%0%0%1%1%1%2%3%5%8%12%16%22%27%31%31%25%12%0%
220%0%0%0%0%0%0%0%0%0%0%0%1%1%2%3%4%7%11%16%23%31%38%38%25%0%
230%0%0%0%0%0%0%0%0%0%0%0%0%0%1%1%2%3%5%9%16%25%38%50%50%25%
240%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%1%2%3%6%12%25%50%50%50%
250%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%0%25%50%50%

The values in the table make intuitive sense. If one team is winning by a lot, one more point doesn't make much difference. But if the scores are close, then each point counts. Each point counts a lot more near the end of the game than at the beginning. The first point only makes an 11% difference in the odds of winning, while the if the score is 23-23, the point makes a 50% difference (75% chance of winning if you get the point vs 25% if you miss the point). This table is sort of a derivative of the first table, showing where the values are changing most rapidly.

The importance of a point as defined above closely matches the behavior of the spectators. If the score is very close at the end of the game, the audience becomes much more animated compared to earlier in the game.

The "importance" is mathematically simpler than the probability of winning derived earlier. If the current score is 25-a, 25-b, then the importance is given by the simple equation:

This can proved straighforwardly from the equation for P(x, y). For example, if the score is 18-12, the importance is C(7+13-2, 6) / 2^(7+13-2) = 18564 / 262144 = 7.08%.

Conclusions

How useful is this model? Well, it depends on the assumption that each team has an equal chance of winning each point. Of course, most teams are not evenly matched. Even more important is the fact that if a team has a good server, they can quickly rack up 10 points in a row, which throws the model out the window.

However, I think the model is still useful, since it provides some quantitative answers to the original questions, and confirms some intuitions. In addition, the mathematics turned out to be more interesting than I was expecting, with the surprising connection to Pascal's Triangle.

Python version

P.S. The code above is in Arc, an obscure language. Here's a version of the code in Python that will be more useful:
solved = {} # Remember values that have been solved

# Compute probability of team 1 wining when score is s1 to s2.
# Max is the points needed to win (typically 25)
# This routine is just a wrapper around scorePercentInt to
# remember values that have been computed.
def scorePercent(s1, s2, max):
  if (s1, s2, max) not in solved:
    solved[s1, s2, max] = scorePercentInt(s1, s2, max)
  return solved[s1, s2, max]

# This routine does the actual calculation
def scorePercentInt(s1, s2, max):
  if s1 >= max and s1 >= s2 + 2: return 100
  if s2 >= max and s2 >= s1 + 2: return 0
  if s1 == s2: return 50
  return (scorePercent(s1, s2+1, max) + scorePercent(s1+1, s2, max)) / 2.

for i in range(0, 26):
  for j in range(0, 26):
    print '%.3f' % scorePercent(i, j, 25),
  print

My 0.015 minutes of fame on CNN

I recently wound up on CNN for a couple seconds doing some Arduino hacking as part of a segment on Google's workshops. Click the image for the full video. If you don't want to watch the whole thing, I appear at 1:00 and 1:39.

Ken Shirriff on CNN

For those who want technical details, I hacked together the following quick sketch to generate the interesting patterns you can see on the oscilloscope:

void setup()
{
  pinMode(4, OUTPUT);
  pinMode(5, OUTPUT);
}

int state = 1;
int count1 = 0;
int state2 = 1;
int count12 = 0;
int max1 = 20;
int max2 = 200;
int t = 0;

void loop() {

  if (count1-- <= 0) {
    state = 1-state;
    digitalWrite(5, state);
    count1 = max1;
  }
  if (count12-- <= 0) {
    state2 = 1-state2;
    digitalWrite(4, state2);
    count12 = max2;

    if (t++ > 20000) {
      max2 -= 1;
      if (max2 < 1) {
        max2 = 500;
      }   
      t = 0;
    }
  }
}
This sketch manually generates two square wave output with periods determined by max1 and max2. The frequency of the second varies occasionally (controlled by the loop with t). I used simple R-C filters on the outputs to turn the square waves into roughly triangular waves, and then fed this into the X and Y inputs of the oscilloscope. The result was constantly-varying Lissajou-like patterns.

I should point out the outputs generated this way are rather unstable because many thing can interrupt the timing loop. The above code is provided just in case your are curious. I don't recommend using this approach for anything real; using the PWM timers would yield much cleaner results.

Anyone have other ideas for easy ways to generate cool oscilloscope patterns with an Arduino?

My Knuth reward check

I attended a very interesting talk "All Questions Answered" by the famous computer science professor Don Knuth in March, where he talked about many things, including his new book.

The talk inspired me to read The Art of Computer Programming, Volume 4A: Combinatorial Algorithms. As he described in the talk, he gives a reward check (for 1 hexadecimal dollar) to anyone who finds an error in one of his books, so I set myself the goal of finding an error in the book while on vacation.

After a lot of study, I thought I'd found an error in a diagram, but then I realized I was confused. Next, I found a blatant error in an appendix, but that error had already been discovered and was listed in the errata. Finally I found an error on page 574 that I hoped hadn't been found yet and sent it off to Professor Knuth.

I was delighted to receive my reward check today, which Wikipedia calls "among computerdom's most prized trophies". That's a big exaggeration but nonetheless, I'm happy to get one. Note that the check is from the fictional Bank of San Seriffe to avoid check fraud problems from all the images of his checks on the web.

My Knuth TAOCP reward check

As for the book itself, it's interesting even if you're not after a reward, although very challenging to read. Volume 4a describes combinatorial algorithms, including more ways of computing permutations and combinations than you can shake a stick at. It also has an extensive discussion of BDDs and ZDDs, which are newish data structures for representing sets. The section on bitwise tricks and techniques is interesting if you like HAKMEM-style tricks such as reversing the bits in an integer through a short sequence of incomprehensible operations.

I have to admit that trying to find an error in a book is a strange vacation goal, but I'm glad I succeeded and Knuth's book taught me some interesting algorithms in the process.

P.S. I was very surprised to see this article on the front page of Hacker News. To answer the questions there and below, the error I found was in volume 4a page 574 (as the memo line on the check shows). The solution to exercise 67 on that page says a particular circuit uses 6 ANDN gates, which I thought should be NAND gates. It gets a bit more complicated because Knuth was referring to the ANDN op code for MMIX, but there was still a mistake with and-not versus not-and. (The other error I noticed was n choose k on page 824, but I checked the errata and it had already been found.)

The Endeavour delay: Complexity, the APU, and the Load Control Assembly

The last launch of the Endeavour space shuttle has been delayed 48 hours (update: indefinitely) due to a problem with the APU heater and the Load Control Assembly. I wanted to find out what exactly these troublesome components are, so I did some investigation. There's a lot of extremely detailed information on the Space Shuttle available online, but it is very hard to find. I've summarized the information here in case anyone else wants to know the specifics.

Space Shuttle APU locations

The Space Shuttle has three independent hydraulic systems to operate engine valves, actuators, landing gear, and so forth during launch and landing. The hydraulic pumps are powered by three Auxiliary Power Units (or APUs), which are hydrazine-powered turbines. Each APU is 88 pounds and produces 135 horsepower (which is about the same horsepower as a Honda Accord).

Hydrazine is a highly-toxic rocket fuel; when exposed to a catalyst, it energetically decomposes into hot gases at 1700°F. It is convenient for applications such as the APU, since it doesn't need oxygen, and the decomposition can be easily started and stopped.

Space Shuttle Auxiliary Power Unit
(Click on the image for tons of detailed information.)

Apparently the fuel heaters in APU 1 are not working. Since the hydrazine fuel will freeze at 34°F, each APU has redundant heaters to keep the system above 45°F. Since the heaters are redundant, the Space Shuttle would still be able to operate with the current problem, but would not be able to handle another failure. If the second fuel heater failed, then the fuel would freeze and the APU would not be able to work. Since there are three APUs, even this failure would not be a major problem. But still, you wouldn't want to take off with the heater not working, because losing hydraulic pressure would be a very bad thing.

According to articles, the fuel heater problem is due to a lack of power from the Aft Cabin Load Control Assembly, a switchbox that powers a heater circuit for Auxiliary Power Unit 1. There are three Aft Load Controller Assemblies, as well as many other Load Controller Assemblies. (Sources are inconsistent about whether it is called a Control vs Controller.)

A complex Electrical Power System provides power to all parts of the Shuttle. Three fuel cells (10 kW each) generate 28-volt direct current. The fuel cells feed three main DC power buses, as well as powering AC inverters to feed three AC buses with three-phase, 117-volt, 400-hertz AC power. From the fuel cell, power goes to a Distribution Assembly (DA), then to the aft Power Controller Assemblies, and then to the Aft Load Controller Assemblies.

The Load Control Assemblies contain solid-state switching devices for loads up to 20 amps, and relays for loads up to 135 amps. These switching devices are internally fused.

Reportedly there is a short or other electrical fault in the Aft Load Controller Assembly 2, which is causing the APU heater to fail to operate. The fuel is being drained from Endeavour so technicians can access the assembly and resolve the problem. If I'm interpreting everything correctly, it seems like they'll need to replace one of the internal fuses in the Load Control Assembly.

Space Shuttle Power Distribution

Complexity and the Space Shuttle

One amazing thing about the Space Shuttle is the layers and layers of complexity. The APU system is just one example of this. For instance, each APU has as lube oil system to keep it lubricated. This requires a lube oil pump, which requires a nitrogen pressurization system to start the pump in zero gravity. The oil also requires a 181-pound water spray boiler system, which sprays cooling water onto the oil pipes; the water boils into steam and is vented into space. The boiler requires controllers, panel switches, and status displays, as well as yet another nitrogen pressurization tank, and yet another system of heaters to keep the water from freezing.

Space Shuttle Water Spray Boiler

The water spray boiler doesn't have anything to do with the current launch delay, other than being part of the APU, but it provides an interesting example of the complexity of the systems involved. To summarize the complexity along just this one path, the engines require hydraulic pressure, which requires APUs to power the hydraulic pumps, which require a lubricating oil system, which requires a complex boiler system, which requires a own control and monitoring system. And this is just one small sub-path! I'm ignoring equally complex systems such as the APU injector cooling system (more water and pressurized nitrogen), or the APU fuel pump (which for instance, has a catch bottle in case its seals leak, a drain port if the catch bottle overflows, and associated monitoring system).

Conclusion

Given the level of complexity of the Space Shuttle, I'm not surprised by the launch delays, and wish NASA the best of luck in resolving the problem promptly. My opinion is while the Space Shuttle is a marvel of engineering, simpler rocket systems such as the SpaceX Falcon will turn out to be more practical in the long run.

The images and much of the information above is from the 1988 Shuttle Reference Manual at shuttlepresskit.com. This manual goes into extreme detail and is very interesting (if you find this sort of thing interesting). I should probably make it clear that this posting is based on what I've read; I have no connection with the space program.

P.S. I've found extensive details on the LCA and launch issues are available at nasaspaceflight.com, e.g. Endeavour receives her new LCA – Blown driver examined.

Solving a math problem of Schrödinger (Part II)

What's the better way to solve a math problem? A few lines of code, or a bunch of mathematical thought? I wanted to solve the following math problem: how many 12-digit numbers can you make from the digits 1 through 5 if each digit must appear at least once. In Part I, I described how to use Python or Arc to solve this problem. Now, in the eagerly-awaited Part II, I'll describe how to solve it using Exponential Generating Functions. This turned out to involve more mathematics than I expected, although it's mostly just algebra.

The exponential generating function solution

My old combinatorics textbook describes how to solve all sorts of combinatorial problems. Section 3.2.13 has a somewhat similar sequence problem: how many sequences of 1,2, or 3 of length l, in which no symbol occurs exactly p times. This approach can be generalized to the problem we're solving.

For our specific problem, we want to exclude solutions where a digit appears 0 times, but we'll stick with the generalized approach of excluding solutions where a digit appears exactly p times, and then we'll set p to 0 later. We'll let d be the range of each digit (e.g. if the digits are 1 through 5, then d=1). And n will be the total number of digits in the sequence (e.g. n=12 in our original problem).

Generating functions are way too complex to explain here, but I'll try to give a quick overview. The idea is that if you have c things of size n, you represent this by cx^n. For instance, if you roll a die, you can get 1 through 6, so the generating function would be x+x^2+x^3+x^4+x^5+x^6. If you roll a die twice, you would square that. If you want to figure out how many ways to roll a 9, you'd extract the coefficient of x^9 (which is represented as [x^9]). If you multiply out the polynomial, you'll find that the coefficient of x^9 is 4, corresponding to the 4 ways to roll a 9 (6+3, 5+4, 4+5, 3+6).

In these polynomials, x is just a placeholder, not a genuine variable; you never actually evaluate the polynomial. It's almost like magic the way the answer pops out of the formulas.

One source of more information on generating functions is Cut the knot: generating functions.

First, we create the generating function for a single digit, say 1. If the digit appears zero times, we represent it by term x^0. If it appears 1 time, we have x^1. If it appears twice, we have x^2/2!. (We divide by 2! because there are 2! equivalent ways the digit can appear twice.) Likewise, if it appears three times, we represent it by x^3/3!. We end up with x^0 + x^1 + x^2/2! + x^3/3! + x^4/4! + ... Now for the crazy part. This is the series expansion of e^x, so we replace this with e^x. Finally we subtract x^p/p!, which is the "forbidden" case. We end up with the generating function e^x - x^p/p!.

If we have d choices for each digit, we multiply together d copies of the generating function, which is of course raising it to the power of d:

To get our solution, if we want to find how many solutions are n digits long, we simply extract the coefficient of x^n/n! from the above generating function, and that gives us the value we want:

Next step: how do we evaluate the above? First, let's expand the right half using the standard binomial formula:

Now we'll take a quick detour through some generating function facts, which I will present without proof.

Important generating function facts

Extracting the coefficient of x^n in an exponential gives you:

Unless n=0. Then you're looking for the coefficient of x^n in 1. This is simply 1 if n=0, and 0 otherwise.

Or unless n<0. Then the result is 0.

Second important generating function fact: if you're looking for the coefficient of, say, x^9 in an expression multiplied by x^3, then you can look instead for the coefficient of x^6 in the expression, if you use the appropriate scaling:

Solving one term of the summation

Now let's extract our desired coefficient from one term of the summation.

If n<kp, then the coefficient is 0.

If d != k, then from the second important fact, the value is:

If k = d, then the exponential drops out and the value is 0 if n != kp (i.e. n != dp), and otherwise:

Putting the summation together

Let's first assume n != dp. Then we can skip the k=d term, and we get:

where limit = min(d-1, n/p) (or d-1 if p=0). This ensures that n-kp >=0.

On the other hand, if n = dp, then we need to handle the k=d term.

Note that if n=dp, then n-kp>0, so we don't need to add an extra upper limit on k.

Solving the original problem

In the original problem, no value is allowed to appear zero times, so the forbidden count p=0. Then the solution simplifies to

Thus, the number of sequences of length 12 consisting of the digits 1 through 5, with each digit appearing at least once, is given by n=12 and d=5:

Happily, this is the same result we got in Part I.

Implementing this in Python

Implementing the above formula is straightforward:
from math import factorial

def solve(d, n, p):
  total = 0
  for k in range(0, d):
    if n-k*p < 0: break
    total += (pow(-1, k) * choose(d, k) *
              factorial(n) / factorial(n-k*p) *
              pow(d-k, n-k*p) / pow(factorial(p), k))
  if n == d*p:
    total += pow(-1, d) * factorial(n) / pow(factorial(p), d)
  return total
Since I didn't entirely trust the mathematics above, I also implemented a totally brute-force solution that generates and tests all the sequences of the given length:
import itertools

def bruteforce(d, n, p):
  total = 0
  for seq in itertools.product(range(0, d), repeat=n):
    ok = 1
    for i in range(0, d):
      if len([x for x in seq if x==i]) == p:
        ok = 0
        break
    total += ok
  return total
The mathematical solution and the brute force solution agree on the cases I tested, which increased my confidence that I didn't mess up the math. The brute force solution is, of course, too slow to use except for fairly small values.

Conclusion

The mathematical solution involved more mathematics than I was expecting. The resulting solution isn't quite as clean as I'd hoped, due to various corner cases that need to be handled. But it can be implemented fairly easily, and also lets us solve a more general problem than the original problem.

IPv6 web serving with Arc or Python: adventures in IPv6

Lego Minifig on a Sheevaplug running IPv6 I've been experimenting with IPv6 on my home network. In part 1, I described how I set up an IPv6 tunnel on Windows 7 and how IPv6 killed my computer. Now, I will describe how I set up a simple (i.e. trivial) IPv6 web server using Arc or Python, on Windows or Linux, which can be accessed at http://ipv6.nlanguages.com.

My IPv6 explorations are roughly driven by Hurricane Electric's IPv6 Certification levels. To get from "Newbie" to "Enthusiast" you have to have an IPv6 web server. I expected I could just use the web server you're viewing right now (provided through Pair), but Pair (like most web providers) doesn't support IPv6. So I figured I'd just set up my own simple web server.

To make things more interesting, I decided to set up the server on my Sheevaplug, a very small plug-based computer running Linux. (See my previous articles about Arc on the Sheevaplug, and Arduino with the Sheevaplug.) The things I describe will work just as well on a standard Linux or Windows box, though.

Setting up the SixXS IPv6 tunnel on the Sheevaplug was much easier than setting it up on Windows; I just followed the directions. One gotcha: if your clock is wrong, aiccu will silently fail - check /var/log/syslog.

IPv6 with the Python web server

Python comes with a simple web server. It is easy to configure the web server for IPv6 once you know how, but it took some effort to figure out how to do it.
import socket
from BaseHTTPServer import HTTPServer
from SimpleHTTPServer import SimpleHTTPRequestHandler

class MyHandler(SimpleHTTPRequestHandler):
  def do_GET(self):
    if self.path == '/ip':
      self.send_response(200)
      self.send_header('Content-type', 'text/html')
      self.end_headers()
      self.wfile.write('Your IP address is %s' % self.client_address[0])
      return
    else:
      return SimpleHTTPRequestHandler.do_GET(self)

class HTTPServerV6(HTTPServer):
  address_family = socket.AF_INET6

def main():
  server = HTTPServerV6(('::', 80), MyHandler)
  server.serve_forever()

if __name__ == '__main__':
  main()
Most of this code is standard Python SimpleHTTPServer code. It implements a handler for the path /ip, which returns the client's IP address. For other paths, SimpleHTTPRequestHandler returns a file from the current directory; this is not particulary secure, but works for a demonstration.

The key change to support IPv6 is subclassing HTTPServer and setting the address_family to IPv6. The other key change is starting the server with the name '::', which is the IPv6 equivalent of '', and binds to no specific address. Similar changes work with related Python classes such as SocketServer.

IPv6 with Arc's web server

My next adventure was to run Arc's web server (which I've documented here) on IPv6. Much to my surprise, Arc's web server worked on Windows with IPv6 without any trouble. You don't have to do anything different to serve on IPv6 and the code and logs handle IPv6 addresses as you'd expect. (Most of the credit for this should go to MzScheme/Racket, which provides the underlying socket implementation.)

I simply start a server thread on port 80, and then define a simple web operation on the home page (represented as || for obscure reasons).

arc> (thread (serve 80))
arc> (defop || req (pr "Welcome to Arc!  Your IP is " (req 'ip)))
Accessing this home page displays a message and the IP address of the client (IPv4 or IPv6 as appropriate).

Unfortunately, when I tried running the same Arc code on the Sheevaplug or another Linux box, it refused to work at all with IPv6. The problem turned out to be misconfiguration in the compilation of Racket (the new name for mzscheme). To get IPv6 working, you can recompile Racket following the instructions here. After recompiling, I was able to get Arc working on my Sheevaplug with IPv6. Eventually this fix will get into the official builds.

Note that implementing web pages in Arc requires much less boilerplate than in Python. This isn't too surprising, given that the primary application of Arc is web serving.

Putting the server's IPv6 address into DNS

Once you have an IPv6 web server, how do you access it? You can access a server with a raw IPv6 address: http://[2001:1938:81:1f8::2]/mypage. (Note that the IPv6 address is enclosed in square brackets, unlike a regular IP address.) However, you'll almost certainly want to access your IPv6 server through a domain name in DNS. To do this, you need to create an AAAA record in your domain zone file.

I had the domain name nlanguages.com available, and wanted to point it at my IPv6 web server. To do this, I went to the DNS Zone File Editor for my nameserver (godaddy.com in my case). I created one AAAA entry for host @ to point to my IPv6 address, and a second AAAA entry for host ipv6 to point to my IPv6 address. The @ provides an entry for my top-level domain (nlanguages.com), and the second provides an entry for ipv6.nlanguages.com. I then tested the DNS entries with a nslookup query, asking for the AAAA record: nslookup -q=aaaa nlanguages.com

The result is that http://ipv6.nlanguages.com will access my IPv6 server (if you have IPv6 access.)

Conclusion

Running a simple IPv6 web server in Python or Arc is straightforward, at least if you don't run into a problem with the underlying language implementation. Python requires just a couple additional lines to support IPv6, while Arc supports IPv6 automatically. Thus, you can easily set up an IPv6 web server, just in time for World IPv6 Day.

Solving a math problem of Schrödinger with Arc and Python

I recently received an interesting math problem: how many 12-digit numbers can you make from the digits 1 through 5 if each digit must appear at least once. The problem seemed trivial at first, but it turned out to be more interesting than I expected, so I'm writing it up here. I was told the problem is in a book of Schrodinger's, but I don't know any more details of its origin.

(The problem came to me via Aardvark (vark.com), which is a service where you can send questions to random people, and random people send you questions. Aardvark tries to match up questions with people who may know the answer. I find it interesting to see the questions I receive.)

The brute force solution

I didn't see any obvious solution to the problem, so I figured I'd first use brute force. I didn't use the totally brute force solution - enumerating all 12 digit sequences and counting the valid ones - since it would be pretty slow, so I took a slightly less brute force approach.

If 1 appears a times, 2 appears b times, and 3, 4, and 5 appear c, d, and e times, then the total number of possibilities is 12!/(a!b!c!d!e!) - i.e. the multinomial coeffient.

Then we just need to sum across all the different combinations of a through e. a has to be at least 1, and can be at most 8 (in order to leave room for the other 4 digits). Then b has to be at least 1 and at most 9-a, and so on. Finally, e is whatever is left over.

The Python code is straightforward, noting that range(1, 9) counts from 1 to 8:

from math import factorial
total = 0
for a in range(1, 8+1):
  for b in range(1, 9-a+1):
    for c in range(1, 10-a-b+1):
      for d in range(1, 11-a-b-c+1):
        e = 12-a-b-c-d
        total += (factorial(12) / factorial(a) / factorial(b)
          / factorial(c) / factorial(d) / factorial(e))
print total
Or, in Arc:
(def factorial (n)
  (if (<= n 1) 1
     (* n (factorial (- n 1)))))

(let total 0
  (for a 1 8
    (for b 1 (- 9 a)
      (for c 1 (- 10 a b)
        (for d 1 (- 11 a b c)
          (let e (- 12 a b c d)
            (++ total (/ (factorial 12) (factorial a) (factorial b)
              (factorial c) (factorial d) (factorial e))))))))
  total)
Either way, we quickly get the answer 165528000.

The generalized brute force solution

The above solution is somewhat unsatisfying, since if we want to know how many 11 digits numbers can be made from at least one of 1 through 6, for instance, there's no easy way to generalize the code.

We can improve the solution by writing code to generate the vectors of arbitrary digits and length. In Python, we can use yield, which magically gives us the vectors one at a time. The vecs function recursively generates the vectors that sum to <= t. The solve routine adds the last entry in the vector to make the sum exactly equal the length. My multinomial function is perhaps overly fancy, using reduce to compute the factorials of all the denominator terms and multiply them together in one go.

from math import factorial

# Generate vectors of digits of length n with sum <= t, and minimum digit 1.
def vecs(n, t):
  if n <= 0:
    yield []
  else:
    for vec in vecs(n-1, t-1):
      for last in range(1, t - sum(vec) + 1):
        yield vec + [last]

def multinomial(n, ks):
  return factorial(n) / reduce(lambda prod, k: prod * factorial(k), ks, 1)
  
# Find number of sequences of digits 1 to n, of length len,
# with each digit appearing at least once
def solve(n, len):
  total = 0
  for vec0 in vecs(n-1, len-1):
    # Make the vector of length n summing exactly to len
    vec = vec0 + [len - sum(vec0)]
    total += multinomial(len, vec)
  return total
Now, solve(5, 12) solves the original problem, and we can easily solve related problems.

In Arc, doing yield would need some crazy continuation code, so I just accumulate the list of vectors with accum and then process it. The code is otherwise similar to the Python code:

(def vecs (n t)
  (if (<= n 0) '(())
    (accum accfn
      (each vec (vecs (- n 1) (- t 1))
       (for last 1 (- t (reduce + vec))
          (accfn (cons last vec)))))))

(def multinomial (n ks)
  (/ (factorial n) (reduce * (map factorial ks))))

(def solve (n len)
  (let total 0
    (each vec0 (vecs (- n 1) (- len 1))
        (++ total (multinomial len
                    (cons (- len (reduce + vec0)) vec0))))
    total))
This solution will solve our original problem quickly, but in general it's exponential, so solving something like n = 10 and length = 30 gets pretty slow.

The dynamic programming solution

By thinking about the problem a bit more, we can come up with a simpler solution. Let's take all the sequences of 1 through 5 of length 12 with each number appearing at least once, and call this value S(5, 12). How can we recursively find this value? Well, take a sequence and look at the last digit, say 5. Either 5 appears in the first 11 digits, or it doesn't. In the first case, the first 11 digits form a sequence of 1 through 5 of length 11 with each number appearing at least once. In the second case, the first 11 digits form a sequence of 1 through 4 of length 11 with each number appearing at least once. So we have S(5, 11) sequences of the first case, and S(4, 11) of the second case. We assumed the last digit was a 5, but everything works the same if it was a 1, 2, 3, or 4; there are 5 possibilities for the last digit.

The above shows that S(5, 12) = 5 * (S(5, 11) + S(4, 11)). In general, we have the nice recurrence S(n, len) = n * (S(n, len-1) + S(n-1, len-1)). Also, if you only have a single digit, there's only one solution, so S(1, len) = 1. And if you have more digits than length to put them in, there are clearly no solutions, so S(n, len) = 0 if n > len. We can code this up in a nice recursive solution with memoization:

memo = {}
def solve1(digits, length):
  if memo.has_key((digits, length)):
    return memo[(digits, length)]
  if digits > length:
    result = 0
  elif digits == 1:
    result = 1
  else:
    result = digits * (solve1(digits-1, length-1) + solve1(digits, length-1))
  memo[(digits, length)] = result
  return result
An interesting thing about this solution is it breaks down each function call into two simpler function calls. Each of these turns into two simpler function calls, and so forth, until it reaches the base case. This results in an exponential number of function calls. This isn't too bad for solving the original problem of length 12, but the run time rapidly gets way too slow.

Note that the actual number of different function evaluations is pretty small, but the same ones keep getting evaluated over and over exponentially often. The traditional way of fixing this is to use dynamic programming. In this approach, the structure of the algorithm is examined and each value is calculated just one, using previously-calculated values. In our case, we can evaluate S(1, 0), S(2, 0), S(3, 0), and so on. Then evaluate S(2, 1), S(2, 2), S(2, 3) using those values. Then evaluate S(3, 2), S(3, 3), and so on. Note that each evaluation only uses values that have already been calculated. For an introduction to dynamic programming, see Dynamic Programming Zoo Tour.

Rather than carefully looping over the sub-problems in the right order to implement a dynamic programming solution, it is much easier to just use memoization. The trick here is once a subproblem is solved, store the answer. If we need the answer again, just use the stored answer rather than recomputing it. Memoization is a huge performance win. Without memoization solve1(10, 30) takes about 15 seconds, and with it, the solution is almost immediate. (solve1 10 35) takes about 80 seconds.

The Arc implementation is nice, because defmemo creates a function that is automatically memoized.

(defmemo solve1 (digits length)
  (if (> digits length) 0
      (is digits 1) 1
      (* digits (+ (solve1 (- digits 1) (- length 1))
                   (solve1 digits (- length 1))))))

A faster dynamic programming solution

When trying to count the number of things satisfying a condition, it's often easier to figure out how many don't satisfy the condition. We can apply that strategy to this problem. The total number of 12-digit sequences made from 1 through 5 is obviously 5^12. We can just subtract the number of sequences that are missing one or more digits, and we get the answer we want.

How many sequences are missing 1 digit? S(4, 12) is the number of sequences containing at least one of 1 through 4, and by definition missing the digit 5. Of course, we need to consider sequences missing 4, 3, 2, or 1 as well. Likewise, there are S(4, 12) of each of these, so 5 * S(4, 12) sequences missing exactly one digit.

Next, sequences missing exactly 2 of the digits 1 through 5. S(3, 12) is the number of sequences containing at least one of 1 through 3, so they are missing 4 and 5. But we must also consider sequences missing 1 and 2, 1 and 3, and so on. There are 5 choose 2 pairs of digits that can be missing. Thus in total, (5 choose 2) * S(3, 12) sequences are missing exactly 2 of the digits.

Likewise, the number of sequences missing exactly 3 digits is (5 choose 3) * S(2, 12). The number of sequences missing exactly 4 digits is (5 choose 4) * S(1, 12). And obviously there are no sequences missing 5 digits.

Thus, we get the recurrence S(5, 12) = 5^12 - 5*S(4, 12) - 10*S(3, 12) - 10*S(2, 12) - 5*S(1, 12)

In general, we get the recurrence:

The same dynamic programming techniques and memoization can be applied to this. Note that the subproblems all have the same length, so there are many fewer subproblems than in the previous case. That is, instead of a 2-dimensional grid of subproblems, the subproblems are all in a single row.

In Python, the solution is:

memo = {}
def solve2(digits, length):
  if memo.has_key((digits, length)):
    return memo[(digits, length)]
  if digits > length:
    result = 0
  elif digits == 1:
    result = 1
  else:
    result = digits**length
    for i in range(1, digits):
      result -= sol2(i, length)*choose(digits, i)
  memo[(digits, length)] = result
  return result
And in Arc, the solution is similar. Instead of looping, I'm using a map and reduce just for variety. That is, the square brackets define a function to compute the choose and solve term. This is mapped over the range 1 to digits-1. Then the results are summed with reduce. As before, defmemo is used to memoize the results.
(def choose (m n)
  (/ (factorial m) (factorial n) (factorial (- m n))))

(defmemo solve2 (digits length)
  (if (> digits length) 0
      (is digits 1) 1
      (- (expt digits length)
         (reduce +
           (map [* (choose digits _) (solve2 _ length)]
             (range 1 (- digits 1)))))))

The mathematical exponential generating function solution

At this point, I dug out my old combinatorics textbook to figure out how to solve this problem. By using exponential generating functions and a bunch of algebra, we obtain a fairly tidy answer to the original problem:

Since the mathematics required to obtain this answer is somewhat complicated, I've written it up separately in Part II.

Discussion

This problem turned out to be a lot more involved than I thought. It gave me the opportunity to try out some new things in Python and Arc, and make use of my fading knowledge of combinatorics and generating functions.

The Python and Arc code was pretty similar. Python's yield construct was convenient. Arc's automatic memoization was also handy. On the whole, both languages were about equally powerful in solving these problems.

Acknowledgement: equations created through CodeCogs' equation editor