Ken Shirriff's blog

Solving a math problem of Schrödinger with Arc and Python

Lego minifig lecturing on Exponential Generating Functions

I recently received an interesting math problem: how many 12-digit numbers can you make from the digits 1 through 5 if each digit must appear at least once. The problem seemed trivial at first, but it turned out to be more interesting than I expected, so I'm writing it up here. I was told the problem is in a book of Schrodinger's, but I don't know any more details of its origin.

(The problem came to me via Aardvark (vark.com), which is a service where you can send questions to random people, and random people send you questions. Aardvark tries to match up questions with people who may know the answer. I find it interesting to see the questions I receive.)

The brute force solution

I didn't see any obvious solution to the problem, so I figured I'd first use brute force. I didn't use the totally brute force solution - enumerating all 12 digit sequences and counting the valid ones - since it would be pretty slow, so I took a slightly less brute force approach.

If 1 appears a times, 2 appears b times, and 3, 4, and 5 appear c, d, and e times, then the total number of possibilities is 12!/(a!b!c!d!e!) - i.e. the multinomial coeffient.

Then we just need to sum across all the different combinations of a through e. a has to be at least 1, and can be at most 8 (in order to leave room for the other 4 digits). Then b has to be at least 1 and at most 9-a, and so on. Finally, e is whatever is left over.

The Python code is straightforward, noting that range(1, 9) counts from 1 to 8:

from math import factorial
total = 0
for a in range(1, 8+1):
  for b in range(1, 9-a+1):
    for c in range(1, 10-a-b+1):
      for d in range(1, 11-a-b-c+1):
        e = 12-a-b-c-d
        total += (factorial(12) / factorial(a) / factorial(b)
          / factorial(c) / factorial(d) / factorial(e))
print total

Or, in Arc:

(def factorial (n)
  (if (<= n 1) 1
     (* n (factorial (- n 1)))))

(let total 0
  (for a 1 8
    (for b 1 (- 9 a)
      (for c 1 (- 10 a b)
        (for d 1 (- 11 a b c)
          (let e (- 12 a b c d)
            (++ total (/ (factorial 12) (factorial a) (factorial b)
              (factorial c) (factorial d) (factorial e))))))))
  total)

Either way, we quickly get the answer 165528000.

The generalized brute force solution

The above solution is somewhat unsatisfying, since if we want to know how many 11 digits numbers can be made from at least one of 1 through 6, for instance, there's no easy way to generalize the code.

We can improve the solution by writing code to generate the vectors of arbitrary digits and length. In Python, we can use yield, which magically gives us the vectors one at a time. The vecs function recursively generates the vectors that sum to <= t. The solve routine adds the last entry in the vector to make the sum exactly equal the length. My multinomial function is perhaps overly fancy, using reduce to compute the factorials of all the denominator terms and multiply them together in one go.

from math import factorial

# Generate vectors of digits of length n with sum <= t, and minimum digit 1.
def vecs(n, t):
  if n <= 0:
    yield []
  else:
    for vec in vecs(n-1, t-1):
      for last in range(1, t - sum(vec) + 1):
        yield vec + [last]

def multinomial(n, ks):
  return factorial(n) / reduce(lambda prod, k: prod * factorial(k), ks, 1)
  
# Find number of sequences of digits 1 to n, of length len,
# with each digit appearing at least once
def solve(n, len):
  total = 0
  for vec0 in vecs(n-1, len-1):
    # Make the vector of length n summing exactly to len
    vec = vec0 + [len - sum(vec0)]
    total += multinomial(len, vec)
  return total

Now, solve(5, 12) solves the original problem, and we can easily solve related problems.

In Arc, doing yield would need some crazy continuation code, so I just accumulate the list of vectors with accum and then process it. The code is otherwise similar to the Python code:

(def vecs (n t)
  (if (<= n 0) '(())
    (accum accfn
      (each vec (vecs (- n 1) (- t 1))
       (for last 1 (- t (reduce + vec))
          (accfn (cons last vec)))))))

(def multinomial (n ks)
  (/ (factorial n) (reduce * (map factorial ks))))

(def solve (n len)
  (let total 0
    (each vec0 (vecs (- n 1) (- len 1))
        (++ total (multinomial len
                    (cons (- len (reduce + vec0)) vec0))))
    total))

This solution will solve our original problem quickly, but in general it's exponential, so solving something like n = 10 and length = 30 gets pretty slow.

The dynamic programming solution

By thinking about the problem a bit more, we can come up with a simpler solution. Let's take all the sequences of 1 through 5 of length 12 with each number appearing at least once, and call this value S(5, 12). How can we recursively find this value? Well, take a sequence and look at the last digit, say 5. Either 5 appears in the first 11 digits, or it doesn't. In the first case, the first 11 digits form a sequence of 1 through 5 of length 11 with each number appearing at least once. In the second case, the first 11 digits form a sequence of 1 through 4 of length 11 with each number appearing at least once. So we have S(5, 11) sequences of the first case, and S(4, 11) of the second case. We assumed the last digit was a 5, but everything works the same if it was a 1, 2, 3, or 4; there are 5 possibilities for the last digit.

The above shows that S(5, 12) = 5 * (S(5, 11) + S(4, 11)). In general, we have the nice recurrence S(n, len) = n * (S(n, len-1) + S(n-1, len-1)). Also, if you only have a single digit, there's only one solution, so S(1, len) = 1. And if you have more digits than length to put them in, there are clearly no solutions, so S(n, len) = 0 if n > len. We can code this up in a nice recursive solution with memoization:

memo = {}
def solve1(digits, length):
  if memo.has_key((digits, length)):
    return memo[(digits, length)]
  if digits > length:
    result = 0
  elif digits == 1:
    result = 1
  else:
    result = digits * (solve1(digits-1, length-1) + solve1(digits, length-1))
  memo[(digits, length)] = result
  return result

An interesting thing about this solution is it breaks down each function call into two simpler function calls. Each of these turns into two simpler function calls, and so forth, until it reaches the base case. This results in an exponential number of function calls. This isn't too bad for solving the original problem of length 12, but the run time rapidly gets way too slow.

Note that the actual number of different function evaluations is pretty small, but the same ones keep getting evaluated over and over exponentially often. The traditional way of fixing this is to use dynamic programming. In this approach, the structure of the algorithm is examined and each value is calculated just one, using previously-calculated values. In our case, we can evaluate S(1, 0), S(2, 0), S(3, 0), and so on. Then evaluate S(2, 1), S(2, 2), S(2, 3) using those values. Then evaluate S(3, 2), S(3, 3), and so on. Note that each evaluation only uses values that have already been calculated. For an introduction to dynamic programming, see Dynamic Programming Zoo Tour.

Rather than carefully looping over the sub-problems in the right order to implement a dynamic programming solution, it is much easier to just use memoization. The trick here is once a subproblem is solved, store the answer. If we need the answer again, just use the stored answer rather than recomputing it. Memoization is a huge performance win. Without memoization solve1(10, 30) takes about 15 seconds, and with it, the solution is almost immediate. (solve1 10 35) takes about 80 seconds.

The Arc implementation is nice, because defmemo creates a function that is automatically memoized.

(defmemo solve1 (digits length)
  (if (> digits length) 0
      (is digits 1) 1
      (* digits (+ (solve1 (- digits 1) (- length 1))
                   (solve1 digits (- length 1))))))

A faster dynamic programming solution

When trying to count the number of things satisfying a condition, it's often easier to figure out how many don't satisfy the condition. We can apply that strategy to this problem. The total number of 12-digit sequences made from 1 through 5 is obviously 5^12. We can just subtract the number of sequences that are missing one or more digits, and we get the answer we want.

How many sequences are missing 1 digit? S(4, 12) is the number of sequences containing at least one of 1 through 4, and by definition missing the digit 5. Of course, we need to consider sequences missing 4, 3, 2, or 1 as well. Likewise, there are S(4, 12) of each of these, so 5 * S(4, 12) sequences missing exactly one digit.

Next, sequences missing exactly 2 of the digits 1 through 5. S(3, 12) is the number of sequences containing at least one of 1 through 3, so they are missing 4 and 5. But we must also consider sequences missing 1 and 2, 1 and 3, and so on. There are 5 choose 2 pairs of digits that can be missing. Thus in total, (5 choose 2) * S(3, 12) sequences are missing exactly 2 of the digits.

Likewise, the number of sequences missing exactly 3 digits is (5 choose 3) * S(2, 12). The number of sequences missing exactly 4 digits is (5 choose 4) * S(1, 12). And obviously there are no sequences missing 5 digits.

Thus, we get the recurrence S(5, 12) = 5^12 - 5*S(4, 12) - 10*S(3, 12) - 10*S(2, 12) - 5*S(1, 12)

In general, we get the recurrence:
$S(d, l) = d^l - \sum_{i=1}^{d-1} \binom{n}{i} S(i,l)$

The same dynamic programming techniques and memoization can be applied to this. Note that the subproblems all have the same length, so there are many fewer subproblems than in the previous case. That is, instead of a 2-dimensional grid of subproblems, the subproblems are all in a single row.

In Python, the solution is:

memo = {}
def solve2(digits, length):
  if memo.has_key((digits, length)):
    return memo[(digits, length)]
  if digits > length:
    result = 0
  elif digits == 1:
    result = 1
  else:
    result = digits**length
    for i in range(1, digits):
      result -= sol2(i, length)*choose(digits, i)
  memo[(digits, length)] = result
  return result

And in Arc, the solution is similar. Instead of looping, I'm using a map and reduce just for variety. That is, the square brackets define a function to compute the choose and solve term. This is mapped over the range 1 to digits-1. Then the results are summed with reduce. As before, defmemo is used to memoize the results.

(def choose (m n)
  (/ (factorial m) (factorial n) (factorial (- m n))))

(defmemo solve2 (digits length)
  (if (> digits length) 0
      (is digits 1) 1
      (- (expt digits length)
         (reduce +
           (map [* (choose digits _) (solve2 _ length)]
             (range 1 (- digits 1)))))))

The mathematical exponential generating function solution

At this point, I dug out my old combinatorics textbook to figure out how to solve this problem. By using exponential generating functions and a bunch of algebra, we obtain a fairly tidy answer to the original problem:
$5^{12} - 5\times 4^{12} + 10\times 3^{12} - 10\times 2^{12} + 5$
Since the mathematics required to obtain this answer is somewhat complicated, I've written it up separately in Part II.

Discussion

This problem turned out to be a lot more involved than I thought. It gave me the opportunity to try out some new things in Python and Arc, and make use of my fading knowledge of combinatorics and generating functions.

The Python and Arc code was pretty similar. Python's yield construct was convenient. Arc's automatic memoization was also handy. On the whole, both languages were about equally powerful in solving these problems.

Acknowledgement: equations created through CodeCogs' equation editor

IPv6 killed my computer: Adventures in IPv6

Lego Minifig examining IPv6 configuration

You may have heard that the Internet is running out of IP addresses and disaster will strike unless everyone switches over to IPv6. This may be overhyped, since civilization continues even though the last block of addresses was given out on February 3, 2011.

However, with World IPv6 Day fast approaching (June 8), I figured this was a good time to try out IPv6 on my home network. I was also motivated by Hurricane Electric's IPv6 Certification (which is a "fun and educational" certification, not a "study thick books of trivia" certification).

This article describes my efforts to run IPv6 on my Windows 7 64-bit machine, and future articles will describe other IPv6 adventures.

IPv6 tunnels

If your ISP provides IPv6, then using IPv6 is easy. Comcast, for instance, is rolling out some IPv6 support. Unfortunately, AT&T (which I have) doesn't provide IPv6 and is lagging other providers in their future IPv6 plans.

The alternative is an IPv6 tunnel, which lets your computer connect to IPv6 sites over a normal IP connection. The number of different IPv6 tunnel mechanisms is very confusing: 6to4, Teredo, AYIYA, and 6in4, to name a few.

A Hacker News thread recommended Hurricane Electric's free tunnel, so I tried that. Unfortunately that tunnel requires IP protocol 41, and I quickly found that my 2Wire AT&T router inconveniently blocks protocol 41.

SixXS IPv6 tunnel on Windows 7 64-bit

Next, I tried SixXS's free IPv6 tunnel service, which doesn't require protocol 41 and promises IPv6 in 10 easy steps. This went well until Step 5, and then things got complicated.

This section gets into the gory details of setting up the tunnel; it's only to help people who are actually setting up a tunnel, and can be ignored by most readers :-)

Eventually I figured that for Windows 7 64-bit, you should ignore most of the 10 easy steps. Instead, get the Tap driver by installing OpenVPN according to the Vista64 instructions. Then go to the Vista instructions and follow all the mysterious steps including the obscure Vista parameter setup commands. Finally, use the AICCU command line, not the GUI.

If you encounter these errors, you're probably using the wrong Tap driver:

tapinstall.exe failed.
[warning] Error opening registry key: SYSTEM\CurrentControlSet\Control\Class\{4D
36E972-E325-11CE-BFC1-08002BE10318}\Properties (t1)
[warning] Found = 0, Count = 0
[error] [tun-start] TAP-Win32 Adapter not configured properly...

Finally, I was able to get my tunnel running with aiccu-2008-03-15-windows-console.exe start.

Once I had the tunnel running, my computer had an IPv6 address, it could talk to other IPv6 addresses, and my computer could be accessed via IPv6.

So now that I have IPv6 running on my computer, what can I do with it? Well, that's one of the problems. There isn't a whole lot you can do with IPv6 that you couldn't do before. You can connect to ipv6.google.com. You can see the animated kame turtle instead of the static turtle. You can see your IPv6 address at whatismyipv6address.com. But there's no IPv6 killer app.

Overall, the immediate benefit of running IPv6 is pretty minimal, which I think is a big barrier to adoption. As I discussed in a surprisingly popular blog post, the chance of adoption of a new technology depends on the ratio of perceived crisis to perceived pain of adoption. Given a high level of pain to run IPv6 and a very low benefit - any potential crisis is really someone else's crisis - I expect the rate of adoption to be very slow until people are forced to switch.

Getting IPv6 running reminds me a lot of TCP/IP circa 1992, when using TCP/IP required a lot of configuration, tunneling (over a serial line) with SLIP, and using immature software stacks such as Winsock. And most of what you wanted to do back then didn't even need TCP/IP. It will be interesting to see how long it takes IPv6 to get to the point where it just works and nobody needs to think about it. (None of this is meant as a criticism of SixXS or Hurricane Electric, by the way, who are providing useful free services. AT&T on the other hand...)

Running IPv6 did let me gain another level in the Hurricane Electric certification, though.

IPv6 kills my computer

The day after setting up IPv6, I turned on my computer. The power light came on, and that was all, no Windows, no BIOS screen, and not even a startup beep. I tried a different monitor with no luck. I opened up the computer, re-seated the memory modules, and poked at anything else that might be loose. I powered up my system again, and everything worked, fortunately enough.

The pedant might insist that IPv6 didn't kill my computer since any IPv6 issues wouldn't show up until after the OS boots, and IPv6 couldn't possibly have stopped my computer from even booting to the BIOS. However, I remain suspicious. Just a coincidence that I set up IPv6 and my computer dies? I don't think so. What about the rest of you? Anyone's computer get killed by IPv6?

Sheevaplug as IPv6 router

A few weeks later, I decided to try using my Sheevaplug as an IPv6 router for my home network, so all my machines could access IPv6 through my Sixxs tunnel. Although I used a Sheevaplug, these steps should work for other Linux systems too. The following are the gory details in the hope that they may be helpful to someone.

I started with the page Sixxs Sheevaplug, which describes how to set up the Sheevaplug with IPv6. Unfortunately, it didn't quite work. It suggests using a program called ufw, which stands for Uncomplicated FireWall. Unfortunately, I found it to be anything but uncomplicated. The first problem was that the default ufw version 0.29 didn't work for me - it failed to create the necessary chains with the inconvenient side-effect of blocking all ports; version 0.30 avoids this problem. I ended up using ip6tables directly, which was much easier - see instructions here.

The steps I took to setting up the Sheevaplug:

Request a subnet from Sixxs. In the examples below, I use: 2001:1938:2a9::/48. Important: use your subnet, not mine.
aiccu start to start the tunnel
sysctl -w net.ipv6.conf.all.forwarding=1 to enable IPv6 routing.

Configure /etc/radvd.conf as follows:

interface eth0
{
 AdvSendAdvert on;
 AdvLinkMTU 1280;
 prefix 2001:1938:2a9::/64
 {
               AdvOnLink on;
               AdvAutonomous on;
 };
};

Note that your subnet is /48, but you need to use /64 in the file.

ip -6 ro add 2001:1938:2a9::/48 dev lo
ip -6 add add 2001:1938:2a9::1/64 dev eth0
I'm not sure why adding that address is necessary, but routing just doesn't work without it.
service radvd start
radvd is the Router Advertisement Daemon, which lets other computers on your network know about your IPv6 tunnel.

At this point, you should be able to start up a Windows box, and it will automatically get an IPv6 address on the subnet. You should then be able to access ipv6.google.com or ip6.me.

Debugging: if the IPv6 subnet doesn't work, many things can be wrong. First, make sure your host machine can access IPv6. Assuming your client machine runs Windows, you can do "ping -6 ipv6.google.com" or "tracert -6 ipv6.google.com". With "ipconfig /all" you should see a "Temporary IPv6 Address" on the subnet, and a "Default Gateway" pointing at your Sheevaplug. If tracert gets one step and then hangs, try disabling your firewall on the Sheevaplug. If these tips don't work, then I guess you'll end up spending hours with tcpdump like I did :-(

Also read Part 2: IPv6 Web Serving with Arc or Python.

Solving edge-match puzzles with Arc and backtracking

I recently saw a tricky nine piece puzzle and wondered if I could use Arc to solve it. It turned out to be straightforward to solve using standard backtracking methods. This article describes how to solve the puzzle using the Arc language.

The puzzle consists of nine square titles. The tiles must be placed so the pictures match up where two tiles meet. As the picture says, "Easy to play, but hard to solve." It turns out that this type of puzzle is called an edge-matching puzzle, and is NP-complete in general. For dozens of examples of these puzzles, see Rob's Puzzle Page.

Backtracking is a standard AI technique to find a solution to a problem step by step. If you reach a point where a solution is impossible, you backtrack a step and try the next possibility. Eventually, you will find all possible solutions. Backtracking is more efficient than brute-force testing of all possible solutions because you abandon unfruitful paths quickly.

To solve the puzzle by backtracking, we put down the first tile in the upper left. We then try a second tile in the upper middle. If it matches, we put it there. Then we try a third tile in the upper right. If it matches, we put it there. The process continues until a tile doesn't match, and then the algorithm backtracks. For instance, if the third tile doesn't match, we try a different third tile and continue. Eventually, after trying all possible third tiles, we backtrack and try a different second tile. And after trying all possible second tiles, we'll backtrack and try a new first tile. Thus, the algorithm will reach all possible solutions, but avoids investigating arrangements that can't possibly work.

The implementation

I represent each tile with four numbers indicating the pictures on each side. I give a praying mantis the number 1, a beetle 2, a dragonfly 3, and an ant 4. For the tail of the insect, I give it a negative value. Each tile is then a list of (top left right bottom). For instance, the upper-left tile is (2 1 -3 3). With this representation, tiles match if the value on one edge is the negative of the value on the other edge. I can then implement the list of tiles:

(with (mantis 1 beetle 2 dragonfly 3 ant 4)
  (= tiles (list
    (list beetle mantis (- dragonfly) dragonfly)
    (list (- beetle) dragonfly mantis (- ant))
    (list ant (- mantis) beetle (- beetle))
    (list (- dragonfly) (- ant) ant mantis)
    (list ant (- beetle) (- dragonfly) mantis)
    (list beetle (- mantis) (- ant) dragonfly)
    (list (- ant) (- dragonfly) beetle (- mantis))
    (list (- beetle) ant mantis (- dragonfly))
    (list mantis beetle (- dragonfly) ant))))

Next, I create some helper functions to access the edges of a tile, convert the integer to a string, and to prettyprint the tiles.

;; Return top/left/right/bottom entries of a tile
(def top (l) (l 0))
(def left (l) (l 1))
(def right (l) (l 2))
(def bottom (l) (l 3))

;; Convert an integer tile value to a displayable value
(def label (val)
  ((list "-ant" "-dgn" "-btl" "-man" "" " man" " btl" " dgn" " ant")
   (+ val 4)))

;; Print the tiles nicely
(def prettyprint (tiles (o w 3) (o h 3))
  (for y 0 (- h 1)
    (for part 0 4
      (for x 0 (- w 1)
        (withs (n (+ x (* y w)) tile (tiles n))
          (if
     (is part 0)
       (pr " ------------- ")
     (is part 1)
       (pr "|    " (label (top tile)) "     |")
     (is part 2)
       (pr "|" (label (left tile)) "    " (label (right tile)) " |")
     (is part 3)
       (pr "|    " (label (bottom tile)) "     |")
     (is part 4)
       (pr " ------------- "))))
      (prn))))

The prettyprint function uses optional arguments for width and height: (o w 3). This sets the width and height to 3 by default but allows it to be modified if desired. The part loop prints each tile row is printed as five lines. Now we can print out the starting tile set and verify that it matches the picture. I'll admit it's not extremely pretty, but it gets the job done:

arc> (prettyprint tiles)
 -------------  -------------  ------------- 
|     btl     ||    -btl     ||     ant     |
| man    -dgn || dgn     man ||-man     btl |
|     dgn     ||    -ant     ||    -btl     |
 -------------  -------------  ------------- 
 -------------  -------------  ------------- 
|    -dgn     ||     ant     ||     btl     |
|-ant     ant ||-btl    -dgn ||-man    -ant |
|     man     ||     man     ||     dgn     |
 -------------  -------------  ------------- 
 -------------  -------------  ------------- 
|    -ant     ||    -btl     ||     man     |
|-dgn     btl || ant     man || btl    -dgn |
|    -man     ||    -dgn     ||     ant     |
 -------------  -------------  -------------

Next is the meat of the solver. The first function is matches, which takes a grid of already-positioned tiles and a new tile, and tests if a particular edge of the new tile matches the existing tiles. (The grid is represented simply as a list of the tiles that have been put down so far.) This function is where all the annoying special cases get handled. First, the new tile may be along an edge, so there is nothing to match against. Second, the grid may not be filled in far enough for there to be anything to match against. Finally, if there is a grid tile to match against, and the value there is the negative of the new tile's value, then it matches. One interesting aspect of this function is that functions are passed in to it to select which edges (top/bottom/left/right) to match.

;; Test if one edge of a tile will fit into a grid of placed tiles successfully
;;  grid is the grid of placed tiles as a list of tiles e.g. ((1 3 4 2) nil (-1 2 -1 1) ...)
;;  gridedge is the edge of the grid cell to match (top/bottom/left/right)
;;  gridx is the x coordinate of the grid tile
;;  gridy is the y coordinate of the grid tile
;;  newedge is the edge of the new tile to match (top/bottom/left/right)
;;  newtile is the new tile e.g. (1 2 -1 -3)
;;  w is the width of the grid
;;  h is the height of the grid
(def matches (grid gridedge gridx gridy newedge newtile w h)
  (let n (+ gridx (* gridy w))
    (or
      (< gridx 0) ; nothing to left of tile to match, so matches by default
      (< gridy 0) ; tile is at top of grid, so matches
      (>= gridx w) ; tile is at right of grid, so matches
      (>= gridy h) ; tile is at bottom of grid, so matches
      (>= n (len grid)) ; beyond grid of tiles, so matches
      (no (grid n)) ; no tile placed in the grid at that position
      ; Finally, compare the two edges which should be opposite values
      (is (- (gridedge (grid n))) (newedge newtile)))))

With that method implemented, it's easy to test if a new tile will fit into the grid. We simply test that all four edges match against the existing grid. We don't need to worry about the edges of the puzzle, because the previous method handles them:

;; Test if a tile will fit into a grid of placed tiles successfully
;;   grid is the grid of tiles
;;   newtile is the tile to place into the grid
;;   (x, y) is the position to place the tile
(def is_okay (grid newtile x y w h)
    (and 
      (matches grid right (- x 1) y left newtile w h)
      (matches grid left (+ x 1) y right newtile w h)
      (matches grid top x (+ y 1) bottom newtile w h)
      (matches grid bottom x (- y 1) top newtile w h)))

Now we can implement the actual solver. The functions solve1 and try recursively call each other. solve1 calls try with each candidate tile in each possible orientation. If the tile fits, try updates the grid of placed tiles and calls solve1 to continue solving. Otherwise, the algorithm backtracks and solve1 tries the next possible tile. My main problem was accumulating all the solutions properly; on my first try, the solutions were wrapped in 9 layers of parentheses! One other thing to note is the conversion from a linear (0-8) position to an x/y grid position.

A couple refactorings are left as an exercise to the reader. The code to try all four rotations of the tile is a bit repetitive and could probably be cleaned up. More interesting would be to turn the backtracking solver into a general solver, with the puzzle just one instance of a problem.

;; grid is a list of tiles already placed
;; candidates is a list of tiles yet to be placed
;; nextpos is the next position to place a tile (0 to 8)
;; w and h are the dimensions of the puzzle
(def solve1 (grid candidates nextpos w h)
  (if
    (no candidates)
      (list grid) ; Success!
    (mappend idfn (accum addfn ; Collect results and flatten
      (each candidate candidates
        (addfn (try grid candidate (rem candidate candidates) nextpos w h))
        (addfn (try grid (rotate candidate) (rem candidate candidates) nextpos w h))
        (addfn (try grid (rotate (rotate candidate)) (rem candidate candidates) nextpos w h))
        (addfn (try grid (rotate (rotate (rotate candidate))) (rem candidate candidates) nextpos w h)))))))

; Helper to append elt to list
(def append (lst elt) (join lst (list elt)))

;; Try adding a candidate tile to the grid, and recurse if successful.
;; grid is a list of tiles already placed
;; candidate is the tile we are trying
;; candidates is a list of tiles yet to be placed (excluding candidate)
;; nextpos is the next position to place a tile (0 to 8)
;; w and h are the dimensions of the puzzle
(def try (grid candidate candidates nextpos w h)
  (if (is_okay grid candidate (mod nextpos w) (trunc (/ nextpos w)) w h)
    (solve1 (append grid candidate) candidates (+ nextpos 1) w h)))

The final step is a wrapper function to initialize the grid:

(def solve (tiles (o w 3) (o h 3))
  (solve1 nil tiles 0 w h))

With all these pieces, we can finally solve the problem, and obtain four solutions (just rotations of one solution):

arc> (solve tiles)
(((1 -2 -4 3) (2 4 1 -3) (4 -1 2 -2) (-3 1 4 -2) (3 -4 -1 2) (2 1 -3 3) (2 -4 -1 -3) (-2 1 4 -3) (-3 -4 4 1))
((2 4 -2 -1) (-3 2 3 1) (4 -3 1 -4) (1 2 -3 4) (-1 3 2 -4) (4 -2 -3 1) (-4 1 3 -2) (4 -3 -2 1) (-1 2 -3 -4))
((1 4 -4 -3) (-3 4 1 -2) (-3 -1 -4 2) (3 -3 1 2) (2 -1 -4 3) (-2 4 1 -3) (-2 2 -1 4) (-3 1 4 2) (3 -4 -2 1))
((-4 -3 2 -1) (1 -2 -3 4) (-2 3 1 -4) (1 -3 -2 4) (-4 2 3 -1) (4 -3 2 1) (-4 1 -3 4) (1 3 2 -3) (-1 -2 4 2)))
arc> (prettyprint (that 0))
 -------------  -------------  ------------- 
|     man     ||     btl     ||     ant     |
|-btl    -ant || ant     man ||-man     btl |
|     dgn     ||    -dgn     ||    -btl     |
 -------------  -------------  ------------- 
 -------------  -------------  ------------- 
|    -dgn     ||     dgn     ||     btl     |
| man     ant ||-ant    -man || man    -dgn |
|    -btl     ||     btl     ||     dgn     |
 -------------  -------------  ------------- 
 -------------  -------------  ------------- 
|     btl     ||    -btl     ||    -dgn     |
|-ant    -man || man     ant ||-ant     ant |
|    -dgn     ||    -dgn     ||     man     |
 -------------  -------------  -------------

I've used gimp on the original image to display the solution. I've labeled the original tiles A-I so you can see how the solution relates to the original image. Using Arc to display a solution as an image is left as an exercise to the reader :-) But seriously, this is where using a language with extensive libraries would be beneficial, such as Python's PIL imaging library.

solution

Theoretical analysis

I'll take a quick look at the theory of the puzzle. The tiles can be placed in 9! (9 factorial) different locations, and each tile can be oriented 4 ways, for a total of 9! * 4^9 possible arrangements of the tiles, which is about 95 billion combinations. Clearly this puzzle is hard to solve by randomly trying tiles.

arc> (* (apply * (range 1 9)) (expt 4 9))
95126814720

We can do a back of the envelope calculation to see how many solutions we can expect. If you put the tiles down randomly, there are 12 edge constraints that must be satisfied. Since only one of the 8 possibilities matches, the chance of all the edges matching randomly is 1 in 8^12 or 68719476736. Dividing this into the 95 billion possible arrangements yields 1.38 solutions for an arbitrary random puzzle. (If this number were very large, then it would be hard to create a puzzle with only one solution.)

We can test this calculation experimentally by seeing how many solutions there are are for a random puzzle. First we make a function to create a random puzzle, consisting of 9 tiles, each with 4 random values. Then we solve 100 of these:

arc> (def randtiles ()
  (n-of 9 (n-of 4 (rand-elt '(1 2 3 4 -1 -2 -3 -4)))))
arc> (n-of 100 (len (solve (randtiles))))
(0 0 0 8 0 0 0 8 0 0 0 0 0 0 0 0 0 0 0 4 0 0 0 0 0 0 0 0 0 0 8 0 0 4 0
0 0 4 8 0 8 0 0 0 0 0 0 0 32 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 16 0
0 0 32 8 0 0 0 0 0 0 0 0 12 0 0 0 0 0 0 0 0 0 0 0 8 0 0 4 0 0 32)
arc> (apply + that)
196

Out of 100 random puzzles, there are 196 solutions, which is close to the 1.38 solutions per puzzle estimate above. (that is an obscure Arc variable that refers to the previous result.) Note that only 16% of the puzzles have solutions, though. Part of the explanation is that solutions always come in groups of 4, since the entire puzzle can be rotated 90 degrees into four different orientations. Solving 100 puzzles took 146 seconds, by the way.

Another interesting experiment is to add a counter to try to see how many tile combinations the solver actually tries. The result is 66384, which is much smaller than the 95 billion potential possibilities. This suggests that the puzzle is solvable manually by trial-and-error with backtracking; at a second per tile, it would probably take you 4.6 hours to get the first solution.

Conclusions

Solving the puzzle with Arc was easier than I expected, and I ran into only minor problems along the way. The code is available as puzzle.arc.

64-bit RC6 codes, Arduino, and Xbox remotes

I've extended my Arduino IRremote library to support RC6 codes up to 64 bits long. Now your Arduino can control your Xbox by acting as an IR remote control. (The previous version of my library only supported 32 bit codes, so it didn't work with the 36-bit Xbox codes.) Details of the IRremote library are here.

Note: I don't actually have an Xbox to test this code with, so please let me know if the code works for you.

Download

This code is in my "experimental" branch since I'd like to get some testing before releasing it to the world. To use it:

Download the IRremote library zip file from GitHub from the Arduino-IRremote experimental branch.
Unzip the download
Move/rename the shirriff-Arduino-IRremote-nnnn directory to arduino-000nn/libraries/IRremote.

Sending an Xbox code

The following program simply turns the Xbox on, waits 10 seconds, turns it off, and repeats.

#include <IRremote.h>
IRsend irsend;
unsigned long long OnOff = 0xc800f740cLL;
int toggle = 0;

void sendOnOff() {
  if (toggle == 0) {
    irsend.sendRC6(OnOff, 36);
  } else {
    irsend.sendRC6(OnOff ^ 0x8000, 36);
  }
  toggle = 1 - toggle;
}
   
void setup() {}

void loop() {
  delay(10000);  // Wait 10 seconds
  sendOnOff();  // Send power signal
}

The first important thing to note is that the On/Off code is "unsigned long long", and the associated constant ends in "LL" to indicate that it is a long long. Because a regular long is only 32 bits, a 64-bit long long must be used to hold the 64 bit code. If you try using a regular long, you'll lose the top 4 bits ("C") from the code.

The second thing to notice is the toggle bit (which is explained in more detail below). Every time you send a RC5 or RC6 code, you must flip the toggle bit. Otherwise the receiver will ignore the code. If you want to send the code multiple times for reliability, don't flip the toggle bit until you're done.

Receiving an Xbox code

The following code will print a message on the serial console if it receives an Xbox power code. It will also dump the received value.

#include <IRremote.h>

int RECV_PIN = 11;
IRrecv irrecv(RECV_PIN);
decode_results results;

void setup()
{
  Serial.begin(9600);
  irrecv.enableIRIn(); // Start the receiver
}

void loop() {
  if (irrecv.decode(&results)) {
    if ((results.value & ~0x8000LL) == 0xc800f740cLL) {
      Serial.println("Got an OnOff code");
    }
    Serial.println(results.value >> 32, HEX);
    Serial.println(results.value, HEX);
    irrecv.resume(); // Receive the next value
  }
}

Two things to note in the above code. First, the received value is anded with ~0x8000LL; this clears out the toggle bit. Second, Serial.println is called twice, first to print the top 32 bits, and second to print the bottom 32 bits.

The output when sent power codes multiple times is:

Got an OnOff code
C
800FF40C
Got an OnOff code
C
800F740C
Got an OnOff code
C
...

Note that the received value is split between the two Serial.println calls. Also note that the output oscillates between ...F740C and ...FF40C as the sender flips the toggle bit. This is why the toggle bit must be cleared when looking for a particular value.

The IRremote/IRrecvDump example code has also been extended to display values up to 64 bits, and can be used for testing.

A quick description of RC6 codes

RC6 codes are somewhat more complicated than the usual IR code. They come in 20 or 36 bit varieties (that I know of). The code consists of a leader pulse, a start bit, and then the 20 (or 36) data bits. A 0 bit is sent as a space (off) followed by a mark (on), while a 1 bit is sent as a mark (on) followed by a space (off).

The first complication is the fourth data bit sent (called a trailer bit for some reason), is twice as long as the rest of the bits. The IRremote library handles this transparently.

The second complication is RC5 and RC6 codes use a "toggle bit" to distinguish between a button that is held down and a button that is pressed twice. While a button is held down, a code is sent. When the button is released and pressed a second time, a new code with one bit flipped is sent. On the third press, the original code is sent. When sending with the IRremote library, you must keep track of the toggle bit and flip it each time you send. When receiving with the IRremote library, you will receive one of two different codes, so you must clear out the toggle bit. (I would like the library to handle this transparently, but I haven't figured out a good way to do it.)

For details of the RC6 encoding with diagrams and timings, see SB projects.

The LIRC database

The LIRC project collects IR codes for many different remotes. If you want to use RC6 codes from LIRC, there a few things to know. A typical code is the Xbox360 file, excerpted below:

begin remote
  name  Microsoft_Xbox360
  bits           16
  flags RC6|CONST_LENGTH
  pre_data_bits   21
  pre_data       0x37FF0
  toggle_bit_mask 0x8000
  rc6_mask    0x100000000

      begin codes
          OpenClose                0x8BD7
          XboxFancyButton          0x0B9B
          OnOff                    0x8BF3
...

To use these RC6 code with the Arduino takes a bit of work. First, the codes have been split into 21 bits of pre-data, followed by 16 bits of data. So if you want the OnOff code, you need to concatenate the bits together, to get 37 bits: 0x37ff08bf3. The second factor is the Arduino library provides the first start bit automatically, so you only need to use 36 bits. The third issue is the LIRC files inconveniently have 0 and 1 bits swapped, so you'll need to invert the code. The result is the 36-bit code 0xc800f740c that can be used with the IRremote library.

The rc6_mask specifies which bit is the double-wide "trailer" bit, which is the fourth bit sent (both in 20 and 36 bit RC6 codes). The library handles this bit automatically, so you can ignore it.

The toggle_bit_mask is important, as it indicates which position needs to be toggled every other transmission. For example, to transmit OnOff you will send 0xc800f740c the first time, but the next time you will need to transmit 0xc800ff40c.

More details of the LIRC config file format are at WinLIRC.

Xbox codes

Based on the LIRC file, the following Xbox codes should work with my library:

OpenClose	0xc800f7428	XboxFancyButton	0xc800ff464	OnOff	0xc800f740c
Stop	0xc800ff419	Pause	0xc800f7418	Rewind	0xc800ff415
FastForward	0xc800f7414	Prev	0xc800ff41b	Next	0xc800f741a
Play	0xc800ff416	Display	0xc800f744f	Title	0xc800ff451
DVD_Menu	0xc800f7424	Back	0xc800ff423	Info	0xc800f740f
UpArrow	0xc800ff41e	LeftArrow	0xc800f7420	RightArrow	0xc800ff421
DownArrow	0xc800f741f	OK	0xc800ff422	Y	0xc800f7426
X	0xc800ff468	A	0xc800f7466	B	0xc800ff425
ChUp	0xc800f7412	ChDown	0xc800ff413	VolDown	0xc800ff411
VolUp	0xc800ff410	Mute	0xc800ff40e	Start	0xc800ff40d
Play	0xc800f7416	Enter	0xc800ff40b	Record	0xc800f7417
Clear	0xc800ff40a	0	0xc800f7400	1	0xc800f7401
2	0xc800ff402	3	0xc800f7403	4	0xc800ff404
5	0xc800f7405	6	0xc800ff406	7	0xc800f7407
8	0xc800ff408	9	0xc800f7409	100	0xc800ff41d
Reload	0xc800f741c

Key points for debugging

The following are the main things to remember when dealing with 36-bit RC6 codes:

Any 36-bit hex values must start with "0x" and end with "LL". If you get the error "integer constant is too large for 'long' type", you probably forgot the LL.
You must use "long long" or "unsigned long long" to hold 36-bit values.
Serial.print will not properly print 36-bit values; it will only print the lower 32 bits.
You must flip the toggle bit every other time when you send a RC6 code, or the receiver will ignore your code.
You will receive two different values for each button depending if the sender has flipped the toggle bit or not.

Control your mouse with an IR remote

You can use an IR remote to control your computer's keyboard and mouse by using my Arduino IR remote library and a small microcontroller board called the Teensy. By pushing the buttons on your IR remote, you can steer the cursor around the screen, click on things, and enter digits. The key is the Teensy can simulate a USB keyboard and mouse, and provide input to your computer.

How to do this

I built a simple sketch that uses the remote from my DVD player to move and click the mouse, enter digits, or page up or down. Follow these steps:

The hardware is nearly trivial. Connect a IR detector to the Teensy: detector pin 1 to Teensy input 11, detector pin 2 to ground, and detector pin 3 to +5 volts.
Install the latest version of my IR remote library, which has support for the Teensy.
Download the IRusb sketch.
In the Arduino IDE, select Tools > Board: Teensy 2.0 and select Tools > USB Type: Keyboard and Mouse.
Modify the sketch to match the codes from your remote. Look at the serial console as you push the buttons on your remote, and modify the sketch accordingly. (This is explained in more detail below.)

To reiterate, this sketch won't work on a standard Arduino; you need to use a Teensy.

How the sketch works

The software is straightforward because the Teensy has USB support built in. First, the IR library is initialized to receive IR codes on pin 11:

#include <IRremote.h>

int RECV_PIN = 11;
IRrecv irrecv(RECV_PIN);
decode_results results;

void setup()
{
  irrecv.enableIRIn(); // Start the receiver
  Serial.begin(9600);
}

Next, the decode method is called to receive IR codes. If the hex value for the received code corresponds to a desired button, a USB mouse or keyboard command is sent. If a code is not recognized, it is printed on the serial port. Finally, after receiving a code, the resume method is called to resume receiving IR codes.

int step = 1;
void loop() {
  if (irrecv.decode(&results)) {
    switch (results.value) {
    case 0x9eb92: // Sony up
      Mouse.move(0, -step); // up
      break;
    case 0x5eb92:  // Sony down
      Mouse.move(0, step); // down
      break;
...
    case 0x90b92:  // Sony 0
      Keyboard.print("0");
      break;
...
    default:
      Serial.print("Received 0x");
      Serial.println(results.value, HEX);
      break;
    }
    irrecv.resume(); // Resume decoding (necessary!)
  }
}

You may wonder where the codes such as 0x9eb92 come from. These values are for my Sony DVD remote, so chances are they won't work for your remote. To get the values for your remote, look at the serial console as you press the desired buttons. As long as you have a supported remote type (NEC, Sony, RC5/6), you'll get the hex values to put into the sketch. Simply copy the hex values into the sketch, and perform the desired action.

There are a few details to note. If your remote uses the RC5 or RC6 format, there are actually two different codes assigned to each button, and the remote alternates between them. Push the button twice to see if you'll need to use two different codes. If you want to send a non-ASCII keyboard code, such as Page Down, you'll need to use a slightly more complex set of commands (documentation). For example, the following code sends a Page UP if it receives a RC5 Volume Up from the remote. Note that there are two codes for volume up, and note that KEY_PAGE_UP is sent, followed by 0 (no key).

    case 0x10: // RC5 vol up
    case 0x810:
      Keyboard.set_key1(KEY_PAGE_UP);
      Keyboard.send_now();
      Keyboard.set_key1(0);
      Keyboard.send_now();
      break;

Improvements

My first implementation of the sketch as described above was very easy, but there were a couple displeasing things. The first problem was the mouse movement was either very slow (with a small step size) or too jerky (with a large step size). Second, if you press a number on the remote, the keyboard input rapidly repeats because the IR remote repeatedly sends the IR code, so you end up with "111111" when you just wanted "1".

The solution to the mouse movement is to implement acceleration - as you hold the button down, the mouse moves faster. This was straightforward to implement. The sketch checks if the button code was received within 200ms of the previous code. If so, the sketch speeds up the mouse movement by increasing the step size. Otherwise it resets the step size to 1. The result is that tapping the button gives you fine control by moving the mouse a little bit, while holding the button down lets you zip across the screen:

    if (millis() - lastTime > GAP) {
      step = 1;
    } 
    else if (step > 20) {
      step += 1;
    }

Similarly, to prevent the keyboard action from repeating, we only output keypresses if the press is more then 200ms after the previous. This results in a single keyboard action no matter how long a button is pressed down. The same thing is done to prevent multiple mouse clicks.

 if (millis() - lastTime > GAP) {
        switch (results.value) {
        case 0xd0b92:
          Mouse.click();
          break;
        case 0x90b92:
          Keyboard.print("0");
          break;
...

Now you can control your PC from across the room by using your remote. Thanks to Paul Stoffregen of PJRC for porting my IR remote library to the Teensy and sending me a Teensy for testing.

IRremote library now runs on the Teensy, Arduino Mega, and Sanguino

Thanks to Paul Stoffregen of PJRC, my Arduino IR remote library now runs on a bunch of different platforms, including the Teensy, Arduino Mega, and Sanguino. Paul has details here, along with documentation on the library that I admit is better than mine.

I used my new IRremote test setup to verify that the library works fine on the Teensy. I haven't tested my library on the other platforms because I don't have the hardware so let me know if you have success or run into problems.

Download

The latest version of the IRremote library with the multi-platform improvements is on GitHub. To download and install the library:

Download the IRremote library zip file from GitHub.
Unzip the download
Move/rename the shirriff-Arduino-IRremote-nnnn directory to arduino-000nn/libraries/IRremote.

Thanks again to Paul for adding this major improvement to my library and sending me a Teensy to test it out. You can see from the picture that the Teensy provides functionality similar to the Arduino in a much smaller package that is also breadboard-compatible; I give it a thumbs-up.

Testing the Arduino IR remote library

I wrote an IR remote library for the Arduino (details) that has turned out to be popular. I want to make sure I don't break things as I improve the library, so I've created a test suite that uses a pair of Arduinos: one sends data and the other receives data. The receiver verifies the data, providing an end-to-end test that the library is working properly.

Overview of the test

The first Arudino repeatedly sends a bunch of IR codes to the second Arduino. The second Arduino verifies that the received code is what is expected. If all is well, the second Arduino flashes the LED for each successful code. If there is an error, the second Arudino's LED illuminates for 5 seconds. The test cycle repeats forever. Debugging information is output to the second Arduino's serial port, which is helpful for tracking down the cause of errors.

Hardware setup

The test hardware is pretty simple: one Arduino transmits, and one Arduino receives. An IR LED is connected to pin 3 of the first Arduino to send the IR code. An IR detector is connected to pin 11 of the second Arduino to receive the IR code. A LED is connected to pin 3 of the second Arduino to provide the test status.
schematic of test setup

Details of the test software

One interesting feature of this test is the same sketch runs on the sending Arduino and the receiving Arduino. The test looks for an input on pin 11 to decide if it is the receiver:

void setup()
{
  Serial.begin(9600);
  // Check RECV_PIN to decide if we're RECEIVER or SENDER
  if (digitalRead(RECV_PIN) == HIGH) {
    mode = RECEIVER;
    irrecv.enableIRIn();
    pinMode(LED_PIN, OUTPUT);
    digitalWrite(LED_PIN, LOW);
    Serial.println("Receiver mode");
  } 
  else {
    mode = SENDER;
    Serial.println("Sender mode");
  }
}

Another interesting feature is the test suite is expressed very simply:

  test("SONY4", SONY, 0x12345, 20);
  test("SONY5", SONY, 0x00000, 20);
  test("SONY6", SONY, 0xfffff, 20);
  test("NEC1", NEC, 0x12345678, 32);
  test("NEC2", NEC, 0x00000000, 32);
  test("NEC3", NEC, 0xffffffff, 32);
...

Each test call has a debugging string, the type of code to send/receive, the value to send/receive, and the number of bits.

On the sender, the testmethod sends the code, while on the receiver, the method verifies that the proper code is received. The SENDER code calls the appropriate send method based on the type, and then delays before the next test. The RECEIVER code waits for a code. If it's correct, it flashes the LED. Otherwise, it sets the state to ERROR.

void test(char *label, int type, unsigned long value, int bits) {
  if (mode == SENDER) {
    Serial.println(label);
    if (type == NEC) {
      irsend.sendNEC(value, bits);
    } 
    else if (type == SONY) {
...
    }
    delay(200);
  } 
  else if (mode == RECEIVER) {
    irrecv.resume(); // Receive the next value
    unsigned long max_time = millis() + 30000;
    // Wait for decode or timeout
    while (!irrecv.decode(&results)) {
      if (millis() > max_time) {
        mode = ERROR; // timeout
        return;
      }
    }
    if (type == results.decode_type && value == results.value && bits == results.bits) {
      // flash LED
    } 
    else {
      mode = ERROR;
    }
  }
}

The trickiest part of the code is synchronizing the sender and the receiver. This happens in loop(). The receiver waits for 1 second without any transmission, while the sender pauses for 2 seconds after each time through the tests. Thus, the receiver will wait while the sender is running through tests, and then will start listening just before the sender starts the next cycle of tests. One other thing to point out is if there is an error, the receiver will skip through all the remaining tests, light the LED to indicate the error, and then will wait to sync up again. This avoids the problem of one bad test getting the receiver permanently out of sync; the receiver is able to re-sync and continue successfully after a failed test.

void loop() {
  if (mode == SENDER) {
    delay(2000);  // Delay for more than gap to give receiver a better chance to sync.
  } 
  else if (mode == RECEIVER) {
    waitForGap(1000);
  } 
  else if (mode == ERROR) {
    // Light up for 5 seconds for error
    mode = RECEIVER;  // Try again
    return;
  }

The test also includes some raw mode tests. These are a bit more complicated, since I want to test the various combinations of sending and receiving in raw mode.

Download and running

I'm gradually moving my development to GitHub at https://github.com/shirriff/Arduino-IRremote.

The code fragments above have been slightly abbreviated; the full code for the test sketch is here.

To download the library and try out the two-Arduino test:

Download the IRremote library zip file.
Unzip the download
Move/rename the shirriff-Arduino-IRremote-nnnn directory to arduino-000nn/libraries/IRremote. The test sketch is in examples/IRtest2.

To run the test, install the sketch on two Arduinos. The test should automatically start running. Note that it is a bit tricky to use two Arduinos at once. They will probably get assigned different serial ports, and you can switch ports using the Tools menu. If you get confused, you can plug one Arduino in at a time, and then you can be sure about which one is getting installed.

My plan is to do more development on the library, now that I have a reasonably solid test suite and I can be more confident that I don't break things. Let me know if there are specific features you'd like.

Thanks go to SparkFun for giving me the second Arduino that made this test possible.