The exponential generating function solution
My old combinatorics textbook describes how to solve all sorts of combinatorial problems. Section 3.2.13 has a somewhat similar sequence problem: how many sequences of 1,2, or 3 of length l, in which no symbol occurs exactly p times. This approach can be generalized to the problem we're solving.For our specific problem, we want to exclude solutions where a digit appears 0 times, but we'll stick with the generalized approach of excluding solutions where a digit appears exactly p times, and then we'll set p to 0 later. We'll let d be the range of each digit (e.g. if the digits are 1 through 5, then d=1). And n will be the total number of digits in the sequence (e.g. n=12 in our original problem).
In these polynomials, x is just a placeholder, not a genuine variable; you never actually evaluate the polynomial. It's almost like magic the way the answer pops out of the formulas.
One source of more information on generating functions is Cut the knot: generating functions.
First, we create the generating function for a single digit, say 1. If the digit appears zero times, we represent it by term x^0. If it appears 1 time, we have x^1. If it appears twice, we have x^2/2!. (We divide by 2! because there are 2! equivalent ways the digit can appear twice.) Likewise, if it appears three times, we represent it by x^3/3!. We end up with x^0 + x^1 + x^2/2! + x^3/3! + x^4/4! + ... Now for the crazy part. This is the series expansion of e^x, so we replace this with e^x. Finally we subtract x^p/p!, which is the "forbidden" case. We end up with the generating function e^x - x^p/p!.
If we have d choices for each digit, we multiply together d copies of the generating function, which is of course raising it to the power of d:
To get our solution, if we want to find how many solutions are n digits long, we simply extract the coefficient of x^n/n! from the above generating function, and that gives us the value we want:
Next step: how do we evaluate the above? First, let's expand the right half using the standard binomial formula:
Now we'll take a quick detour through some generating function facts, which I will present without proof.
Important generating function facts
Extracting the coefficient of x^n in an exponential gives you:
Unless n=0. Then you're looking for the coefficient of x^n in 1. This is simply 1 if n=0, and 0 otherwise.
Or unless n<0. Then the result is 0.
Second important generating function fact: if you're looking for the coefficient of, say, x^9 in an expression multiplied by x^3, then you can look instead for the coefficient of x^6 in the expression, if you use the appropriate scaling:
Solving one term of the summation
Now let's extract our desired coefficient from one term of the summation.
If n<kp, then the coefficient is 0.
If d != k, then from the second important fact, the value is:
If k = d, then the exponential drops out and the value is 0 if n != kp (i.e. n != dp), and otherwise:
Putting the summation together
Let's first assume n != dp. Then we can skip the k=d term, and we get:
where limit = min(d-1, n/p) (or d-1 if p=0). This ensures that n-kp >=0.
On the other hand, if n = dp, then we need to handle the k=d term.
Note that if n=dp, then n-kp>0, so we don't need to add an extra upper limit on k.
Solving the original problem
In the original problem, no value is allowed to appear zero times, so the forbidden count p=0. Then the solution simplifies to
Thus, the number of sequences of length 12 consisting of the digits 1 through 5, with each digit appearing at least once, is given by n=12 and d=5:
Happily, this is the same result we got in Part I.
Implementing this in Python
Implementing the above formula is straightforward:from math import factorial def solve(d, n, p): total = 0 for k in range(0, d): if n-k*p < 0: break total += (pow(-1, k) * choose(d, k) * factorial(n) / factorial(n-k*p) * pow(d-k, n-k*p) / pow(factorial(p), k)) if n == d*p: total += pow(-1, d) * factorial(n) / pow(factorial(p), d) return totalSince I didn't entirely trust the mathematics above, I also implemented a totally brute-force solution that generates and tests all the sequences of the given length:
import itertools def bruteforce(d, n, p): total = 0 for seq in itertools.product(range(0, d), repeat=n): ok = 1 for i in range(0, d): if len([x for x in seq if x==i]) == p: ok = 0 break total += ok return totalThe mathematical solution and the brute force solution agree on the cases I tested, which increased my confidence that I didn't mess up the math. The brute force solution is, of course, too slow to use except for fairly small values.