Thursday, March 22, 2012

Fun with bugs

I encountered a fairly interesting bug a few days ago. I will not post the minimal code snippet that reproduces the bug, but rather something similar to the actual code that I had the bug in so that you might try to find the bug for fun (preferably just by looking at the code :). Note however, that the bug I'm going to write about is not really a logic error, but more of a "language issue" (I don't want to be too specific yet because a hint will quickly ruin the fun :).

First, a bit of introduction about what the code does, or is supposed to do. I was implementing an interval tree that would efficiently find the maximal value in an interval of keys. When creating an empty tree (with no values yet inserted), you'd want to initialize all the values to some very small value that I'll call -INF. Basically, the intent of initialization is the set up the M vector that represents the intervals that the tree will be storing directly. So here is the initialization code.

class IntervalTree {
  public:
  IntervalTree(int maxval): n(maxval), M(2*maxval+1) {
    init(0, 0, n);
  }

  private:
  int n;
  std::vector M;
  
  int init(int node, int l, int r) {
    while (node >= (int)M.size()) {
      M.push_back(0);
    }
    if (l == r) {
      return (M[node] = -INF);
    } else {
      int mid = l + (r-l)/2;
      return (M[node] = max(init(2*node + 1, l, mid),
                            init(2*node + 2, mid+1, r)));
    }
  }
};

The constructor is pretty minimal. It stores the rightmost interval boundary (the leftmost one is implicitly zero) and tries to initialize the vector that holds the tree nodes. Now, unfortunately, the maximum index of a node in the tree might be bigger than 2n if the tree is not complete, i.e. the number of keys is not a power of two. An example of this case is n=5 when the maximum index of a node is 12 (the node representing the interval [4,4]).

One way to fix this problem would be to use the first larger power of two, but that can potentially waste space. So what I did here was in the recursive init phase add on additional nodes as necessary. The remaining code is fairly straightforward: if the interval contains only one key then initialize that leaf node (here you might use A[l] if you were initializing the tree with an initial array of values); otherwise, recursively initialize the left and the right halved subintervals and assign this node the maximum of those. Again, in this case it would be OK to just assign M[node] to be -1, but if we were using an initial value array, this is pretty much how you do it.

So, where's the bug? :) This code actually has undefined behavior and is fairly likely to fail spectacularly (with a segfault). If you don't want to try to find the bug, then read on for the answer. Before that, here's a hint - the bug is related to std::vector. Can you spot it now? :)

Here's the explanation. The bug is actually in the line that recursively calls init and assigns the max of the two calls to the tree node. As you might have guessed by now, there are two issues in interplay here. First of all, when you push_back to a std::vector, you will get a reallocation if the previous size of the vector equaled the capacity (i.e., all the preallocated memory was exhausted). When that happens, all pointers and references to the elements of the vector are invalidated, for obvious reasons (meaning you must not use them). This is why you should always be wary of taking pointers (or references) to vector elements, i.e. carefully consider if it is logically possible for the vector to grow which might cause it to get reallocated.

However, one might wonder, where are the pointers or references in this code? The answer is, fairly obviously, vector's operator[]. The thing is, the compiler is free to evaluate M[node] before, between or after both init calls are made, i.e. it is free to evaluate the left side of the assignment before the right. Since init uses push back on the same vector it's assigning to, it might happen that the reference returned by operator[] in M[node] gets invalidated and you assign to it, breaking the program.

The manifestation of this bug was actually quite entertaining. I got a segfault for many executions, but sometimes I got a std::bad_alloc (for minute trees of a dozen or so nodes :). Quite spectacularly, it actually caused my AV to flag the executable as a trojan on one run (though that is, I suspect, not really closely related to this particular bug).

It is fairly easy to fix this problem in this case... just use a temporary to store the max of the recursive calls and then assign it to the appropriate location in the vector. Another solution would be to determine how many nodes there will be in the constructor and avoid having to extend the vector. Both of these solutions are either uglier or more complex than this one, but they are correct while this one is not :). However, I think this bug is generally fairly subtle and I wouldn't be surprised if it "leaked out" (in the abstraction sense) of some libraries that use std::vector internally.

Thursday, March 15, 2012

The order of perfect K-shuffles in a deck of N cards

The problem I'll write about in this post is the following: Given a deck of N cards (with N potentially as big as say 10^12) and an integer K that is a divisor of N, find the minimum positive number of perfect K-shuffles that takes the deck back to its original order. I'll define what I actually mean by a perfect K-shuffle in a second, but I'd just like to stress that this problem is very easy if N is small and K is two and you can find the answer for "normal" decks of cards (say 52 card decks) by googling it.

Two kinds of "perfect K-shuffles" are mentioned in the literature: the in-shuffle and the out-shuffle. This post will deal with the in-shuffle, but the same approach can be used to solve the out-shuffle version. So, in a perfect K-in-shuffle, you take the deck of N cards and split it "perfectly" into K piles with N/K cards each. The first pile contains the topmost N/K cards of the deck, the second pile the next N/K topmost cards and so on.

Here's an example with K=3, N=15 with the cards numbered from 1 to 15 and originally in sorted order from the top of the deck downward. After splitting the deck as described, you'd get the following configuration:

 1  6 11
 2  7 12
 3  8 13
 4  9 14
 5 10 15

To complete the K-in-shuffle, you start compiling the deck from top to bottom, taking one card from each pile starting from the rightmost pile and moving leftwards. That is, card 11 would end up on the top of the deck, followed by 6 and 1, then followed by 12, 7, and 2 etc. What you get, from top to bottom, is:

11 6 1 12 7 2 13 8 3 14 9 4 15 10 5
By the way, the out-shuffle differs only in that we start taking the cards from the leftmost pile, therefore preserving the position of the "outer cards" (the first and the last card always stay in their original position).

This shuffle (as any other shuffle) is a permutation of N elements. In fact, the numbers written above are the inverse permutation defined by the perfect 3-in-shuffle of a deck of 15 cards. If we invert it, we will get the following permutation:

3 6 9 12 15 2 5 8 11 14 1 4 7 10 13

The problem is basically asking us to find out the order of this permutation, i.e. how many times it has to be applied to get back to the sorted sequence. When a permutation is fairly small, it is easy to find its order by examining its cycle structure - the order is just the LCM of all the cycle lengths. This will in general take O(N) time since we basically want to walk through all the cycles. However, with N large, this is certainly way too slow.

So the idea is to analyze the permutation and try to come up with something better. It turns out (which you might guess by playing around with small examples) that this permutation takes the card from position c to position Kc mod N+1. It is fairly easy to prove this. Consider the general case. Let Q = N/K. Then the general case after the first part of the shuffle looks like:

  1 Q+1 ... (K-1)Q+1
  2 Q+2 ... (K-1)Q+2
  .  .   .      .
  .  .   .      .
  .  .   .      .
  Q Q+Q  .  (K-1)Q+Q = N

So the card in the ith row and the jth column (1-based) is c=(j-1)Q + i. When completing the second part of the shuffle, all the (i-1)K cards in the preceding rows will be before c, as well as the K-j+1 cards in the ith row that are to the right of c, and are therefore collected before c. That means that any card c=(j-1)Q + i moves to position (i-1)K + K-j+1 = iK - j + 1 (*).

Now consider the number cK = (j-1)QK + iK. We find that iK = cK - (j-1)QK. Plugging that into (*), we get that the position of card c after one shuffle is cK - (j-1)QK - j + 1 = cK - (j-1)QK - (j-1) = cK - (j-1)(QK + 1) = cK (mod QK+1). This completes the proof since QK = N.

One useful sanity check to make here is that since we're numbering the cards from 1 to N (inclusive), we better not get the result that a card moves to position zero. To see that this is indeed impossible, consider the GCD of K and the modulus N+1. Since K is a divisor of N, it follows that N+1 is not divisible by any prime factor of K. This is because every second number is divisible by 2, every third number by 3, every fifth number by 5 etc. Therefore, if K has a prime factor p, that is also a prime factor of N and therefore can't be a factor of N+1. It follows that K is relatively prime to the modulus N+1. Using this fact, the only way that cK would be divisible by N+1 (and therefore we would get a zero as the position of the card) is if c would be a multiple of N+1, which is impossible since c < N+1.

Now consider the first card. After one shuffle, it will be in position K, after two shuffles 2*K mod N+1 etc. Eventually it will get back to the top of the deck. The lowest positive number of shuffles that achieves this will be a number e such that K^e = 1 (mod N+1). Since K is relatively prime with N+1, it is a member of the multiplicative group modulo N+1, and e is exactly the order of K in this group. In fact, this number will be the sough order of the shuffle. That is fairly obvious - any card c will after e shuffles end up at position cK^e = c (mod N+1). Note that some cards might cycle faster, but then they will have to cycle several times before the topmost card cycles.

I wrote about efficiently computing the order of an element in a multiplicative group in a previous post, and we can do that in O(sqrt(N)) which will be sufficient for N up to around 10^14 in under one second.