I have a good friend, Matt, who shares my interest for interesting problems. We were talking about this problem, and he immediately described why I was having the problems that I was having with the Knight's Tour.
In my previous solution, I had been explicitly saving each encountered state in a stack. Therefore, this stack was simply a linearization of the entire search tree - something with d^b nodes.
Now, the first time I had programmed this algorithm, I had used a breadth-first search, something where using an explicit queue was necessary. I had quickly realized that in order to port this algorithm to a depth-first search, I should just simply use a stack.
Matt quickly made the comment that using an explicit stack to save each and every state is completely wasteful. In fact, it is possible (and simple!) to use the system stack with a depth-first search, and use recursion to push and pop states on and off the stack. With this approach, only one path through the search tree is ever in memory at a time, so instead of holding d^b nodes, you simply hold a maximum of d nodes. Clearly this is a better solution.
Last week I applied for a job at Google, and the interviewer actually started talking to me about this problem. He said that, at least for this specific problem, it is possible to only store a single state. The motivation for this is clear, I assume: Some games have extremely complicated, large states, and holding multiple states in memory may not be feasible. After some thought about what he described, I realized that it is simply an extension of what Matt had told me about.
The idea is this: all levels of recursion share the same representation of the board. Because only one level of recursion is literally executing at a time (the deepest level), the board only needs to have one value at any given point in time. When you want to make a move and recurse to search that subtree, you can make the move on the shared board. Then, if that recursion fails to find a solution - un-make that move on the shared board. That way the precondition will still hold for the next move to make.
Now, clearly this solution only works if it's possible to un-make a move. In a game where the moves are super complicated, this solution will not work.
Also, my first solution was written in Haskell. As Haskell is purely functional, all of the data structures are immutable. This means that a copy of the state is made every time a move is made - it is impossible to change the single board in memory. Therefore, this solution would clearly not work in Haskell.
Also, it should be noted that this algorithm assumes that only one search path is executing at any time. Therefore, this algorithm is unable to be parallelized in any obvious way. Different subtrees cannot be searched in parallel, because they have to be operating on different boards; however, there is only one board in memory. So, this algorithm also does not scale to multiple CPUs. It might be possible to split the entire search tree into a small amount of subtrees at one-level deep, and run the algorithm in parallel on each of those sub-trees. In this case, however, each thread would have to have their own board to search on, so there are now a few boards in memory instead of just one.
I have posted previously about my lack of thinking on the topic of complexities in programming. So, in the spirit of self-improvement, I thought about this algorithm before I coded it up. Clearly, this algorithm would solve the memory problem. However, the tree for a 8x8 chessboard still is 64 levels deep. The breadth of the tree starts out at 8, but this diminishes as possible moves may have already been made (Clearly, or else there would be a gigantic number of solutions as any solution that makes it all 64 levels deep is a successful solution). Therefore, the tree is fairly deep and starts out fairly broad. As I am a rather visual person, I imagine it as being something vaguely circular - broad at the top, but then tapering off as the number of unmade moves is limited toward the bottom. Therefore I expected that my solution to the problem will still be slow, even if it isn't large.
I then thought about my state representation. The solution to this problem is a sequence of (X, Y) pairs. I will call a particular entry in the solution (Xp, Yp) and the next entry (Xp+1, Yp+1). The solution can be entirely crafted as a sequence of these pairs where the first pair is (0, 0) and abs(Xp+1 - Xp) + abs(Yp+1 - Yp) = 3. Not only that, but no pair on the solution may be reused, so the solution is also a permutation of ([0..width], [0..height]). The word permutation immediately clued me in to the idea that this problem is NP-complete (even though I already knew it was, this sealed the deal).
The search tree successor function can simply find all the (Xp+1, Yp+1)s that fit the bill. This approach, however, does not lend to using a permutation generator, as finding a permutation that has this quality is non-trivial. This means that the solution can entirely be crafted in terms of itself, and the representation of the board itself is unnecessary. That being said, storing the board allows for O(1) lookup to see if a particular solution is already in the solution, making sure that the solution is, in fact, a permutation. This is necessary because the permutation generator approach does not find successors easily, so the permutative quality of the solution must be artificially kept.
So anyway, I coded this up in C. My program finds that there is no solution to a knight's tour on a 4x4 board in 0.009 seconds. It finds that there is no solution to a knight's tour on a 5x5 board in 0.168 seconds. It finds that there is a solution to a knight's tour on a 6x6 board in 52.473 seconds. I started running the program on a 7x7 board, but killed the process after it ran for around 24 hours. Note that I am running these on my MacBook Pro laptop.
So here I am, trying to think of a way to make this faster. As soon as I figure it out, I'll make another post.
My code can be found here.