Exceptional Code: Solving the Boggle Game - Recursion, Prefix Tree, and Dynamic Programming

Friday, February 10, 2012

Solving the Boggle Game - Recursion, Prefix Tree, and Dynamic Programming

I spent this past weekend designing the game of Boggle. Although it looks like a simple game at a high level, implementing it in a programming language was a great experience. I had to use recursion, sorting, searching, prefix trees (also knows as trie's), and dynamic programming as I was improving the run time of the program. So here is a summary of the work I did in trying to write a very optimal solution to the Boggle game:

*Image courtesy: http://stackoverflow.com/questions/746082/how-to-find-list-of-possible-words-from-a-letter-matrix-boggle-solver

The game of Boggle is played on a N x N board (usually made of cubes that has letters engraved on it). Given a dictionary, you will have to construct words on the board following these rules:

i) The letters in the word must be "adjacent" to each other
ii) Two letters on the board are "adjacent" if they are located on the board next to each other horizontally, vertically, or diagonally.
iii) A word must be 3 or more letters long to be valid

The more words you can construct, the more points you get. Longer words get more points.

In computerse: given an N x N board, and a dictionary containing hundreds of thousands of words, construct those words from the board that are found in the dictionary.

There are a number of approaches that one can take to solve this problem. I designed three different algorithms to solve it. Here are they:

First solution - Recursion + Binary Search:

In this approach, we recurse (using backtracking) through the board and generate all possible words. For each word that are three or more letters long we check to see if it's in the dictionary. If it is, we have a match!

Here is the algorithm steps:

1. Read in the dictionary in to a container in memory.
2. Sort the dictionary.
3. Using backtracking, search the board for words.
4. If a word is found and it contains 3 or more letters, do a binary search on the dictionary to see if the word is there.
5. If searching was successful in the previous step, print the letter out.
6. Continue step 3-5 as long as there are more words to construct on the board.

Complexity of this approach:

In this solution, we do a good job on the dictionary search. Using binary search, we are quickly finding out whether a word is in dictionary or not. But the real bottleneck is in searching the board for words. For an N x N board the search space is O((N*N)!). I am not exactly sure about this number, but you can find some discussions of it here: http://www.danvk.org/wp/2007-08-02/how-many-boggle-boards-are-there/.
(N*N)! is a HUGE number even for N = 5. So this approach is is impractical and out of question for any useful implementation.

Second Solution - Pruned Recursion + Prefix Tree (also known as a Trie):

From the previous approach, our major concern was the enormous search space on the board. Fortunately, using a a prefix tree or trie data structure we could significantly cut down on this search space. The reasoning behind this improvement is that, if a word "abc" does not occur as a prefix to any word in the dictionary there is no reason to keep searching for words after we encounter "abc" on the board. This actually cut down the run time a lot.

Here is the algorithm steps:

1. Read a word from the dictionary file.
2. Insert it into a prefix tree data structure in memory.
3. Repeat steps 1-2 until all words in the dictionary have been inserted into the prefix tree.
4. Using backtracking, search the board for words.
5. If a word is found and it contains 3 or more letters, search the prefix tree for the word.
6. If searching was *not* successful in the previous step, return from this branch of the backtracking stage. (There is no point to continue searching in this branch, nothing in the dictionary as the prefix tree says).
7. If searching was successful in step 5, continue searching by constructing more words along this branch of backtracking and stop when the leaf node has been reached in the prefix tree. (at that point there is nothing more to search).
8. Repeat steps 4-7 as long as there are more words to search in the backtracking.

Complexity of this approach:

This approach significantly improves on the first one. Building a prefix tree our of the dictionary words is O(W * L), where W is the number of words in the dictionary and L is the maximum length of a word in the dictionary.
Searching the board will be of the same order as the dictionary since we are not really searching words that are not in the dictionary. But in reality it will be more work than that as we still need to backtrack along the board to construct new words until we can consult the dictionary prefix tree to know whether it exists or not.

Third and Final Solution - No search space + Dynamic Programming:

The 2nd approach mentioned above was good enough until the board size was 5. Unfortunately with a board size of 6, that too was taking forever to complete!

It got me into thinking - "Dang, this search space is still too big to search! Can I just get rid of it entirely?" And then this idea popped into my mind: instead of random constructing word after word in this infinite ocean of words why don't I take a word from the dictionary and somehow magically check whether that's available on the board or not?

It turns out, we can use a nifty dynamic programming technique to quickly check whether a word (from the dictionary in this case) can be constructed from the board or not!

Here is core point of the dynamic programming idea:

For a word of length k to be found (end location) at the [i, j]-th location of the board, the k-1'th letter of that word must be located in one of the adjacent cells of [i, j].

The base case is k = 1.

A letter of length 1 will be found (end location) in the [i, j]-th cell of the board of the only letter in the word matches the letter in the [i, j]-th location of the board.

Once our dynamic programming table is populated with the base case, we can build on top of that for any word of length k, k > 1.

Here is a sample code for this:

for (k = 2; k < MAX_WORD_LENGTH; ++k)
    for (i = 0; i < N; ++i)
        for (j = 0; j < N; ++j)
            if (board[i][j] == word[k])
            {
                 for all the "adjacent" cells of board[i, j]
                     if we table[k-1][adjacent_i][adjacent_j] is true
                         then table[k][i][j] = true;
             }

*I have the C++ code, if you are interested to take a look leave a comment with your email address.

Run Time Complexity:

The run time complexity of this approach is obvious is pretty obvious. It's O (W * N * N * MAX_WORD_LENGTH). Here N is dimension of the board which is usually between 4 to 6. So essentially the algorithm runs in the order of the size of the dictionary!

I solved a 6 X 6 boggle game with a dictionary with 600,000 words in 0.002 seconds! With I/O it as about 0.57 seconds. But with the trie and binary search approaches, it took much longer!

So, here is the summary of my a weekend's worth of adventure in to the world of algorithms and data structures. Feel free to ask me any questions, or any bug in these algorithms:)

Credit where it is due: My friend Satej who is a ninja problem solver and a PhD student at the UCF explained the dynamic programming solution to me first before I refined and finalized it. Thanks Satej!

Update (5/10/2012):

So many people have asked for the source code that I feel I better add the C++ code here:) The final, polished code for this is not in my home computer here, so if there is a bug please let me know!

#include <cstdio>
#include <iostream>

using namespace std;

const int N = 6; // max length of a word in the board

char in[N * N + 1]; // max length of a word
char board[N+1][N+2]; // keep room for a newline and null char at the end
char prev[N * N + 1];
bool dp[N * N + 1][N][N];

// direction X-Y delta pairs for adjacent cells
int dx[] = {0, 1, 1, 1, 0, -1, -1, -1};
int dy[] = {1, 1, 0, -1, -1, -1, 0, 1};
bool visited[N][N];

bool checkBoard(char* word, int curIndex, int r, int c, int wordLen)
{
    if (curIndex == wordLen - 1)
    {
        //cout << "Returned TRUE!!" << endl;
        return true;
    }

    int ret = false;

    for (int i = 0; i < 8; ++i)
    {
        int newR = r + dx[i];
        int newC = c + dy[i];

        if (newR >= 0 && newR < N && newC >= 0 && newC < N && !visited[newR][newC] && word[curIndex+1] == board[newR][newC])
        {
            ++curIndex;
            visited[newR][newC] = true;

            ret = checkBoard(word, curIndex, newR, newC, wordLen);
            if (ret)
                break;

            --curIndex;
            visited[newR][newC] = false;
        }
    }

    return ret;
}

int main(int argc, char* argv[])
{

    int i, j, k, l;

    FILE* fDict = fopen("dict.txt","r");
    FILE* fBoard = fopen("tmp2.txt","r");

    for(i = 0; i < N; ++i)
        fgets(board[i], N+2, fBoard);

    strcpy(prev,"");
    int pLen = 0;

    while(fgets(in, N*N + 1, fDict))
    {
        int len = strlen(in);
        if (in[len-1] == '\n')
        {
            in[len-1] = '\0'; // remove the trailing newline
            --len;
        }

        if (len < 3)
            continue; // we only want words longer than 3 or more letter

        for(i = 0; i < len && i < pLen; ++i)
        {
            if(prev[i] != in[i])
                break;
        }

        int firstMismatch = i; // little optimization: building on previous word (will benefit if the word list is sorted)

        if(firstMismatch==0)
        {
            for(i = 0; i < 6; ++i)
            {
                for(j = 0; j < 6; ++j)
                {
                    if(board[i][j] == in[0])
                        dp[0][i][j] = true;
                    else
                        dp[0][i][j] = false;
                }
            }
            firstMismatch = 1;
        }

        for(k = firstMismatch; k < len; ++k)
        {
            for(i=0;i<6;++i)
            {
                for(j=0;j<6;++j)
                {
                    dp[k][i][j] = false;

                    if(board[i][j] != in[k])
                        continue;

                    for(l= 0; l < 8 && !dp[k][i][j]; ++l)
                    {
                        int ti = i + dx[l];
                        int tj = j + dy[l];

                        if(ti < 0 || ti >= 6 || tj < 0 || tj >= 6)
                            continue;

                        if (dp[k-1][ti][tj])
                            dp[k][i][j] = true;
                    }
                }
            }
        }

        // check if the word is tagged as found in the dp table
        bool flag = false;
        for(i = 0; i < 6 && !flag; ++i)
        {
            for(j = 0; j < 6 && !flag; ++j)
            {
                if(dp[len-1][i][j])
                    flag =true;
            }
        }

        // dp table says its there, but make sure its in the board and it does not repeat a location in the board
        if(flag)
        {
            //cout << "Checking word: " << in << endl;
            bool verified = false;

            for (i = 0; i < N && !verified; ++i)
            {
                for (j = 0; j < N && !verified; ++j)
                {
                    if (in[0] != board[i][j])
                        continue;

                    memset(visited, false, sizeof(visited));
                    visited[i][j] = true;

                    if (checkBoard(in, 0, i, j, len))
                    {
                        cout << in << endl;
                        break;
                    }
                }
            }
        }

        strcpy(prev,in);
        pLen=len;

    }

    return 0;
}

Enjoy!

Update 3/15/2015: I have added a Java version of the boggle game solution on my Github page here: https://github.com/bilash/boggle-solver

The Java version builds a Trie for the dictionary and then uses the dynamic programming approach mentioned above.

99 comments:

Doğan Cem GürelFebruary 22, 2012 at 10:49 AM
i am interested and need the C++ code
could you provide me them codes
cemgurel@laben.com.tr
ReplyDelete
Replies
Golam KawsarFebruary 22, 2012 at 8:14 PM
Thanks for reading my blog!

Please check your inbox for the C++ code.
ReplyDelete
Replies
yasha02February 26, 2012 at 4:53 AM
Please send me a c++ code on yasha.khandelwal02@gmail.com.

Thanks
ReplyDelete
Replies
Golam KawsarFebruary 26, 2012 at 8:12 AM
Hi,

Please check your inbox. Thanks.
ReplyDelete
Replies
naveenMarch 4, 2012 at 10:29 PM
hi,

please could you send me the java code for this at dspds.naveen@gmail.com

thanks
ReplyDelete
Replies
Golam KawsarMarch 5, 2012 at 6:46 AM
Hi Naveen, I didn't do it in Java. I have C++ code. Let me know if want C++ code for this.
ReplyDelete
Replies
radzMarch 22, 2012 at 1:11 AM
Hi,
Can u please send me the c++ code to radha.addala@yahoo.co.in

Thanks in advance
ReplyDelete
Replies
arjMarch 25, 2012 at 12:45 AM
your blog is nice.... now i am currently working on a data structures project...if u send this c++ code it would be helpful.... thankyou
ReplyDelete
Replies
arjMarch 25, 2012 at 12:50 AM
sry my email is dsprojectmaze@gmail.com
ReplyDelete
Replies
juzhaxMarch 25, 2012 at 4:02 AM
my email is juzhax@gmail.com
do you mind to share the code too ?
ReplyDelete
Replies
Golam KawsarMarch 25, 2012 at 3:38 PM
Thanks everyone for showing interest in the code. I have sent it to all of you. Please check your inbox.
ReplyDelete
Replies
alycannApril 9, 2012 at 10:50 AM
please send me c++ or java source code
thanks btw
alican@hotmail.co.jp
ReplyDelete
Replies
Jonas MaesApril 18, 2012 at 7:31 PM
Very interesting. I'm curious for the source code.
Can you send it to me?
eydenjones@gmail.com
Thx
ReplyDelete
Replies
Golam KawsarApril 21, 2012 at 5:45 AM
Sorry for the late reply. Please check your inbox. I sent the code.
ReplyDelete
Replies
BobApril 26, 2012 at 11:06 PM
Could you send me the code?

bobsyouruncle@yarr.ca
ReplyDelete
Replies
YannApril 29, 2012 at 2:28 PM
Hi, that looks very nice.
Could you send me the C++ code to :

coffeehouse@free.fr

Thanks
ReplyDelete
Replies
punjab1836April 29, 2012 at 8:03 PM
can you send me the code
punjab1836@gmail.com
ReplyDelete
Replies
SmoggieApril 30, 2012 at 5:47 AM
Hi, I'd be interested in seeing the code for this if you could email it to info[at]daveallanson.com
ReplyDelete
Replies
LF4May 4, 2012 at 5:19 PM
That's really cool, whats the largest board you've tired it on? I'm interested in your code as well.

lifeforce0 at gmail
ReplyDelete
Replies
ashish anandMay 7, 2012 at 7:43 PM
Plz mail me c++ code at ashiniit@gmail.com.......
Thanks a lot
ReplyDelete
Replies
jmarkMay 9, 2012 at 2:54 AM
please send me C++ code.
jamesmark1007@gmail.com
Thank you
ReplyDelete
Replies
Golam KawsarMay 10, 2012 at 5:48 PM
Hi all,

I updated the post with the source code. Please let me know if you find any bugs, etc. Thanks.
ReplyDelete
Replies
tzeJuly 6, 2012 at 1:21 AM
please send me C++ code.
tzewei91@gmail.com
Thank you
ReplyDelete
Replies
blackdanteJuly 25, 2012 at 9:59 AM
This is a really awesome read, could you send me the code, jcp1217@gmail.com
ReplyDelete
Replies
Golam KawsarJuly 25, 2012 at 10:30 AM
To those still asking for the code, I updated the post with the dynamic programming code. It's at the end of the post.

Thanks for your interest in the code!
ReplyDelete
Replies
azeeOctober 27, 2012 at 8:52 AM
can you please send me the java source code its educational purpose i just wana know how it works please send it to
p4l951@hotmail.com
ReplyDelete
Replies
PrinceNovember 14, 2012 at 3:47 AM
Can you send me the c++ code @ m.garimella1@gmail.com
ReplyDelete
Replies
lbnashblFebruary 12, 2013 at 11:10 AM
If still available, I'd like to a copy of the c++ code
Thanks larry@televox.cc
ReplyDelete
Replies
UnknownFebruary 26, 2013 at 9:20 AM
can u please mail me the java and c++ codes for the problem.
thanks my id is abhinavsareen1@gmail.com
ReplyDelete
Replies
Vidur MayorFebruary 27, 2013 at 9:24 AM
Can you forward me the c++ code as well? vidurmayor88@gmail.com.

Thanks :)
ReplyDelete
Replies
UnknownFebruary 28, 2013 at 2:18 AM
can u please mail me the c++ code too.
swati_sj00@yahoo.com
ReplyDelete
Replies
UnknownMarch 2, 2013 at 5:12 AM
Can you please forward me the C++ code?
akanksha.237@gmail.com

Thanks :)
ReplyDelete
Replies
UnknownMarch 2, 2013 at 6:53 AM
plz forward me the c++ code
tarunnarang1992@gmail.com
ReplyDelete
Replies
UnknownMarch 4, 2013 at 7:37 PM
This comment has been removed by the author.
ReplyDelete
Replies
Golam KawsarMarch 5, 2013 at 7:14 AM
To all those that wanted the C++ code, please check the end of the blog post. I added the C++ code in the post. Thanks.
ReplyDelete
Replies
UnknownApril 7, 2013 at 8:37 PM
Have you tried this with say a 100x100 boggle? I'm trying to compare your final solution with a c++ solver I implemented. My solver can do 100x100 in <7 seconds with a 178,000 word dictionary on a single 1.4ghz pentium. However using my solution, the time taken for larger puzzles does not scale linearly with the number of cubes.
ReplyDelete
Replies
Golam KawsarApril 8, 2013 at 6:58 AM
Hi, no I haven't tied my solution with a large matrix (like 100x100). I guess it will take much longer than 7 seconds. Did you use DP or something else?
ReplyDelete
Replies
UnknownApril 8, 2013 at 9:12 AM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownApril 30, 2013 at 3:54 AM
can u give the codes in netbeans java.please...
ReplyDelete
Replies
UnknownJune 1, 2013 at 3:25 AM
can u please mail me the c++ code too.
ralg9098@gmail.com
ReplyDelete
Replies
WJJuly 9, 2013 at 7:06 AM
can you please email me the c++ code?
joshua_chua90@yahoo.com
ReplyDelete
Replies
UnknownJuly 10, 2013 at 11:08 PM
Great post. Enjoyed reading and perusing your code. You stored the board in a char array which is fine for all the tiles except for Qu. While the general approach would be the same, it seems that you would need to change your container and perhaps some of your search algorithms. Just a thought!
ReplyDelete
Replies
AngelicdevilAugust 24, 2013 at 8:50 PM
Hi, can you send me the C++ code to cronix_psycho@yahoo.com
Thanks
ReplyDelete
Replies
cha0sNovember 2, 2013 at 3:08 PM
Hi, can I use your code in a commercial project ?

Thanks!
ReplyDelete
Replies
AnonymousDecember 6, 2013 at 9:32 PM
hey do you think i can take a look at your program? thanks! pnchming@yahoo.com
ReplyDelete
Replies
Piyush KapoorFebruary 13, 2014 at 1:18 AM
Unfortunately, the dynamic programming solution is wrong over here as it ignores the fact that each cell can be visited atmost once. Your dp[i][j][k] can go through a cell many times as the state doesn't keep track of the previously visited cells.
ReplyDelete
Replies
AlbertApril 3, 2014 at 8:20 PM
Hey.. Ur explanation is too good.. Thanks.. Can you forward the full c++ code to albertiitm @gmail.com. ... thanks...
ReplyDelete
Replies
goldyApril 30, 2014 at 4:22 AM
the code above doesmt giv any output..
can u mail me d code?? its quite urgent ;)
anurag27k@gmail.com
thanks
ReplyDelete
Replies
UnknownJuly 4, 2014 at 10:09 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownAugust 3, 2014 at 11:00 PM
Why can't you do the solution 3 but with trie ?
instead of iterate through all dictionary, you can use trie to match valid word.

from the board you can check the word with NxNx(max words)
ReplyDelete
Replies
UnknownMarch 16, 2015 at 8:27 AM
Interested in looking at the c++ code. Could you send it to ryanlh7job@gmail.com?
ReplyDelete
Replies
qaiosJuly 7, 2015 at 9:11 AM
Can you please send the final C++ code to tanya_thatzme@yahoo.com?
ReplyDelete
Replies
UnknownSeptember 19, 2015 at 4:16 PM
This comment has been removed by the author.
ReplyDelete
Replies
UnknownNovember 13, 2015 at 1:01 AM
Could I have the C++ code of this solution?
my gmail: kittysit1220@gmail.com

Thank you so much
ReplyDelete
Replies
UnknownDecember 5, 2015 at 7:37 AM
Can you please send me the C++ or java code at profcdd@gmail.com ?
ReplyDelete
Replies
Sagiv OfekDecember 18, 2015 at 8:58 AM
great article. thanks
ReplyDelete
Replies
UnknownJanuary 24, 2016 at 8:50 AM
First of all good article. Do you have an c# cod for that? here is my email vasquezjohncullen@yahoo.com I'm very happy if you let me have some
ReplyDelete
Replies
UnknownFebruary 24, 2016 at 7:49 AM
plz give me the code of dp approach thank you
mail id is abhidreams662@gmail.com
ReplyDelete
Replies
HitApril 1, 2016 at 1:39 PM
plz mail me c++ code for second and third algorithm.....thank you mial id: hiteshkumar848@gmail.com
ReplyDelete
Replies
UnknownApril 17, 2016 at 3:58 AM
This comment has been removed by the author.
ReplyDelete
Replies
nothignApril 17, 2016 at 12:08 PM
I'm not sure why you said solution #2 was taking forever. Using your concept, the implementation using the trie solves a 10x10 in 210 milliseconds.
ReplyDelete
Replies
Akshay JagirdarMay 2, 2016 at 7:11 PM
Hi,
On your github page, I saw that you mentioned that you have implemented the dynamic approach in C++. Could you please send that to me at akshay.jagirdar.delhi@gmail.com
Thank you
ReplyDelete
Replies
UnknownNovember 14, 2016 at 4:41 PM
the 3rd method works if it allows using one character repeatedly as long as not continuously. For example, 'dad' was found if 'da' is on the board. 'lee' was not found if 'le' in on the board.
the 3rd method is actually BFS+branch-and-bound.
ReplyDelete
Replies
UnknownFebruary 9, 2017 at 5:17 PM
I'd love to understand your solution using dynamic programming, but I cannot follow the description and the code is very cryptic with all those single letter variables.
ReplyDelete
Replies
UnknownMarch 21, 2017 at 12:10 AM
you mind sending me a copy of your C++ code?
ReplyDelete
Replies
UnknownApril 1, 2017 at 12:01 PM
Hey what is the board.txt?
ReplyDelete
Replies
c387November 8, 2017 at 4:35 PM
would you be able to send me a copy of the C++ code please? c837@protonmail.com
ReplyDelete
Replies
UnknownDecember 16, 2017 at 12:46 AM
From where can I get dictionary and board file ?
ReplyDelete
Replies
UnknownJuly 19, 2018 at 6:41 AM
hi, can you please share the c++ code - keerthivanjari@gmail.com
Thanks for your time on this blog
ReplyDelete
Replies
RudreshJanuary 9, 2019 at 12:17 PM
Hi , Can you pleae share the C++ code - rudreshaa@gmail.com
Its very well explained. It would great help if you can share the code.
ReplyDelete
Replies
RockVlknJuly 11, 2019 at 6:21 AM
Hi, Can you please share the c++ code- rakeshvlkn@gmail.com.

It is explained very well.
ReplyDelete
Replies
Michael HamarnehSeptember 3, 2019 at 3:45 PM
I am interested in the source code. My email address is michaelhamarneh@gmail.com.
ReplyDelete
Replies
UnknownNovember 24, 2019 at 1:11 PM
I am also interested in the C++ source code. My email address is ssidari2018@my.fit.edu
ReplyDelete
Replies
svrtechnologiesFebruary 8, 2020 at 3:52 AM
Whatever we gathered information from the blogs, we should implement that in practically then only we can understand that exact thing clearly, but it’s no

need to do it, because you have explained the concepts very well. It was crystal clear, keep sharing artificial intelligence course online.
ReplyDelete
Replies
North Shore Tree ServicesMay 26, 2020 at 5:44 AM
Very interesting discussion!! I think that you should write more on this topic, it might not be a taboo subject but generally people are not enough to speak on such topics. To the next.
Tree Service
ReplyDelete
Replies
dadsprintingJune 4, 2020 at 7:27 AM
Great article, Thanks for sharing such wonderful content.
Custom Face Mask
ReplyDelete
Replies
IK SoftechSeptember 14, 2021 at 9:57 PM

Pretty article! I found some useful information in your blog, it was awesome to read, thanks for sharing this great content to my vision, keep sharing.sap fico training institute in Bangalore
ReplyDelete
Replies
MY PET CLINIC VIZAGSeptember 26, 2021 at 11:53 PM

That is nice article from you , this is informative stuff . Hope more articles from you . I also want to share some information about Pet Vaccination in vizag
ReplyDelete
Replies
AnonymousMay 26, 2022 at 6:14 PM
Smm panel
Smm panel
HTTPS://İSİLANLARİBLOG.COM
İNSTAGRAM TAKİPÇİ SATIN AL
hirdavatciburada.com
Https://www.beyazesyateknikservisi.com.tr/
Servis
jeton hile indir
ReplyDelete
Replies

Add comment

Friday, February 10, 2012

Solving the Boggle Game - Recursion, Prefix Tree, and Dynamic Programming

99 comments:

Pageviews

Subscribe