Wednesday, January 21, 2009

Generating String Permutations Recursively in C#

This article will demonstrate a simple routine that uses recursion in C# to generate
a list of string permutations. I originally wrote this routine because I was
interested in a little bit larger project in which I thought a routine like this
would be useful. I had become interested in an online game where, given a
string of X number of characters, you see how many way you can combine the letters
in that string to make words. The program would go something like this.

  1. Generate the permutations.
  2. Break each permutation down into 3, 4, 5 ... X letter strings.
  3. Check each break down against a dictionary, if in the dictionary, keep the word

So the first order of business was to generate every possible permutation of an
X character string. After working with a recursive routine, I managed to get
it down to a very simple few lines of code.

First, before going over the code, let's understand the concept. Let's
look at the permutations for a 2 character string. Let's say our string
is 'AB'. The possible permutations are

  1. 'AB'
  2. 'BA'

Very easy, we just switch the last two characters. Now if we want to get the
permutations for a 3 character string, Say 'ABC', that process is very simple
too.

  1. Start with 'ABC'.
  2. Swap the last two characters to get ACB
  3. Now take the original string and move the A into the second to last position giving
    you BAC
  4. Again Swap the last two characters giving you BCA
  5. Now come back out and swap the first two characters giving you CBA
  6. And finally swap the last two characters giving you CAB

If you want to do four characters, the process is the same. Permutate the
right most 2 characters, then swap in character 3 and do the permutations, then
swap in character 4 and do the permutations again. The beauty of the recursion
is that upon leaving one swapping operation, you can get back to an original string
(stored on the stack) and keep the permutations going. Since the calls use
the string lenght to determine how to proceed, you should be able to pass a string
of any length in and get the permutations. The only caveat is that the number
of permutations generated for any given string is exponential as the string get's
larger. The number of permutation is X ! given that X is the number of characters
in the original string. As you can see, it can get quite large pretty quick.

String SizeNumber of PermutationsTime to Process
11 (1!)Negligable
2
2 (2!)Negligable
36 (3!)Negligable
424 (4!)Negligable
..........
103,628,800 (10!)7.27 Seconds
12479,001,600 (12!)Out of Memory error thrown.


The time to process above was done on a dual processor test machine running at 3
GHz with 2 Gig of internal memory. As you can see, with a short string, you
can generate many permutations. Get much more over 10 and you will start to
see some serious performance degradation. If you really need to generate permutations
on strings that large, consider disk storage.

Now that we have discussed the concept, let's look at the code. I have
encapsulated all of the code into one class. Here is the listing for that
class.

using System;
using System.Collections;

namespace Permutations
{
// To save space, I have not used proper techniques of
// hiding attributes behind properties. There are
// probably other things that haven't been done as well
// but I am just trying to show the concept.
class PermutateString
{
// Holds the final array of permutations
public ArrayList AL = new ArrayList();

public PermutateString(string Source)
{
Perm("", Source);
}

private void Perm(string Beginning, string Shuffle)
{
// By the time the Shuffle length get's down to 1,
// we have another permutation to store.
if (Shuffle.Length <= 1)
AL.Add(Beginning + Shuffle);
else
{
for (int Ndx = 0; Ndx < Shuffle.Length; Ndx++)
{
// here, we move the character at the ndx position
// to the front of the line, then copy in the first
// ndx characters of the string and then the characters
// after that. so if NDX is at 3 in the following string,

// then the string ABCDEF would become DABCEF. This loop
// pulls each postion out to the front of the line and then
// calls the Perm operation again.
Shuffle = Shuffle.Substring(Ndx, 1) +
Shuffle.Substring(0, Ndx) +
Shuffle.Substring(Ndx + 1);

// Now that we have moved the shuffled characters around a bit,
// we call the perm function again moving one of the characters back

// to the beginning string.
Perm(Beginning + Shuffle.Substring(0, 1),
Shuffle.Substring(1));
}
}
}
}
}

It may be a little difficult to follow so the best thing to do is to copy the
code into your project and run it through the debugger. As you trace the
values of Beginning and Shuffle, you should see the logic unfold. Hope you
enjoyed this.

No comments: