Monday, March 29, 2010

Reversing A String - The worse way, The better way

A common programming interview question is to reverse a string. A common first approach is to copy the contents of the string into a temporary string in reverse order, and then copy the temporary string over the original string. Do this in an interview, and you'll be informed that while your solution is correct, it can be much better. Doing it in this fashion is very inefficient, but it is still nonetheless a good place to start. As always, solve the problem first, then optimize. The naive solution would look something like this:



At this point, the interviewer is going to ask you to perform the reversing "in-place" meaning "do it all in one pass of the string, no double-pass copying shenanigans!"

So, how do we accomplish this? A basic swap should immediately come to mind. You need to swap pairs of characters in the string as you iterate over it until no more pairs exist. But, how to know when you've reached that point? Sure, you could keep track of indices of the string that have been swapped, but at that point you're diverging toward the inefficiency of the first solution because you'd have to do a lookup in the index table each time you attempted a swap, and that gets slower as your string becomes larger. So, storing indices is out. What else can we do? Well, let's see if there is an algorithm behind this.

Given the string "abc", what can we deduct about which pairs should be swapped?

'a' and 'c' get swapped, leaving us with 'b'. Clearly, we see there is nothing else to swap with, but how would we know this internally, from a source code level? The key lies in the positions of the characters in relation to the end of the string. We never want to swap 2 characters unless the location of the second character is > the location of the first character. Using our string "abc", 'a' is located at position 0 and 'c' at position 2. 'b' is located at position 1, but the character at position 2 has already been swapped.

To compute this mathematically, you do this:

firstIndex = 0
secondIndex = length of the string - 1
while secondIndex > firstIndex:
swap
increment firstIndex
decrement secondIndex
loop

Putting this into C, we have the following, efficient reverse algorithm:



And here is the output, using both versions consecutively on the same string (reverses it, then reverses it again):

$ ./reverse
original: abc
reverse: cba
reverse2: abc

Hopefully this has been helpful to you. Remember, the most obvious solution is most likely not the most efficient.

1 comment:

Unknown said...

Good post.

You could even further reduce the overhead of the stack for the function calls to swap(). Consider the following implementation:

char z[] = "sesreveR";

printf( "String: %s", z );

size_t length = (strlen(z) - 1);
int first = 0;

while( first < length )
{
int tmp = z[first];
z[first++] = z[length];
z[length--] = tmp;
}

printf( "\nString: %s", z );