Wednesday, July 25, 2012

1.3 removeDuplicates



Design an algorithm and write code to remove the duplicate characters in a string without using any additional buffer  NOTE: One or two additional variables are fine. An extra copy of the array is no

It took me a while to understand the solution to this problem, not sure why.

If we ran the word "badboy" through this code, we would expect "badoy0" to return, where 0 is the end of unique characters. "ababab" would turn into "ab0bab." This is because we cannot create another array, so we must stick with the original string size.

First, we need to change the string input into a character array.
Next, we introduce a variable "tail" which keeps track of how far we have moved in the array. Tail position will start at the beginning of the array and move forward everytime we confirm that the current letter is not a duplicate. It DOES NOT move forward if we discover that the letter is a duplicate, that way we can overwrite the duplicate in the next iteration.

Code:

void removeDuplicates(char[] str)
   int tail = 1;
   for (int i=1; i< str.length ; i++) //we start at 1 because we know the first character has to be unique
           int j;//we need j outside of the for loop, so we must declare outside
           for (j=0; j<tail; j++ )
                   if (str[i] == str[j])
                          break;
            if ( j == tail ) //this means we never break-ed
                   str[tail] = str[i]
                   tail++ //tails is only incremented if there wasn't a duplicate
   str[tail] = 0; //add 0 to end of correct letters

this runtime is actually pretty bad, O(n^2). Not sure if there is a better way to do it if you aren't allow to create another string.

You could add error checking that if the string length is less than 3, just instantly stop.

No comments:

Post a Comment