soundex algorithm in C++

July 19, 2009

I couldn’t actually find a C++ version of the soundex algorithm (all the ones I found were C code with a .cpp extension), so I threw one together.
Here it is if anyone is interested.



static char     lookup[] = {
'0',    /* A */
'1',    /* B */
'2',    /* C */
'3',    /* D */
'0',    /* E */
'1',    /* F */
'2',    /* G */
'0',    /* H */
'0',    /* I */
'2',    /* J */
'2',    /* K */
'4',    /* L */
'5',    /* M */
'5',    /* N */
'0',    /* O */
'1',    /* P */
'0',    /* Q */
'6',    /* R */
'2',    /* S */
'3',    /* T */
'0',    /* U */
'1',    /* V */
'0',    /* W */
'2',    /* X */
'0',    /* Y */
'2',    /* Z */
};

std::string computeSoundex(const std::string &input, const int resultLength){

//keep the first character intact
std::string result = input.substr(0,1);

//compute value for each character thereafter
for(int i=1;i<input.length(); i++){
   //skip non-alpha characters
   if(!isalpha(input[i])){
   continue;
}

//uppercase the input value
const char lookupInput = islower(input[i]) ? toupper(input[i]) : input[i];
//lookup it's value
const char *lookupVal = &lookup[lookupInput-'A'];

//make sure this isn't a dupe value
if(result.find(lookupVal, 0) != 0 ){
   result.append(lookupVal);
}
}

//make sure we could actually encode something
if(result.length() >= resultLength){
return result.substr(0,resultLength-1);
}

//In cases of empty strings (or strings with no encodable
characters, return Z000
return "Z000";
}


Follow

Get every new post delivered to your Inbox.