Home > front end >  camelCase function in C, unable to remove duplicate chars after converting to uppercase
camelCase function in C, unable to remove duplicate chars after converting to uppercase

Time:10-05

void camelCase(char* word) 
{
    /*Convert to camelCase*/
    int sLength = stringLength(word);
    int i,j;
 
    for (int i = 0; i < sLength; i  ){
        if (word[i] == 32)
            word[i] = '_';
    }

    //remove staring char '_',*,numbers,$ from starting
    for (i = 0; i < sLength; i  ){
        if (word[i] == '_'){
            word[i] = toUpperCase(word[i   1]);         
        }
        else
            word[i] = toLowerCase(word[i]);
    }

    word[0] = toLowerCase(word[0]);

    //remove any special chars if any in the string
    for(i = 0; word[i] != '\0';   i)
    {
        while (!((word[i] >= 'a' && word[i] <= 'z') || (word[i] >= 'A' && word[i] <= 'Z') || word[i] == '\0') )
        {
            for(j = i; word[j] != '\0';   j)
            {
                word[j] = word[j 1];
            }
            word[j] = '\0'; 
        }
    }
}

int main()
{
    char *wordArray;
    wordArray = (char*)malloc(sizeof(char)*100);

    // Read the string from the keyboard
    printf("Enter word: ");
    scanf("%s", wordArray);
    
    // Call camelCase
    camelCase(wordArray);
    
    // Print the new string
    printf("%s\n", wordArray);
    
    return 0;
}

I am writing a function that takes in this for example _random__word_provided, and I am to remove any additional underscores or special characters, capitalize the first word after an underscore and reprint the word without any underscores. The above example would come out like this randomWordProvided.

When I run my code though this is what I am getting rrandomWwordPprovided. I am unsure where my loop is having issues. Any guidance would be appreciated. Thank you!

CodePudding user response:

You are WAY over-processing the string...

First measure the length. Why? You can find the '\0' eventually.
Then convert ' 's to underscores (don't use magic numbers in code).
Then force almost everything to lowercase.
Then try to "strip out" non-alphas, cajoling the next character to uppercase.
(The non-alpha '_' has already been replaced with an uppercase version of the next character... This is causing the "thewWho" duplication to remain in the string. There's no indication of '$' being addressed as per your comments.)

It seems the code is traversing the string 4 times, and the state of the string is in flux, leading to hard-to-understand intermediate states.


Process from beginning to end in one pass, doing the right thing all the way along.

char *camelCase( char word[] ) { // return something usable by the caller
    int s = 0, d = 0; // 's'ource index, 'd'estination index

    // one sweep along the entire length
    while( ( word[d] = word[s] ) != '\0' ) {

        if( isalpha( word[d] ) ) {  // make ordinary letters lowercase
            word[ d ] = tolower( word[ d ] );
            d  , s  ;
            continue;
        }
        // special handling for non-alpha. may be more than one!
        while( word[s] && !isalpha( word[s] ) ) s  ;

        // end of non-alpha? copy alpha as UPPERCASE
        if( word[s] )
            word[d  ] = toupper( word[s  ] );
    }
    // make first character lowercase
    word[ 0 ] = tolower( word[ 0 ] );

    return word; // return modified string
}

int main() {
    // multiple test cases. Add "user input" after algorithm developed and tested.
    char *wordArray[] = {
        "_random__word_provided",
        " the quick brown fox  ",
        "stuff  happens all the time",
    };

    for( int i = 0; i < 3; i   )
        puts( camelCase( wordArray[i] ) );

    return 0;
}
randomWordProvided
theQuickBrownFox
stuffHappensAllTheTime

There may come comments pointing out that the ctype.h functions receive and return unsigned datatypes. This is a "casting" elaboration that you can/should add to the code if you ever expect to encounter something other than 7-bit ASCII characters.

CodePudding user response:

In my opinion, there's a very simple algorithm that just requires you to remember the last character parsed only:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>

void camelCase(char* source) 
{
    /*Convert to camelCase*/
    char last  = '_',
         *dest = source;

    /* while we are not at the string end copy the char values */
    while ((*dest = *source  ) != '\0') {
        /* if the char is a lower case letter and the previous was a '_' char. */
        if (islower(*dest) && last == '_')
            *dest = toupper(*dest);
        /* update the last character */
        last = *dest;
        /* to skip on the underscores */
        if (*dest != '_') dest  ;
    }
} /* camelCase */

int main()
{
    char wordArray[100]; /* better use a simple array */

    // Read the string from the keyboard
    printf("Enter identifiers separated by spaces/newlines: ");
    /* for each line of input */
    while (fgets(wordArray, sizeof wordArray, stdin)) {
        for (   char *word = strtok(wordArray, " \t\n");
                word;
                word = strtok(NULL, " \t\n"))
        {
    
            printf("%s -> ", word);

            // Call camelCase
            camelCase(word);
    
            // Print the new string
            printf("%s\n", word);
        }
    }
    
    return 0;
}

if you actually want to skip the first character (and don't convert it to uppercase), you can initialize last with a different char (e.g. '\0')

  • Related