Home > database >  command-line argument comma delimited list in C
command-line argument comma delimited list in C

Time:11-27

I am trying to make put command line arguments by the user into an array but I am unsure how to approach it.

For example say I ran my program like this.

./program 1,2,3,4,5

How would I store 1 2 3 4 5 without the commas, and allow it to be passed to other functions to be used. I'm sure this has to do with using argv.

PS: NO space-separated, I want the numbers to parse into integers, I have an array of 200, and I want these numbers to be stored in the array as, arr[0] = 1, arr[1] = 2....

store 1 2 3 4 5 without the commas, and allow it to be passed to other functions to be used.

CodePudding user response:

PS: NO space-separated, I want the numbers to parse into integers

Space or comma-separated doesn't matter. Arguments always come in as strings. You will have to do the work to turn them into integers using atoi (Ascii-TO-Integer).

Using spaces between arguments is the normal convention: ./program 1 2 3 4 5. They come in already separated in argv. Loop through argv (skipping argv[0], the program name) and run them through atoi.

#include <stdio.h>
#include <stdlib.h>

int main(int argc, char *argv[]) {
    for(int i = 1; i < argc; i  ) {
        int num = atoi(argv[i]);
        printf("%d: %d\n", i, num);
    }
}

Using commas is going to make that harder. You first have to split the string using the kind of weird strtok (STRing TOKenizer). Then again call atoi on the resulting values.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

int main(int argc, char *argv[]) {
    char *token = strtok(argv[1], ",");
    while(token) {
        int num = atoi(token);
        printf("%d\n", num);
        token = strtok(NULL, ",");
    }
}

This approach is also more fragile than taking them as individual arguments. If the user types ./program 1, 2, 3, 4, 5 only 1 will be read.

CodePudding user response:

One of the main disadvantages to using atoi() is it provides no check on the string it is processing and will happily accept atoi ("my-cow"); and silently fail returning 0 without any indication of a problem. While a bit more involved, using strtol() allows you to determine what failed, and then recover. This can be as simple or as in-depth a recovery as your design calls for.

As mentioned in the comment, strtol() was designed to work through a string, converting sets of digits found in the string to a numeric value. On each call it will update the endptr parameter to point to the next character in the string after the last digit converted (to each ',' in your case -- or the nul-terminating character at the end). man 3 strtol provides the details.

Since strtol() updates endptr to the character after the last digit converted, you check if nptr == endptr to catch the error when no digits were converted. You check errno for a numeric conversion error such as overflow. Lastly, since the return type is long you need to check if the value returned is within the range of an int before assigning to your int array.

Putting it altogether with a very minimal bit of error handling, you could do something like:

#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <limits.h>
#include <errno.h>

#define NELEM 200   /* if you need a constant, #define one (or more) */

int main (int argc, char **argv) {
  
  int arr[NELEM] = {0}, ndx = 0;            /* array and index */
  char *nptr = argv[1], *endptr = nptr;     /* nptr and endptr */
  
  if (argc < 2) {       /* if no argument, handle error */
    fputs ("error: no argument provided.\n", stderr);
    return 1;
  }
  else if (argc > 2) {  /* warn on more than 2 arguments */
    fputs ("warning: more than one argument provided.\n", stdout);
  }
  
  while (ndx < NELEM) {     /* loop until all ints processed or arr full */
    int error = 0;          /* flag indicating error occured */
    long tmp = 0;           /* temp var to hold strtol return */
    char *onerr = NULL;     /* pointer to next comma after error */
    errno = 0;              /* reset errno */
    
    tmp = strtol (nptr, &endptr, 0);      /* attempt conversion to long */
    
    if (nptr == endptr) {   /* no digits converted */
      fputs ("error: no digits converted.\n", stderr);
      error = 1;
      onerr = strchr (endptr, ',');
    }
    else if (errno) {       /* overflow in conversion */
      perror ("strtol conversion error");
      error = 1;
      onerr = strchr (endptr, ',');
    }
    else if (tmp < INT_MIN || INT_MAX < tmp) {  /* check in range of int */
      fputs ("error: value outside range of int.\n", stderr);
      error = 1;
      onerr = strchr (endptr, ',');
    }
    
    if (!error) {           /* error flag not set */
      arr[ndx  ] = tmp;     /* assign integer to arr, advance index */
    }
    else if (onerr) {       /* found next ',' update endptr to next ',' */
      endptr = onerr;
    }
    else {                  /* no next ',' after error, break */
      break;
    }
    
    /* if at end of string - done, break loop */
    if (!*endptr) {
      break;
    }
    
    nptr = endptr   1;      /* update nptr to 1-past ',' */
  }
  
  for (int i = 0; i < ndx; i  ) {   /* output array content */
    printf (" %d", arr[i]);
  }
  putchar ('\n');           /* tidy up with newline */
}

Example Use/Output

This will handle your normal case, e.g.

$ ./bin/argv1csvints 1,2,3,4,5
 1 2 3 4 5

It will warn on bad arguments in list while saving all good arguments in your array:

$ ./bin/argv1csvints 1,my-cow,3,my-cat,5
error: no digits converted.
error: no digits converted.
 1 3 5

As well as handling completely bad input:

$ ./bin/argv1csvints my-cow
error: no digits converted.

Or no argument at all:

$ ./bin/argv1csvints
error: no argument provided.

Or more than the expected 1 argument:

$ ./bin/argv1csvints 1,2,3,4,5 6,7,8
warning: more than one argument provided.
 1 2 3 4 5

The point to be made it that with a little extra code, you can make your argument parsing routine as robust as need be. While your use of a single argument with comma-separated values is unusual, it is doable. Either manually tokenizing (splitting) the number on the commas with strtok() (or strchr() or combination of strspn() and strcspn()), looping with sscanf() using something similar to the "%d%n" format string to get a minimal succeed / fail indication with the offset of the next number from the last, or using strtol() and taking advantage of its error reporting. It's up to you.

Look things over and let me know if you have questions.

CodePudding user response:

This is how I'd deal with your requirement using strtol(). This does not damage the input string, unlike solutions using strtok(). It also handles overflows and underflows correctly, unlike solutions using atoi() or its relatives. The code assumes you want to store an array of type long; if you want to use int, you can add testing to see if the value converted is larger than INT_MAX or less than INT_MIN and report an appropriate error if it is not a valid int value.

Note that handling errors from strtol() is a tricky business, not least because every return value (from LONG_MIN up to LONG_MAX) is also a valid result. See also Correct usage of strtol(). This code requires no spaces before the comma; it permits them after the comma (so you could run ./csa43 '1, 2, -3, 4, 5' and it would work). It does not allow spaces before commas. It allows leading spaces, but not trailing spaces. These issues could be fixed with more work — probably mostly in the read_value() function. It may be that the validation work in the main loop should be delegated to the read_value() function — it would give a better separation of duty. OTOH, what's here works within limits. It would be feasible to allow trailing spaces, or spaces before commas, if that's what you choose. It would be equally feasible to prohibit leading spaces and spaces after commas, if that's what you choose.

#include <errno.h>
#include <limits.h>
#include <stdio.h>
#include <stdlib.h>

static int read_val(const char *str, char **eov, long *value)
{
    errno = 0;
    char *eon;
    if (*str == '\0')
        return -1;
    long val = strtol(str, &eon, 0);

    if (eon == str || (*eon != '\0' && *eon != ',') ||
        ((val == LONG_MIN || val == LONG_MAX) && errno == ERANGE))
        {
        fprintf(stderr, "Could not convert '%s' to an integer "
                "(the leftover string is '%s')\n", str, eon);
        return -1;
        }
    *value = val;
    *eov = eon;
    return 0;
}

int main(int argc, char **argv)
{
    if (argc != 2)
    {
        fprintf(stderr, "Usage: %s n1,n2,n3,...\n", argv[0]);
        exit(EXIT_FAILURE);
    }
    enum { NUM_ARRAY = 200 };
    long array[NUM_ARRAY];
    size_t nvals = 0;

    char *str = argv[1];
    char *eon;
    long  val;
    while (read_val(str, &eon, &val) == 0 && nvals < NUM_ARRAY)
    {
        array[nvals  ] = val;
        str = eon;
        if (str[0] == ',' && str[1] == '\0')
        {
            fprintf(stderr, "%s: trailing comma in number string\n", argv[1]);
            exit(EXIT_FAILURE);
        }
        else if (str[0] == ',')
            str  ;
    }

    for (size_t i = 0; i < nvals; i  )
        printf("[%zu] = %ld\n", i, array[i]);

    return 0;
}

Output (program csa43 compiled from csa43.c):

$ csa43 1,2,3,4,5
[0] = 1
[1] = 2
[2] = 3
[3] = 4
[4] = 5
$
  •  Tags:  
  • c
  • Related