Home > Software design >  How do I collect chars into a string in C?
How do I collect chars into a string in C?

Time:10-22

I need to collect some chars into the buffer for my lexer, but I don't know how. I've read some answers on stackoverflow, but those are different cases. I have a while loop that reads next char and I want to put logic in it so it append new char to the buffer in memory.

// init buffer with the first char 'h'
char *buffer = malloc(sizeof(char));
buffer[0] = 'h';
buffer[1] = '\0';

// go through input char by char
while(...)
{
   char c = read_next_char(); 
   buffer.append(c) // I whould do in JavaScript, but not in C :(
}


CodePudding user response:

In your case you are allocating a single byte char *buffer = malloc(sizeof(char)); at the beginning and access buffer[1] or any other index is UB.

You can allocate a known number of bytes at beginning and use it until you see a point you need more buffer size.

something like this,

int buffersize = 100;
int index =0;
char *buffer = malloc(sizeof(char)*buffersize); //100bytes are allocated

if(!buffer)
    return;

buffer[index  ] = 'h';
buffer[index  ] = '\0';

// go through input char by char
while(...)
{
   char c = read_next_char(); 
   if(index == buffersize ){
      buffersize  =100;
       buffer= realloc(buffer, buffersize );
      //here buffer size is increased by 100
       if(!buffer) 
           return;
   }
   
   buffer[index  ] = c ;
}

Note: You must free the buffer once the usage is over else it would lead to resource leak.

CodePudding user response:

You need simple to overwrite the null terminating character and add the new one.

char *append(char *buff, int ch)
{
    size_t len = strlen(buff);
    buff[len] = ch;
    buff[len 1] = 0;
    return buff;
}

The code assumes that buff is a valid pointer to long enough memory block to accommodate the new char and null terminating char. It has to contain a valid C string.

CodePudding user response:

Unlike in java or javascipt there is no string type in C, you need to write your own.

This is a very simple example of how you could handle the building of strings in an efficient way.

It's pretty self explaining.

#include <stdio.h>
#include <stdlib.h>
#include <string.h>

struct DynamicString
{
  char* string;    // pointer to string
  int length;      // string length
  int capacity;    // capacity of the string buffer (= allocated size)
};

#define DS_CHUNKSIZE 100   // we increase the buffer size by DS_CHUNKSIZE
                           // change this as needed

// Initialize the structure
void InitDynamicString(struct DynamicString* ds)
{
  ds->capacity = DS_CHUNKSIZE   1;
  ds->string = malloc(ds->capacity);
  ds->string[0] = 0;   // null terminator
  ds->length = 0;      // initial string length 0
};

// Increase the string buffer size if necessary
// (internal function)
void IncreaseSize(struct DynamicString* ds, int newsize)
{
  if (ds->length   newsize   1 > ds->capacity)
  {
    ds->capacity = ds->length   newsize   DS_CHUNKSIZE   1;
    ds->string = realloc(ds->string, ds->capacity); // reallocate a new larger buffer
  }
}

// append a single character
void AppendChar(struct DynamicString* ds, char ch)
{
  IncreaseSize(ds, sizeof(char)); // increase size by 1 if necessary
  ds->string[ds->length  ] = ch;  // append char
  ds->string[ds->length] = 0;     // null terminator
}

// append a string
void AppendString(struct DynamicString* ds, const char *str)
{
  IncreaseSize(ds, strlen(str));  // increase by length of string if necessary
  strcat(ds->string, str);        // concatenate
  ds->length  = strlen(str);      // update string length
}


int main(int argc, char* argv[])
{
  struct DynamicString ds;

  InitDynamicString(&ds);   // initialize ds

  AppendChar(&ds, 'a');     // append chars
  AppendChar(&ds, 'b');
  AppendChar(&ds, 'c');

  AppendString(&ds, "DE");      // append strings
  AppendString(&ds, "xyz1234");

  printf("string = \"%s\"", ds.string);  // show result
}

You code could use it like this:

struct DynamicString buffer;
InitDynamicString(&buffer)

dAppendChar(&buffer, 'h');

while(...)
{
   char c = read_next_char(); 
   AppendChar(&buffer, c); // quite similar to  buffer.append(c)
}

Disclaimer:

  • The code hasn't been thoroughly tested and there may be bugs.
  • There is no error checking whatsoever. malloc and realloc may fail.
  • Other useful functions such as SetString(struct DynamicString *ds, const char *string) need to be written.
  • There is room for optimisation, especially the strcat could be handled differently, read this article for more information. I leave this as a (very simple) exercise to the reader.

CodePudding user response:

There a no standard function in C that can append a char to a string. You need to write the code from scratch.

Let's start here:

char *buffer = malloc(sizeof(char));  // This allocates memory for ONE char
buffer[0] = 'h';                      // So this is fine
buffer[1] = '\0';                     // but this is bad. It writes outside the allocated memory

Fix it by allocating memory for two chars

char *buffer = malloc(2);  // sizeof(char) is always 1 so no need for it
buffer[0] = 'h';
buffer[1] = '\0';

When you want to append a new character to the string, you also need to allocate memory for it. In other words, you need to increase the size of the memory that buffer is pointing to. For that you can use the function realloc.

size_t buffer_size = 2;
char *buffer = malloc(buffer_size );
buffer[0] = 'h';
buffer[1] = '\0';

while(...)
{
    char c = read_next_char(); 

    char* tmp = realloc(buffer, buffer_size   1);
    if (tmp == NULL)
    {
        // realloc failed ! Add error handling here
        ... error handling ...
    }
    buffer = tmp;
    buffer[buffer_size - 1] = c;  // Add the new char
    buffer[buffer_size] = '\0';     // Add the string termination
      buffer_size;                  // Update buffer size
}

CodePudding user response:

The other answers can work but they are complex. I suggest a simpler solution. A string is an array of char's where the last char of that string is a '\0'-Byte. There can be more char's in the array after it but they are not part of the string.

The simpler solution is to create an array which is large enough for 98% of cases, use it to store the string and when the string gets too long you can exit with an error. Changing the buffer size when needed is a nice feature but when you are new to C you shouldn't start there.

#define BUFFER_SIZE 1024
// init buffer with the first char 'h'
char buffer[BUFFER_SIZE];
buffer[0] = 'h';
buffer[1] = '\0';

// go through input char by char Replace the ... with your condition of the while loop
for(size_t i=1;...;i  ) //start at 1 so the 'h' is not overwritten
{
   if(i==BUFFER_SIZE-1) //-1 for the '\0'-Byte
   {
     fputs("Input too long, exit\n",stderr);
     exit(1);
   }
   //Are you sure you don't need error handling for read_next_char()?
   buffer[i] = read_next_char();
   buffer[i 1]='\0'; //End the string with a '\0'-Byte
}
  • Related