Home > Software engineering >  string splicing in c language, A stange phenomenon
string splicing in c language, A stange phenomenon

Time:04-13

Recently, I saw a C language code, the following:

printf("%s\n", "1234" "qwer");
// output: 1234qwer

snprintf(buffer, sizeof(buffer), "bvcx" "mju");
// buffer data: bvcxmju

To be honest, it's amazing for me. Before that, I didn't know that the strings can be pasted in "1234" "qwer" format. Why can it run? then, I try this 'char a[] = "1234" "qwer"', gcc return an error! so, can someone explain this phenomenon and explain theory?

CodePudding user response:

What you saw has been part of the C language syntax for a long time. A string literal can be split in multiple parts separated only by white space, after preprocessing and comment removal. This syntax enables for example:

  • writing a long string literal on multiple lines:

    char message[] = "This is a long message that can be split on "
                     "multiple lines for readability";
    
  • combining string fragments defined as macros:

    printf("The value of i32 is %" PRId32 "\n", i32);
    
  • separating string contents that have a different meaning if juxtaposed:

    char s1[] = "This is ESC 4: \x1B" "4";
    char s2[] = "so is this: \0334 and this: \33""4";
    char s3[] = "but not this: \334";
    char s4[] = "nor this: \x1B4";
    
  • combining stringified macro arguments

CodePudding user response:

Adjacent string literals are always concatenated into a single one as part of the translation phases. See C17 6.4.5/5:

In translation phase 6, the multibyte character sequences specified by any sequence of adjacent character and identically-prefixed string literal tokens are concatenated into a single multibyte character sequence.

Formally, translation phase 6 happens after macro expansion but before preprocessor tokens are converted to tokens. Meaning for example that
sizeof "hello " "world" yields the result 12, equivalent to:
sizeof "hello world"

Practically, this is convenient when writing various "stringification" macros, example:

#include <stdio.h>

#define STRINGIFY(x) #x
#define STRINGIFY_CONCAT(a,b) STRINGIFY(a) " " STRINGIFY(b)

int main (void)
{
  puts(STRINGIFY_CONCAT(hello,world));
}

It's also a useful feature whenever you have to use hex escape sequences and need to terminate them, since C allows them to be of variable length: puts("\xABBA") vs puts("\xAB" "BA") will give different outputs.

  •  Tags:  
  • c
  • Related