Home > Back-end >  What exactly does this format string do when paired with fscanf?
What exactly does this format string do when paired with fscanf?

Time:12-02

I am looking at some code and came across this line:

fscanf(file, "%*[: ]s", dest);

What does the %*[: ]s format string specifier do?

CodePudding user response:

This format string

"%*[: ]s"

means that all symbols ':' and ' ' (the symbols placed in the square brackets in the format string) must be skipped in the input stream and then at most 16 characters be read in a character array.

In the format string the symbol * is assignment-suppressing character.

Here is a demonstration program. For visibility I am using sscanf instead of fscanf.

#include <stdio.h>

int main( void ) 
{
    const char *stream = "::: : : : :::Hello";
    char s[17];
    
    sscanf( stream, "%*[: ]s", s );
    
    printf( "\"%s\"\n", s );

    return 0;
}

The program output is

"Hello"

CodePudding user response:

It reads in any spaces or : (colon) characters then discards them and then reads up to 16 non-whitespace characters into dest (17 including the null terminator \0).

The * after the % is the "assignment-suppression character." The number between % and s is the "maximum field width." The square brackets indicate to match either the characters within or everything except those characters (with a caret). Dash and caret are handled specially.

From the Linux manpage for scanf:

Each conversion specification in format begins with either the character '%' or the character sequence "%n$" (see below for the distinction) followed by:

· An optional '*' assignment-suppression character: scanf() reads input as directed by the conversion specification, but discards the input. No corresponding pointer argument is required, and this specification is not included in the count of successful assignments returned by scanf(). [snip]

· An optional decimal integer which specifies the maximum field width. Reading of characters stops either when this maximum is reached or when a nonmatching character is found, whichever happens first. Most conversions discard initial white space characters (the exceptions are noted below), and these discarded characters don't count toward the maximum field width. String input conversions store a terminating null byte ('\0') to mark the end of the input; the maximum field width does not include this terminator.

The following conversion specifiers are available:

[snip]

s Matches a sequence of non-white-space characters; the next pointer must be a pointer to the initial element of a character array that is long enough to hold the input sequence and the terminating null byte ('\0'), which is added automatically. The input string stops at white space or at the maximum field width, whichever occurs first.

[snip]

[ Matches a nonempty sequence of characters from the specified set of accepted characters; the next pointer must be a pointer to char, and there must be enough room for all the characters in the string, plus a terminating null byte. The usual skip of leading white space is suppressed. The string is to be made up of characters in (or not in) a particular set; the set is defined by the characters between the open bracket [ character and a close bracket ] character. The set excludes those characters if the first character after the open bracket is a circumflex (^). To include a close bracket in the set, make it the first character after the open bracket or the circumflex; any other position will end the set. The hyphen character - is also special; when placed between two other characters, it adds all intervening characters to the set. To include a hyphen, make it the last character before the final close bracket. For instance, [^]0-9-] means the set "everything except close bracket, zero through nine, and hyphen". The string ends with the appearance of a character not in the (or, with a circumflex, in) set or when the field width runs out.

– Linux manpage for scanf(3)

CodePudding user response:

What does the "%*[: ]s" format string specifier do?

  1. "%*[: ]": read and discard (due to the "*") at least one of input characters of the scan_set : ':', ' '. If no ':', ' ' found, stop scan.

  2. "s" has 3 steps: 1) Read and discard any (0 or more) leading white-spaces. e.g. ' ', '\n', '\t', etc. 2) Read and save to dest at least one but no more than 16 non-white-spaces - else stop scan. 3) Append a null character to dest. Thus dest should be at least 17: char dest[16 1];


Advanced

A curious difference of fscanf() and sscanf() is that when sscanf() reads a null character, scanning stops. With fscanf(), scanning continues.

With fscanf(file "%s", dest) and 8 character file data of '\t','1', '2', '3', '\0', 'x', 'y', 'z', dest[] will get '1', '2', '3', '\0', 'x', 'y', 'z', '\0'. It is unusual to have a null character in a text file.

  • Related