The format that I want to match the string to is "from:<%s>" or "FROM:<%s>". The %s can be any length of characters representing an email address.
I have been using sscanf(input, "%*[fromFROM:<]%[@:-,.A-Za-z0-9]>", output)
. But it doesn't catch the case where the last ">" is missing. Is there a clean way to check if the input string is correctly formatted?
CodePudding user response:
You can't directly tell whether trailing literal characters in a format string are matched; there's no direct way for sscanf()
) to report their absence. However, there are a couple of tricks that'll do the job:
Option 1:
int n = 0;
if (sscanf("%*[fromFROM:<]%[@:-,.A-Za-z0-9]>%n", email, &n) != 1)
…error…
else if (n == 0)
…missing >…
Option 2:
char c = '\0';
if (sscanf("%*[fromFROM:<]%[@:-,.A-Za-z0-9]%c", email, &c) != 2)
…error — malformed prefix or > missing…
else if (c != '>')
…error — something other than > after email address…
Note that the 'from' scan-set will match ROFF
or MorfROM
or <FROM:morf
as a prefix to the email address. That's probably too generous. Indeed, it would match: from:<foofoomoo
of from:<[email protected]>
, which is a much more serious problem, especially as you throw the whole of the matched material away. You should probably capture the value and be more specific:
char c = '\0';
char from[5];
if (sscanf("%4[fromFROM]:<%[@:-,.A-Za-z0-9]%[>]", from, email, &c) != 3)
…error…
else if (strcasecmp(from, "FROM") != 0)
…not from…
else if (c != '>')
…missing >…
or you can compare using strcmp()
with from
and FROM
if that's what you want. The options here are legion. Be aware that strcasecmp()
is a POSIX-specific function; Microsoft provides the equivalent stricmp()
.
CodePudding user response:
Regarding the first part of the string, if you want to accept only FROM:<
or from:<
, then you can simply use the function strncmp
with both possibilities. Note, however, that this means that for example From:<
will not be accepted. In your question, you implied that this is how you want your program to behave, but I'm not sure if this really is the case.
Generally, I wouldn't recommend using the function sscanf
for such a complex task, because that function is not very flexible. Also, in ISO C, it is not guaranteed that character ranges are supported when using the %[]
format specifier. Therefore, I would recommend checking the individual parts of the string "manually":
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <ctype.h>
#include <stdbool.h>
bool is_valid_string( const char *line )
{
const char *p;
//verify that string starts with "from:<" or "FROM:<"
if (
strncmp( line, "from:<", 6 ) != 0
&&
strncmp( line, "FROM:<", 6 ) != 0
)
{
return false;
}
//verify that there are no invalid characters before the `>`
for ( p = line 6; *p != '>'; p )
{
if ( *p == '\0' )
return false;
if ( isalpha( (unsigned char)*p ) )
continue;
if ( isdigit( (unsigned char)*p ) )
continue;
if ( strchr( "@:-,.", *p) != NULL )
continue;
return false;
}
//jump past the '>' character
p ;
//verify that we are now at the end of the string
if ( *p != '\0' )
return false;
return true;
}
int main( void )
{
char line[200];
//read one line of input
if ( fgets( line, sizeof line, stdin ) == NULL )
{
printf( "Input failure!\n" );
exit( EXIT_FAILURE );
}
//remove newline character
line[strcspn(line,"\n")] = '\0';
//call function and print result
if ( is_valid_string ( line ) )
printf( "VALID\n" );
else
printf( "INVALID\n" );
}
This program has the following output:
This is an invalid string.
INVALID
from:<[email protected]
INVALID
from:<[email protected]>
VALID
FROM:<[email protected]
INVALID
FROM:<[email protected]>
VALID
FROM:<john.doe@example!!!!.com>
INVALID
FROM:<[email protected]>invalid
INVALID
CodePudding user response:
Use "%n"
. It records the offset of the scan of input[]
, if scanning got that far.
Use it to:
Detect scan success that include the
>
.Detect Extra junk.
A check of the return value of sscanf()
is not needed.
Also use a width limit.
char output[100];
int n = 0;
// sscanf(input, "%*[fromFROM:<]%[@:-,.A-Za-z0-9]>", output);
sscanf(input, "%*[fromFROM]:<