Home > OS >  Use scanf to split a string on a non-whitespace separator
Use scanf to split a string on a non-whitespace separator

Time:07-02

I aim to scan a string containing a colon as a division and save both parts of it in a tuple.
For example:

input: "a:b"
output : -: string * string = (a,b)

My approach so far keeps getting the error message:
"scanf: bad input at char number 9: looking for ':', found '\n'".

 Scanf.bscanf Scanf.Scanning.stdin "%s:%s" (fun x y -> (x,y));;

Additionally, my approach works with integers, I'm confused why it is not working with strings.

Scanf.bscanf Scanf.Scanning.stdin "%d:%d" (fun x y -> (x,y));;
4:3
- : int * int = (4, 3)

CodePudding user response:

The reason for the issue you're seeing is that the first %s is going to keep consuming input until one of the following conditions hold:

  • a whitespace has been found,
  • a scanning indication has been encountered,
  • the end-of-input has been reached.

Note that seeing a colon isn't going to satisfy any of these (if you don't use a scanning indication). This means that the first %s is going to consume everything up to, in your case, the newline character in the input buffer, and then the : is going to fail.

You don't have this same issue for %d:%d because %d isn't going to consume the colon as part of matching an integer.

You can fix this by instead using a format string which will not consume the colon, e.g., %[^:]:%s. You could also use a scanning indication, like so: %s@:%s.

Additionally, your current method won't consume any trailing whitespace in the buffer, which might result in newlines being added to the first element on subsequent use of this, so you might prefer %s@:%s\n to consume the newline.

So, in all,

Scanf.bscanf Scanf.Scanning.stdin "%s@:%s\n" (fun x y -> (x,y));;

CodePudding user response:

The %s specifier is greedy and it will read the string up to whitespace or a scanning indicator. The indicator could be specified using @<indicator> just after the %s specifier, where <indicator> is a single character, e.g.,

let split str = 
  Scanf.sscanf str "%s@:%s" (fun x y -> x,y)

This will instruct scanf to read everything up to : into the first string, drop : and then read the rest into the second string.

CodePudding user response:

The string specifier %s is eager by default and will swallow all your content until the next space. You need to add a scanning indication(https://ocaml.org/api/Scanf.html#indication) to explain to Scanf.sscanf that you expect the first string to end on the first : :

For instance,

Scanf.sscanf "a:b"
  "%s@:%s"
  (fun x y -> x,y)

returns "a", "b". Here the scanning indication is the @: specifier just after the first %s specifier. In general, scanning indication are written @c for a character c.

  • Related