Prolog newbie here. In SWI Prolog, I'm trying to figure out how to parse a simple line of CSV reversably, but I'm stuck. Here's what I've got:
csvstring1(S, L) :-
split_string(S, ',', ',', T),
maplist(atom_number, T, L).
csvstring2(S, L) :-
atomic_list_concat(T, ',', S),
maplist(atom_number, T, L).
% This one is the same except that maplist comes first.
csvstring3(S, L) :-
maplist(atom_number, T, L),
atomic_list_concat(T, ',', S).
Now csvstring1 and csvstring2 work in a "forward" manner:
?- csvstring1('1,2,3,4', L).
L = [1, 2, 3, 4].
?- csvstring2('1,2,3,4', L).
L = [1, 2, 3, 4].
But not csvstring3:
?- csvstring3('1,2,3,4', L).
ERROR: Arguments are not sufficiently instantiated
Moreover csvstring3 works in reverse, but not the other two predicates:
?- csvstring3(L, [1,2,3,4]).
L = '1,2,3,4'.
?- csvstring1(L, [1,2,3,4]).
ERROR: Arguments are not sufficiently instantiated
?- csvstring2(L, [1,2,3,4]).
ERROR: Arguments are not sufficiently instantiated
How can I combine these into a single predicate?
CodePudding user response:
There is no particularly newbie friendly way to do it which doesn't compromise somewhere. This is the easiest:
csvString_list(String, List) :-
ground(String),
atomic_list_concat(Temp, ',', String),
maplist(atom_number, Temp, List).
csvString_list(String, List) :-
ground(List),
maplist(atom_number, Temp, List),
atomic_list_concat(Temp, ',', String).
but it makes and leaves spurious choicepoints.
This cuts the choicepoints which is nice when using it, but poor practise to get into without being aware of what that means:
csvString_list(String, List) :-
ground(String),
atomic_list_concat(Temp, ',', String),
maplist(atom_number, Temp, List),
!.
csvString_list(String, List) :-
ground(List),
maplist(atom_number, Temp, List),
atomic_list_concat(Temp, ',', String).
This uses if/else which is less code:
csvString_list(String, List) :-
ground(String) ->
(atomic_list_concat(Temp, ',', String), maplist(atom_number, Temp, List))
; (maplist(atom_number, Temp, List), atomic_list_concat(Temp, ',', String)).
but is logically bad and you should reify the branching with if_ which isn't builtin to SWI Prolog and is less simple to use.
Or you could write a grammar with a DCG, which is not newbie territory:
:- set_prolog_flag(double_quotes, chars).
:- use_module(library(dcg/basics)).
csvTail([N|Ns]) --> [','], number(N), csvTail(Ns).
csvTail([]) --> [].
csv([N|Ns]) --> number(N), csvTail(Ns).
e.g.
?- phrase(csv(Ns), "11,22,33,44,55").
Ns = [11, 22, 33, 44, 55]
?- phrase(csv([11, 22, 33, 44, 55]), String)
String = [49, 49, ',', 50, 50, ',', 51, 51, ',', 52, 52, ',', 53, 53]
but now you're back to it leaving spurious choicepoints while parsing and you have to deal with the historic split of strings/atoms/character codes in SWI Prolog; that list will unify with "11,22,33,44,55"
because of the double_quotes flag but it doesn't look like it will.
CodePudding user response:
split_string is not reversible. Can use DCG - here is a simple multi-line DCG parser for CSV:
% Nicer formatting
% https://www.swi-prolog.org/pldoc/man?section=flags
:- set_prolog_flag(answer_write_options, [quoted(true), portray(true), spacing(next_argument), max_depth(100), attributes(portray)]).
% Show lists of codes as text (if 3 chars or longer)
:- portray_text(true).
csv_lines([]) --> [].
% Newline after every line
csv_lines([H|T]) --> csv_fields(H), [10], csv_lines(T).
csv_fields([H|T]) --> csv_field(H), csv_field_end(T).
csv_field_end([]) --> [].
% Comma between fields
csv_field_end(T) --> [44], csv_fields(T).
csv_field([]) --> [].
csv_field([H|T]) -->
[H],
% Fields cannot contain comma, newline or carriage return
{ maplist(dif(H), [44, 10, 13]) },
csv_field(T).
To demonstrate reversibility:
% Note: z is char 122
?- phrase(csv_lines([[`def`, `cool`], [`abc`, [122]]]), Lines).
Lines = `def,cool\nabc,z\n` ;
false.
?- phrase(csv_lines(Fields), `def,cool\nabc,z\n`).
Fields = [[`def`, `cool`], [`abc`, [122]]] ;
false.
To parse the field contents and maintain reversibility, can use e.g. atom_codes.
CodePudding user response:
Others have given some advice and a lot of code. With SWI-Prolog, to parse comma-separated integers, you would use library(dcg/basics) and library(dcg/high_order) to do that trivially:
?- use_module(library(dcg/basics)), use_module(library(dcg/high_order)).
true.
?- phrase(sequence(integer, ",", Ns), `1,2,3,4`).
Ns = [1, 2, 3, 4].
?- phrase(sequence(integer, ",", [7,6,42]), S), format("~s", [S]).
7,6,42
S = [55, 44, 54, 44, 52, 50].