Trying to figure out how to parse structs that have multiple constructors or overloaded constructors. For example in this case, a range struct that contains either a range or a singleton case where the start/end of the range is equal.
case 1: look like
"start-stop"
case 2:
"start"
For the range case
auto range_constraint = x3::rule<struct test_struct, MyRange>{} = (x3::int_ >> x3::lit("-") >> x3::int_);
works but
auto range_constraint = x3::rule<struct test_struct, MyRange>{} = x3::int_ | (x3::int_ >> x3::lit("-") >> x3::int_);
unsurprisingly, won't match the signature and fails to compile.
Not sure what the fix is?
#include <boost/fusion/adapted/struct.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iostream>
namespace x3 = boost::spirit::x3;
struct MyRange
{
size_t start;
size_t end;
// little bit weird because should be end 1, but w/e
explicit MyRange(size_t start, size_t end = 0) : start(start), end(end == 0 ? start : end)
{
}
};
BOOST_FUSION_ADAPT_STRUCT(MyRange, start, end)
// BOOST_FUSION_ADAPT_STRUCT(MyRange, start)
//
int main()
{
auto range_constraint = x3::rule<struct test_struct, MyRange>{} = (x3::int_ >> x3::lit("-") >> x3::int_);
// auto range_constraint = x3::rule<struct test_struct, MyRange>{} = x3::int_ | (x3::int_ >> x3::lit("-") >> x3::int_);
for (std::string input :
{"1-2", "1","1-" ,"garbage"})
{
auto success = x3::phrase_parse(input.begin(), input.end(),
// Begin grammar
range_constraint,
// End grammar
x3::ascii::space);
std::cout << "`" << input << "`"
<< "-> " << success<<std::endl;
}
return 0;
}
CodePudding user response:
It's important to realize that sequence adaptation by definition uses default construction with subsequent sequence element assignment.
Another issue is branch ordering in PEG grammars. int_
will always success where int_ >> '‑' >> int_
would so you would never match the range version.
Finally, to parse size_t
usually prefer uint_
/uint_parser<size_t>
:)
Things That Don't Work
There are several ways to skin this cat. For one, there's BOOST_FUSION_ADAPT_STRUCT_NAMED, which would allow you to do
BOOST_FUSION_ADAPT_STRUCT_NAMED(MyRange, Range, start, end)
BOOST_FUSION_ADAPT_STRUCT_NAMED(MyRange, SingletonRange, start)
So one pretty elaborate would seem to spell it out:
auto range = x3::rule<struct _, Range>{} = uint_ >> '-' >> uint_;
auto singleton = x3::rule<struct _, SingletonRange>{} = uint_;
auto rule = x3::rule<struct _, MyRange>{} = range | singleton;
TIL that this doesn't even compile, apparently Qi was differently: Live On Coliru
X3 requires the attribute to be default-constructible whereas Qi would attempt to bind to the passed-in attribute reference first.
Even in the Qi version you can see that the fact Fusion sequences will be default-contructed-then-memberwise-assigned leads to results you didn't expect or want:
`1-2` -> true
-- [1,NIL)
`1` -> true
-- [1,NIL)
`1-` -> true
-- [1,NIL)
`garbage` -> false
What Works
Instead of doing the complicated things, do the simple thing. Anytime you see an optional value you can usually provide a default value. Alternatively you can not use Sequence adaptation at all, and go straight to semantic actions.
Semantic Actions
The simplest way would be to have specific branches:
auto assign1 = [](auto& ctx) { _val(ctx) = MyRange(_attr(ctx)); };
auto assign2 = [](auto& ctx) { _val(ctx) = MyRange(at_c<0>(_attr(ctx)), at_c<1>(_attr(ctx))); };
auto rule = x3::rule<void, MyRange>{} =
(uint_ >> '-' >> uint_)[assign2] | uint_[assign1];
Slighty more advanced, but more efficient:
auto assign1 = [](auto& ctx) { _val(ctx) = MyRange(_attr(ctx)); };
auto assign2 = [](auto& ctx) { _val(ctx) = MyRange(_val(ctx).start, _attr(ctx)); };
auto rule = x3::rule<void, MyRange>{} = uint_[assign1] >> -('-' >> uint_[assign2]);
Lastly, we can move towards defaulting the optional end:
auto rule = x3::rule<void, MyRange>{} =
(uint_ >> ('-' >> uint_ | x3::attr(MyRange::unspecified))) //
[assign];
Now the semantic action will have to deal with the variant end type:
auto assign = [](auto& ctx) {
auto start = at_c<0>(_attr(ctx));
_val(ctx) = apply_visitor( //
[=](auto end) { return MyRange(start, end); }, //
at_c<1>(_attr(ctx)));
};
Also Live On Coliru
Simplify?
I'd consider modeling the range explicitly as having an optional end:
struct MyRange {
MyRange() = default;
MyRange(size_t s, boost::optional<size_t> e = {}) : start(s), end(e) {
assert(!e || *e >= s);
}
size_t size() const { return end? *end - start : 1; }
bool empty() const { return size() == 0; }
size_t start = 0;
boost::optional<size_t> end = 0;
};
Now you can directly use the optional to construct:
auto assign = [](auto& ctx) {
_val(ctx) = MyRange(at_c<0>(_attr(ctx)), at_c<1>(_attr(ctx)));
};
auto rule = x3::rule<void, MyRange>{} = (uint_ >> -('-' >> uint_))[assign];
Actually, here we can go back to using adapted sequences, although with different semantics:
#include <boost/fusion/adapted.hpp>
#include <boost/spirit/home/x3.hpp>
#include <iomanip>
#include <iostream>
namespace x3 = boost::spirit::x3;
struct MyRange {
size_t start = 0;
boost::optional<size_t> end = 0;
};
static inline std::ostream& operator<<(std::ostream& os, MyRange const& mr) {
if (mr.end)
return os << "[" << mr.start << "," << *mr.end << ")";
else
return os << "[" << mr.start << ",)";
}
BOOST_FUSION_ADAPT_STRUCT(MyRange, start, end)
int main() {
x3::uint_parser<size_t> uint_;
auto rule = x3::rule<void, MyRange>{} = uint_ >> -('-' >> uint_);
for (std::string const input : {"1-2", "1", "1-", "garbage"}) {
MyRange into;
auto success = phrase_parse(input.begin(), input.end(), rule, x3::space, into);
std::cout << quoted(input, '`') << " -> " << std::boolalpha << success
<< std::endl;
if (success) {
std::cout << " -- " << into << "\n";
}
}
}
Summarizing
I hope these strategies give you all the things you needed. Pay close attention to the semantics of your range. Specifically, I never payed any attention to difference between "1"
and "1-"
. You might want one to be [1,2)
and the other to be [1,inf)
, both to be equivalent, or the second one might even be considered invalid?
Stepping back even further, I'd suggest that maybe you just needed
using Bound = std::optional<size_t>;
using MyRange = std::pair<Bound, Bound>;
Which you could parse directly with:
auto boundary = -x3::uint_parser<size_t>{};
auto rule = x3::rule<void, MyRange>{} = boundary >> '-' >> boundary;
It would allow for more inputs:
for (std::string const input : {"-2", "1-2", "1", "1-", "garbage"}) {
MyRange into;
auto success = phrase_parse(input.begin(), input.end(), rule, x3::space, into);
std::cout << quoted(input, '`') << " -> " << std::boolalpha << success
<< std::endl;
if (success) {
std::cout << " -- " << into << "\n";
}
}
Prints: Live On Coliru
`-2` -> true
-- [,2)
`1-2` -> true
-- [1,2)
`1` -> false
`1-` -> true
-- [1,)
`garbage` -> false