Home > Back-end >  How to skip (not output) tokens in Boost Spirit?
How to skip (not output) tokens in Boost Spirit?

Time:05-29

I'm new to Boost Spirit. I haven't been able to find examples for some simple things. For example, suppose I have an even number of space-delimited integers. (That matches *(qi::int_ >> qi::int_). So far so good.) I'd like to save just the even ones to a std::vector<int>. I've tried a variety of things like *(qi::int_ >> qi::skip[qi::int_]) https://godbolt.org/z/KPToo3xh6 but that still records every int, not just even ones.

#include <stdexcept>

#include <fmt/format.h>
#include <fmt/ranges.h>

#include <boost/spirit/include/qi.hpp>

namespace qi = boost::spirit::qi;

// Example based off https://raw.githubusercontent.com/bingmann/2018-cpp-spirit-parsing/master/spirit1_simple.cpp:
// Helper to run a parser, check for errors, and capture the results.
template <typename Parser, typename Skipper, typename ... Args>
void PhraseParseOrDie(
    const std::string& input, const Parser& p, const Skipper& s,
    Args&& ... args)
{
    std::string::const_iterator begin = input.begin(), end = input.end();
    boost::spirit::qi::phrase_parse(begin, end, p, s, std::forward<Args>(args) ...);
    if (begin != end) {
        fmt::print("Unparseable: \"{}\"\n", std::string(begin, end));
    }
}

void test(std::string input)
{
    std::vector<int> out_int_list;

    PhraseParseOrDie(
        // input string
        input,
        // parser grammar
        *(qi::int_ >> qi::skip[qi::int_]),
        // skip parser
        qi::space,
        // output list
        out_int_list);

    fmt::print("test() parse result: {}\n", out_int_list);
}


int main(int argc, char* argv[])
{
    test("12345 42 5 2");

    return 0;
}

Prints

test() parse result: [12345, 42, 5, 2]

CodePudding user response:

You're looking for qi::omit[]:

*(qi::int_ >> qi::omit[qi::int_])

Note you can also implicitly omit things by declaring a rule without attribute-type (which make it bind to qi::unused_type for silent compatibility).

Also note that if you're making an adhoc, sloppy grammar to scan for certain "landmarks" in a larger body of text, consider spirit::repository::qi::seek which can be significantly faster and more expressive.

Finally, note that Spirit X3 comes with a similar seek[] directive out of the box.

Simplified Demo

Much simplified: https://godbolt.org/z/EY4KdxYv9

#include <fmt/ranges.h>
#include <boost/spirit/include/qi.hpp>

// Helper to run a parser, check for errors, and capture the results.
void test(std::string const& input)
{
    std::vector<int> out_int_list;

    namespace qi = boost::spirit::qi;

    qi::parse(input.begin(), input.end(),                            //
            qi::expect[                                            //
                qi::skip(qi::space)[                               //
                    *(qi::int_ >> qi::omit[qi::int_]) > qi::eoi]], //
            out_int_list);

    fmt::print("test() parse result: {}\n", out_int_list);
}

int main() { test("12345 42 5 2"); }

Prints

test() parse result: [12345, 5]

But Wait

Seeing your comment

// Parse a bracketed list of integers with spaces between symbols

Did you really mean that? Because that sounds a ton more like:

'[' > qi::auto_ %  qi::graph > ']'

See it live: https://godbolt.org/z/eK6Thzqea

//#define BOOST_SPIRIT_DEBUG
#include <fmt/ranges.h>
#include <boost/spirit/include/qi.hpp>
#include <boost/spirit/include/qi_auto.hpp>
//#include <boost/fusion/adapted.hpp>

// Helper to run a parser, check for errors, and capture the results.
template <typename T> auto test(std::string const& input) {
    std::vector<T> out;

    using namespace boost::spirit::qi;

    rule<std::string::const_iterator, T()> v = auto_;
    BOOST_SPIRIT_DEBUG_NODE(v);

    phrase_parse(                                //
        input.begin(), input.end(),              //
        '[' > -v % lexeme[ (graph - ']')] > ']', //
        space, out);

    return out;
}

int main() {
    fmt::print("ints: {}\n", test<int>("[12345 USD     5 PUT]"));
    fmt::print("doubles: {}\n", test<double>("[ 1.2345 42 -inf 'hello' 3.1415 ]"));
}

Prints

ints: [12345, 5]
doubles: [1.2345, -inf, 3.1415]
  • Related