Home > database >  Stuck writing a recursive Rust declarative macro with esoteric puntuation
Stuck writing a recursive Rust declarative macro with esoteric puntuation

Time:06-30

I have written a Rust interpreter for a small esolang called Deque. For the purposes of this post I don't think we need to go into detail of how the language works, but we need its syntax: A program consists of space-separated instructions, each instruction consists of a command and an exclamation mark, in some order, and commands are integer literals to be pushed, or written-out functions to apply. For instance, 3! !5 !2 sub! !add.

My interpreter works in such a way that you call let s = Script::new("3! !5 !2 sub! !add") to construct a program (this splits the string into instructions, parses them one by one, and creates a Vec<Instruction> out of them). Calling s.run() consumes s and returns the result of running said program. This works just fine.

However, I wanted to make a macro to shorten the above by, like, 5 characters. And also because I haven't written many macros and this seemed like a good exercise. What I want to be able to do is to call let mut s = script!(3! !5 !2 sub! !add), and then this should recursively construct the program instance. (Yes, I know I could just stringify! everything inside the brackets and send it to Script::new(), but that wouldn't be fun).

So, since instructions can contain either integer literals or typed-out commands, and I haven't been able to really figure out how macro_rules descriptors work, this is what I have so far (I also wrote a Script::prepend() method to prepend an instruction to the instruction queue, to get the recursion going; trying to append instead immediately got me into ambiguous token parsing because of the ordering):

macro_rules! script {
    ($val:literal !) => {
        Script::new(format!("{}!", $val))
    };
    ($cmd:ident !) => {
        Script::new(format!("{}!", stringify!($cmd)))
    };
    (!$val:literal) => {
        Script::new(format!("!{}", $val))
    };
    (!$cmd:ident) => {
        Script::new(format!("!{}", stringify!($cmd)))
    };
    (!$val:literal $(!$vals:tt) ) => {
        script!($(!$vals) ).prepend(&format!("!{}", $val)[..])
    };
    () => {
        Script::new("")
    };
}

So this works great for single-instruction programs. However, I am at a loss for what to do with multiple-instruction programs. I have written one arm (of what seems at the moment to be 4, just like the single-instruction case). I have tried to do a lot of things to the exclamation mark in $(!$vals:tt) , and none of them work.

This is the only one I've been able to write that actually works, albeit on the small subset of cases where all instructions have the exclamation mark on the left side. For instance, trying to make exclamation marks optional on either side with $($(!)?$vals:tt$(!)) apparently runs into ambiguous parsing issues. Matching on (!$val:literal $($vals:tt) ) for some reason can't parse script!(!5 !6) and gets hung up on the exclamation mark next to the 6, and using just (!$val:literal $vals:tt) without the repetition marker gets hung up on the 6 instead. I don't know that I can make the tt more restrictive, and still catch both the integer literals and the function names, and I need the repeated part to catch both of those cases at once, because otherwise I would have to write a lot of arms.

How can I change the $(!$vals:tt) part to be agnostic to which side the exclamation mark is on? And is there a way to avoid having to write four arms for the base case and four arms for the recursion case? (Exclamation mark on left versus right and literal versus ident. Maybe the empty case could actually be the base case, saving me there. But I can't really know until I can make the recursion work.) Finally, and perhaps much more of a long shot: Is there a way to reverse the order of the recursion, so that I can append rather than prepend?

CodePudding user response:

In many case such parsing will require push-down accumulation, but you are lucky: there is a very simple solution in this case. Instead of trying to capture them all as valid input, just parse one and forward the rest as $($tt)*:

macro_rules! script {
    ($val:literal ! $($vals:tt)*) => {
        script!($($vals)*).prepend(format!("{}!", $val))
    };
    ($cmd:ident ! $($vals:tt)*) => {
        script!($($vals)*).prepend(format!("{}!", stringify!($cmd)))
    };
    (!$val:literal $($vals:tt)*) => {
        script!($($vals)*).prepend(format!("!{}", $val))
    };
    (!$cmd:ident $($vals:tt)*) => {
        script!($($vals)*).prepend(format!("!{}", stringify!($cmd)))
    };
    () => {
        Script::new(String::new())
    };
}

Playground.

As for reversing the order, I'd recommend you to not bother. This can be very efficient even as is - you just need to append to the end of the vector then reverse() it. However, for the sake of fun, it is possible but requires push-down accumulation. Here's an example:

macro_rules! script_impl {
    {
        [ $($parsed:tt)* ]
        $val:literal !
        $($rest:tt)*
    } => {
        script_impl! {
            [
                // The parentheses are important for it to be one `tt`.
                // I choose parentheses to because they're still valid
                // expression producing `&str` (although braces will
                // also do, but brackets not).
                ( concat!(stringify!($val), "!") )
                $($parsed)*
            ]
            $($rest)*
        }
    };
    {
        [ $($parsed:tt)* ]
        ! $val:literal
        $($rest:tt)*
    } => {
        script_impl! {
            [
                ( concat!("!", stringify!($val)) )
                $($parsed)*
            ]
            $($rest)*
        }
    };
    {
        [ $($parsed:tt)* ]
        $cmd:ident !
        $($rest:tt)*
    } => {
        script_impl! {
            [
                ( concat!(stringify!($cmd), "!") )
                $($parsed)*
            ]
            $($rest)*
        }
    };
    {
        [ $($parsed:tt)* ]
        ! $cmd:ident
        $($rest:tt)*
    } => {
        script_impl! {
            [
                ( concat!("!", stringify!($cmd)) )
                $($parsed)*
            ]
            $($rest)*
        }
    };
    {
        [ $first_parsed:tt $($rest_parsed:tt)* ]
        // Finished parsing
    } => {
        Script::new($first_parsed)
            $( .append($rest_parsed) )*
    };
}
macro_rules! script {
    ( $($t:tt)* ) => {
        script_impl! {
            [ ]
            $($t)*
        }
    }
}

Playground.

  • Related