Home > Software design >  Avoiding stack overflow from recursion in Clojure
Avoiding stack overflow from recursion in Clojure

Time:09-17

I'm new to Clojure and am having trouble figuring out how to avoid stack overflows in certain situations. One such situation came up while trying to port a parsing project to Clojure with a parser combinator library I found called kern.

Kern defines a recursive implementation for a "many-till" parser: source

This works fine for small inputs:

(def input-text "The blue {cat} and the red {dog} became best friends with the white {wolf} END {not included}")

(def  between-brackets "parser that grabs all text between brackets"
  (between (sym* \{) (sym* \}) (< > (many (none-of* "}")))))

(def parse-enclosed-words "parser that attempts to grab text between brackets, 
skips a character when it can't, and halts when it hits the string END"
  (many-till (<|> between-brackets (skip any-char)) (token* "END")))

(filter #(some? %) (value parse-enclosed-words input-text)) ;; => ("cat" "dog" "wolf")

Unfortunately, the parser encounters stack overflows as input strings grow:

(def file-input-text (slurp (io/resource "[input-text-20x-repeated.txt][2]") ))
(filter #(some? %) (value parse-enclosed-words file-input-text)) ;; => Unhandled java.lang.StackOverflowError

From what I can tell by reading online, this is probably due to the function using stack-consuming recursion. I played around with some attempts to re-write the function using the "recur" keyword, but since the recursive call is not in tail position this didn't seem to work.

How can many-till be modified to avoid stack overflow?

CodePudding user response:

I don't think you can do it with the abstractions the library provides. This is a very natural definition of many-till, and would work fine in Parsec, the library that Kern is obviously inspired by. But Clojure doesn't have Haskell's lazy evaluation and automatic trampolining, so the nested lambdas that many-till builds inevitably consume unbounded stack. You would need an implementation more like its implementation of many, which builds a parser by hand out of a function. I would include its source below, but I don't own it and so I don't think I am authorized to give Stack Overflow a CC BY-SA 4.0 license to it, as posting it would do. Instead, here is a link to its source.

  • Related