Home > Blockchain >  Alpha Beta Pruning with Recursion Schemes
Alpha Beta Pruning with Recursion Schemes

Time:06-08

I'm trying to get more proficient with recursion schemes as they have so far been really helpful for turning gnarly explicit recursion code into something less spike-y. One of the other tools I tend to reach for when implementing algorithms that can get really confusing with explicit recursion is monad transformers / mutability. Ideally I'd like to get comfortable enough with recursion schemes such that I can ditch statefulness altogether. An example of an algorithm I'd still reach for the transformers for is minimax with alpha beta pruning. I did normal minimax with a catamorphism and minimax f-algebra (data MinimaxF a f = MMResult a | MMState [f] Bool), but I wasn't sure how I could extend this to do alpha beta pruning. I thought maybe I could use histomorphism, or maybe there was some custom solution with comonads, but I didn't know how to approach trying a solution using either technique.

In addition to a version of alpha beta pruning with recursion schemes any general advice you have about tackling similar problems would be much appreciated. For example I've had trouble applying recursion schemes to algorithms like Dijkstra that usually are implemented in an imperative fashion.

CodePudding user response:

Alpha-beta can be seen as an instance of minimax, where min and max are instantiated using a well-chosen lattice. Full gist.

We represent games as a tree, where each internal node is a position in the game, waiting for a designated player to pick a move to a child node, and each leaf is a final position with its score, or value.

-- | At every step, either the game ended with a value/score,
-- or one of the players is to play.
data GameF a r = Value a | Play Player (NonEmpty r)
 deriving Functor
type Game a = Fix (GameF a)

-- | One player wants to maximize the score,
-- the other wants to minimize the score.
data Player = Mini | Maxi

minimax will work on any lattice, defined by the following class:

class Lattice l where
  inf, sup :: l -> l -> l

The Lattice class is more general than Ord: and Ord instance is a Lattice with decidable equality (Eq). If we could redefine Ord, then it would be appropriate to add Lattice as a superclass. But here a newtype will have to do:

-- The Lattice induced by an Ord
newtype Order a = Order { unOrder :: a }
  deriving (Eq, Ord)

instance Ord a => Lattice (Order a) where
  inf = min
  sup = max

Here's minimax. It is parameterized by an embedding leaf :: a -> l of final values to the chosen lattice. One player maximizes the embedded value, the other player minimizes it.

-- | Generalized minimax
gminimax :: Lattice l => (a -> l) -> Game a -> l
gminimax leaf = cata minimaxF where
  minimaxF (Value x) = leaf x
  minimaxF (Play p xs) = foldr1 (lopti p) xs

lopti :: Lattice l => Player -> l -> l -> l
lopti Mini = inf
lopti Maxi = sup

The "regular" minimax uses the scores of the game directly as the lattice:

minimax :: Ord a => Game a -> a
minimax = unOrder . gminimax Order

For alpha-beta pruning, the idea is that we can keep track of some bounds on the optimal score, and this allows us to short-circuit the search. So the search is to be parameterized by that interval (alpha, beta). This leads us to a lattice of functions Interval a -> a:

newtype Pruning a = Pruning { unPruning :: Interval a -> a }

An interval can be represented by (Maybe a, Maybe a) to allow either side to be unbounded. But we shall use better named types for clarity, and also to leverage a different Ord instance on each side:

type Interval a = (WithBot a, WithTop a)
data WithBot a = Bot | NoBot a deriving (Eq, Ord)
data WithTop a = NoTop a | Top deriving (Eq, Ord)

We will require that we can only construct Pruning f if f satisfies clamp i (f i) = clamp i (f (Bot, Top)), where clamp is defined below. That way, f is a search algorithm which may shortcircuit if it learns that its result lies outside of the interval, without having to find the exact result.

clamp :: Ord a => Interval a -> a -> a
clamp (l, r) = clampBot l . clampTop r

clampBot :: Ord a => WithBot a -> a -> a
clampBot Bot x = x
clampBot (NoBot y) x = max y x

clampTop :: Ord a => WithTop a -> a -> a
clampTop Top x = x
clampTop (NoTop y) x = min y x

Functions form a lattice by pointwise lifting. And when we consider only functions satisfying clamp i (f i) = clamp i (f (Bot, Top)) and equate them modulo a suitable equivalence relation (Pruning f = Pruning g if clamp <*> f = clamp <*> g), a short-circuiting definition of the lattice becomes possible.

The inf of two functions l and r, given an interval i = (alpha, beta), first runs l (alpha, beta) to obtain a value vl. If vl <= alpha, then it must be clamp i vl == alpha == clamp i (min vl (r i)) so we can stop and return vl without looking at r. Otherwise, we run r, knowing that the final result is not going to be more than vl so we can also update the upper bound passed to r. sup is defined symmetrically.

instance Ord a => Lattice (Pruning a) where
  inf l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if NoBot vl <= alpha then vl else min vl (unPruning r (alpha, min (NoTop vl) beta))

  sup l r = Pruning \(alpha, beta) ->
    let vl = unPruning l (alpha, beta) in
    if beta <= NoTop vl then vl else max vl (unPruning r (max (NoBot vl) alpha, beta))

Thus we obtain alpha-beta as an instance of minimax. Once the lattice above is defined, we only need some simple wrapping and unwrapping.

alphabeta :: Ord a => Game a -> a
alphabeta = runPruning . gminimax constPruning

constPruning :: Ord a => a -> Pruning a
constPruning x = Pruning \i -> clamp i x

runPruning :: Pruning a -> a
runPruning f = unPruning f (Bot, Top)

If all goes well, alphabeta and minimax should have the same result:

main :: IO ()
main = quickCheck \g -> minimax g === alphabeta (g :: Game Int)
  • Related