Minimax algorithm incorrectly acting upon recognizing win/loss scenarios-CodePudding

The following minimax algorithm in python is meant to find the values of possible moves in a given connect four board. It is called by another function, computer_move, which is also shown below.

Expected behavior: minimax will return evaluations of a position based on a board state and a player.

If the player is one, it returns the evaluation of the highest value move.
If the player is two, it returns the evaluation of the lowest value move.

minimax is called by computer_move, which gets the returned values and chooses the best from them.

Actual behavior: computer_move seems to successfully choose the best move from minimax's returned values, but minimax does not properly evaluate different moves. Specifically, winning cases or cases where a win can be prevented are not properly evaluated. Intermediate situations have unknown behavior because the static evaluation function has not been implemented yet. However, in cases where a win is one-off by either player, the algorithm fails to react correctly.

I've tried switching various signs and swapping mins/maxes in both functions, but this did not seem to resolve the issue. Careful use of print statements also showed me that computer_move is correctly processing returned values, but minimax is returning incorrect values, suggesting that there's some error with the algorithm. However, it seems to be a textbook minimax algorithm, at least as far as I can see.

Does anyone have some suggestions for what the issue might be? Thanks!

Minimax function:

def minimax(board, depth, alpha, beta, player, move):
    # Base cases
    if check_winning(board, 1) or check_winning(board, 2):
        player = player*2-3     #Convert to  /-1
        return player*math.inf
    elif depth==0:
        return static_eval(board)

    # Player 1 finds highest value move
    if player==1:
        max_eval = -math.inf
        for move in range(COLUMNS):
            if valid_move(board, move):
                make_move(board, move, player)
                eval = minimax(board, depth-1, alpha, beta, 2, move)
                unmake_move(board, move)
                max_eval = max(max_eval, eval)
                alpha = max(alpha, eval)
                if beta <= alpha:
                    break
        return max_eval

    # Player 2 finds lowest (for player one) value move
    else:
        min_eval = math.inf
        for move in range(COLUMNS):
            if valid_move(board, move):
                make_move(board, move, player)
                #print_board(board)
                eval = minimax(board, depth-1, alpha, beta, 1, move)
                unmake_move(board, move)
                min_eval = min(min_eval, eval)
                beta = min(beta, eval)
                if beta <= alpha:
                    break
        return min_eval

Calling the minimax function and implementing the best move:

# Get list of move values, and choose highest/lowest depending on player

def computer_move(board, player, difficulty=3):
    move_vals=[]
    for move in range(COLUMNS):
        if(valid_move(board, move)):
            move_vals.append(minimax(board,
                                     difficulty, 
                                     -math.inf, 
                                     math.inf, 
                                     player, 
                                     move))
        else:
            move_vals.append(-math.inf)

    min_val = min(move_vals)
    max_val = max(move_vals)
    if(player==1):
        move = move_vals.index(max_val)
    else:
        move = move_vals.index(min_val)

    make_move(board, move, player)

    print('Here is the computer\'s move:')
    print_board(board)

CodePudding user response：

One problem is here:

 if check_winning(board, 1) or check_winning(board, 2):
        player = player*2-3     #Convert to  /-1
        return player*math.inf

In your case you are sending back a score of -infininty if player 1 is winning and it is his turn. Split the cases up into different statements instead which will make it easier to debug. Also you don't need depth == 0, you can check for when the board is full instead and then it is a draw if that is the case.

if check_winning(board, current_player):
    return math.inf
if check_winning(board, the_other_player):
    return -math.inf
if is_board_full(board):
    return 0

You probably also want to include the depth to always go for the shortest possible win/longest possible lose, and you also don't need to have such large values as inf:

if check_winning(board, current_player):
    return (10 depth)
if check_winning(board, the_other_player):
    return -(10 depth)
if is_board_full(board):
    return 0

CodePudding user response：

It turns out there were two issues:

Because the minimax function is hardcoded to have player one maximize and player two minimize, player one winning must return a positive value, and player two winning must always return a negative value
The computer_move function needs to add the potential moves to the board before calling the minimax function, and unmake them after

Here is the updated code:

def minimax(board, depth, alpha, beta, player, previous_move):
    # Change 1
    if check_winning(board, 1):
        return 100 depth
    if check_winning(board, 2):
        return -(100 depth)
    elif depth==0:
        return static_eval(board)

    # Player 1 finds highest value move
    if player==1:
        max_move_eval = -math.inf
        for move in range(COLUMNS):
            if valid_move(board, move):
                make_move(board, move, player)
                move_eval = minimax(board, depth-1, alpha, beta, 2, move)
                unmake_move(board, move)
                max_move_eval = max(max_move_eval, move_eval)
                alpha = max(alpha, move_eval)
                if beta <= alpha:
                    break
        return max_move_eval

    # Player 2 finds lowest (for player one) value move
    else:
        min_move_eval = math.inf
        for move in range(COLUMNS):
            if valid_move(board, move):
                #print_board(board)
                make_move(board, move, player)
                move_eval = minimax(board, depth-1, alpha, beta, 1, move)
                unmake_move(board, move)
                min_move_eval = min(min_move_eval, move_eval)
                beta = min(beta, move_eval)
                if beta <= alpha:
                    break
        return min_move_eval

def computer_move(board, difficulty, player):
    other_player = int(not bool(player-1)) 1
    move_vals=[]
    for move in range(COLUMNS):
        if(valid_move(board, move)):
            # Change 2
            make_move(board, move, player)
            move_vals.append(minimax(board,
                                     difficulty, 
                                     -math.inf, 
                                     math.inf, 
                                     other_player, 
                                     move))
            unmake_move(board, move)
        else:
            if(player==2):
                move_vals.append(-math.inf)
            if(player==1):
                move_vals.append(math.inf)

    min_val = min(move_vals)
    max_val = max(move_vals)
    if(player==1):
        move = move_vals.index(max_val)
    else:
        move = move_vals.index(min_val)

    make_move(board, move, player)

    print('Here is the computer\'s move:')
    print_board(board)