The following minimax algorithm in python is meant to find the values of possible moves in a given connect four board. It is called by another function, computer_move
, which is also shown below.
Expected behavior: minimax
will return evaluations of a position based on a board state and a player.
If the player is one, it returns the evaluation of the highest value move.
If the player is two, it returns the evaluation of the lowest value move.
minimax
is called by computer_move
, which gets the returned values and chooses the best from them.
Actual behavior: computer_move
seems to successfully choose the best move from minimax
's returned values, but minimax does not properly evaluate different moves. Specifically, winning cases or cases where a win can be prevented are not properly evaluated. Intermediate situations have unknown behavior because the static evaluation function has not been implemented yet. However, in cases where a win is one-off by either player, the algorithm fails to react correctly.
I've tried switching various signs and swapping mins/maxes in both functions, but this did not seem to resolve the issue. Careful use of print statements also showed me that computer_move
is correctly processing returned values, but minimax
is returning incorrect values, suggesting that there's some error with the algorithm. However, it seems to be a textbook minimax algorithm, at least as far as I can see.
Does anyone have some suggestions for what the issue might be? Thanks!
Minimax function:
def minimax(board, depth, alpha, beta, player, move):
# Base cases
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to /-1
return player*math.inf
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_eval = max(max_eval, eval)
alpha = max(alpha, eval)
if beta <= alpha:
break
return max_eval
# Player 2 finds lowest (for player one) value move
else:
min_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
#print_board(board)
eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_eval = min(min_eval, eval)
beta = min(beta, eval)
if beta <= alpha:
break
return min_eval
Calling the minimax function and implementing the best move:
# Get list of move values, and choose highest/lowest depending on player
def computer_move(board, player, difficulty=3):
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
player,
move))
else:
move_vals.append(-math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer\'s move:')
print_board(board)
CodePudding user response:
One problem is here:
if check_winning(board, 1) or check_winning(board, 2):
player = player*2-3 #Convert to /-1
return player*math.inf
In your case you are sending back a score of -infininty if player 1 is winning and it is his turn. Split the cases up into different statements instead which will make it easier to debug. Also you don't need depth == 0, you can check for when the board is full instead and then it is a draw if that is the case.
if check_winning(board, current_player):
return math.inf
if check_winning(board, the_other_player):
return -math.inf
if is_board_full(board):
return 0
You probably also want to include the depth to always go for the shortest possible win/longest possible lose, and you also don't need to have such large values as inf:
if check_winning(board, current_player):
return (10 depth)
if check_winning(board, the_other_player):
return -(10 depth)
if is_board_full(board):
return 0
CodePudding user response:
It turns out there were two issues:
- Because the
minimax
function is hardcoded to have player one maximize and player two minimize, player one winning must return a positive value, and player two winning must always return a negative value - The
computer_move
function needs to add the potential moves to the board before calling theminimax
function, and unmake them after
Here is the updated code:
def minimax(board, depth, alpha, beta, player, previous_move):
# Change 1
if check_winning(board, 1):
return 100 depth
if check_winning(board, 2):
return -(100 depth)
elif depth==0:
return static_eval(board)
# Player 1 finds highest value move
if player==1:
max_move_eval = -math.inf
for move in range(COLUMNS):
if valid_move(board, move):
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 2, move)
unmake_move(board, move)
max_move_eval = max(max_move_eval, move_eval)
alpha = max(alpha, move_eval)
if beta <= alpha:
break
return max_move_eval
# Player 2 finds lowest (for player one) value move
else:
min_move_eval = math.inf
for move in range(COLUMNS):
if valid_move(board, move):
#print_board(board)
make_move(board, move, player)
move_eval = minimax(board, depth-1, alpha, beta, 1, move)
unmake_move(board, move)
min_move_eval = min(min_move_eval, move_eval)
beta = min(beta, move_eval)
if beta <= alpha:
break
return min_move_eval
def computer_move(board, difficulty, player):
other_player = int(not bool(player-1)) 1
move_vals=[]
for move in range(COLUMNS):
if(valid_move(board, move)):
# Change 2
make_move(board, move, player)
move_vals.append(minimax(board,
difficulty,
-math.inf,
math.inf,
other_player,
move))
unmake_move(board, move)
else:
if(player==2):
move_vals.append(-math.inf)
if(player==1):
move_vals.append(math.inf)
min_val = min(move_vals)
max_val = max(move_vals)
if(player==1):
move = move_vals.index(max_val)
else:
move = move_vals.index(min_val)
make_move(board, move, player)
print('Here is the computer\'s move:')
print_board(board)