How can I improve performance of Sudoku solver?-CodePudding

I can't improve the performance of the following Sudoku Solver code. I know there are 3 loops here and they probably cause slow performance but I can't find a better/more efficient way. "board" is mutated with every iteration of recursion - if there are no zeros left, I just need to exit the recursion.

I tried to isolate "board" from mutation but it hasn't changed the performance. I also tried to use list comprehension for the top 2 "for" loops (i.e. only loop through rows and columns with zeros), tried to find coordinates of all zeros, and then use a single loop to go through them - hasn't helped.

I think I'm doing something fundamentally wrong here with recursion - any advice or recommendation on how to make the solution faster?

def box(board,row,column):
    start_col = column - (column % 3)
    finish_col = start_col   3
    start_row = row - (row % 3)
    finish_row = start_row   3
    return [y for x in board[start_row:finish_row] for y in x[start_col:finish_col]]

def possible_values(board,row,column):
    values = {1,2,3,4,5,6,7,8,9}
    col_values = [v[column] for v in board]
    row_values = board[row]
    box_values = box(board, row, column)
    return (values - set(row_values   col_values   box_values))

def solve(board, i_row = 0, i_col = 0):
    for rn in range(i_row,len(board)):
        if rn != i_row: i_col = 0
        for cn in range(i_col,len(board)):
            if board[rn][cn] == 0:
                options = possible_values(board, rn, cn)
                for board[rn][cn] in options:
                    if solve(board, rn, cn): 
                        return board
                    board[rn][cn] = 0
                #if no options left for the cell, go to previous cell and try next option
                return False
    #if no zeros left on the board, problem is solved
    return True

problem = [
            [9, 0, 0, 0, 8, 0, 0, 0, 1],
            [0, 0, 0, 4, 0, 6, 0, 0, 0],
            [0, 0, 5, 0, 7, 0, 3, 0, 0],
            [0, 6, 0, 0, 0, 0, 0, 4, 0],
            [4, 0, 1, 0, 6, 0, 5, 0, 8],
            [0, 9, 0, 0, 0, 0, 0, 2, 0],
            [0, 0, 7, 0, 3, 0, 2, 0, 0],
            [0, 0, 0, 7, 0, 5, 0, 0, 0],
            [1, 0, 0, 0, 4, 0, 0, 0, 7]
        ]

solve(problem)

CodePudding user response：

Three things you can do to speed this up:

Maintain additional state using arrays of integers to keep track of row, col, and box candidates (or equivalently values already used) so that finding possible values is just possible_values = row_candidates[row] & col_candidates[col] & box_candidates[box]. This is a constant factors thing that will change very little in your approach.
As kosciej16 suggested use the min-remaining-values heuristic for selecting which cell to fill next. This will turn your algorithm into crypto-DPLL, giving you early conflict detection (cells with 0 candiates), constraint propagation (cells with 1 candidate), and a lower effective branching factor for the rest.
Add logic to detect hidden singles (like the Norvig solver does). This will make your solver a little slower for the simplest puzzles, but it will make a huge difference for puzzles where hidden singles are important (like 17 clue puzzles).

CodePudding user response：

A result that worked at the end thanks to 53x15 and kosciej16. Not ideal or most optimal but passes the required performance test:

def solve(board, i_row = 0, i_col = 0):
    cells_to_solve = [((rn, cn), possible_values(board,rn,cn)) for rn in range(len(board)) for cn in range(len(board)) if board[rn][cn] == 0]
    if not cells_to_solve: return True

    min_n_of_values = min([len(x[1]) for x in cells_to_solve])
    if min_n_of_values == 0: return False

    best_cells_to_try = [((rn,cn),options) for ((rn,cn),options) in cells_to_solve if len(options) == min_n_of_values]

    for ((rn,cn),options) in best_cells_to_try:
        for board[rn][cn] in options:
            if solve(board, rn, cn): 
                return board
            board[rn][cn] = 0
        return False