To solve problem 83 of project euler I tried to use the A* algorithm. The algorithm works fine for the given problem and I get the correct result. But when I visualized the algorithm I realized that it seems as if the algorithm checked way to many possible nodes. Is it because I didn't implement the algorithm properly or am I missing something else? I tried using two different heuristic functions which you can see in the code below, but the output didn't change much.
If there are any tips to make the code more efficient I would also be thankful for that as I am quite a beginner.
import heapq
import matplotlib.pyplot as plt
from matplotlib.colors import ListedColormap
from matplotlib import animation
import numpy as np
class PriorityQueue:
def __init__(self):
self.elements = []
def empty(self):
return not self.elements
def put(self, item, priority):
heapq.heappush(self.elements, (priority, item))
def get(self):
return heapq.heappop(self.elements)[1]
class A_star:
def __init__(self, data, start, end):
self.data = data
self.start = start
self.end = end
self.a = len(self.data)
self.b = len(self.data[0])
def h_matrix(self):
elements = sorted([self.data[i][j] for j in range(self.b) for i in range(self.a)])
n = self.a self.b - 1
minimum = elements[:n]
h = []
for i in range(self.a):
h_i = []
for j in range(self.b):
h_i.append(sum(minimum[:(n-i-j-1)]))
h.append(h_i)
return h
def astar(self):
h = self.h_matrix()
open_list = PriorityQueue()
open_list.put(self.start, 0)
came_from = {}
cost_so_far = {}
came_from[self.start] = None
cost_so_far[self.start] = self.data[0][0]
checked = []
while not open_list.empty():
current = open_list.get()
checked.append(current)
if current == self.end:
break
neighbors = [(current[0] x, current[1] y) for x, y in {(-1,0), (0,-1), (1,0), (0,1)}
if 0 <= current[0] x < self.a and 0 <= current[1] y < self.b]
for next in neighbors:
new_cost = cost_so_far[current] self.data[next[0]][next[1]]
if next not in cost_so_far or new_cost < cost_so_far[next]:
cost_so_far[next] = new_cost
priority = new_cost h[next[0]][next[1]]
open_list.put(next, priority)
came_from[next] = current
return came_from, checked, cost_so_far[self.end]
def reconstruct_path(self):
paths = self.astar()[0]
best_path = [self.end]
while best_path[0] is not None:
new = paths[best_path[0]]
best_path.insert(0, new)
return best_path[1:]
def minimum(self):
return self.astar()[2]
if __name__ == "__main__":
liste = [[131, 673, 234, 103, 18], [201, 96, 342, 965, 150], [630, 803, 746, 422, 111], [537, 699, 497, 121, 956], [805, 732, 524, 37, 331]]
path = A_star(liste, (0,0), (4,4))
print(path.astar())
#print(path.reconstruct_path())
path.plot_path(speed=200)
Here you can see my visualization for the 80x80 matrix given in the problem. Blue are all the points in checked and red is the optimal path. From my understanding it shouldn't be the case that every point in the matrix is in checked i.e. blue. https://i.stack.imgur.com/LKkdh.png
My initial guess would be that my heuristic function is not good enough. If I choose h=0, which would mean Dijkstra Algorithm the length of my checked list is 6400. Contrary if I use my custom h the length is 6455. But how can I improve the heuristic function for an arbitrary matrix? Thanks for any help.
CodePudding user response:
Let me start at the end of your post:
Marking cells as checked
if I use my custom h the length is 6455.
You should not have a size of checked
that exceeds the number of cells. So let me first suggest an improvement for that: instead of using a list, use set, and skip anything popped from the priority queue that is already in the set. The relevant code will then look like this:
checked = set()
while not open_list.empty():
current = open_list.get()
if current in checked: # don't repeat the same search
continue
checked.add(current)
And if in the end you need the list version of checked
, just return it that way:
return came_from, list(checked), cost_so_far[self.end]
Now to the main question:
Improving heuristic function
From my understanding it shouldn't be the case that every point in the matrix is in checked i.e. blue. My initial guess would be that my heuristic function is not good enough.
That is the right explanation. Combine that with the fact that the given matrix has paths which have a total cost which come quite close, so there is a competition field that involves much of the matrix.
how can I improve the heuristic function for an arbitrary matrix?
One idea is to consider the following. A path must include at least one element from each "forward" diagonal (/
). So if we work with the minimum value on each such diagonal and create running sums (backwards -- starting from the target), we'll have a workable value for h
.
Here is the code for that idea:
def h_matrix(self):
min_diagonals = [float("inf")] * (self.a self.b - 1)
# For each diagonal get the minimum cost on that diagonal
for i, row in enumerate(self.data):
for j, cost in enumerate(row):
min_diagonals[i j] = min(min_diagonals[i j], cost)
# Create a running sum in backward direction
for i in range(len(min_diagonals) - 2, -1, -1):
min_diagonals[i] = min_diagonals[i 1]
min_diagonals.append(0) # Add an entry for allowing the next logic to work
# These sums are a lower bound for the remaining cost towards the target
h = [
[min_diagonals[i j 1] for j in range(self.b)]
for i in range(self.a)
]
return h
With these improvements, we get these counts:
len(cost_so_far) == 6374
len(checked) == 6339
This represents still a large portion of the matrix, but at least a few cells were left out.