I'm working on a tutorial showing how Breadth First Search, Dijkstra's Algorithm, Greedy Best-First Search, and A* are related. I'm focusing on keeping the code simple and easy to understand. {Note: this blog post is a shorter version of a page on my site}.

While tracking down a bug in my Dijkstra's Algorithm code, I realized I had forgotten to implement the "reprioritize" case. That led me to wonder why my code seemed to work without handling that case. It turns out that case is never triggered by the graphs I'm searching over. Let me explore this …

Algorithm textbooks describe the algorithm as starting with ∞ everywhere and then checking the new cost against the old cost. The old cost is always defined so it's a single test:

if cost_so_far[current] + cost(current, next) < cost_so_far[next]:
    cost_so_far[next] = cost_so_far[current] + cost(current, next)
    came_from[next] = current
    frontier.reprioritize(next, cost_so_far[next])

In practice I don't initialize the priority queue to have all vertices with distance set to ∞. Instead, I start the priority queue with only the start element, and only insert elements when their distance is finite. Keeping the priority queue small makes it significantly faster, especially with early exit.

However, this complicates the code. I now have two cases instead of one:

  1. This location wasn't already in the priority queue. The old cost is implicitly ∞, so the new cost is always lower. We need to insert the node into the priority queue. This is a simpler operation, and it's more common, so it's worth separating out.
  2. This location was already in the priority queue. When the new cost is lower than the old cost, we need to reprioritize the existing node already in the priority queue. This is rare and more complicated than an insert.

This logic can go either in the algorithm code or in the priority queue code. I had put it in the algorithm (but in the tutorial I am going to instead put it in the priority queue, because it keeps the algorithm easier to understand).

if next not in cost_so_far:
    cost_so_far[next] = cost_so_far[current] + cost(current, next)
    came_from[next] = current
    frontier.insert(next, cost_so_far[next])
elif cost_so_far[current] + cost(current, next) < cost_so_far[next]:
    cost_so_far[next] = cost_so_far[current] + cost(current, next)
    came_from[next] = current
    frontier.reprioritize(next, cost_so_far[next])

I wanted to better understand which situations lead to the reprioritize case. Let's say we're looking at edge b → c and we previously found c via edge a → c. (Note: algorithm textbooks call this g but I call it cost_so_far)

  1. To trigger the reprioritize case, I need the old cost to be worse than the new cost: g(a) + cost(a,c) > g(b) + cost(b,c) .
  2. Dijkstra's Algorithm explores nodes in increasing order. Since a was explored before b, g(a) ≤ g(b) , or equivalently, g(b) - g(a) ≥ 0

Those two together can be rewritten as cost(a,c) - cost(b,c) > g(b) - g(a) ≥ 0

That means when cost(a,c) - cost(b,c) > 0 we need to reprioritize.

In several of my game projects, cost(x,y) depends only on y. That means cost(a,c) = cost(b,c), and we never need to reprioritize.

That means any bugs in my reprioritize code never mattered!

It also means the abstraction boundaries between the graph data structure, the search algorithm, and the priority queue data structure potentially make the code more complicated and hurting optimization opportunities. I think this rule works, but I'm not sure: if we have a monotonic ordering (always with Dijkstra's algorithm, and A* with a consistent heuristic), and the movement cost only depends on the target node, then we never need to reprioritize. And that means we can use a simpler and faster regular priority queue instead of a reprioritizable priority queue.

Labels: ,

4 comments:

CaptainKraft wrote at July 10, 2014 12:24 PM

I have been meaning to implement some path-finding just as a learning exercise and because I'm sure I'll need it soon. Can't wait to read what you come up with.

CC. Kao wrote at November 09, 2018 12:20 AM


I think you can apply a "lazy deletion" approach: Simply mark the visited vertex and insert the vertex multiple times when encountered. In this case the vertex with "better" cost will always be extracted from the priority queue before all the other "bad" ones of the same vertex. And they will eventually be discarded since they are visited before and the min cost has been found.

Unknown wrote at May 04, 2019 11:58 PM

The Fedex algorithm says if you drive on the right, 3 right turns are cheaper than one left turn. But then you cross your path, and reprioritising the movement cost for that location is not the right answer. I don't know how to solve that.

Amit wrote at May 11, 2019 8:35 AM

@Unknown: The graph nodes should contain all the relevant information for making decisions. For FedEx, the direction is relevant, so the direction should be part of the graph location. You will have a separate movement cost for the second time you cross that path, because the two crossings are different *graph* locations. A* will work exactly the same in this case, as it does not "know" what a map location is. It only looks at graph locations. I have a little bit written at http://www-cs-students.stanford.edu/~amitp/GameProgramming/MapRepresentations.html#road-maps but haven't explored this in depth.