# I walk the (train) line – part quatre – Tell me why

**Embracing the Random | R**, and kindly contributed to R-bloggers]. (You can report issue about the content on this page here)

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.

*(TL;DR: Dijkstra’s algorithm is correct. Mathematical proof inside
<3<3<3)*

**LaTeX and MathJax warning for those viewing my feed: please view
directly on website!**

This is part four of our epic. Find the other parts here:

Last time, we learnt about Dijkstra’s algorithm and implemented it in

R.

We found the shortest path from Newtown Station to Small Bar.

But this is not enough for me! I’m obsessed with learning the details of

how things work. Merely implementing Dijkstra’s algorithm would leave me

feeling empty.

Let’s **prove** that Dijkstra’s algorithm works!

# Proving the “correctness” of Dijkstra’s algorithm

I’ll be analysing the proof found on **page 597** of *Cormen et al,
Introduction to Algorithms, Second Edition*. Excellent book – please get

it for yourself!

## What are we proving?

To quote directly from page 597 of the book:

Dijkstra’s algorithm, run on a weighted, directed graph

with non- negative weight function and source , terminates

with for all vertices

What in the above quote do we already understand from our knowledge gained from

the earlier posts?

- We know what weighted, directed graphs are from my previous posts.
- We also know that a graph is made up of a set of vertices

and a set of edges . - We know what a source node is.

Let’s define the stuff that we have not yet encountered:

- We don’t know what a
**weight function**is. It’s simply some

mathematical function that**takes in the set of edges and maps them**. More formally,

to real numbers (the weights)

. In our case, we want**non-negative**only.

weights - is the distance that Dijkstra’s algorithm has calculated

from our source node to some vertex in our set of

vertices . - is the
**actual**shortest path distance between

our source node and some vertex in our graph.

Now let’s translate the initial quote!

As we mentioned last time, Dijkstra solves the **single source shortest
path** problem. Let’s assume that we know the

**true**shortest

distances from our source node to all other nodes in our graph

(including the source node itself). Then, to prove that Dijkstra’s

algorithm is “correct”, we need to show that when the algorithm

terminates, the distances found by Dijkstra from our source node to all

other nodes in our graph (including the source node itself) are equal to

the “true” shortest distances.

Good!

## How do we prove this? We focus on the ‘while’ loop!

To prove the correctness of Dijkstra, the authors focus on the big

**while loop** in Dijkstra. Here are some key lines from our

implementation in

GitHub

from last time:

```
<span class="k">while</span><span class="w"> </span><span class="p">(</span><span class="n">nodes_left_to_process</span><span class="p">(</span><span class="n">vec_of_nodes</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">...</span><span class="w">
</span><span class="n">vec_of_nodes</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">remove_min_dist_node</span><span class="p">(</span><span class="n">vec_of_nodes</span><span class="p">,</span><span class="w"> </span><span class="n">min_dist_node_id</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
```

In the book, the authors explicitly maintain a set of nodes that

have been processed by Dijkstra. The `while`

loop terminates when this

set (that is, we have processed all nodes in our graph). In

our implementation, we instead maintained a set of **nodes that haven’t
been processed yet** (

`vec_of_nodes`

). Our loop continues until this setis empty (i.e. when we have processed all nodes in our graph). I will

state everything the authors’ write

**in terms of our implementation**.

If you’re following along with the book, keep this in mind!

The authors use what is called a **loop invariant** to prove Dijkstra’s

correctness. These proofs are broken down into three phases. From page

18 of the book, those three phases are these:

Initialisation:It is true prior to the first iteration of the

loop.

Maintenance:If it is true before an iteration of the loop,

it remains true before the next iteration.

Termination:When the loop terminates, the invariant gives us a

useful property that helps show that the algorithm is correct.

The initialisation phase is like the **base case** in an induction proof. The maintenance

phase is like the **inductive step**. The termination phase is something different.

Unlike in an induction proof, we don’t continue indefinitely. Our algorithm terminates!

The authors use this specific loop invariant:

At the start of each iteration of the while loop …

for each vertex … (which we have finished processing.)

Time to ATAAAAACK!

## Structure of the proof

Let’s tackle each part!

### Initialisation

Initially, we have not processed any vertices. So our invariant is

trivially true! We are done with this step.

### Maintenance

This is where the bulk of the proof is.

Let’s show that when we are done processing our vertex ,

. We don’t go about this directly. We proceed to

show this indirectly via a **proof by contradiction**. We assume something

is true. We start to reason and arrive at some contradiction. This

contradiction means that our original assumption cannot be true.

Therefore, the opposite must be true!

If this is the first time you’ve seen a proof by contradiction, this

might be very confusing. If so, see this nice

article.

Let’s do this!

### Let’s start with an assumption

This is the assumption that we will make which will later give rise to a

contradiction:

There is some vertex that we have just finished processing which

is thefirstvertex for which .

If we cast our minds back to last time, Dijkstra begins by first

processing our source node, . We set the distance from our source

node to our source node to **zero** in this little function:

```
<span class="n">initialise_distances_vector</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">distances_vector</span><span class="p">,</span><span class="w"> </span><span class="n">source_node_id</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">distances_vector</span><span class="p">[[</span><span class="n">source_node_id</span><span class="p">]]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">0</span><span class="w">
</span><span class="nf">return</span><span class="p">(</span><span class="n">distances_vector</span><span class="p">)</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
```

Therefore, (the distance found by Dijkstra from our source node

to our source node) is equal to (the true distance from

our source node to our source node). They are the same node, so the

distance is clearly zero!

So we have our **first observation** – this mysterious vertex

cannot be our source node. That is, .

Next, we reason that there must be some path from our source node

to our node . Remember the `LARGE_NUMBER`

constant we defined in

our implementation of Dijkstra?

```
<span class="n">LARGE_NUMBER</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="m">999999</span><span class="w">
</span>
```

We set all distances in our vector of distances to this `LARGE_NUMBER`

in this function:

```
<span class="n">create_distances_vec</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="k">function</span><span class="p">(</span><span class="n">osm_graph</span><span class="p">,</span><span class="w"> </span><span class="n">source_node_id</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">num_nodes</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">vcount</span><span class="p">(</span><span class="n">osm_graph</span><span class="p">)</span><span class="w">
</span><span class="n">distances_vec</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="nf">rep</span><span class="p">(</span><span class="n">LARGE_NUMBER</span><span class="p">,</span><span class="w"> </span><span class="n">num_nodes</span><span class="p">)</span><span class="w">
</span><span class="n">...</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
```

In the book, the authors initialise all distances by setting them to

. They are writing a proof in the abstract world of maths. Our

job is to implement the algorithm in an R program. Our choice to use a

`LARGE_NUMBER`

constant of 999,999 metres for our little map of a part

of Sydney has the same practical effect. However, as we are

mathematically proving the properties of our algorithm, let’s talk in

terms of .

The authors reason that there must be some path from to ,

because if there was no path at all, .

That is, the distance found by Dijkstra would remain as the

value we initialised it with, and therefore, our algorithm would have

found the **“correct”** distance from to (which happens to

be because there is no path in our graph from to

). If our algorithm found the correct distance between and

, then this would have violated our initial assumption that

.

We are safe to proceed! We have our **second observation** – there is

some path between and . And because there is “some path”,

**there must exist a shortest path!**

To explain the next part, let’s refer to this image which can be found

on p597 of the book:

This image shows us some path , which connects our source node

to our vertex . The dark blob named is the set of nodes

that have already been processed.

We have a vertex which is the first vertex along the path

that has not been processed yet (notice how it is outside of the dark

blob ). is the predecessor on this path (i.e. the vertex

directly before ), and has already been processed by our algorithm.

The authors reason that because is the **first vertex** for which our

algorithm fails to find the **actual** shortest path, when was

processed, Dijkstra must have found the correct shortest path. That is,

. When we processed , all edges out of

were **relaxed**. These are the relevant lines in our implementation:

```
<span class="k">for</span><span class="w"> </span><span class="p">(</span><span class="n">i</span><span class="w"> </span><span class="k">in</span><span class="w"> </span><span class="nf">seq_along</span><span class="p">(</span><span class="n">adjacent_to_min_dist_node</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">...</span><span class="w">
</span><span class="k">if</span><span class="w"> </span><span class="p">(</span><span class="n">current_distance_to</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">distance_to_adj_node</span><span class="w"> </span><span class="o"><</span><span class="w"> </span><span class="n">curr_known_dist_to_adj_node</span><span class="p">)</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">vec_of_distances</span><span class="p">[[</span><span class="n">adjacent_node_id</span><span class="p">]]</span><span class="w"> </span><span class="o"><-</span><span class="w"> </span><span class="n">current_distance_to</span><span class="w"> </span><span class="o">+</span><span class="w"> </span><span class="n">distance_to_adj_node</span><span class="w">
</span><span class="n">vec_of_predecessors</span><span class="p">[[</span><span class="n">adjacent_node_id</span><span class="p">]]</span><span class="w"> </span><span class="o">=</span><span class="w"> </span><span class="n">min_dist_node_id</span><span class="w">
</span><span class="p">}</span><span class="w">
</span><span class="n">...</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
```

This means that the algorithm found the correct shortest path distance

to ! That is, we have .

### Let’s contradict our assumption!

We started with an initial assumption that our algorithm found the wrong

shortest path distance for our vertex . That is,

. We assumed that was the first vertex

where this had happened. We will now contradict our assumption and show

that in fact !

The vertex occurs prior to on some shortest path. Therefore,

the shortest path distance between our source and must be

less than or equal to our distance from our source node to . Why

**less than or equal to**? Because and may be the same node.

So . This means that:

The last bit is relevant because Dijkstra relaxes edge

weights downward from our initial `LARGE_NUMBER`

value and continues to

decrease the edge weights until their reach their lower bounds. If

is the lower bound, then the smallest edge weight

that the algorithm can find is .

The authors then continue – both and were unprocessed when

our algorithm chose to process in its **while** loop. So that

means, and that:

What??? Did you see that???

**CONTRADICTION!!!**

This means that when our algorithm is done processing , we have in

fact found the **actual** shortest path between our source and

and that this shortest distance is maintained through to when the

algorithm terminates.

Hooray! Our maintenance step is complete.

### Termination

This step is simple! Once our algorithm terminates, we have finished

processing all vertices. That is, our vector of nodes left to process is

empty and our loop stops:

```
<span class="k">while</span><span class="w"> </span><span class="p">(</span><span class="n">nodes_left_to_process</span><span class="p">(</span><span class="n">vec_of_nodes</span><span class="p">))</span><span class="w"> </span><span class="p">{</span><span class="w">
</span><span class="n">...</span><span class="w">
</span><span class="p">}</span><span class="w">
</span>
```

This termination phase along with the maintenance phase above show that

Dijkstra’s algorithm finds the shortest path distance for all vertices

(where is the set of vertices in our graph).

# Next time

Mathematics can be difficult…but I see so much beauty in it! The way the

steps in mathematical proofs flow is breathtaking.

Next time, **I will finally write about how I came up with different
routes to walk to work!** It involves a little bit of geometry 😉

Justin

**leave a comment**for the author, please follow the link and comment on their blog:

**Embracing the Random | R**.

R-bloggers.com offers

**daily e-mail updates**about R news and tutorials about learning R and many other topics. Click here if you're looking to post or find an R/data-science job.

Want to share your content on R-bloggers? click here if you have a blog, or here if you don't.