Hi there! This post is a collaboration with Alejandro, taken directly from our upcoming book, The Hitchhiker's Guide to Graphs. This project grew out of our series in graph theory here, aiming to focus on the algorithmic aspects as well. We'll even implement our custom graph library from scratch.
The book is already available in early access, with instant updates to the content as we write it.
Alejandro is one of my favorite writers and one of my original inspirations for starting to talk about mathematics online. We complement each other in knowledge and style: He is a master of computer science and algorithms, and I make the necessary theoretical background as clear as possible.
If you would like to see what's the book like, check out our introductory series on graph theory:
Episode 00: What do your brain, your friends, and the city you live in have in common
Episode 01: Running into the woods: trees, forests, and spanning subgraphs
You have been hired by the city council to measure the total length of all the streets in your hometown. To do so, you have a car with a bunch of expensive sensors that can accurately measure how far it has moved in any direction. So you hop in and start driving, making sure that every time you reach a corner, you turn into a new street.
For this method to work, you must be able to drive through all the streets in a single run without repeating any street. Is this always possible? And if it isn’t, what is the minimum number of different circuits you would need to complete?
This is an example of an Eulerian tour. A graph tour, in the general sense, is a path that visits the “entire” graph, whatever “entire” means. There are two fundamental types of tours: Eulerian (that visits all edges) and Hamiltonian (that visits all nodes).
In this lecture, we will explore tours and answer two deeply related questions: - When does a given graph contain an Eulerian or Hamiltonian tour? - If it exists, can we find such a tour efficiently?
Although finding Eulerian and Hamiltonian tours seem very similar, we will discover that they are of fundamentally different difficulty levels. Eulerian tours are pretty easy to analyze and compute while finding Hamiltonian tours is one of the hardest problems in Computer Science.
Eulerian tours
Eulerian tours are, intuitively, just paths that contain all edges exactly once. They are named after Leonard Euler, the most influential mathematician ever to have lived. The following is a free artistic retelling of this story.
Legend tells that Euler once visited an old town in Prussia named Königsberg, famous for an unsolvable riddle that had taken over everyone’s sanity. The town was built around a river, featuring seven bridges that crossed different sections of it. The villagers had tried for ages to find a path through the town that crossed each bridge exactly once and returned to the point of origin, to no avail. While many thought it wasn’t possible, some claimed it was, to the point that frequent fights would burst in the middle of the street over this argument.
When Euler came to the town, the Major of Königsberg was eager to offer him all sorts of incentives if he could solve the seven bridges riddle and end all discussions once and for all. Euler, being a world-renowned mathematician but also a pious and respectful man, relinquished all the gifts and promised to solve the riddle for the glory of rational thinking alone.
Thus, he set to frame this problem in a way that was amenable to mathematical analysis. He quickly understood the following:
It doesn’t matter where you start, as a cycle can be traversed starting from any point.
It doesn’t matter what the exact location of each bridge is, only their relative structure, that is, from which bridge you can move to another.
It doesn’t matter which path you choose between buildings in the town; all that matters is which bridge you cross at each time.
With these intuitions in mind, Euler transformed the unsolvable riddle into something much simpler to analyze. By identifying the main islands and the bridges between them, Euler constructed what would later be called a multigraph: a graph with possibly more than one edge between pairs of vertices.
With this simplification, the problem becomes a matter of finding a cycle that contains all edges exactly once. The theory that Euler developed — and we will see shortly — provides a concise and definite answer for all multigraphs (and regular graphs) which, in the case of the Königsberg bridges, settled the dispute once and for all.
By doing so, Euler not only prevented countless bar fights in Königsberg but also single-handedly invented the field of graph theory, laid out the foundations of mathematical topology, and proved the first-ever graph theorem. How’s that for a productive weekend?
Finding Eulerian tours
Let’s kick off with some definitions.
Definition 1. In a multigraph G, an Eulerian path (cycle) is a path (cycle) that contains all edges exactly once.
We say that a graph is Eulerian if it contains an Eulerian cycle. The question we want to answer, as in all our previous lectures, is whether we can find a necessary and sufficient condition to prove that a graph is Eulerian. Ideally, checking this condition should also be computationally efficient. As before, we will start by developing a necessary condition, and then see if we can extend it to sufficiency.
The first to notice is that an Eulerian cycle will contain all vertices with degree greater than zero, and many of them probably more than once. Since all edges must be there exactly once, every vertex with at least one edge must appear. Furthermore, since it is a cycle, every vertex that appears in the cycle must have at least two neighbors, since you have to enter and leave the vertex from different edges.
This quickly shows that if a vertex has an odd degree, there is at least one edge we can never fit into our cycle. Why? Because every time you enter and leave a vertex, you mark two different edges. Thus, if the vertex appears k times in the cycle, you covered exactly 2k distinct edges. Can we formalize this claim?
Lemma 1. If a multigraph G is Eulerian, then every vertex has even degree.
Note: in a multigraph, two vertices u and v can be connected by more than one edge, so instead of defining deg(v) as the number of vertices connected to v, we define it as the number of edges incident on v, so the same adjacent vertex u counts as many times as edges it shares with v.
Proof. Let C be an Eulerian cycle of graph G, and v ∈ V(G) an arbitrary vertex, with deg(v) > 0. (If deg(v) = 0, then it’s trivially even.)
Let’s assume that v appears k times in C, and let (u₁, v, w₁), …, (uₖ, v, wₖ), be the k respective sequences of vertices that appear immediately before and after v in the cycle C. Notice that some vertex wᵢ can coincide with uᵢ₊₁, but the edges (uᵢ, v) and (v, wⱼ) are all distinct because C is an Eulerian cycle.
Thus, there are a total of 2k different edges incident to v. Furthermore, all edges incident at v appear in C, which means deg(v) = 2k is even. ◼
Can we turn this lemma into a sufficient condition? Incredibly, yes. All we need is even degrees to ensure the existence of an Eulerian cycle. To show why this is true, we must show how to construct such a cycle.
Theorem 1. (Euler) A connected multigraph G is Eulerian if and only if every vertex has an even degree.
We will show how to build an Eulerian cycle for any multigraph with all even degrees. An intuitive idea may be to walk around the graph, picking a new edge every time. However, this may not always work: if we are not careful about the order of edges, we might close a loop too soon.
Now, deciding exactly how to follow the edges to build an Eulerian cycle in a single iteration is considerably more tricky than just finding a cycle in general. So, we won’t attempt that — at least until our upcoming series on graph algorithms.
Instead, we will show how you can build a cycle via induction by chaining together smaller sub-cycles. Not the most efficient solution, but it does give us a beautiful constructive proof. Here we go.
Proof. One side of the theorem (the necessary part) is already proven in Lemma 1. We will prove the sufficient part of the theorem via strong induction on the size m (number of edges) of G.
The first base case we need is for m = 2, and the only connected multigraph with even degrees is this one, which is trivially Eulerian. The second base case is when the n = 2 (the number of vertices) and m ≥ 3, in which case the graph must have an even number of edges, and is also trivially Eulerian.
Now, let’s assume that m ≥ 3 and n ≥ 3. Then, we can find three vertices u, v, and w connected via uv and vw. We will remove these two edges and add uw, reducing E(G) by exactly one edge and leaving all vertices with even degrees. (Note that it doesn’t matter if u and w were already connected because G is a multigraph).
The resulting graph G’ has exactly one less edge than G, and all degrees are still even (we took two from deg(v) and left deg(u) and deg(w) the same). Thus, if G’ is connected, it contains an Eulerian cycle C′ by the inductive hypothesis. Replacing uw with uv and vw in C’ we obtain an Eulerian cycle C for our original graph.
Now, if G’ is not connected, we have a little more work to do, but we can still solve it. First, note we could have created at most two connected components, such that u and w must be in one of them and v in the other (perhaps alone). Each component has its own Eulerian cycle (check for yourself that the inductive conditions hold). Thus, we can rewire both cycles as shown in the picture, going from u to v, then across the whole cycle that contains v, and then back from v to w, completing the first cycle, thus building a large Eulerian cycle for G. Note that if v is alone in a connected component, it still works, we would just enter v and exit immediately, as there would be no upper cycle.
With these two cases, we have shown the existence of the Eulerian cycle. ◼
And that’s it! We have shown how to build an Eulerian cycle in any connected multigraph with even degrees. Although our theory and proof involved the notion of multigraph, it also works for regular graphs, which are just a special type of multigraphs.
With this result, we can easily prove the following low-hanging corollary. (Corollary is a theorem that is very easy to prove from another.)
Corollary 1. (Euler) A connected multigraph G has an Eulerian path (not cycle) if and only if there are exactly two vertices x and y with an odd degree. Furthermore, x and y are the path extremes.
In a typical professor move, we will leave this proof to you! But it should be easy; here’s a hint: consider what happens if you magically add an edge between x and y.
Back to Königsberg
We’re now ready to answer the infamous Königsberg riddle! Is there an Eulerian path? Well, let’s count the degrees!
All vertices have an odd degree, so there’s no hope. There is not even the possibility of an Eulerian path. But, rest assured, major of Könisberg, we have one solution for you. By demolishing just one bridge, you can leave the city connected and ensure the existence of an Eulerian path! Can you figure out which one?
Hamiltonian Tours
If Eulerian tours are about visiting all the edges, then Hamiltonian tours are about the vertices. In a Hamiltonian cycle, you want all vertices to appear exactly once, regardless of their degree. Think something like the census folks visiting all houses in a city. Ideally, you’ll want to do it in a single drive-by, not needing to go back unnecessarily.
This means in a Hamiltonian cycle, you will leave a bunch of edges out. (In fact, you’ll keep as many edges as vertices.) This fact alone makes finding Hamiltonian cycles exceedingly more difficult because there are many ways in which you could select those edges.
So, in contrast to Eulerian tours, we unfortunately won’t be able to find any nice sufficient and necessary conditions to check if a graph is Hamiltonian. The best we can do is enumerate some necessary conditions on one side and some sufficient conditions on another, but nothing that answers the question satisfactorily.
Since the theory behind Hamiltonian cycles is far more complex than anything we’ve covered so far, we will focus on proving just one fundamental result, giving a glimpse of the kind of reasoning involved. But first, some definitions. Buckle up!
Definition 2. In a graph G, a cycle C that contains every vertex exactly once is called a Hamiltonian cycle. The graph is said to be Hamiltonian if it contains at least one such cycle.
With that out of the way, let’s focus first on some pretty obvious necessary conditions.
Theorem 2. If a graph G is Hamiltonian, then the following conditions hold:
G is connected.
G has more than three vertices.
G has no cut vertices.
Proof. The first two are obvious from the definition of a Hamiltonian cycle. G must be connected because all vertices appear in a cycle, and there must be three or more because that’s the minimum cycle length. (We’re back on regular graphs, not multigraphs.)
The third condition is also easy to verify. A cut vertex x is, by definition, one that splits the graph into more than one connected component. For that to happen, there must be at least two vertices such that all paths among them contain x. Removing x necessarily leaves them disconnected. However, since there is a cycle that contains all vertices in G, then for every pair of vertices u, v, we have at least two disjoint paths connecting them. Thus, no vertex appears in all paths among any other pair. ◼
The existence of a Hamiltonian cycle
Now, let’s tackle the big theorem. The main intuition behind all results regarding Hamiltonian cycles is that the more edges you have, the more likely you’ll find one. This is of course not a necessary condition, because any graph that is exactly one big cycle — say, a loop with a hundred vertices — is trivially Hamiltonian while containing as few edges as possible. But for arbitrary graphs, it seems intuitive that a Hamiltonian cycle becomes more and more likely as you keep adding edges.
Is there a minimal number of edges (or more precisely, a minimal degree) that guarantees the existence of a Hamiltonian cycle? Yes, there is!
This is precisely the claim of the following theorem.
Theorem 3. (Dirac) Given a graph G with n ≥ 3 vertices and minimum degree δ ≥ n/2 (that is, deg(v) ≥ n/2 for all vertices v), then there is a Hamiltonian cycle in G.
The proof of this theorem uses one of the most beautiful strategies in graph theory. We take the longest path in G and prove that this path can be converted into a cycle that contains every edge. For that, we will heavily use the fact that the longest path is one that, by definition, cannot be extended any further.
Proof. Let P = v₁, …, vₖ be the longest path in G.
Since P is maximal, all vertices adjacent to both v₁ and vₖ must be contained in it, because otherwise we could extend it. This means the length of P is at least (n/2) + 1, since deg(v) ≥ n/2. That is, P contains more than half of G’s vertices. This will be important later on.
Now, let’s prove that there is indeed a cycle lurking around P. For a cycle to appear, we would need to find the following pattern: say v₁ is connected to some vertex vᵢ, and vₖ happens to be connected precisely to vᵢ₋₁. If that were the case, we could easily turn P into a cycle, as the following figure shows.
That is, start at v₁, jump to vᵢ, continue on the original path until vₖ, return with a jump to vᵢ₋₁, and trace the original path back towards v₁.
The question is, how can we be sure that such a pattern must exist in P? Well, imagine that this is not the case, meaning that for every vertex vᵢ connected to v₁, the corresponding vᵢ₋₁ cannot be connected to vₖ. But since there are at least n/2 vertices connected to v₁, we need n/2 vertices in P not connected to vₖ. On the other hand, vₖ is also connected to at least n/2 vertices in P, so we have n/2 + n/2 = n vertices in P different from vₖ, which is impossible because there are only n vertices in the graph!
(This is a pigeonhole type of argument.)
Thus, P can be extended into a cycle. Fair enough. But just how large this cycle is? Well, let’s assume there is some vertex u outside P. Since deg(u) ≥ n/2 and P contains more than half of the vertices, then u is adjacent to at least some vⱼ inside P. But if that is the case, now knowing that P gives us a cycle, we could unwind it as the following figure shows (start at u, jump to vⱼ, trace the cycle all the way back until vⱼ₊₁) and build a path P′ larger than P (with one extra vertex u), contradicting the claim that P is the longest path.
Thus, no vertex u can be outside P, which means we actually have a Hamiltonian cycle! ◼
Beyond Dirac
Dirac’s theorem is one of the most important and beautiful results in the theory of Hamiltonian cycles, but it is only the beginning. We can improve Dirac in at least two directions. First, we can make it more precise for specific types of graphs. For example, in bipartite graphs, the n/2 bound is too strong because that would mean the bipartite graph is complete! Instead, we can prove that we only need δ >|X|/2, that is, half the vertices in each partition. (Assuming that both partitions are the same size; otherwise, we can’t complete the cycle.)
The second direction is generalizing to more flexible conditions than all vertices having deg(v) ≥ n/2 exactly. Actually, using a similar idea to Dirac, we can prove that if deg(u) + deg(v) ≥ n holds for all pairs of vertices, we can also find a Hamiltonian cycle. The uniform condition deg(u) ≥ n/2 is just one special case.
Final remarks
Eulerian and Hamiltonian tours are graph problems that seem very similar on the surface, but are surprisingly different when you dig deeper. For the first we can find very easy and straightforward conditions that are both necessary and sufficient, giving us very efficient algorithms (which we haven’t mentioned, but we will eventually). However, the second is a well-known NP-complete problem, which means we are unlikely to ever find something as nice as Euler’s theorem.
This is one of the most interesting characteristics of computer science in general. The frontier between easy and nearly unsolvable problems is very jagged. By changing a small condition in a simple problem, you can easily turn it, unknowingly, into a godly beast that no one knows how to handle.
All is not lost, though. Even for NP-complete and NP-hard problems, there are some strategies we can try. We call them “heuristic algorithms” because they will not always work, or work efficiently, but we can find strategies that work pretty well most of the time. But that’s a story for another day.