Categories
data structures and algorithms

Out of Place Sort a Singly Linked List in Python

While there are many sorting algorithms in computer science, there are just two ways to classify them. In place, and not in place (or out of place). In place sorting algorithms are ones that are run without an auxiliary data structure such as bubble sort. Out of place (or not in place) sorting algorithms are ones that need extra data structures like merge sort. In this post, we learn how to out of place sort a singly linked list.

We take an unordered singly linked list and create a sorted list out of it. I haven’t seen this algorithm anywhere else on the internet, this is an intuitive sorting algorithm I made up in my head. The time complexity of this algorithm is O(n^2). The space complexity of this algorithm is O(n) – the auxiliary data structure we create becomes the sorted list.

We cover:

  • What Functions Do You Need to Out of Place Sort on a Singly Linked List?
    • Finding the Minimum
    • Removing the Minimum
    • Full Code for a Custom Singly Linked List Class
  • Not In Place Sort Function for a Singly Linked List
    • Visualization of How to Sort a Singly Linked List
    • Code for How to Sort a Singly Linked List in Python
  • Sorting an Example Linked List
  • Summary of Out of Place Sort on a Singly Linked List in Python

What Functions Do You Need to Out of Place Sort on a Singly Linked List?

We cover linked lists and binary trees and sorting algorithms as parts of our introduction to data structures and algorithms series here on PythonAlgos. Now, we are combining the two to learn how to sort a singly linked list. Our original linked list implementation had helper functions to do things like get the size, run a search, delete a node, add a node, or traverse the list. We also had a linked list class separate from the Node class. 

Most of these functions are useless to us right now. For sorting a singly linked list out of place, let’s assume that we won’t bother individually deleting nodes, traversing the list, searching for specific values, or care about the size. We focus solely on the functions we need to actually get a sorted list. The functions that we need are a function to add nodes, a way to print the nodes (this is actually also a nice to have), a way to find the minimum, and a way to remove the minimum.

To start off the Node class, we create a simple __init__ method which takes the required self parameter and a value parameter. This __init__ method creates an initial Node out of the passed in value. We also set the next attribute to None – this is the attribute that makes the singly linked list linked.

Next, we create the append function. This function takes a value and appends it to the end of the linked list. Starting at the root Node, we iterate through until the next attribute is None, meaning we’ve reached the end of the list. Once we’ve reached the end of the list, we set the next value of the last Node to a Node with the passed in value.

The print_nodes function prints all the values of the nodes in one line separated by a space. Similar to the append function, we loop through all of the Nodes until we reach one where the next value is None. For each of the nodes, we print out the value, using an end parameter of a space so that each one does not print a newline before moving onto the next Node. Finally, we print out the value of the last node.

class Node(object):
   def __init__(self, value):
       self.value = value
       self.next = None
  
   def append(self, value):
       while self.next is not None:
           self = self.next
       self.next = Node(value)
 
   def print_nodes(self):
       while self.next is not None:
           print(self.value, end=" ")
           self = self.next
       print(self.value)

Finding the Minimum

The out of place sort algorithm for our singly linked list relies on repeatedly finding the minimum of the linked list. I actually refactored this code multiple times because there are a couple edge cases to consider here. The possible cases you could come across while looking for the minimum of a singly linked list include:

  • There is only one node and it is automatically the minimum
  • The first node is the minimum, and there are multiple nodes
  • The minimum is somewhere in the linked list, and it is not the first one

Handling all of these cases presents some nuance. In addition, we also return the parent when we find the minimum node. Remember that linked lists are only kept in memory by a reference to the root node. If we want to remove a node, we need to get a pointer to the parent of the node so we don’t lose access to the whole list.

The first case that we handle is the case that there is only one node. In this case, we don’t need to do anything but return the node. However, since we are also returning the parent for the last case, we need to return two objects. In this case, we will return a None type object as the parent, and the Node itself as the minimum. The other two cases can be handled with the same strategy.

To start off, we set the minimum node to the self, or the root node, and the parent node to None. Next, we loop through all the nodes until we reach a node that does not point to any other node (the second the last node). For each of the nodes, we check if the value of the next node is less than the value of the current minimum node. If it is, we set the parent node to the node we are on and the minimum node to the next node. Either way, we set the self to the next node to ensure we continue iterating. After we’ve looped through all the nodes, we return the parent and the minimum node.

   # keep track of min and parent
   def find_min(self):
       if self.next is None:
           return None, self
       min = self
       parent = None
       while self.next is not None:
           if self.next.value < min.value:
               parent = self
               min = self.next
           self = self.next
       return parent, min

Removing the Minimum

Now that we have a way to find the minimum node, we need a way to remove the minimum node. This function also has multiple cases to consider. The three cases that we need to consider when removing the minimum of a singly linked list, given that we have found the parent and the minimum already are:

  • There is no parent node to account for
  • The minimum node is the last node and does not have a next node
  • We’re removing a node in the middle and need to account for the next node

The first case happens when the minimum node is the first node or the only node. We handle this case first. When the parent node is None, we set the self object to the next node and the minimum node’s next pointer to None to remove it. Then we return the self object and the minimum node. Note that if the minimum node does not have a next object, we have emptied our original linked list.

Once again, we handle the other two cases together using if statements. Instead of branching on the parent node though, we use a condition on the minimum node. If the minimum node’s next object is not None, we set the parent node’s next pointer to the minimum node’s original next object. Then, we set the minimum node’s next pointer to None to remove the node out of the linked list.

Otherwise, the minimum node’s next object is None, and we are removing the last node. We can simply remove the last node by setting the parent node’s next object to None. Once again, we set the minimum node’s next object to None. We can actually make this code cleaner/less repetitive by setting the min.next = None line outside of the if/else conditional. I’m not sure that makes it more clear, but it does make the file smaller.

   # find minimum and parent
   # point parent.next to min.next if it exists
   # point min.next to None to officially remove
   def remove_min(self):
       parent, min = self.find_min()
       if parent is None:
           self = min.next
           min.next = None
           return self, min
 
       if min.next is not None:
           parent.next = min.next
           min.next = None
       else:
           parent.next = None
           min.next = None
       return self, min

Full Code for a Custom Singly Linked List Class

We’ve covered the logic for each part of the Node class for a singly linked list separately. Here is the full code we use for this implementation.

class Node(object):
   def __init__(self, value):
       self.value = value
       self.next = None
  
   def append(self, value):
       while self.next is not None:
           self = self.next
       self.next = Node(value)
 
   def print_nodes(self):
       while self.next is not None:
           print(self.value, end=" ")
           self = self.next
       print(self.value)
  
   # keep track of min and parent
   # while parent has next
   # # if next value is
   def find_min(self):
       if self.next is None:
           return self, self
       min = self
       parent = None
       while self.next is not None:
           if self.next.value < min.value:
               parent = self
               min = self.next
           self = self.next
       return parent, min
  
   # find minimum and parent
   # point parent.next to min.next if it exists
   # point min.next to original root
   def remove_min(self):
       parent, min = self.find_min()
       if parent is None:
           self = min.next
           min.next = None
           return self, min
 
       if min.next is not None:
           parent.next = min.next
           min.next = None
       else:
           parent.next = None
           min.next = None
       return self, min

Not In Place Sort Function for a Singly Linked List

Now that we have a Node class to create a singly linked list, let’s write our sorting algorithm. As mentioned above, this sorting algorithm is not an in-place sorting algorithm. We actually build an entirely new linked list to sort our original one. I made this sorting algorithm on the fly as an intuitive way to sort a singly linked list. 

This algorithm passes through the linked list once for each element. Each pass through of the linked list consists of removing the current minimum element and appending that element to a new, sorted linked list. In the next two subsections we see a visualization and the code with an explanation of how this sorting algorithm works.

Visualization of How to Sort a Singly Linked List

Let’s visualize how this out of place sorting algorithm works on a simple three element linked list. Assuming we start with a totally out of order list of 8, 4, 2, our algorithm will take three passes. Round one finds the minimum of 2, removes it from the original list, and starts a new singly linked list from it. The second pass looks at the new list of just 8, 4, finds the minimum of 4, and appends it to the new list we made in round one. Finally, we remove the last element, 8, and append that to the new list. Now we are left with a sorted singly linked list: 2, 4, 8.

Code for How to Sort a Singly Linked List in Python

I debated with myself over whether or not this sorting function should be a standalone function that acts on the class, or whether it should be a function in the class. I decided that it should be a standalone function because we are creating a totally new data structure. This sort function takes one parameter, the starting root node of the original list. It returns the new root of the sorted linked list.

As I break down in the comments above the function, this function does two things. First, remove the minimum and put it into a new linked list. Second, repeat step one until the original linked list is empty. Start by getting the parent and current minimum from the passed in root. Then, set the new root of the sorted singly linked list to the returned minimum. While the modified original linked list has a next node, repeatedly remove the minimum and append the new root to the new list. Finally, append the last remaining value in the linked list and return the root of the newly created linked list.

# 1. remove minimum and put it into a new linked list
# 2. repeat 1 until original linked list no longer exists
def sort_nodes(root: Node):
   new_list, min = root.remove_min()
   new_root = min
   while new_list.next is not None:
       new_list, min = new_list.remove_min()
       new_root.append(min.value)
   new_root.append(new_list.value)
   return new_root

Sorting an Example Linked List

Let’s test our function with three examples. The first example is the three element example shown in the visualization: 8, 4, 2. Next, we also append 1 and 5 and sort, and then 7 and sort. The resulting sorted lists should be: 2, 4, 8, then 1, 2, 4, 5, 8, and 1, 2, 4, 5, 7, 8. The example code to test the sort function is shown below.

root = Node(8)
root.append(4)
root.append(2)
root = sort_nodes(root)
root.print_nodes()
 
root.append(1)
root.append(5)
root = sort_nodes(root)
root.print_nodes()
 
root.append(7)
root = sort_nodes(root)
root.print_nodes()

When we run the code above, we get an output that matches the expected sorted lists as shown below.

Summary of Out of Place Sort on a Singly Linked List in Python

In this post, we created a custom linked list implementation and an out of place sorting algorithm. The sorting algorithm we created runs in O(n^2) time and requires O(n) space. It works by finding and removing the minimum element from the original linked list, and then appending it to a new, sorted linked list.

More by the Author

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Categories
data structures and algorithms General Python

Python Counting Sort Guide and Implementation

Sorting algorithms are one of the foundational algorithm areas in computer science. Here on PythonAlgos, we’ve covered five of the basic sorting algorithms: bubble sort, heap sort, insertion sort, merge sort, and quick sort. These sorting algorithms are all generalizable algorithms. They can be used to sort integers, floats, strings, etc. In this post, we’re going to cover counting sort.

Unlike the five basic sorting algorithms we covered on the resources page, counting sort is not generalizable. What is it and when can we use it? We’ll cover:

  • What is Counting Sort (Pseudocode)
  • How to Implement Python Counting Sort Algorithm
    • Python Counting Sort Function Steps
    • Full Code Implementation for Counting Sort Algorithm in Python
    • Testing Our Python Counting Sort Algorithm
  • When Should I Use Counting Sort?
  • Counting Sort Time Complexity
  • Prove that Counting Sort is Stable
  • Summary of Python Counting Sort Guide and Implementation

What is Counting Sort (Pseudocode)

Counting sort is a sorting algorithm that only works on integers. In its most basic form, counting sort only works on whole numbers (integers greater than or equal to 0). Counting sort can be modified to work with negative numbers too. Let’s look at the steps involved in implementing counting sort.

  1. Find n, the maximum integer in the list
  2. Create a list of zeroes of size n+1 where each index corresponds to an integer from 0 to n
  3. Iterate through the original list, incrementing the index corresponding to the value in the counting list
  4. Iterate through the counting list and create a cumulative list, where each index contains the correct index that number should be in (ie index 0 contains 0 and index 1 contains 2, that means 1 goes in the 0th and 1st indices)

How to Implement Python Counting Sort Algorithm

Let’s take a look at how to implement a Python Counting Sort Algorithm. In this section, we’re going to create a function that will implement the counting sort algorithm on an unsorted list. Our counting sort function will only take one parameter. The unsorted list we need to sort.

The first thing we do in our function is get the size of the unsorted list. Next, we initialize a list of the same size consisting of 0s. We use this list to store the sorted list. Third, we get the max value in the unsorted list. This corresponds to n in the counting sort pseudocode above.

def counting_sort(unsorted: list):
   size = len(unsorted)
   sorted_list = [0]*size
   _max = max(unsorted)

Python Counting Sort Function Steps

Luckily for us, the three steps involved in counting sort are pretty straight forward. First, we’re going to get the counts of each integer in the unsorted list. We do this by creating an empty list of 0s of whatever the max number is plus 1 (to include the 0 index). Then, we loop through each entry and increment the index in the count list by one for each time the number appears in our unsorted list.

Second, we want to get the accumulated counts. We do this by looping through each index from one to the max value. For each of these values, we add the value of the last index in the list to the value of the current index in the counting sort count list we created above.The last step is to create the sorted list.

We start with an index that is one less than the size (to account for the 0 index). While this index, i, is greater than or equal to 0, we fill in the sorted list. We use the value of the unsorted list’s i index as the index for the accumulated count and subtract 1 from that to get the index of the sorted list to fill in. Then, we decrement the accumulated list’s value at the index equal to the value of the i index in the unsorted list and the value of i.

Finally, we return the sorted list.

   # get counts of each integer in unsorted
   count = [0]*(_max+1)
   for i in range(size):
       count[unsorted[i]] += 1
  
   # get cumulative count
   for i in range(1, _max+1):
       count[i] += count[i-1]
  
   # create sorted list
   i = size - 1
   while i >= 0:
       sorted_list[count[unsorted[i]] - 1] = unsorted[i]
       count[unsorted[i]] -= 1
       i -= 1
  
   return sorted_list

Full Code Implementation for Counting Sort Algorithm in Python

The above section including the steps of counting sort are part of the function we defined in the section above that. Here is the full code for a Python Counting Sort implementation:

def counting_sort(unsorted: list):
   size = len(unsorted)
   sorted_list = [0]*size
   _max = max(unsorted)
  
   # get counts of each integer in unsorted
   count = [0]*(_max+1)
   for i in range(size):
       count[unsorted[i]] += 1
  
   # get cumulative count
   for i in range(1, _max+1):
       count[i] += count[i-1]
  
   # create sorted list
   i = size - 1
   while i >= 0:
       sorted_list[count[unsorted[i]] - 1] = unsorted[i]
       count[unsorted[i]] -= 1
       i -= 1
  
   return sorted_list

Testing Our Counting Sort Python Implementation

Now that we’ve written a function to implement counting sort in Python, let’s see how our counting sort function works. Below, I’ve provided three different unsorted lists. Then, we call counting sort on all three of these lists.

unsorted1 = [5, 2, 1, 7, 3, 3, 5, 2, 7, 4, 11]
unsorted2 = [4, 1, 2, 2, 3, 9, 8, 9, 2]
unsorted3 = [6, 1, 1, 8, 3, 2, 4, 4, 2, 1]
print(counting_sort(unsorted1))
print(counting_sort(unsorted2))
print(counting_sort(unsorted3))

The sorted lists should look like the sorted lists in the image below.

When Should I Use Counting Sort?

Unlike the basic sorting algorithms we’ve covered on this page, counting sort is not applicable in all circumstances. Counting sort should only be used when you have to sort a list of integers. It is almost always explicitly stated or implicitly implied that counting sort is meant for positive integers.

However, we can actually extend it to also work when we have negative integers. How could we do that? By making a counting list of size m where m is the difference between the max and min number in the list. Perhaps we’ll cover a Python implementation of counting sort with negative numbers in the future.

Counting Sort Time Complexity

What’s the time complexity of counting sort? How long it takes for an algorithm to sort a list is an important factor in picking your sorting algorithm. Counting sort has a time complexity of O(n + m) where n is the size of the list and m is the max number. In other words, the time that it takes to sort your list with counting sort goes up at the same rate as the size of your list.

How is counting sort able to have a linear time complexity while every other sorting algorithm we’ve covered so far is O(n log n) at best? Counting sort buys its time efficiency by requiring extra memory and being limited in its scope. As you can see from counting sort Python implementation, we need two extra lists. These lists are usually different sizes. One corresponds to the number of entries in the list, and the other corresponds to the largest entry.

As we said above, counting sort only works on integers. If the largest entry in our list is much larger than the size of our list, counting sort becomes less efficient. This only becomes a problem when working with large numbers.

Prove that Counting Sort is Stable

Another thing to consider when evaluating sorting algorithms is stability. Counting sort is stable. In this section, we’ll prove that counting sort is stable. It breaks ties between equal numbers by making sure the number that appears first in the unsorted array also appears first in the sorted array.

We’re going to do one better here. We’re going to change the code slightly to prove that counting sort is stable via demonstration. We can use a priority queue and apply counting sort to that to prove that counting sort is stable. 

We only have to make a few changes to the code to make our counting sort Python implementation work on priority queues. I’ve commented next to the line in the code below. The main differences are using a lambda function to get the max value and using the [0] suffix whenever we work with the unsorted list to grab the number.

def counting_sort(unsorted: list):
   size = len(unsorted)
   sorted_list = [0]*size
   _max = max(unsorted, key=lambda item:item[0])[0] # change here
  
   # get counts of each integer in unsorted
   count = [0]*(_max+1)
   for i in range(size):
       count[unsorted[i][0]] += 1 # change here
  
   # get cumulative count
   for i in range(1, _max+1):
       count[i] += count[i-1]
  
   # create sorted list
   i = size - 1
   while i >= 0:
       sorted_list[count[unsorted[i][0]] - 1] = unsorted[i] # change here
       count[unsorted[i][0]] -= 1 # change here
       i -= 1
  
   return sorted_list
 
unsorted = [(3,'a'), (1,'a'), (1,'b'), (2,'a'), (2,'b')]
print(counting_sort(unsorted))

When we call our counting sort Python implementation on the above priority queue, we get the output shown below.

Summary of Python Counting Sort Guide and Implementation

In this guide we covered how to do a counting sort Python implementation. Counting sort is only implemented when we have data of integers. More traditionally, positive integers. We covered how counting sort is implemented via pseudocode.

Counting sort makes use of an extra two lists and iterates through each of them once. It literally counts the number of times each value appears, hence the name counting sort. We also covered that counting sort has linear time complexity. It scales directly with the max size and the size of the list.

We also adjusted the code to prove that counting sort is stable. We made it so that our counting sort Python implementation worked for priority queues. Then, we sort a priority queue with the adjustment and show that the entries are placed in order. Counting sort always puts numbers in the sorted list in the same order they appear in the unsorted list.

Further Reading

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Categories
data structures and algorithms General Python

Prim’s Algorithm in Python for MST

Minimum Spanning Tree (MST) algorithms find the shortest path that connects all the points in a graph. Tree algorithms that find minimum spanning trees are useful in network design, taxonomies, or cluster analysis. On PythonAlgos, we’ve already covered one MST algorithm, Kruskal’s algorithm. This time we’re going to cover Prim’s algorithm for an MST.

Prim’s algorithm is similar to Kruskal’s algorithm. Whereas Kruskal’s adds to the MST by looping through edges, Prim’s adds to the MST by looping through vertices. In this post, we’ll cover how to implement Prim’s algorithm in Python through the following:

Psuedocode for Prim’s Algorithm for MST

Prim’s Algorithm is a graph algorithm that finds the minimum spanning tree of a graph. Our implementation will assume that the graph is connected, and therefore made up of one minimum spanning tree. Perhaps we will cover a more advanced implementation in the future.

Here’s the pseudocode for implementing Prim’s Algorithm:

  1. Pick a random point to start the graph at, in our example, we’ll start from point 0 (the first point)
  2. Go through the tree and find the point that is the shortest distance away from the current points in the MST
  3. Add the new point and update the current list of known minimum distances to the MST
  4. Repeat steps 2 and 3 until all the vertices have been added to the minimum spanning tree

Prim’s Algorithm in Python

We’re going to be using Python 3, and we won’t need any external libraries for this implementation. We will create a Graph object which will hold three properties and three functions. The properties our graph will have will be a large number representing the max possible value for an edge distance, the number of vertices in the graph, and an adjacency matrix representation of the graph.

Creating a Graph Object

The first thing we’re going to do is create a Graph object. Every other code block in this tutorial belongs inside this object. First, we create a class property called INF which represents the max value that an edge in the graph can have (or infinity). Then we’ll create the init function. 

The init function of the Graph object will take one parameter other than self, the number of vertices in the graph. It sets the instance property V to the number of vertices and creates an empty adjacency list representation of a graph with V vertices.

class Graph():
   INF = 999999
   def __init__(self, num_vertices):
       self.V = num_vertices
       self.graph = [[0 for column in range(num_vertices)] for row in range(num_vertices)]

Print Function for MST Created by Prim’s Algorithm

Now that we’ve finished the initial creation of our graph, let’s write the functions we need for Prim’s algorithm. In this section, we’ll create a function that prints the edges and weights of the MST that we find using Prim’s algorithm.

This function takes one parameter, parent. It expects parent to be a list of indices corresponding to the parent node of each index. Then it prints out a line that shows that we’re going to first print the edge, a tab, then the weight. Finally, we’ll loop through all the vertices from 1 (the root vertex, 0, has no parent node), until the end and print out the edge, a tab, then the weight.

   # pretty print of the minimum spanning tree
   # prints the MST stored in the list var `parent`
   def printMST(self, parent):
       print("Edge     Weight")
       for i in range(1, self.V):
           print(f"{parent[i]} - {i}       {self.graph[i][parent[i]]}")

Helper Function for Finding the Minimum Vertex in Prim’s

Now let’s create the helper function to find the next minimum distance vertex to add in Prim’s algorithm. This function takes two parameters, a list of distances called key, and a list representing whether a vertex, represented by the truth value at an index, is in the MST already.

The first thing we’ll do in this function is set a min value which represents the minimum edge distance to a vertex that we’ve found so far. We will initially set this value to the INF property we declared earlier. 

Next, we’ll loop through the list of vertices and check if the distance to a vertex in the key list is less than the current minimum and the vertex is not in the minimum spanning tree. If it is, then we’ll set the new minimum value to be the distance and the current minimum index to be the vertex being iterated on. After looping through each point, we’ll return the minimum index.

   # finds the vertex with the minimum distance value
   # takes a key and the current set of nodes in the MST
   def minKey(self, key, mstSet):
       min = self.INF
       for v in range(self.V):
           if key[v] < min and mstSet[v] == False:
               min = key[v]
               min_index = v
       return min_index

Prim’s Algorithm Implementation

Alright, let’s get to the fun part, the actual implementation of Prim’s algorithm. This Python implementation doesn’t take any parameters other than the object itself. The first things we’re going to do are initialize the list of distances, parent nodes, MST nodes, and their values.

The initial list of distances to existing nodes should be set to the max value, INF, for each node in the range of vertices. We’ll set the distance to the initial node, 0, to 0 to start the MST. The list of parent nodes should be set to None for each of the nodes, then we’ll set the parent node for the initial node, 0, to -1 to show it does not exist. Finally, we’ll create a truth value list where the truth value of each index represents whether that vertex is in the minimum spanning tree yet.

Now we’ll create the outer loop of vertices. In each iteration of this loop we’ll be adding a vertex to the MST. In this loop, we first find the vertex with the minimum distance using the minKey helper function we wrote earlier. Next, we’ll set the index of that vertex in the MST list to True.

Once we’ve selected the next vertex to add to the MST, we’ll sort out the keys which show the distances of the connectable vertices and the parent list. We’ll create another loop that goes through each vertex. This loop will check for all the new connectable vertices and new minimum distances based on the newly added vertex and update everything in the existing lists.

   # prim's algo, graph is represented as an v by v adjacency list
   def prims(self):
       # used to pick minimum weight edge
       key = [self.INF for _ in range(self.V)]
       # used to store MST
       parent = [None for _ in range(self.V)]
       # pick a random vertex, ie 0
       key[0] = 0
       # create list for t/f if a node is connected to the MST
       mstSet = [False for _ in range(self.V)]
        # set the first node to the root (ie have a parent of -1)
       parent[0] = -1
 
       for _ in range(self.V):
           # 1) pick the minimum distance vertex from the current key
           # from the set of points not yet in the MST
           u = self.minKey(key, mstSet)
           # 2) add the new vertex to the MST
           mstSet[u] = True
 
           # loop through the vertices to update the ones that are still
           # not in the MST
           for v in range(self.V):
               # if the edge from the newly added vertex (v) exists,
               # the vertex hasn't been added to the MST, and
               # the new vertex's distance to the graph is greater than the distance
               # stored in the initial graph, update the "key" value to the
               # distance initially given and update the parent of
               # of the vertex (v) to the newly added vertex (u)
               if self.graph[u][v] > 0 and mstSet[v] == False and key[v] > self.graph[u][v]:
                   key[v] = self.graph[u][v]
                   parent[v] = u
 
       self.printMST(parent)

Testing Our Python Implementation of Prim’s Algorithm

Let’s test out our Python implementation of Prim’s algorithm via a graph object on a graph of size five. Before we get into testing the code, let’s take a look at a visual representation of our graph and the expected MST.

Visual representation of the graph that we will be applying our Python implementation of Prim’s algorithm on:

Example Graph to Build an MST from Prims Algorithm

Expected Minimum Spanning Tree Yielded by Prims Algorithm:

Expected MST from Prim’s Algorithm

Now let’s take a look at the code and the output from our implementation.

g = Graph(5)
g.graph = [ [0, 2, 0, 6, 0],
           [2, 0, 3, 8, 5],
           [0, 3, 0, 0, 7],
           [6, 8, 0, 0, 9],
           [0, 5, 7, 9, 0]]
 
g.prims()

The expected print output for our Python implementation of Prim’s Algorithm:

Prim’s Algorithm’s MST

Full Code for Prim’s Algorithm in Python

Here’s the full code for Prim’s Algorithm in Python.

class Graph():
   INF = 999999
   def __init__(self, num_vertices):
       self.V = num_vertices
       self.graph = [[0 for column in range(num_vertices)] for row in range(num_vertices)]
      
   # pretty print of the minimum spanning tree
   # prints the MST stored in the list var `parent`
   def printMST(self, parent):
       print("Edge     Weight")
       for i in range(1, self.V):
           print(f"{parent[i]} - {i}       {self.graph[i][parent[i]]}")
  
   # finds the vertex with the minimum distance value
   # takes a key and the current set of nodes in the MST
   def minKey(self, key, mstSet):
       min = self.INF
       for v in range(self.V):
           if key[v] < min and mstSet[v] == False:
               min = key[v]
               min_index = v
       return min_index
  
   # prim's algo, graph is represented as an v by v adjacency list
   def prims(self):
       # used to pick minimum weight edge
       key = [self.INF for _ in range(self.V)]
       # used to store MST
       parent = [None for _ in range(self.V)]
       # pick a random vertex, ie 0
       key[0] = 0
       # create list for t/f if a node is connected to the MST
       mstSet = [False for _ in range(self.V)]
        # set the first node to the root (ie have a parent of -1)
       parent[0] = -1
 
       for _ in range(self.V):
           # 1) pick the minimum distance vertex from the current key
           # from the set of points not yet in the MST
           u = self.minKey(key, mstSet)
           # 2) add the new vertex to the MST
           mstSet[u] = True
 
           # loop through the vertices to update the ones that are still
           # not in the MST
           for v in range(self.V):
               # if the edge from the newly added vertex (v) exists,
               # the vertex hasn't been added to the MST, and
               # the new vertex's distance to the graph is greater than the distance
               # stored in the initial graph, update the "key" value to the
               # distance initially given and update the parent of
               # of the vertex (v) to the newly added vertex (u)
               if self.graph[u][v] > 0 and mstSet[v] == False and key[v] > self.graph[u][v]:
                   key[v] = self.graph[u][v]
                   parent[v] = u
 
       self.printMST(parent)
 
g = Graph(5)
g.graph = [ [0, 2, 0, 6, 0],
           [2, 0, 3, 8, 5],
           [0, 3, 0, 0, 7],
           [6, 8, 0, 0, 9],
           [0, 5, 7, 9, 0]]
 
g.prims()

Prims Algorithm Using Priority Queue in Python

Now that we’ve walked through one solution to Prims Algorithm, let’s look at a second. This solution uses a priority queue implemented with Python heapq. One of the main differences between these two solutions is that this one uses an adjacency list to represent the graph. From the code above you can see that we used a matrix to represent the graph before.

The way that we do Prim’s algorithm with a priority queue requires us to use tuples to represent edges and vertices. The adjacency list representation below is the same graph as above. Each entry in each row of the adjacency list represents a vertex-edge combination. For example, (1, 2) in the first row means that vertex 0 (the first vertex) connects to vertex 1 with edge length/width 2.

import heapq

# adjacency list for Prims Algorithm with Priority Queue
adj_list_graph=[[(1, 2), (3, 6)],
                [(0, 2), (2, 3), (3, 8), (4, 5)],
                [(1, 3), (4, 7)],
                [(0, 6), (1, 8), (4, 9)],
                [(1, 5), (2, 7), (3, 9)]]

Prims Algorithm Implementation with heapq Priority Queue

Now that we’ve set up the graph that our function can read, let’s implement the logic. This implementation of Prim’s algorithm takes two parameters. The input graph as an adjacency matrix, and the starting vertex.

The first thing we do in our function is establish the data structures we need. We will keep track of the edges, the weights, and the visited vertices. The edges and weights are initially empty and the visited_vertices list is populated with the starting vertex.

Then, while the length of the visited vertices is less than the length of the graph, we use the priority queue to figure out which vertex to add next. In our while loop, we create a heap out of the possible moves from the existing MST. Note that we need to weigh the priority queue by the weight of the edge, not the vertex number.

Once we’ve created that priority queue out of the available moves, we pop the first move available to us with heapq.heappop. We add that vertex and the edge and keep going until we have added all possible vertices to our Prims algorithm MST.

"""Prims Algorithm with a Priority Queue 
Implemented with HeapQ
@parameter graph: Graph (adjacency list rep)
@parameter start: integer within graph's # of vertices
@return list type representing the MST as (vertex, weight)"""
def prims_priority_q(graph, start):
    # establish the necessary data structures 
    edges = []
    weights = []
    visited_vertices = [start]

    while len(visited_vertices) < len(graph):
        moves = []
        for x in visited_vertices:
            for node in graph[x]:
                # push weight, cur vertex, next vertex
                # could be prettier if we used objects instead
                # of tuples
                if node[0] not in visited_vertices:
                    heapq.heappush(moves, (node[1], x, node[0]))
        
        # get the next move based on the weight
        next_move = heapq.heappop(moves)
        print(f"next move: {next_move}")
        # add the next vertex, total weight, and append the edge
        visited_vertices.append(next_move[2])
        weights.append(next_move[0])
        edges.append((next_move[1], next_move[2]))

    return edges, weights

Testing the Priority Queue Version of Prim’s Algorithm

Finally, we can test the priority queue implementation of Prim’s algorithm. We run it on the adjacency list graph we created above and expect to see the same MST as the one we created before. The code below executes these tests and pretty prints the edges and weights.

edges, weights = prims_priority_q(adj_list_graph, 0)
print("edges    weights")
for edge, weight in zip(edges, weights):
    print(f"{edge}      {weight}")

Running the program made of all the code compiled in these sections should result in the output like the one below. In this implementation, we’re even showing the next move available in Prim’s algorithm as we run it.

Prims Algorithm Priority Queue Implementation Test Results

Summary of Prims Algorithm

In this post on Prim’s algorithm in Python, we learned that Prim’s is a minimum spanning tree graph algorithm. Prim’s algorithm creates a MST by adding the nearest vertex one after another. We looked at the psuedocode for Prim’s algorithm, and then created a Python graph class that implements Prim’s. Finally, we tested our implementation on a five vertex graph.

Further Reading

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Categories
data structures and algorithms level 1 python

Kruskal’s Algorithm in Python (with Pseudocode)

Data structures and algorithms are a cornerstone of computer science. In our journey so far, we’ve looked at basic data structures like stacks, queues, and dequeues, linked lists and binary trees, and algorithms like sorting algorithms, tree algorithms, and Dijsktra’s algorithm. Now, let’s take a look at another important graph algorithm – Kruskal’s.

Kruskal’s algorithm is an algorithm that finds the minimum spanning tree (MST) of a graph. In this post we’ll cover:

  • Kruskal’s Algorithm Pseudocode
  • Kruskal’s Algorithm in Python
    • Creating a Graph Object
    • Python Union Find Algorithm
    • Implementing Kruskal’s Algorithm in Python
    • Full Python Code for Kruskal’s Graph Algorithm
    • Testing Kruskal’s Algorithm in Python
  • Summary of Kruskal’s Algorithm in Python

Kruskal’s Algorithm Pseudocode

Kruskal’s algorithm uses a greedy approach to build a minimum spanning tree. Let’s take a look at the pseudocode:

  1. Initialize a graph using the shortest (lowest weight) edge
  2. Find the shortest connected edge and add it to the shortest edges so far as long as adding the edge doesn’t create a cycle in the graph
  3. Repeat step 2 until all vertices have been included in the final MST

Kruskal’s Algorithm in Python

Let’s go over how to implement Kruskal’s Algorithm in Python. This implementation of Kruskal’s Algorithm is going to be as a function in a Graph object. We’ll create a Graph object that will hold the number of vertices in the graph as well as an adjacency list that represents the graph. 

Each entry in the adjacency list will have three entries, the two vertices and the weight of the edge between them. We also need to create functions to perform the find and union pieces of the union find algorithm.

Creating a Graph Object in Python

The first thing we’ll need to do is create our Graph object. All the functions we define later in the tutorial are a part of this object. We need an __init__ function and a function to add an edge. The __init__ function takes one parameter, the number of vertices. It sets one of the object properties to the number of vertices and creates an empty adjacency list to represent the edges in the graph.

class Graph(object):
   def __init__(self, num_vertices):
       self.V = num_vertices
       self.graph = []
  
   def add_edge(self, u, v, w):
       self.graph.append([u, v, w])

Python Union Find Algorithm

Union Find is an algorithm to find the sets of connected graphs. This algorithm consists of two separate functions, the find and union functions. The find function needs two parameters, and the union function needs four parameters.

The find function takes a list that keeps track of the minimum spanning tree and a vertex number. If the index in the MST list is equivalent to the passed in vertex, then we return the vertex number. Otherwise, we recursively look for the vertex by calling the find function with the same list, but with the new index corresponding to the value of the index of the vertex in the MST list.

The union function’s four parameters are the MST list, another list that marks the number of disjoint sets and the origin of those sets, and two vertices. First, we look for where the vertices are in terms of the MST. Then, we compare the values of the indices of the set tracking list and change the corresponding value in the MST list depending on the value comparison. If the values in the MST list are different, then the index of the smaller value is set to the higher value and vice versa. If they are the same, then we create a new disjoint set.

   # union find
   def find(self, root, i):
       if root[i] == i:
           return i
       return self.find(root, root[i])
  
   def union(self, root, rank, x, y):
       xroot = self.find(root, x)
       yroot = self.find(root, y)
       if rank[xroot] < rank[yroot]:
           root[xroot] = yroot
       elif rank[xroot] > rank[yroot]:
           root[yroot] = xroot
       else:
           root[yroot] = xroot
           rank[xroot] += 1

Implementing Kruskal’s Algorithm in Python

Now we have our helper functions, let’s take a look at how we can implement Kruskal’s algorithm in Python. We already have the two objects we need to perform Kruskal’s algorithm so this function does not need any parameters.

The first thing we’ll need to do in our Kruskal’s function is initialize the results list, which will contain the MST and will be structured the same as the list of the vertices and edge weights. We also need to initialize the iterations and number of edges added to the MST. Then, we’ll sort the adjacency list representing the graph by the weights of the algorithm.

Now, we’ll initialize the lists that will represent the root nodes of the sets we’ve found (root) and the list of the number of connected sets in the graph represented by the root node (rank). We’ll then populate the two lists.

Next, we’ll run a while loop that runs as long as we haven’t added the expected number of edges to the MST. In the while loop, we’ll start by extracting the vertices and edge weight between them corresponding to the graph index of the iteration we’re in. Now, we’ll use the find function to see if either or both vertices are connected to the graph.

If only one vertex is connected to the graph (x is not equal to y), then we increment the number of edges we’ve found in the MST, add the set of vertices and edge weights to the results list, and run the union algorithm to keep track of the MST and number of sets. When the MST is created, we’ll print it out.

   # applying kruskal's
   def kruskals(self):
       # initialize an empty MST list
       result = []
       # initialize i, the iteration and e, the edges added
       i, e  = 0, 0
       # sort the graph based on edge weights
       self.graph = sorted(self.graph, key = lambda item: item[2])
       # initialize root, which keeps track of the MST
       # and the rank, which keeps track of where each node belongs
       root = []
       rank = []
       for node in range(self.V):
           root.append(node)
           rank.append(0)
 
       # while we haven't yet added each edge
       # increment iterator and run the union find algorithm
       while e < self.V - 1:
           u, v, w = self.graph[i]
           i = i + 1
           x = self.find(root, u)
           y = self.find(root, v)
           print(f"x, y: {x}, {y}")
           if x != y:
               e = e + 1
               result.append([u, v, w])
               self.union(root, rank, x, y)
 
       for u, v, w in result:
           print(f'{u} - {v}: {w}')

Full Kruskal’s Graph Algorithm Python Code

Here’s the full code for Kruskal’s Algorithm in Python implemented with a Graph class:

# kruskal's in Python
class Graph(object):
   def __init__(self, num_vertices):
       self.V = num_vertices
       self.graph = []
  
   def add_edge(self, u, v, w):
       self.graph.append([u, v, w])
  
   # union find
   def find(self, root, i):
       if root[i] == i:
           return i
       print(i, root[i])
       return self.find(root, root[i])
  
   def union(self, root, rank, x, y):
       print(f"root: {root}, rank: {rank}")
       xroot = self.find(root, x)
       yroot = self.find(root, y)
       if rank[xroot] < rank[yroot]:
           root[xroot] = yroot
       elif rank[xroot] > rank[yroot]:
           root[yroot] = xroot
       else:
           root[yroot] = xroot
           rank[xroot] += 1
       print(f"root: {root}, rank: {rank}")
  
   # applying kruskal's
   def kruskals(self):
       # initialize an empty MST list
       result = []
       # initialize i, the iteration and e, the edges added
       i, e  = 0, 0
       # sort the graph based on edge weights
       self.graph = sorted(self.graph, key = lambda item: item[2])
       # initialize root, which keeps track of the MST
       # and the rank, which keeps track of where each node belongs
       root = []
       rank = []
       for node in range(self.V):
           root.append(node)
           rank.append(0)
 
       # while we haven't yet added each edge
       # increment iterator and run the union find algorithm
       while e < self.V - 1:
           u, v, w = self.graph[i]
           i = i + 1
           x = self.find(root, u)
           y = self.find(root, v)
           print(f"x, y: {x}, {y}")
           if x != y:
               e = e + 1
               result.append([u, v, w])
               self.union(root, rank, x, y)
 
       for u, v, w in result:
           print(f'{u} - {v}: {w}')

Testing Kruskal’s Algorithm in Python

Now let’s test Kruskal’s Algorithm. We’ll create the following graph:

Kruskal’s Algorithm Python Test, Initial Graph
g = Graph(6)
g.add_edge(0, 1, 4)
g.add_edge(0, 2, 4)
g.add_edge(1, 2, 2)
g.add_edge(1, 0, 4)
g.add_edge(2, 0, 4)
g.add_edge(2, 1, 2)
g.add_edge(2, 3, 3)
g.add_edge(2, 5, 2)
g.add_edge(2, 4, 4)
g.add_edge(3, 2, 3)
g.add_edge(3, 4, 3)
g.add_edge(4, 2, 4)
g.add_edge(4, 3, 3)
g.add_edge(5, 2, 2)
g.add_edge(5, 4, 3)
g.kruskals()

This will print out something like the image below.

Kruskal’s Minimum Spanning Tree Representation

The MST should correspond to this graph:

Kruskal’s Algorithm Python Generated Minimum Spanning Tree Example

Summary of Kruskal’s Algorithm in Python

Kruskal’s algorithm builds a minimum spanning tree of a graph through adding the minimum weighted edges in a greedy manner. We implemented Kruskal’s algorithm in Python by creating a Graph object and adding functions to it to use the union find algorithm to run Kruskal’s algorithm.

Further Reading

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Categories
data structures and algorithms

Floyd Warshall in Python (with Pseudocode)

Data structures and algorithms are a cornerstone of computer science. In our journey so far, we’ve looked at basic data structures like stacks, queues, and dequeues, linked lists and binary trees, and algorithms like sorting algorithms, tree algorithms, and Dijsktra’s algorithm. Now, let’s take a look at another important graph algorithm – Floyd Warshall.

In this post, we’ll take a look at:

  • Floyd Warshall Pseudocode
  • Floyd Warshall Algorithm in Python
    • Define Constants and Create Examples
    • Implementing Floyd Warshall in Python
    • Testing Floyd Warshall in Python
  • Summary of Floyd Warshall Algorithm in Python

Floyd Warshall Pseudocode

Floyd Warshall is a simple graph algorithm that maps out the shortest path from each vertex to another using an adjacency graph. It takes a brute force approach by looping through each possible vertex that a path between two vertices can go through. 

Let’s take a look at the pseudocode:

  1. Pick a vertex – v
  2. Calculate distance for each set of vertices
  3. If the distance through vertex v is less than the currently recorded distance between two vertices, replace that distance
  4. Repeat steps 1 through 3 for each vertex

Note that Floyd Warshall will not work for any graph with negative edge weights. It’s also useless for unweighted and directed graphs.

Floyd Warshall Algorithm in Python

We’re going to implement the Floyd Warshall Algorithm in Python. First we’re going to define any constants we need and create some example adjacency lists, then we’ll define and implement the Floyd Warshall algorithm, finally we’ll test it.

Define Constants and Create Examples

The only constant we need to define for Floyd Warshall is “infinity”. Technically, we could also import the infinity constant from the math module, but the behavior will be the same as long as we define an infinity that is greater than any individual distance between vertices. Most of the time, we won’t be working with graphs with distances of greater than 99999 between vertices so we’ll use that as our infinity value. 

Then we’ll create two examples. These graphs will have 0 in the diagonal since the distance from a point to itself is 0, and we’ll also throw in some infinities into our adjacency graph to represent unconnected vertices.

# define "infinity"
_inf = 99999
 
# example 1
A = [[0, 5, _inf, _inf],
    [50, 0, 15, 5],
    [30, _inf, 0, 15],
    [15, _inf, 5, 0]]
 
# example 2
B = [[0, 3, _inf, 5],
    [2, 0, _inf, 4],
    [_inf, 1, 0, _inf],
    [_inf, _inf, 2, 0]]

These examples should give us:

[0, 5, 15, 10]
[20, 0, 10, 5]
[30, 35, 0, 15]
[15, 20, 5, 0]

and

[0, 3, 7, 5]
[2, 0, 6, 4]
[3, 1, 0, 5]
[5, 3, 2, 0]

Implementing Floyd Warshall in Python

Now that we’ve defined our constant and created our example, we’ll implement the Floyd Warshall algorithm in Python. First, we’ll get the number of vertices that we’re dealing with by getting the length of the graph. Next, we’ll create our loops. 

We need three layers of for loops that will loop through each vertex three times and replace the entry in the graph by the minimum distance between the direct distance between two vertices and the distance between those vertices that goes through the initially defined vertex (in the outermost loop). Finally, we’ll print out the list of lengths for each row in the graph.

# floyd warshall function
def floyd_warshall(graph):
​​   n = len(graph)
   # stepping loop
   for k in range(n):
       # outer loop
       for i in range(n):
           # inner loop
           for j in range(n):
               # replace direct path with path through k if direct path is longer
               graph[i][j] = min(graph[i][j], graph[i][k] + graph[k][j])
   # print complete graph in a pretty manner
   for row in graph:
       print(row)

Testing Floyd Warshall in Python

Now that we’ve implemented the Floyd Warshall algorithm, let’s take a look at how our algorithm does against the expected solutions we saw above. All we do is call Floyd Warshall on each of the graphs we made and check what the solution prints.

print("Shortest Paths Graph for A")
floyd_warshall(A)
print("Shortest Paths Graph for B")
floyd_warshall(B)

We should see an output like the image below.

Floyd Warshall Algorithm Results

Summary of Floyd Warshall Algorithm in Python

In this post we went over how to implement Floyd Warshall in Python. We walked through the pseudocode and then the actual implementation. The Floyd Warshall algorithm loops through each vertex three times. First it loops through a chosen vertex, then it loops through sets of two vertices that define the distance between them. While looping, it replaces the distance between two vertices with the minimum between the existing distance and the distance between the chosen vertex.

Further Reading

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly
Categories
data structures and algorithms General Python level 1 python

Create Beautiful Sorting Animations in Python

Sorting Algorithms are an important part of computer science. It’s great to be able to understand these. Being able to see the animations of these algorithms is a great way to understand them. I was recently asked to create a tutorial on how to create the animations on my Sorting Algorithms page. 

To follow this guide, we’ll need to install the matplotlib library. We can do so with the terminal command below.

pip install matplotlib

Overview of How to Create Sorting Animations in Python

In this post we’ll create a file that handles our animation function called animate_generator.py. To actually create the sorting animations, we’ll create a different file for each sorting algorithm.

In this post we’ll go over:

  • Matplotlib Modules for Animation
    • Introduction to matplotlib.animation.FuncAnimation
    • Introduction to matplotlib.animation.PillowWriter
  • Creating the Sorting Animation
    • Setting Up the Figure to Animate
    • Creating the Frame Animation Function
    • Using FuncAnimation to Animate Function
    • Save the GIF with the PillowWriter Class
    • Full Code for Creating an Animation Function in Python
  • Animating Different Sorting Algorithms
    • Bubble Sort
    • Heap Sort
    • Insertion Sort
    • Merge Sort
    • Quick Sort

Matplotlib Modules for Animation

We’re going to use matplotlib, the de facto plotting library for Python. We’ve used matplotlib.pyplot in so many posts already such as how to plot a random dataset, an example of bifurcation, and how to create a recurrent neural network from scratch. This is the basic plotting library to plot functions and scatter plots in Python. For creating and saving an animation though, we’re going to use two classes from the matplotlib.animation library, FuncAnimation, and PillowWriter.

import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation, PillowWriter

Introduction to matplotlib.animation.FuncAnimation

class matplotlib.animation.FuncAnimation(fig, func, frames=None, init_func=None, fargs=None, save_count=None, *, cache_frame_data=True, **kwargs)

FuncAnimation creates an animation with a passed in function. As we can see from the code signature above, FuncAnimation requires two parameters and takes up to seven optional parameters. We’ll use this to create our sorting animations. To learn more, read about how to create animations with FuncAnimation.

Introduction to matplotlib.animation.PillowWriter

class matplotlib.animation.PillowWriter(fps=5, metadata=None, codec=None, bitrate=None)

PillowWriter is the Python Image Library fork writer to save animations to a file. PillowWriter takes four parameters. None of the parameters are required, but we’ll be using the fps parameter. The other three are not commonly used. To learn more, read about how to save animations with PillowWriter in Python.

Creating the Sorting Animation Function

To create beautiful sorting animations in Python, the most important step is to actually create the function that will create our sorting animations. Our animate functions will take four mandatory parameters and one optional parameter. The four required parameters are _list, the list that we want to sort, generator, the sorting function in the form of a generator, title, the title of the file, save, whether or not we want to save the file, and save_frames, the number of frames to save.

We’ll begin our function by saving the length of the list to a variable, n, which we’ll use later to determine the axis size. Then we’ll get our figure and axis from subplots(), and set the title of the axis to the passed in title variable.

def animate(_list, generator, title, save, save_frames=1000):
    n = len(_list)
    fig, ax = plt.subplots()
    ax.set_title(title)

Setting Up the Figure to Animate

Now let’s set up the figure that we’re going to run our animations on. First thing we’ll need to do is set up the bars. We’re going to use a bar plot to create our sorting animation because we want to visualize the numbers moving around. Next, we’ll set the x and y limits. We set the xlim to the length of the list and the ylim to a little bit larger than the length of the list for a cleaner looking animation. 

Then we’ll set some text to be displayed at the top corner of the image. Finally, we’ll set the iteration to a one element list with entry 0. Note that we’re not going to add elements to this list, just increment the element in it. Why do we use a list then? Because we don’t want to actually change the “value” of the variable itself, just the value that it holds. This is due to some under-the-hood implementation details of animation. Feel free to change it to an int from a list and play around with it yourself to see why.

    bar_rects = ax.bar(range(len(_list)), _list, align="edge")  
    ax.set_xlim(0, n)
    ax.set_ylim(0, int(1.1*n))
    text = ax.text(0.01, 0.95, "", transform = ax.transAxes)
    iteration = [0]

Creating the Frame Animation Function

Now let’s create the function that we’ll pass into FuncAnimation to create the individual frames of our animation. We’ll define a function inside of our animation function called _animation (the underscore in front just denotes a private function/variable). This function will take three parameters, the array of numbers, the rects, representing the bars on the axis, and the iteration telling us which iteration it is.

In this function, we’ll set the values of all the bars in a frame, increment the first (and only) element of the iteration by 1, and then set the text to display which iteration we’re on.

    def _animate(array, rects, iteration):
        for rect, val in zip(rects, array):
            rect.set_height(val)
        iteration[0] += 1
        text.set_text(f"iterations: {iteration[0]}")

Using FuncAnimation to Animate Function

With everything set up, we can get to creating the actual animation. The matplotlib.animation.FuncAnimation class provides an easy way to stitch a bunch of frames together for an animation. All we need to do to create an animation is use FuncAnimation and pass it our figure, the _animate function we created earlier and the arguments we need for it through fargs, and the generator function for each frame. We’ll also pass an interval between frames, whether or not we want to repeat the animation (I set this to False, but this is actually trivial), and the number of frames we want to save in the case that we save this animation.

    anim = FuncAnimation(fig, func=_animate,
        fargs=(bar_rects, iteration), frames=generator, interval=10,
        repeat=False, save_count=save_frames)

Save the GIF with the PillowWriter Function

Finally, we’ll use the matplotlib.animation.PillowWriter class to create a writer to save our animation with. Remember that we had save as a parameter for our animate function to dictate whether or not we’ll actually save the outputted animation as a file. If we choose to save the file, we’ll use PillowWriter to create a writer with an fps of 30. Then we’ll call the animation object to save it to a .gif file using the PillowWriter object. 

Finally, regardless of whether or not we save the .gif file, we’ll show the figure so we can see the animated sorting algorithm.

    if save:
        writer = PillowWriter(fps=30)
        anim.save(title + ".gif", writer=writer)
    plt.show()

Full Code for Creating an Animation Function in Python

Here’s the full code for creating an animation function we can use to animate our sorting algorithms and save them as .gif files.

import matplotlib.pyplot as plt
from matplotlib.animation import FuncAnimation, PillowWriter
 
def animate(_list, generator, title, save, save_frames=1000):
    n = len(_list)
    fig, ax = plt.subplots()
    ax.set_title(title)
 
    bar_rects = ax.bar(range(len(_list)), _list, align="edge")  
    ax.set_xlim(0, n)
    ax.set_ylim(0, int(1.1*n))
    text = ax.text(0.01, 0.95, "", transform = ax.transAxes)
    iteration = [0]
    def _animate(array, rects, iteration):
        for rect, val in zip(rects, array):
            rect.set_height(val)
        iteration[0] += 1
        text.set_text(f"iterations: {iteration[0]}")
   
    anim = FuncAnimation(fig, func=_animate,
        fargs=(bar_rects, iteration), frames=generator, interval=10,
        repeat=False, save_count=save_frames)
    if save:
        writer = PillowWriter(fps=30)
        anim.save(title + ".gif", writer=writer)
    plt.show()	

Animating Different Sorting Functions

Now that we’ve created the sort algorithm animation function, we’re ready to apply it. Let’s apply it to five common sorting algorithms we covered in the Sorting Algorithms You Need to Know post. We’ll animate Bubble Sort, Heap Sort, Insertion Sort, Merge Sort, and Quick Sort.

Bubble Sort Animation with Python

The first sorting algorithm we’ll animate with our animate function is bubble sort. Bubble sort is one of the first sorting algorithms taught in a beginner data structures and algorithms course on computer science. 

The steps for bubble sort can simply be expressed like so:

  1. Use a for loop to loop through the list
    1. For each element in the list, loop through each element before it
      1. If the element before the element we’re checking is less than the element after it, swap the elements

This results in a “bubbling” of the largest elements to the top (end) of the list. 

In the example code we begin by importing the animate function from animate_generator.py and the random module. Our bubblesort function will take one parameter, an array. The first thing we’ll do is assign the length of the array to a variable for simplicity. Then we’ll initialize a double for loop. The outer loop will loop through each element, the inner loop will loop through each element less than the current index of the outer loop.

Inside the inner loop, we’ll compare each element to the one after it. If the element is greater than the element after it, we’ll swap them. Then we’ll yield the changed array. At the end of the double for loop, our function will give us a generator function that will contain a lazy iterator “containing” each swap in the bubble sort algorithm.

Finally, we’ll create an array of length 50, shuffle it, and call the animate function on it to create our animation.

from animate_generator import animate
import random
 
def bubblesort(arr):
    n = len(arr)
    for i in range(n-1):
        for j in range(n-i-1):
            if arr[j] > arr[j+1]:
                arr[j], arr[j+1] = arr[j+1], arr[j]
                yield arr
 
array = [i for i in range(50)]
random.shuffle(array)
animate(array, bubblesort(array), "bubble sort", True)

Heap Sort Animation with Python

This is my favorite sorting animation. I think heap sort is the most beautiful sort. The next sorting algorithm we’ll animate is Heap Sort. Heap sort is more complex than bubble sort. It’s a recursive sorting animation that “heaps” the list over and over again and then sorts the heap.

from animate_generator import animate
import random

Heap Function for Heap Sort

We need to create a heap function to turn the list into heaps for sorting. Our heap function will take three parameters, arr, the array, n, the length of the array, and i, the root element we’re heaping from. Inside the function we assign the left and right children of the root node as the first and second indices after doubling the root index. 

Then we compare to see if the left child’s index is in the array and if the array entry at that index is greater than the ith index. If it is, we assign the left child’s index as the new root index. If the right child’s index is less than the size of the array and if the entry at the current root (which could be the left child or the original i) is less than the value of the right child, we assign the right child’s index to the root.

If we change the value of the root index during either of those checks, we’ll swap the values at the new root and the original i. Then we’ll recursively call the heap function again with the new root value as the new i.

def heap(arr, n, i):
    root = i
    lc = 2*i + 1
    rc = 2*i + 2
    if lc < n  and arr[i] < arr[lc]:
        root = lc
    if rc < n and arr[root] < arr[rc]:
        root = rc
    if root != i:
        arr[i], arr[root] = arr[root], arr[i]
        heap(arr, n, root)

Heap Sort Function

Now that we’ve created the heap function for heap sort, let’s create the heap_sort function that will execute the heap function on the whole array. The heap_sort function takes one parameter, an unsorted list. The first thing we do in our heap_sort function is store the length of the array as a variable.

For each of the elements in the first half of the array, starting from the middle and working backwards, we call the heap function with that element as the root. We also yield each of the resulting arrays for each heap call.

Then we’ll start from the end and work our way backwards again to the starting index. For each one, we’ll swap the focus element, i, and the first element, and yield the array. Then we call the heap function on the array with the first element as the root element and yield that array.

Finally, we’ll shuffle an array of size 100 and call the animate function on it to create the animation. We can sort 100 here instead of just 50 because this sort is WAY faster than bubble sort. Read the post about Sorting Functions to learn more about the time complexity comparison.

def heap_sort(arr):
    n = len(arr)
    for i in range(n//2, -1, -1):
        heap(arr, n, i)
        yield arr
    for i in range(n-1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]
        yield arr
        heap(arr, i, 0)
        yield arr
 
array = [i for i in range(100)]
random.shuffle(array)
animate(array, heap_sort(array), "heap sort", False)

Full Code for Heap Sort Algorithm with Animation in Python

from animate_generator import animate
import random
 
def heap(arr, n, i):
    root = i
    lc = 2*i + 1
    rc = 2*i + 2
    if lc < n  and arr[i] < arr[lc]:
        root = lc
    if rc < n and arr[root] < arr[rc]:
        root = rc
    if root != i:
        arr[i], arr[root] = arr[root], arr[i]
        heap(arr, n, root)
 
def heap_sort(arr):
    n = len(arr)
    for i in range(n//2, -1, -1):
        heap(arr, n, i)
        yield arr
    for i in range(n-1, 0, -1):
        arr[i], arr[0] = arr[0], arr[i]
        yield arr
        heap(arr, i, 0)
        yield arr
 
array = [i for i in range(100)]
random.shuffle(array)
animate(array, heap_sort(array), "heap sort", False)

Insertion Sort Animation with Python

Insertion sort is called such because of the way it creates the sorted list. The insertion sort algorithm inserts sorted elements into a list as it grows. Each time it encounters an element, it inserts it into the already sorted list of elements before it.

Our insertion_sort algorithm will take one parameter, the unsorted list. We’ll loop through each element in the list and check the next element. While the next element is less than the focus element and we’re not at the 0 index, we’ll swap the element with the one before it and yield the changed array. Then we’ll set the i+1th element to the next element that we initially got at the beginning of our loop and yield this updated array.

Finally, to make the animation, we’ll create a randomized list of 50 elements and pass it to the animate function from animation_generator.py.

from animate_generator import animate
import random
 
def insertion_sort(arr):
    for i in range(len(arr)-1):
        next = arr[i+1]
        while arr[i] > next and i >= 0:
            arr[i+1] = arr[i]
            i -= 1
            yield arr
        arr[i+1] = next
        yield arr
 
array = [i for i in range(50)]
random.shuffle(array)
animate(array, insertion_sort(array), "insertion sort", False)

Merge Sort Animation with Python

The merge sort algorithm is a sorting algorithm that continuously breaks down an unsorted list in half until it gets to just 2 elements. Once it gets that list, it sorts those elements and merges them until we get a wholly sorted list. The only two imports we’ll have to import are the animate function from animate_generator and the random library.

from animate_generator import animate
import random

Merge Function for Merge Sort

First let’s create a merge function, this function will take four parameters, the list we want to merge sort, the start index, the middle index, and the end index. The first thing we’ll do is create an empty list of results and then set the left and right indices. The left index will be the start index and the right index will be one past the middle index. These indicate where we’ll start organizing the left and right side of the list.

The next thing we’ll do is append to our result array. We will go through the left and right side of the array we passed in. If the element in the left index is less than the element in the right index, we append the value from the left index into the result array and increment the left index. Otherwise we’ll append the value from the right index into the result array and increment the right index.

After we reach the end of either the left or right side, we go through the rest of the left array and right array (only one of these will still have terms actually) and append the rest of that list to the result list. After each of these, we’ll loop through the result list and set the values in the original to the right values and yield the updated list.

def merge(_arr, start, mid, end):
    result = []
    left_index = start
    right_index = mid + 1
    while left_index <= mid and right_index <= end:
        if _arr[left_index] < _arr[right_index]:
            result.append(_arr[left_index])
            left_index+=1
        else:
            result.append(_arr[right_index])
            right_index+=1
   
    while left_index <= mid:
        result.append(_arr[left_index])
        left_index+=1
   
    while right_index <= end:
        result.append(_arr[right_index])
        right_index+=1
   
    for index, value in enumerate(result):
        _arr[start+index] = value
        yield _arr

Sort Function for Merge Sort

As I said earlier, the merge sort function takes two functions to do, a merge function and a sort function. The sort function will take three parameters, the unsorted list, the start index, and the end index. The first thing we’ll do in the sort function is check for validity of the start and end indices and then create a middle index.

From there, all we need to do is recursively sort the left and right sides and then merge the two. We’ll yield from each of those calls to get the intermediate steps and finally we’ll yield the sorted array. To see our animation in action, we simply create a randomized list of 100 integers and pass it to the animate function.

def sort(_arr, start, end):
    if end <= start:
        return
    mid = start + ((end-start+1)//2)
    yield from sort(_arr, start, mid-1)
    yield from sort(_arr, mid, end)
    yield from merge(_arr, start, mid-1, end)
    yield _arr
 
N=100
array = [i for i in range(N)]
random.shuffle(array)
animate(array, sort(array, 0, N-1), "merge sort", True)

Full Code for Merge Sort Algorithm with Animation in Python

from animate_generator import animate
import random
 
def merge(_arr, start, mid, end):
    result = []
    left_index = start
    right_index = mid + 1
    while left_index <= mid and right_index <= end:
        if _arr[left_index] < _arr[right_index]:
            result.append(_arr[left_index])
            left_index+=1
        else:
            result.append(_arr[right_index])
            right_index+=1
   
    while left_index <= mid:
        result.append(_arr[left_index])
        left_index+=1
   
    while right_index <= end:
        result.append(_arr[right_index])
        right_index+=1
   
    for index, value in enumerate(result):
        _arr[start+index] = value
        yield _arr
 
def sort(_arr, start, end):
    if end <= start:
        return
    mid = start + ((end-start+1)//2)
    yield from sort(_arr, start, mid-1)
    yield from sort(_arr, mid, end)
    yield from merge(_arr, start, mid-1, end)
    yield _arr
 
N=100
array = [i for i in range(N)]
random.shuffle(array)
animate(array, sort(array, 0, N-1), "merge sort", True)

Quicksort Animation with Python

The last sorting algorithm we’ll go over is the quicksort algorithm. The quicksort algorithm is based around choosing a “pivot” and sorting from start to end around that pivot by sorting the numbers less than it, then the numbers greater than it.

As always, we need to import the animate function and the random library. From there we’ll make a quicksort function that takes three parameters, the unsorted list, the start index, and the end index. Our function will start off by validating the indices and then defining a pivot and a swap index (as the start index for this example, but it’s actually up to you how you want to define it).

For each of the elements after the start index and before the end index, we’ll compare the element against the pivot. If the element is less than or equal to the pivot (it will never be equal in our example), we’ll increment the swap index and swap the ith element and the element in the swap index. We’ll yield the resulting array for each of these loops.

After we’re done traversing from the start index up, we swap the elements in the swap and start indices and yield the updated list. Then we yield from a quicksort of the new start index to one under the current swap index and a quicksort of one over the current swap index to the end index. Note that we don’t need to include the swap index from either of the recursive quicksort calls because it’s already in place.

from animate_generator import animate
import random
 
def quicksort(array, start, end):
    if start >= end:
        return
    pivot = array[start]
    swap = start
    for i in range(start+1, end+1):
        if array[i] <= pivot:
            swap += 1
            array[swap], array[i] = array[i], array[swap]
        yield array
    array[start], array[swap],  = array[swap], array[start]
    yield array
 
    yield from quicksort(array, start, swap-1)
    yield from quicksort(array, swap+1, end)
 
array = [i for i in range(100)]
random.shuffle(array)
animate(array, quicksort(array, 0, 99), "quicksort", False)

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

One-Time
Monthly
Yearly

Make a one-time donation

Make a monthly donation

Make a yearly donation

Choose an amount

$5.00
$15.00
$100.00
$5.00
$15.00
$100.00
$5.00
$15.00
$100.00

Or enter a custom amount

$

Your contribution is appreciated.

Your contribution is appreciated.

Your contribution is appreciated.

DonateDonate monthlyDonate yearly