An Overview of Multiprocessing in Python

Python is one of the most popular programming languages today. However, one big weakness is that Python has no real parallelization capabilities. Multiprocessing is a way to implement virtual parallelization with Python. In this post we’ll cover:

  • What is Multiprocessing?
  • The Python Multiprocessing Library
    • Running two Functions in parallel
    • Multiprocessing Functions with Arguments
    • Multiprocessing for Functions that Depend on Other Functions
  • A Summary of Python Multiprocessing

What is Multiprocessing?

Multiprocessing is the action of running multiple processes at once. A process simply executes an instance of executable code. A program may contain multiple processes in it. Processes may be single threaded or multithreaded. Python does not have true multithreading due to its Global Interpreter Lock so true parallelization is impossible. However, multiprocessing allows us to achieve a level of virtual parallelization.

The Python Multiprocessing Library

The Python multiprocessing library is a native library that comes with the Python installation. It has an almost identical interface to the Python threading library. The multiprocessing library allows us to virtually parallelize operation by spinning up multiple processes and letting them run at the same time.

The core of running multiple functions in parallel using the multiprocessing library is the Process object. The Process object takes a target function to run. It also provides a way to pass arguments to those functions. We run Process objects by using the .start() and .join() functions. 

Running Two Functions in Parallel

The simplest version of multiprocessing is running two functions with no parameters in parallel. In the example below, we create two functions with no dependencies and no arguments. They each simply initialize a counter variable and increment it by 1 100 times.

To run these two example functions in parallel, we start by making two Process objects. We only need to pass a target to the Process. Then we start and join each Process object. We should get two lists that print from 1 to 100 independently of each other.
Read Run Multiple Functions in Parallel in Python 3 for a more detailed description.

from multiprocessing import Process
 
def func1():
    counter = 0
    print("start func 1")We 
    while counter < 100:
        counter += 1
        print("func 1", counter)
    print("end func 1")
 
def func2():
    counter = 0
    print("start func 2")
    while counter < 100:
        counter += 1
        print("func 2", counter)
    print("end func 2")
 
if __name__ == "__main__":
    p1 = Process(target = func1)
    p2 = Process(target = func2)
    p1.start()
    p2.start()
    p1.join()
    p2.join()

Multiprocessing Functions with Arguments

Now we know how to run the simplest of functions in parallel through the multiprocessing library. Let’s see how we can run functions that require arguments. The setup is almost the same. This time, the functions will each take a parameter and increment that parameter.

In the main part of the script, we’ll set up two counter variables, one for each Process. There are two ways to pass arguments to the Process functions. The first way is through passing a tuple through the args keyword. The second way is passing a dictionary through the kwargs keyword.

Read Python Multiprocessing with Arguments for a more detailed description.

from multiprocessing import Process
 
def func1(counter: int):
    print("start func 1")
    for i in range(100):
        counter += 1
        print("func 1", counter)
    print("end func 1")
 
def func2(counter: int):
    print("start func 2")
    for i in range(100):
        counter += 1
        print("func 2", counter)
    print("end func 2")
 
if __name__ == "__main__":
    counter1 = 0
    counter2 = 0
    p1 = Process(target = func1, args=(counter1,))
    p2 = Process(target = func2, kwargs={"counter":counter2})
    p1.start()
    p2.start()
    p1.join()
    p2.join()

Multiprocessing for Functions that Depend on Other Functions

Once we learn how to pass arguments to a function running in a Process we can use that to build functions on top of each other. In this example, we’re going to do almost the same set up using example functions that count from 1 to 100. This time, we’re going to also create two functions that will each run both of the counting functions in parallel.

Then, we’re going to run those functions in parallel. This results in four counters. Now, we’re going to pass the same variable to each of the processes. Since processes do not share memory by default, we can’t just pass a regular variable. We need to pass a special type of variable to the functions so that it gets incremented by all the functions.

The Value class from the multiprocessing library allows us to create a shared class type that can be used in multiple Process objects. This synchronous class type allows us to parallelize incrementing it through multiple processes. We should see a counter at 400 at the end of this function.
Read Python Multiprocessing Functions with Dependencies for a more detailed description.

from multiprocessing import Process, Value
 
def func1(counter):
    print("start func 1")
    for i in range(100):
        counter.value += 1
        print("func 1", counter.value)
    print("end func 1")
 
def func2(counter):
    print("start func 2")
    for i in range(100):
        counter.value += 1
        print("func 2", counter.value)
    print("end func 2")
 
def func3(counter):
    p1 = Process(target=func1, args=(counter,))
    p2 = Process(target=func2, args=(counter,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()
    print(f"func3, {counter.value}")
 
def func4(counter):
    p1 = Process(target=func1, args=(counter,))
    p2 = Process(target=func2, args=(counter,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()
    print(f"func4, {counter.value}")
 
if __name__ =="__main__":
    counter = Value('d', 0)
    p1 = Process(target=func3, args=(counter,))
    p2 = Process(target=func4, args=(counter,))
    p1.start()
    p2.start()
    p1.join()
    p2.join()
    print(f"main, {counter.value}")

A Summary of Python Multiprocessing

In this post we covered multiprocessing and how it can virtualize parallelization in Python. Then, we briefly covered the Python multiprocessing module and went over three things we can do with it. 

First, we covered how to run simple example functions with no dependencies or arguments. Second, we learned how to run example functions that require arguments. We can pass arguments through the args or kwargs parameters. Third, we learned how to run multiple functions in parallel that each depend on each other and modify a shared variable.

I run this site to help you and others like you find cool projects and practice software skills. If this is helpful for you and you enjoy your ad free site, please help fund this site by donating below! If you can’t donate right now, please think of us next time.

Yujian Tang

One thought on “An Overview of Multiprocessing in Python

Leave a Reply

%d bloggers like this: