Intro to Python Multithreading

📅 February 23, 2024
Multithreading allows your Python script to perform more than one thing at the same time. This is called concurrency, and it can be used to help make your programs more efficient and speed up their apparent execution…somewhat.

Python provides the threading module, so you can run multiple functions simultaneously. It is pretty cool stuff, but the world of concurrency opens an entirely new set of issues to deal with.

Let’s write a simple Python script that runs the same function multiple times — simultaneously — to show how to perform multithreading in Python.

Non-threaded Code

Below is a Python script that passes a string to a function, converts the string to uppercase, and prints Unicode stars between the letters. When finished, the script displays “All operations complete” and exits.

#!/usr/bin/env python3

import time

def showMessage(msg):
    for c in msg:
        print(c.upper() + '\u2605', end = '', flush = True)
        time.sleep(0.1)
    print()


showMessage('MySecretMessage') 
showMessage('BringLegos') 
print('All operations complete.')

Output. You will need to use a terminal capable of displaying Unicode characters. gnome-terminal used here.

Okay, that shows us what happens when we run a single-threaded application. With one thread, there is a single “thread” of execution. A call to the showMessage( ) function must complete before the next call can begin. The final message only appears at the end of the code after all preceding statements have finished executing. This is plain sequential execution that we are familiar with. So far, so good.

Multi-threaded Code

But what happens if we revise our script in a way that turns the call to showMessage() into a thread?

#!/usr/bin/env python3

import threading, time

def showMessage(msg):
    for c in msg:
        print(c.upper() + '\u2605', end = '', flush = True)
        time.sleep(0.1)
    print()

threading.Thread(target = showMessage, args = ('MySecretMessage',)).start()

print('All operations complete.')

What? This makes no sense, but the output is correct.

Import the threading module, and then create a thread object using Thread(). The parameters specify how the thread show behave.

target = showMessage

target is a function to call. Make sure to use only the function name because we need to pass the function object to threading. Do not use parentheses like this:

target = showMessage() # No, no, no!

args = ('MySecretMessage', )

Since the function showMessage() requires an argument, we specify the argument that we need to pass to the function here. Notice the stray comma? Arguments are passed as tuples, and this is tuple shorthand. Omitting it will result in a syntax error.

.start()

This method on the thread object starts the thread. We can delay or control when a thread begins executing. Calling the start() method when the thread is created runs it immediately.

“Okay, so why do we see All operations complete near the beginning of the output? That should appear after the function is finished, right?”

Yes, but we have taken it out of the flow of normal execution. As soon as the thread starts executing, Python begins executing the next statement it finds, which prints “All operations complete.” Python no longer waits for showMessage() to finish executing before moving on. This is because showMessage() is now a thread that is executing independently of the main flow of code.

Single threaded vs. Multithreaded. With multithreading, showMessage() immediately executes when .start() is called on its thread. The script immediately processes the next statement while the thread continues to execute showMessage().

When thread completes execution, it stops and the program exits. This means that the main flow of execution will wait for all of the threads to complete before quitting the program.

Running Two Threads

What happens when we run two threads at once? To achieve this, just create another thread that calls the same function. The two threads will treat them as separate function calls.

#!/usr/bin/env python3

import threading, time

def showMessage(msg):
    for c in msg:
        print(c.upper() + '\u2605', end = '', flush = True)
        time.sleep(0.1)
    print()

threading.Thread(target = showMessage, args = ('MySecretMessage',)).start()
threading.Thread(target = showMessage, args = ('BringLegos',)).start()

print('All operations complete.')

Messy, but this is correct operation.

We have three parts of our code competing for the lone terminal output from which the script is running, so we see scrambled text. This is normal. “All operations complete” appears as before and for the same reason, but now that we have two threads running, both threads are printing to the terminal when they are able to. This causes the outputted text to be interleaved. Again, this is normal operation for threads. If a thread does not need to output text to a terminal or file like this, then there is nothing to be alarmed about. Usually, threads are written to avoid competing output.

Naming Threads

To help illustrate what is happening, we can assign names to threads and display their names to show which output comes from which thread.

#!/usr/bin/env python3

import threading, time

def showMessage(msg):
    for c in msg:
        print(f'Thread: {threading.current_thread()._name:>8}: {c.upper()}\u2605', flush = True)
        time.sleep(0.1)
    print()

# Instantiate threads
firstThread = threading.Thread(target = showMessage, args = ('MySecretMessage',))
secondThread = threading.Thread(target = showMessage, args = ('BringLegos',))

# Assign names to each thread object
firstThread.name = 'Secret'
secondThread.name = 'Legos'

# Start the threads
firstThread.start()
secondThread.start()

print('All operations complete.')

By naming the threads, we can see which thread is printing what text.

We have altered the script to use an f-string for readability and easier formatting.

Each thread has a _name property (include the underscore) that we can read to get a thread’s name that we assigned earlier using the name property (no underscore) of the thread object. We use

threading.current_thread()._name

within the function to get the name of the thread running that function. Be sure to use the _name property instead of the getName() method because getName() has been deprecated, and you will receive a warning each time you run the script.

firstThread.name = 'Secret'
secondThread.name = 'Legos'

Assigns a name to each thread.

firstThread.start()
secondThread.start()

Starts the threads.

Waiting for Thread Completion

“But the text All operations complete is still out of order. How do we fix that?”

No matter how many threads are started, Python immediately executes the next line of code that prints “All operations complete” and then waits for both threads to finish their executions.

If you notice, one thread will complete before the other because the message is shorter. The script still waits for both threads to complete execution. We can tell Python when to wait for thread completion using the join() method on both thread objects.

#!/usr/bin/env python3

import threading, time

def showMessage(msg):
    for c in msg:
        print(f'Thread: {threading.current_thread()._name:>8}: {c.upper()}\u2605', flush = True)
        time.sleep(0.1)
    print()

firstThread = threading.Thread(target = showMessage, args = ('MySecretMessage',))
secondThread = threading.Thread(target = showMessage, args = ('BringLegos',))

firstThread.name = 'Secret'
secondThread.name = 'Legos'

firstThread.start()
secondThread.start()

# Wait for both threads to complete execution
firstThread.join()
secondThread.join()

print('All operations complete.')

Now, Python waits for both threads to finish before continuing with the next statement.

Our diagram now looks like this:

The main thread of execution waits for both threads to complete before executing the remainder of the program.

Global Interpreter Lock

“Is Python multithreading true multihreading?”

No, not really. Due to the way the Python interpreter runs, it cannot achieve true multithreading the way C or C++ can. Why? Because of this itsy-bitsy little feature called the Global Interpreter Lock (GIL). Also, Python is much, much slower than a true compiled language, such as C, so multithreading will be slower anyway…but that is a different topic.

When we enter the realm of multithreading, we create a plethora of issues that we never need to consider when writing single threaded programs. Issues such as race conditions, locking, thread-safe variables, deadlocks, and semaphores are just a few of the complications that can arise.

To avoid some of these complications, Python implements the GIL to ensure that your system will not lock up or create zombie processes that might force a system reset. This is a good thing for an interpreted language like Python, but not good for speed or true parallel computing optimization. Python multithreading is more of an “illusion” to fool us into thinking multithreading is occurring by taking advantage of Python downtime.

The GIL is a mutex (fancy word for a lock) that locks the interpreter into executing a single thread at a time no matter how many cores are available on a physical CPU. Because of this, the GIL is often the culprit behind processor bottlenecks despite our best efforts to write streamlined, multithreaded Python code.

If you want 100% true multithreading that can take advantage of multi-core CPUs while running as fast as possible, then there is no way around it: you will need to use C, C++, or another language that assumes you are a programming guru who can crunch code in your sleep.

This does not mean that Python multithreading is a pointless endeavor. There will be times when you want Python to do something in the background, maybe as a daemon process, while waiting for something else to finish or for user input. Rather than bringing the program to a screeching halt with a blinking cursor, for example, multithreading allows Python to do something else behind the scenes. This could allow precaching data while the user completes a form — as an example. It could even be a background polling task checking for temperatures. The possibilities are endless, and Python multithreading allows us to writes programs where concurrency would be beneficial.

Conclusion

This is meant to be a very brief introduction to the world of multithreading with Python and show how to write simple threads. The best way is to practice running basic functions as threads, and then look for ways to divide your program in nonessential chunks that can run asynchronously with the main thread as an accomplice.

After a while, you will see issues arise that are not readily apparent during single threaded programming and begin to ask yourself, “Hmm. I never thought that would happen. Why? It makes no sense when my code is correct. How can I solve this?”

Have fun!

programming, python

This entry was posted on February 23, 2024, 9:27 PM and is filed under Programming. You can follow any responses to this entry through RSS 2.0. You can leave a response, or trackback from your own site.