📅 February 23, 2024
Multithreading allows your Python script to perform more than one thing at the same time. This is called concurrency, and it can be used to help make your programs more efficient and speed up their apparent execution…somewhat.
Python provides the threading module, so you can run multiple functions simultaneously. It is pretty cool stuff, but the world of concurrency opens an entirely new set of issues to deal with.
Let’s write a simple Python script that runs the same function multiple times — simultaneously — to show how to perform multithreading in Python.
Non-threaded Code
Below is a Python script that passes a string to a function, converts the string to uppercase, and prints Unicode stars between the letters. When finished, the script displays “All operations complete” and exits.
#!/usr/bin/env python3 import time def showMessage(msg): for c in msg: print(c.upper() + '\u2605', end = '', flush = True) time.sleep(0.1) print() showMessage('MySecretMessage') showMessage('BringLegos') print('All operations complete.')
Okay, that shows us what happens when we run a single-threaded application. With one thread, there is a single “thread” of execution. A call to the showMessage( ) function must complete before the next call can begin. The final message only appears at the end of the code after all preceding statements have finished executing. This is plain sequential execution that we are familiar with. So far, so good.
Multi-threaded Code
But what happens if we revise our script in a way that turns the call to showMessage() into a thread?
#!/usr/bin/env python3 import threading, time def showMessage(msg): for c in msg: print(c.upper() + '\u2605', end = '', flush = True) time.sleep(0.1) print() threading.Thread(target = showMessage, args = ('MySecretMessage',)).start() print('All operations complete.')
Import the threading module, and then create a thread object using Thread(). The parameters specify how the thread show behave.
target = showMessage
target is a function to call. Make sure to use only the function name because we need to pass the function object to threading. Do not use parentheses like this:
target = showMessage() # No, no, no!
args = ('MySecretMessage', )
Since the function showMessage() requires an argument, we specify the argument that we need to pass to the function here. Notice the stray comma? Arguments are passed as tuples, and this is tuple shorthand. Omitting it will result in a syntax error.
.start()
This method on the thread object starts the thread. We can delay or control when a thread begins executing. Calling the start() method when the thread is created runs it immediately.
“Okay, so why do we see All operations complete near the beginning of the output? That should appear after the function is finished, right?”
Yes, but we have taken it out of the flow of normal execution. As soon as the thread starts executing, Python begins executing the next statement it finds, which prints “All operations complete.” Python no longer waits for showMessage() to finish executing before moving on. This is because showMessage() is now a thread that is executing independently of the main flow of code.
When thread completes execution, it stops and the program exits. This means that the main flow of execution will wait for all of the threads to complete before quitting the program.
Running Two Threads
What happens when we run two threads at once? To achieve this, just create another thread that calls the same function. The two threads will treat them as separate function calls.
#!/usr/bin/env python3 import threading, time def showMessage(msg): for c in msg: print(c.upper() + '\u2605', end = '', flush = True) time.sleep(0.1) print() threading.Thread(target = showMessage, args = ('MySecretMessage',)).start() threading.Thread(target = showMessage, args = ('BringLegos',)).start() print('All operations complete.')
We have three parts of our code competing for the lone terminal output from which the script is running, so we see scrambled text. This is normal. “All operations complete” appears as before and for the same reason, but now that we have two threads running, both threads are printing to the terminal when they are able to. This causes the outputted text to be interleaved. Again, this is normal operation for threads. If a thread does not need to output text to a terminal or file like this, then there is nothing to be alarmed about. Usually, threads are written to avoid competing output.
Naming Threads
To help illustrate what is happening, we can assign names to threads and display their names to show which output comes from which thread.
#!/usr/bin/env python3 import threading, time def showMessage(msg): for c in msg: print(f'Thread: {threading.current_thread()._name:>8}: {c.upper()}\u2605', flush = True) time.sleep(0.1) print() # Instantiate threads firstThread = threading.Thread(target = showMessage, args = ('MySecretMessage',)) secondThread = threading.Thread(target = showMessage, args = ('BringLegos',)) # Assign names to each thread object firstThread.name = 'Secret' secondThread.name = 'Legos' # Start the threads firstThread.start() secondThread.start() print('All operations complete.')
We have altered the script to use an f-string for readability and easier formatting.
Each thread has a _name property (include the underscore) that we can read to get a thread’s name that we assigned earlier using the name property (no underscore) of the thread object. We use
threading.current_thread()._name
within the function to get the name of the thread running that function. Be sure to use the _name property instead of the getName() method because getName() has been deprecated, and you will receive a warning each time you run the script.
firstThread.name = 'Secret' secondThread.name = 'Legos'
Assigns a name to each thread.
firstThread.start() secondThread.start()
Starts the threads.
Waiting for Thread Completion
“But the text All operations complete is still out of order. How do we fix that?”
If you notice, one thread will complete before the other because the message is shorter. The script still waits for both threads to complete execution. We can tell Python when to wait for thread completion using the join() method on both thread objects.
#!/usr/bin/env python3 import threading, time def showMessage(msg): for c in msg: print(f'Thread: {threading.current_thread()._name:>8}: {c.upper()}\u2605', flush = True) time.sleep(0.1) print() firstThread = threading.Thread(target = showMessage, args = ('MySecretMessage',)) secondThread = threading.Thread(target = showMessage, args = ('BringLegos',)) firstThread.name = 'Secret' secondThread.name = 'Legos' firstThread.start() secondThread.start() # Wait for both threads to complete execution firstThread.join() secondThread.join() print('All operations complete.')
Our diagram now looks like this:
Global Interpreter Lock
“Is Python multithreading true multihreading?”
No, not really. Due to the way the Python interpreter runs, it cannot achieve true multithreading the way C or C++ can. Why? Because of this itsy-bitsy little feature called the Global Interpreter Lock (GIL). Also, Python is much, much slower than a true compiled language, such as C, so multithreading will be slower anyway…but that is a different topic.
When we enter the realm of multithreading, we create a plethora of issues that we never need to consider when writing single threaded programs. Issues such as race conditions, locking, thread-safe variables, deadlocks, and semaphores are just a few of the complications that can arise.
To avoid some of these complications, Python implements the GIL to ensure that your system will not lock up or create zombie processes that might force a system reset. This is a good thing for an interpreted language like Python, but not good for speed or true parallel computing optimization. Python multithreading is more of an “illusion” to fool us into thinking multithreading is occurring by taking advantage of Python downtime.
The GIL is a mutex (fancy word for a lock) that locks the interpreter into executing a single thread at a time no matter how many cores are available on a physical CPU. Because of this, the GIL is often the culprit behind processor bottlenecks despite our best efforts to write streamlined, multithreaded Python code.
If you want 100% true multithreading that can take advantage of multi-core CPUs while running as fast as possible, then there is no way around it: you will need to use C, C++, or another language that assumes you are a programming guru who can crunch code in your sleep.
This does not mean that Python multithreading is a pointless endeavor. There will be times when you want Python to do something in the background, maybe as a daemon process, while waiting for something else to finish or for user input. Rather than bringing the program to a screeching halt with a blinking cursor, for example, multithreading allows Python to do something else behind the scenes. This could allow precaching data while the user completes a form — as an example. It could even be a background polling task checking for temperatures. The possibilities are endless, and Python multithreading allows us to writes programs where concurrency would be beneficial.
Conclusion
This is meant to be a very brief introduction to the world of multithreading with Python and show how to write simple threads. The best way is to practice running basic functions as threads, and then look for ways to divide your program in nonessential chunks that can run asynchronously with the main thread as an accomplice.
After a while, you will see issues arise that are not readily apparent during single threaded programming and begin to ask yourself, “Hmm. I never thought that would happen. Why? It makes no sense when my code is correct. How can I solve this?”
Have fun!