Modern applications often need to perform multiple tasks simultaneously to improve efficiency. Python provides three primary ways to achieve this:
Threads – Lightweight tasks running in the same process
Multiprocessing – Running tasks in parallel using multiple CPU cores
Async Programming – Handling I/O-bound operations efficiently
Understanding when and how to use these techniques will help you write more efficient Python programs. Let’s dive in!
1️⃣ Understanding Concurrency vs. Parallelism
Concurrency means handling multiple tasks at the same time but not necessarily executing them simultaneously.
Parallelism means executing multiple tasks simultaneously by utilizing multiple CPU cores.
Feature | Concurrency | Parallelism |
---|---|---|
Execution | Tasks start and pause efficiently | Tasks run simultaneously |
CPU Cores Used | Usually single-core | Uses multiple CPU cores |
Suitable For | I/O-bound tasks | CPU-bound tasks |
2️⃣ Using Threads in Python (Concurrency for I/O-Bound Tasks)
Threads allow multiple operations to run concurrently within a single process.
Python provides the threading module for creating and managing threads.
Basic Thread Example
<span>import</span> <span>threading</span><span>def</span> <span>print_numbers</span><span>():</span><span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>5</span><span>):</span><span>print</span><span>(</span><span>i</span><span>)</span><span>thread</span> <span>=</span> <span>threading</span><span>.</span><span>Thread</span><span>(</span><span>target</span><span>=</span><span>print_numbers</span><span>)</span><span>thread</span><span>.</span><span>start</span><span>()</span> <span># Start the thread </span><span>thread</span><span>.</span><span>join</span><span>()</span> <span># Wait for the thread to complete </span><span>import</span> <span>threading</span> <span>def</span> <span>print_numbers</span><span>():</span> <span>for</span> <span>i</span> <span>in</span> <span>range</span><span>(</span><span>5</span><span>):</span> <span>print</span><span>(</span><span>i</span><span>)</span> <span>thread</span> <span>=</span> <span>threading</span><span>.</span><span>Thread</span><span>(</span><span>target</span><span>=</span><span>print_numbers</span><span>)</span> <span>thread</span><span>.</span><span>start</span><span>()</span> <span># Start the thread </span><span>thread</span><span>.</span><span>join</span><span>()</span> <span># Wait for the thread to complete </span>import threading def print_numbers(): for i in range(5): print(i) thread = threading.Thread(target=print_numbers) thread.start() # Start the thread thread.join() # Wait for the thread to complete
Enter fullscreen mode Exit fullscreen mode
Key Takeaways:
Threads are useful for I/O-bound tasks (e.g., network requests, file operations).
Python threads run on a single CPU core due to the Global Interpreter Lock (GIL).
Use join() to ensure a thread finishes before proceeding.
3️⃣ Multiprocessing in Python (Parallelism for CPU-Bound Tasks)
For CPU-intensive tasks, Python provides the multiprocessing module to run code on multiple CPU cores.
Basic Multiprocessing Example
<span>import</span> <span>multiprocessing</span><span>def</span> <span>square_number</span><span>(</span><span>n</span><span>):</span><span>print</span><span>(</span><span>n</span> <span>*</span> <span>n</span><span>)</span><span>process</span> <span>=</span> <span>multiprocessing</span><span>.</span><span>Process</span><span>(</span><span>target</span><span>=</span><span>square_number</span><span>,</span> <span>args</span><span>=</span><span>(</span><span>5</span><span>,))</span><span>process</span><span>.</span><span>start</span><span>()</span><span>process</span><span>.</span><span>join</span><span>()</span> <span># Ensures process completes before moving forward </span><span>import</span> <span>multiprocessing</span> <span>def</span> <span>square_number</span><span>(</span><span>n</span><span>):</span> <span>print</span><span>(</span><span>n</span> <span>*</span> <span>n</span><span>)</span> <span>process</span> <span>=</span> <span>multiprocessing</span><span>.</span><span>Process</span><span>(</span><span>target</span><span>=</span><span>square_number</span><span>,</span> <span>args</span><span>=</span><span>(</span><span>5</span><span>,))</span> <span>process</span><span>.</span><span>start</span><span>()</span> <span>process</span><span>.</span><span>join</span><span>()</span> <span># Ensures process completes before moving forward </span>import multiprocessing def square_number(n): print(n * n) process = multiprocessing.Process(target=square_number, args=(5,)) process.start() process.join() # Ensures process completes before moving forward
Enter fullscreen mode Exit fullscreen mode
Why Use Multiprocessing?
Bypasses the GIL, allowing true parallel execution.
Ideal for CPU-heavy tasks like image processing and data analysis.
4️⃣ Using ThreadPoolExecutor for Simpler Thread Management
The concurrent.futures module provides a simpler way to manage threads.
<span>from</span> <span>concurrent.futures</span> <span>import</span> <span>ThreadPoolExecutor</span><span>def</span> <span>fetch_data</span><span>(</span><span>site</span><span>):</span><span>print</span><span>(</span><span>f</span><span>"</span><span>Fetching data from </span><span>{</span><span>site</span><span>}</span><span>"</span><span>)</span><span>sites</span> <span>=</span> <span>[</span><span>"</span><span>site1.com</span><span>"</span><span>,</span> <span>"</span><span>site2.com</span><span>"</span><span>,</span> <span>"</span><span>site3.com</span><span>"</span><span>]</span><span>with</span> <span>ThreadPoolExecutor</span><span>(</span><span>max_workers</span><span>=</span><span>3</span><span>)</span> <span>as</span> <span>executor</span><span>:</span><span>executor</span><span>.</span><span>map</span><span>(</span><span>fetch_data</span><span>,</span> <span>sites</span><span>)</span> <span># Run tasks concurrently </span><span>from</span> <span>concurrent.futures</span> <span>import</span> <span>ThreadPoolExecutor</span> <span>def</span> <span>fetch_data</span><span>(</span><span>site</span><span>):</span> <span>print</span><span>(</span><span>f</span><span>"</span><span>Fetching data from </span><span>{</span><span>site</span><span>}</span><span>"</span><span>)</span> <span>sites</span> <span>=</span> <span>[</span><span>"</span><span>site1.com</span><span>"</span><span>,</span> <span>"</span><span>site2.com</span><span>"</span><span>,</span> <span>"</span><span>site3.com</span><span>"</span><span>]</span> <span>with</span> <span>ThreadPoolExecutor</span><span>(</span><span>max_workers</span><span>=</span><span>3</span><span>)</span> <span>as</span> <span>executor</span><span>:</span> <span>executor</span><span>.</span><span>map</span><span>(</span><span>fetch_data</span><span>,</span> <span>sites</span><span>)</span> <span># Run tasks concurrently </span>from concurrent.futures import ThreadPoolExecutor def fetch_data(site): print(f"Fetching data from {site}") sites = ["site1.com", "site2.com", "site3.com"] with ThreadPoolExecutor(max_workers=3) as executor: executor.map(fetch_data, sites) # Run tasks concurrently
Enter fullscreen mode Exit fullscreen mode
Key Benefits of ThreadPoolExecutor:
Manages threads efficiently.
Easier than manually starting/stopping threads.
5️⃣ Using ProcessPoolExecutor for Multiprocessing
Similar to ThreadPoolExecutor but runs tasks in separate processes.
<span>from</span> <span>concurrent.futures</span> <span>import</span> <span>ProcessPoolExecutor</span><span>def</span> <span>square</span><span>(</span><span>n</span><span>):</span><span>return</span> <span>n</span> <span>*</span> <span>n</span><span>numbers</span> <span>=</span> <span>[</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>3</span><span>,</span> <span>4</span><span>,</span> <span>5</span><span>]</span><span>with</span> <span>ProcessPoolExecutor</span><span>()</span> <span>as</span> <span>executor</span><span>:</span><span>results</span> <span>=</span> <span>executor</span><span>.</span><span>map</span><span>(</span><span>square</span><span>,</span> <span>numbers</span><span>)</span><span>print</span><span>(</span><span>list</span><span>(</span><span>results</span><span>))</span><span>from</span> <span>concurrent.futures</span> <span>import</span> <span>ProcessPoolExecutor</span> <span>def</span> <span>square</span><span>(</span><span>n</span><span>):</span> <span>return</span> <span>n</span> <span>*</span> <span>n</span> <span>numbers</span> <span>=</span> <span>[</span><span>1</span><span>,</span> <span>2</span><span>,</span> <span>3</span><span>,</span> <span>4</span><span>,</span> <span>5</span><span>]</span> <span>with</span> <span>ProcessPoolExecutor</span><span>()</span> <span>as</span> <span>executor</span><span>:</span> <span>results</span> <span>=</span> <span>executor</span><span>.</span><span>map</span><span>(</span><span>square</span><span>,</span> <span>numbers</span><span>)</span> <span>print</span><span>(</span><span>list</span><span>(</span><span>results</span><span>))</span>from concurrent.futures import ProcessPoolExecutor def square(n): return n * n numbers = [1, 2, 3, 4, 5] with ProcessPoolExecutor() as executor: results = executor.map(square, numbers) print(list(results))
Enter fullscreen mode Exit fullscreen mode
When to Use?
ThreadPoolExecutor → Best for I/O-bound tasks.ProcessPoolExecutor → Best for CPU-bound tasks.ThreadPoolExecutor → Best for I/O-bound tasks. ProcessPoolExecutor → Best for CPU-bound tasks.ThreadPoolExecutor → Best for I/O-bound tasks. ProcessPoolExecutor → Best for CPU-bound tasks.
Enter fullscreen mode Exit fullscreen mode
6️⃣ Async Programming in Python (asyncio)
For high-performance I/O-bound applications, Python provides async programming using asyncio.
Basic Async Example
<span>import</span> <span>asyncio</span><span>async</span> <span>def</span> <span>say_hello</span><span>():</span><span>print</span><span>(</span><span>"</span><span>Hello, World!</span><span>"</span><span>)</span><span>await</span> <span>asyncio</span><span>.</span><span>sleep</span><span>(</span><span>1</span><span>)</span> <span># Simulates an I/O operation </span> <span>print</span><span>(</span><span>"</span><span>Goodbye!</span><span>"</span><span>)</span><span>asyncio</span><span>.</span><span>run</span><span>(</span><span>say_hello</span><span>())</span><span>import</span> <span>asyncio</span> <span>async</span> <span>def</span> <span>say_hello</span><span>():</span> <span>print</span><span>(</span><span>"</span><span>Hello, World!</span><span>"</span><span>)</span> <span>await</span> <span>asyncio</span><span>.</span><span>sleep</span><span>(</span><span>1</span><span>)</span> <span># Simulates an I/O operation </span> <span>print</span><span>(</span><span>"</span><span>Goodbye!</span><span>"</span><span>)</span> <span>asyncio</span><span>.</span><span>run</span><span>(</span><span>say_hello</span><span>())</span>import asyncio async def say_hello(): print("Hello, World!") await asyncio.sleep(1) # Simulates an I/O operation print("Goodbye!") asyncio.run(say_hello())
Enter fullscreen mode Exit fullscreen mode
Why Use Async Programming?
Handles thousands of tasks concurrently.
Best for network requests, web scraping, and database queries.
Uses event loops to avoid blocking operations.
7️⃣ Combining asyncio with aiohttp for Async HTTP Requests
A common use case for async programming is making multiple API calls concurrently.
<span>import</span> <span>asyncio</span><span>import</span> <span>aiohttp</span><span>async</span> <span>def</span> <span>fetch</span><span>(</span><span>url</span><span>):</span><span>async</span> <span>with</span> <span>aiohttp</span><span>.</span><span>ClientSession</span><span>()</span> <span>as</span> <span>session</span><span>:</span><span>async</span> <span>with</span> <span>session</span><span>.</span><span>get</span><span>(</span><span>url</span><span>)</span> <span>as</span> <span>response</span><span>:</span><span>return</span> <span>await</span> <span>response</span><span>.</span><span>text</span><span>()</span><span>async</span> <span>def</span> <span>main</span><span>():</span><span>urls</span> <span>=</span> <span>[</span><span>"</span><span>https://example.com</span><span>"</span><span>,</span> <span>"</span><span>https://example.org</span><span>"</span><span>]</span><span>tasks</span> <span>=</span> <span>[</span><span>fetch</span><span>(</span><span>url</span><span>)</span> <span>for</span> <span>url</span> <span>in</span> <span>urls</span><span>]</span><span>results</span> <span>=</span> <span>await</span> <span>asyncio</span><span>.</span><span>gather</span><span>(</span><span>*</span><span>tasks</span><span>)</span><span>print</span><span>(</span><span>results</span><span>)</span><span>asyncio</span><span>.</span><span>run</span><span>(</span><span>main</span><span>())</span><span>import</span> <span>asyncio</span> <span>import</span> <span>aiohttp</span> <span>async</span> <span>def</span> <span>fetch</span><span>(</span><span>url</span><span>):</span> <span>async</span> <span>with</span> <span>aiohttp</span><span>.</span><span>ClientSession</span><span>()</span> <span>as</span> <span>session</span><span>:</span> <span>async</span> <span>with</span> <span>session</span><span>.</span><span>get</span><span>(</span><span>url</span><span>)</span> <span>as</span> <span>response</span><span>:</span> <span>return</span> <span>await</span> <span>response</span><span>.</span><span>text</span><span>()</span> <span>async</span> <span>def</span> <span>main</span><span>():</span> <span>urls</span> <span>=</span> <span>[</span><span>"</span><span>https://example.com</span><span>"</span><span>,</span> <span>"</span><span>https://example.org</span><span>"</span><span>]</span> <span>tasks</span> <span>=</span> <span>[</span><span>fetch</span><span>(</span><span>url</span><span>)</span> <span>for</span> <span>url</span> <span>in</span> <span>urls</span><span>]</span> <span>results</span> <span>=</span> <span>await</span> <span>asyncio</span><span>.</span><span>gather</span><span>(</span><span>*</span><span>tasks</span><span>)</span> <span>print</span><span>(</span><span>results</span><span>)</span> <span>asyncio</span><span>.</span><span>run</span><span>(</span><span>main</span><span>())</span>import asyncio import aiohttp async def fetch(url): async with aiohttp.ClientSession() as session: async with session.get(url) as response: return await response.text() async def main(): urls = ["https://example.com", "https://example.org"] tasks = [fetch(url) for url in urls] results = await asyncio.gather(*tasks) print(results) asyncio.run(main())
Enter fullscreen mode Exit fullscreen mode
Key Benefits:
Faster than traditional requests module for multiple API calls.<br>Non-blocking execution improves performance.<br>Faster than traditional requests module for multiple API calls.<br> Non-blocking execution improves performance.<br>Faster than traditional requests module for multiple API calls.
Non-blocking execution improves performance.
Enter fullscreen mode Exit fullscreen mode
8️⃣ Choosing the Right Approach for Your Application
Scenario | Best Approach |
---|---|
Making multiple API calls | asyncio with aiohttp |
Reading large files | Threads (ThreadPoolExecutor) |
CPU-intensive computations | Multiprocessing (ProcessPoolExecutor) |
Running background tasks | Threads |
Conclusion
Threads → Best for I/O-bound tasks (network requests, file handling).
Multiprocessing → Best for CPU-heavy tasks (data processing, machine learning).
Async Programming → Ideal for high-performance I/O tasks (web scraping, API calls).
Mastering these concepts helps write efficient and scalable Python programs.
What’s Next?
In the next post, we’ll explore Error Handling and Debugging in Python, covering best practices, logging, and debugging tools. Stay tuned!
What Do You Think?
Have you used multiprocessing or asyncio in your projects? Let’s discuss in the comments!
原文链接:Concurrency and Parallelism in Python – Threads, Multiprocessing, and Async Programming
暂无评论内容