Introduction
Multiprocessing is a technique that allows you to run multiple processes concurrently, each with its own Python interpreter and memory space.
Why Use Multiprocessing?
- Efficiency: Greatly improves the performance of CPU-bound tasks by leveraging multi-core processors.
- Isolation: Each process has its own memory space, reducing the risk of memory-related bugs.
Key Concepts
- Process: An independent execution unit with its own memory space.
- Pool: A collection of worker processes used to execute tasks in parallel.
- Queue: A thread and process safe data structure used for inter-process communication.
- Lock: A synchronization primitive used to prevent race conditions when processes access shared resources.
When to Use Threading
- CPU-bound Tasks: Tasks that require significant CPU resources, like image processing, scientific computations and simulations.
- Data Processing: Large-scale data processing tasks that can be divided into smaller, independent tasks.
- Parallel Execution: Any application that can benefit from parallel execution across multiple CPU cores.
Module
Creating a Process: You can create a process using the Process class from the multiprocessing module.
from multiprocessing import Process def print_numb(): print("-----Hello-----") process = Process(target=print_numb) process.start() process.join()
Using Process Pool: The Pool class allows you to manage a pool of worker processes to perform tasks in parallel.
from multiprocessing import Pool def addition(n): return n+3 num=[1,2,3,4,5] with Pool() as pool: result = pool.map(addition, num) print(result)
Inter-Process Communication with Queues: Processes can communicate with each other using a Queue.
from multiprocessing import Process,Queue def producer(queue): for i in range(5): print(i) def consumer(queue): while not queue.empty(): print(queue.get()) queue = Queue() pp = Process(target=producer, args=(queue,)) cp = Process(target=consumer, args=(queue,)) pp.start() pp.join() cp.start() cp.join()
Synchronizing Processes with Locks: To avoid race conditions use a Lock to synchronize access to shared resources. Race conditions occur when multiple processes attempt to access and modify shared resources (Files, Database connection etc.) concurrently.
from multiprocessing import Process, Lock counter = 0 lock = Lock() def increment(): global counter with lock: counter += 1 processes = [] for _ in range(10): process = Process(target=increment, args=(lock,)) processes.append(process) process.start() for process in processes: process.join() print(f"Final counter: {counter}")
Sharing State Between Processes: You can share state between processes using Value.
from multiprocessing import Process,Value def increment(shared_value): with shared_value.get_lock(): shared_value.value +=1 shared_value = Value('i',0) processes = [Process(target=increment, args=(shared_value,)) for _ in range(10)] for p in processes: p.start() for p in processes: p.join() print(f"Final value: {shared_value.value}")
More Information
multiprocessing — Process-based parallelism — Python 3.13.1 documentation