Processing in Python

Introduction

Multiprocessing is a technique that allows you to run multiple processes concurrently, each with its own Python interpreter and memory space.

Why Use Multiprocessing?

  • Efficiency: Greatly improves the performance of CPU-bound tasks by leveraging multi-core processors.
  • Isolation: Each process has its own memory space, reducing the risk of memory-related bugs.

Key Concepts

  • Process:  An independent execution unit with its own memory space.
  • Pool: A collection of worker processes used to execute tasks in parallel.
  • Queue: A thread and process safe data structure used for inter-process communication.
  • Lock: A synchronization primitive used to prevent race conditions when processes access shared resources.

When to Use Threading

  • CPU-bound Tasks: Tasks that require significant CPU resources, like image processing, scientific computations and simulations.
  • Data Processing: Large-scale data processing tasks that can be divided into smaller, independent tasks.
  • Parallel Execution: Any application that can benefit from parallel execution across multiple CPU cores.

Module

Creating a Process: You can create a process using the Process class from the multiprocessing module.

from multiprocessing import Process

def print_numb():
    print("-----Hello-----")

process = Process(target=print_numb)
process.start()
process.join()

Using Process Pool: The Pool class allows you to manage a pool of worker processes to perform tasks in parallel.

from multiprocessing import Pool

def addition(n):
    return n+3

num=[1,2,3,4,5]
with Pool() as pool:
    result = pool.map(addition, num)

print(result)

Inter-Process Communication with Queues: Processes can communicate with each other using a Queue.

from multiprocessing import Process,Queue

def producer(queue):
    for i in range(5):
        print(i)

def consumer(queue):
    while not queue.empty():
        print(queue.get())

queue = Queue()
pp = Process(target=producer, args=(queue,))
cp = Process(target=consumer, args=(queue,))

pp.start()
pp.join()

cp.start()
cp.join()

Synchronizing Processes with Locks: To avoid race conditions use a Lock to synchronize access to shared resources. Race conditions occur when multiple processes attempt to access and modify shared resources (Files, Database connection etc.) concurrently.

from multiprocessing import Process, Lock

counter = 0
lock = Lock()

def increment():
    global counter
    with lock:
        counter += 1

processes = []
for _ in range(10):
    process = Process(target=increment, args=(lock,))
    processes.append(process)
    process.start()

for process in processes:
    process.join()

print(f"Final counter: {counter}")

Sharing State Between Processes: You can share state between processes using Value.

from multiprocessing import Process,Value

def increment(shared_value):
    with shared_value.get_lock():
        shared_value.value +=1

shared_value = Value('i',0)
processes = [Process(target=increment, args=(shared_value,)) for _ in range(10)]
for p in processes:
    p.start()

for p in processes:
    p.join()

print(f"Final value: {shared_value.value}")

More Information

Python – pytechie.com

multiprocessing — Process-based parallelism — Python 3.13.1 documentation

Leave a Reply