Python Collections Module

Introduction

Python’s collections module provides a set of specialized container data types that extend the functionality of built-in data structures. These advanced containers help handle complex data more efficiently than standard lists, tuples & dictionaries.

Types

Namedtuple: Is an extension of a regular tuple that allows you to access fields using names instead of index positions. It is used for storing structured data in a more readable format. Replacing dictionaries where field names are known in advance.

from collections import namedtuple

# Single records
person = namedtuple('Person',['name','age','city'])
p1 = person(name='Sau',age='29',city='NY')

# Access a element
print(p1.name)

# Convert to dictionary
p1._asdict()

# For multiple records
p1 = [person(name='Sau',age='29',city='NY'),
     person(name='erick',age='35',city='TX'),
     person(name='Mat',age='45',city='NY')]

for i in p1:
    print(i.name)

Deque: Double ended queues allow fast appends and pops from both ends, making it more efficient than lists for queue operations. It is used to handle large datasets where fast append & removals are needed.

from collections import deque

dq = deque([11,23,45,67])

# add a element
dq.append(34)

# add a element at the start
dq.appendleft(1)

# remove the element
dq.pop()

# remove the first elment
dq.popleft()
print(dq)

Counter: The Counter class is a dictionary subclass designed to count occurrences of elements. It is used for counting word frequencies in text analysis. Finding the most common elements in a dataset.

from collections import Counter

num = [1,3,4,6,7,7,3,5,6,7,3,1,0]
counter = Counter(num)
print(counter)

# get a most common elements
print(counter.most_common())

# for string value
c = Counter("Hello World")
print(c)

OrderedDict: An OrderedDict remembers the order in which keys were inserted. It is used for preserving the order of dictionary keys.

from collections import OrderedDict

od = OrderedDict()
od['a']=13
od['z']=1
od['y']=10
print(od)

# move to the end element
od.move_to_end('z')
print(od)

Defaultdict: A defaultdict automatically assigns a default value for missing keys, preventing KeyError. Avoiding manual key existence checks in dictionaries.

from collections import defaultdict

# create for a int
dd = defaultdict(int)
dd['a']+=1
print(dd)

# create a list
dd = defaultdict(list)
dd['a'].append(12)
print(dd)

# access a value
print(dd['a'])

ChainMap: A ChainMap combines multiple dictionaries into a single view.

from collections import ChainMap

d1 = {'a':12,'x':87}
d2 = {'x':12,'z':87}

cm = ChainMap(d1,d2)
print(cm)

# access a value
print(cm['a'])

Customizing Built-in types: UserDict, UserList, UserString these classes allow you to customize dictionaries, lists & strings. Extending dictionary, list or string behaviors.

from collections import UserDict

class MyDict(UserDict):
    def __setitem__(self, k, v):
        if not isinstance(k, str):
            raise TypeError("Keys must be string")
        super().__setitem__(k,v)

d = MyDict()
d['a']='abc'
print(d)

More Info:

Python – pytechie.com

collections — Container datatypes — Python 3.10.16 documentation

Leave a Reply