Introduction
Python’s collections module provides a set of specialized container data types that extend the functionality of built-in data structures. These advanced containers help handle complex data more efficiently than standard lists, tuples & dictionaries.
Types
Namedtuple: Is an extension of a regular tuple that allows you to access fields using names instead of index positions. It is used for storing structured data in a more readable format. Replacing dictionaries where field names are known in advance.
from collections import namedtuple
# Single records
person = namedtuple('Person',['name','age','city'])
p1 = person(name='Sau',age='29',city='NY')
# Access a element
print(p1.name)
# Convert to dictionary
p1._asdict()
# For multiple records
p1 = [person(name='Sau',age='29',city='NY'),
person(name='erick',age='35',city='TX'),
person(name='Mat',age='45',city='NY')]
for i in p1:
print(i.name)
Deque: Double ended queues allow fast appends and pops from both ends, making it more efficient than lists for queue operations. It is used to handle large datasets where fast append & removals are needed.
from collections import deque dq = deque([11,23,45,67]) # add a element dq.append(34) # add a element at the start dq.appendleft(1) # remove the element dq.pop() # remove the first elment dq.popleft() print(dq)
Counter: The Counter class is a dictionary subclass designed to count occurrences of elements. It is used for counting word frequencies in text analysis. Finding the most common elements in a dataset.
from collections import Counter
num = [1,3,4,6,7,7,3,5,6,7,3,1,0]
counter = Counter(num)
print(counter)
# get a most common elements
print(counter.most_common())
# for string value
c = Counter("Hello World")
print(c)
OrderedDict: An OrderedDict remembers the order in which keys were inserted. It is used for preserving the order of dictionary keys.
from collections import OrderedDict
od = OrderedDict()
od['a']=13
od['z']=1
od['y']=10
print(od)
# move to the end element
od.move_to_end('z')
print(od)
Defaultdict: A defaultdict automatically assigns a default value for missing keys, preventing KeyError. Avoiding manual key existence checks in dictionaries.
from collections import defaultdict # create for a int dd = defaultdict(int) dd['a']+=1 print(dd) # create a list dd = defaultdict(list) dd['a'].append(12) print(dd) # access a value print(dd['a'])
ChainMap: A ChainMap combines multiple dictionaries into a single view.
from collections import ChainMap
d1 = {'a':12,'x':87}
d2 = {'x':12,'z':87}
cm = ChainMap(d1,d2)
print(cm)
# access a value
print(cm['a'])
Customizing Built-in types: UserDict, UserList, UserString these classes allow you to customize dictionaries, lists & strings. Extending dictionary, list or string behaviors.
from collections import UserDict
class MyDict(UserDict):
def __setitem__(self, k, v):
if not isinstance(k, str):
raise TypeError("Keys must be string")
super().__setitem__(k,v)
d = MyDict()
d['a']='abc'
print(d)
More Info:
collections — Container datatypes — Python 3.10.16 documentation
