^^Python. collections.defaultdict(default_factory)

docs.python.org/3.8/library/collections.html

 

class collections.defaultdict([default_factory[, ...]])

defaultdict(list)    builds a dictionary of lists

list of key-value pairs  TO  dictionary of lists

[('c', 6), ('b', 2), ('c', 3), ('b', 4), ('a', 1), ('b', 2)]

to

{'c': [6, 3], 'b': [2, 4, 2], 'a': [1]}

 

from collections import defaultdict
s = [('c', 6), ('b', 2), ('c', 3), ('b', 4), ('a', 1), ('b', 2)]
d = defaultdict(list)
for k, v in s:
    d[k].append(v)
 
d
sorted(d.items())

When a key is

encountered the first time, it is not already in the mapping.

encountered again, it is already in the mapping

This technique is simpler and faster than an equivalent technique using dict.setdefault().

vo:

defaultdict(set) builds a dictionary of sets

list of key-value pairs TO dictionary of sets

[('c', 6), ('b', 2), ('c', 3), ('b', 4), ('a', 1), ('b', 2)]

to

{'c': {3, 6}, 'b': {2, 4}, 'a': {1}}

 

from collections import defaultdict
s = [('c', 6), ('b', 2), ('c', 3), ('b', 4), ('a', 1), ('b', 2)]
d = defaultdict(set) 
for k, v in s: 
    d[k].add(v)
 
d
sorted(d.items())

defaultdict(int) builds dictionary of int.

Useful for counting, like a bag or multiset in other languages.

string TO dictionary of int

'roberto antonio occa'

to

{'r': 2, 'o': 5, 'b': 1, 'e': 1, 't': 2, ' ': 2, 'a': 2, 'n': 2, 'i': 1, 'c': 2}

 

from collections import defaultdict
s = 'roberto antonio occa' 
d = defaultdict(int) 
for k in s: 
    d[k] += 1
 
d
sorted(d.items())
 

{'r': 2, 'o': 5, 'b': 1, 'e': 1, 't': 2, ' ': 2, 'a': 2, 'n': 2, 'i': 1, 'c': 2}

[(' ', 2), ('a', 2), ('b', 1), ('c', 2), ('e', 1), ('i', 1), ('n', 2), ('o', 5), ('r', 2), ('t', 2)]

 

When a letter is

encountered the first time, it is not already in the mapping

The function int() which always returns zero is just a special case of constant functions.

 

 

Counter objects

 

 

A counter tool is provided to support convenient and rapid tallies. For example:
>>>

>>> # Tally occurrences of words in a list
>>> cnt = Counter()
>>> for word in ['red', 'blue', 'red', 'green', 'blue', 'blue']:
... cnt[word] += 1
>>> cnt
Counter({'blue': 3, 'red': 2, 'green': 1})

 

Find the most common words in a text

 
from collections import Counter
import re
words = re.findall(r'\w+', open('collodi_pinocchio.txt').read().lower())
Counter(words).most_common(20)
 

Regular expression. Modulo re.py
https://docs.python.org/3.8/library/re.html