Hiprup

What is the difference between a dictionary and a defaultdict?

A dict is Python’s built-in hash map: key-value pairs where accessing a missing key raises KeyError. A collections.defaultdict is a subclass of dict that automatically creates a default value the first time a missing key is read.

How defaultdict differs: you pass a zero-argument factory callable at construction. When you access a key that isn’t there, the factory is called, the result is stored under that key, and that value is returned — all in one step.

from collections import defaultdict

counts = defaultdict(int)           # factory = int, produces 0
for word in "the quick brown fox".split():
    counts[word] += 1               # no KeyError; first access creates 0

groups = defaultdict(list)          # factory = list, produces []
for word in words:
    groups[len(word)].append(word)  # auto-creates the list

The equivalent with plain dict: doable but noisier — you use d.setdefault(key, factory()) or if k not in d: d[k] = [].

groups = {}
for word in words:
    groups.setdefault(len(word), []).append(word)

Common factories:

  • int — counters (0 default).

  • list — grouping values into buckets.

  • set — collecting unique values per key.

  • lambda: 0.0 or lambda: "N/A" — custom scalar defaults.

  • A class — lazily instantiate rich objects.

Behavior to know:

  • Reading a missing key inserts and returns the factory result. This can surprise code that expects a “pure” read to leave the dict unchanged.

  • in and .get() do not trigger the factory — use them if you want to probe without auto-insert.

  • The factory is .default_factory; setting it to None makes a defaultdict behave like a normal dict again.

  • Pickling/unpickling and JSON dumps work fine; JSON serializes the current contents as a regular object.

When to reach for each:

  • Use dict for general key-value storage or when you explicitly want missing-key errors to surface bugs.

  • Use defaultdict when you’re building per-key collections (counters, groupings, adjacency lists) and want to drop the if-exists boilerplate.

  • For pure counting, collections.Counter is even more direct: Counter(words).

Gotcha to watch: because reading inserts, logging d["missing"] in an error path changes the dict. Use d.get("missing") for pure reads.

Interview-ready summary: a plain dict raises on missing keys; a defaultdict(factory) auto-creates and stores a default value. Use defaultdict to simplify grouping, counting, and bucketing code — and use Counter when all you’re doing is counting.

from collections import defaultdict

# Regular dict - KeyError on missing key
regular = {}
# regular['missing']  # KeyError!
value = regular.get('missing', 0)  # Safe but verbose

# Counting with regular dict (verbose)
words = ['apple', 'banana', 'apple', 'cherry', 'banana', 'apple']
counts = {}
for word in words:
    if word not in counts:    # Must check!
        counts[word] = 0
    counts[word] += 1

# Counting with defaultdict (clean)
counts = defaultdict(int)  # int() returns 0
for word in words:
    counts[word] += 1       # Auto-creates 0 if missing
# {'apple': 3, 'banana': 2, 'cherry': 1}

# Grouping with defaultdict
students = [('Math', 'Alice'), ('Science', 'Bob'), ('Math', 'Charlie')]
by_subject = defaultdict(list)
for subject, name in students:
    by_subject[subject].append(name)  # Auto-creates [] if missing
# {'Math': ['Alice', 'Charlie'], 'Science': ['Bob']}

# Counter (even cleaner for counting)
from collections import Counter
counts = Counter(words)  # {'apple': 3, 'banana': 2, 'cherry': 1}
print(counts.most_common(2))  # [('apple', 3), ('banana', 2)]

Regular dict requires key existence checks before incrementing. defaultdict(int) auto-creates 0 for missing keys, making counting a one-liner. defaultdict(list) auto-creates empty lists for grouping. Counter is even more concise for counting — it counts elements directly from an iterable and provides most_common.

Show the before/after: verbose dict counting vs clean defaultdict counting. Know the three common factories: int (counting), list (grouping), set (unique grouping).

Mention Counter as the purpose-built solution for counting.

What is the difference between a dictionary and a defaultdict? | Hiprup