Skip to content Skip to sidebar Skip to footer

Is The Defaultdict In Python's Collections Module Really Faster Than Using Setdefault?

I've seen other Python programmers use defaultdict from the collections module for the following use case: from collections import defaultdict s = [('yellow', 1), ('blue', 2), ('y

Solution 1:

Yes, there is something "wrong":

You have put the creation of the (default)dict into the statement instead of the setup. Constructing a new defaultdict is more expensive than a normal dict, and usually that's not the bottleneck you should be profiling in a program - after all, you build your data structures once but you use them many times.

If you do your tests like below, you see that defaultdict operations are indeed faster:

>>> import timeit
>>> setup1 = """from collections import defaultdict
... s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
... d = defaultdict(list)"""
>>> stmt1 = """for k, v in s:
...     d[k].append(v)"""
>>> setup2 = """s = [('yellow', 1), ('blue', 2), ('yellow', 3), ('blue', 4), ('red', 1)]
... d = {}"""
>>> stmt2 = """for k, v in s:
...     d.setdefault(k, []).append(v)"""
>>> timeit.timeit(setup=setup1, stmt=stmt1)
1.0283400125194078
>>> timeit.timeit(setup=setup2, stmt=stmt2)
1.7767367580925395

Python 2.7.3 on Win7 x64.


Post a Comment for "Is The Defaultdict In Python's Collections Module Really Faster Than Using Setdefault?"