Productively Distracted
Posted Monday June 27, 2011 around 11:38 PM

While profiling a bit of legacy code last night, I noticed some strange differences in how fast it is to get a value from a dictionary with a default value. I decided to investigate further with some quick and dirty benchmarks.


For my comparison, both a positive and a negative lookup will be performed from each. The first method uses the dict.get method, passing in a default:

def with_get():
    result1 = a_dict.get(-1, None)
    result2 = a_dict.get(900, None)

The second method catches the KeyError exception to return a default:

def with_exception():
        result1 = a_dict[-1]
    except KeyError:
        result = None
        result2 = a_dict[900]
    except KeyError:
        result = None

The third and last method uses the in operator to return a default:

def with_in():
    result1 = a_dict[-1] if -1 in a_dict else None
    result2 = a_dict[900] if 900 in a_dict else None

The full benchmarking script that I created can be found here. It isn't hard science, but I think its good enough for general comparisons.


With python 2.7:

2.7 (r27:82508, Jul  3 2010, 20:17:05)
[GCC 4.0.1 (Apple Inc. build 5493)]

with_get():       0.702752828598
with_exception(): 2.69198894501
with_in():        0.579782009125

With python 3.2 (for giggles):

3.2 (r32:88452, Feb 20 2011, 11:12:31)
[GCC 4.2.1 (Apple Inc. build 5664)]

with_get():       0.49797797203063965
with_exception(): 0.8870129585266113
with_in():        0.4416959285736084

A couple things surprise me about this. First I thought exceptions were faster in python, but it is 5 times slower to use an exception. Second, the example with in hits the dictionary twice, and yet it is the fastest. I am guessing that the dict.get method must do something similar internally.


Use the in operator whenever possible. The convenience of dict.get is worth the very small time difference in my mind, but the exception method is unacceptably slow.

The good news is that the exception performance issues were obviously addressed to some extent in python 3 and are now only twice as slow.

Maybe this is common knowledge already, but it took me by surprise. I am interested to hear of other methods that are commonly used (if any). Feel free to fork the gist and give it a spin.

blog comments powered by Disqus