Python Dictionaries and Tuples

March 08, 2018

Dictionaries

Python’s efficient key/value hash table structure is called a dict. The contents of a dict can be written as a series of key: value pairs within braces { }, e.g. dict = {key1:value1, key2:value2, ... }. The “empty dict” is just an empty pair of curly braces {}.

Looking up or setting a value in a dict uses square brackets, e.g. dict['foo'] looks up the value under the key 'foo'. Strings, numbers, and tuples work as keys, and any type can be a value. Other types may or may not work correctly as keys (strings and tuples work cleanly since they are immutable). Looking up a value which is not in the dict throws a KeyError — use in to check if the key is in the dict, or use dict.get(key) which returns the value or None if the key is not present (or get(key, not-found) allows you to specify what value to return in the not-found case).

## Can build up a dict by starting with the the empty dict {}
## and storing key/value pairs into the dict like this:
## dict[key] = value-for-that-key
dict = {}
dict['a'] = 'alpha'
dict['g'] = 'gamma'
dict['o'] = 'omega'
print dict  ## {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}
print dict['a']     ## Simple lookup, returns 'alpha'
dict['a'] = 6       ## Put new key/value into dict
'a' in dict         ## True
## print dict['z']                  ## Throws KeyError
if 'z' in dict: print dict['z']     ## Avoid KeyError
print dict.get('z')  ## None (instead of KeyError)

Deleting items

The del operator does deletions. In the simplest case, it can remove the definition of a variable, as if that variable had not been defined. del can also be used on list elements or slices to delete that part of the list and to delete entries from a dictionary.

var = 6
del var  # var no more!
list = ['a', 'b', 'c', 'd']
del list[0]     ## Delete first element
del list[-2:]   ## Delete last two elements
print list      ## ['b']
dict = {'a':1, 'b':2, 'c':3}
del dict['b']   ## Delete 'b' entry
print dict      ## {'a':1, 'c':3}

Iterating over dictionaries

A for-loop on a dictionary iterates over its keys by default. The keys will appear in an arbitrary order. The methods dict.keys() and dict.values() return lists of the keys or values explicitly. There’s also an items() which returns a list of (key, value) tuples, which is the most efficient way to examine all the key value data in the dictionary.

## By default, iterating over a dict iterates over its keys.
## Note that the keys are in a random order.
for key in dict: print key
## prints a g o
## Exactly the same as above
for key in dict.keys(): print key
## Get the .keys() list:
print dict.keys()  ## ['a', 'o', 'g']
## Likewise, there's a .values() list of values
print dict.values()  ## ['alpha', 'omega', 'gamma']
## .items() is the dict expressed as (key, value) tuples
print dict.items()  ##  [('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')]
## This loop syntax accesses the whole dict by looping
## over the .items() tuple list, accessing one (key, value)
## pair on each iteration.
for k, v in dict.items(): print k, '>', v
## a > alpha    o > omega     g > gamma

All of these lists can be passed to the sorted() function. For example:

## Common case -- loop over the keys in sorted order,
## accessing each key/value
for key in sorted(dict.keys()):
    print key, dict[key]

There are iter variants of these methods called iterkeys(), itervalues() and iteritems() which avoid the cost of constructing the whole list — a performance win if the data is huge. However, I generally prefer the plain keys() and values() methods with their sensible names. In Python 3.0 revision, the need for the iterkeys() variants is going away.

Dictionary Performance

From a performance point of view, the dictionary is one of your greatest tools, and you should use it where you can as an easy way to organize data. For example, you might read a log file where each line begins with an IP address, and store the data into a dict using the IP address as the key, and the list of lines where it appears as the value. Once you’ve read in the whole file, you can look up any IP address and instantly see its list of lines. The dictionary takes in scattered data and makes it into something coherent.

Formatting using Dictionaries

The % string formatting operator works conveniently to substitute values from a dict into a string by name:

hash = {}
hash['word'] = 'garfield'
hash['count'] = 42
s = 'I want %(count)d copies of %(word)s' % hash  # %d for int, %s for string
# 'I want 42 copies of garfield'

Tuples

A tuple is a fixed size grouping of elements, such as an (x, y) co-ordinate. Tuples are like lists, except they are immutable and do not change size (tuples are not strictly immutable since one of the contained elements could be mutable). Tuples play a sort of “struct” role in Python — a convenient way to pass around a little logical, fixed size bundle of values. A function that needs to return multiple values can just return a tuple of the values. For example, if I wanted to have a list of 3-d coordinates, the natural python representation would be a list of tuples, where each tuple is size 3 holding one (x, y, z) group.

To create a tuple, just list the values within parenthesis separated by commas. The “empty” tuple is just an empty pair of parenthesis. Accessing the elements in a tuple is just like a list — len(), [ ], for, in, etc. all work the same.

tuple = (1, 2, 'hi')
print len(tuple)  ## 3
print tuple[2]    ## hi
tuple[2] = 'bye'  ## NO, tuples cannot be changed
tuple = (1, 2, 'bye')  ## this works

To create a size-1 tuple, the lone element must be followed by a comma.

tuple = ('hi',)   ## size-1 tuple

It’s a funny case in the syntax, but the comma is necessary to distinguish the tuple from the ordinary case of putting an expression in parentheses. In some cases you can omit the parenthesis and Python will see from the commas that you intend a tuple.

Assigning a tuple to an identically sized tuple of variable names assigns all the corresponding values. If the tuples are not the same size, it throws an error. This feature works for lists too.

(x, y, z) = (42, 13, "hike")
print z  ## hike
(err_string, err_code) = Foo()  ## Foo() returns a length-2 tuple

_{Content was originally based on https://developers.google.com/edu/python/dict-files, but has been modified since. Licensed under CC BY 3.0. Code samples licensed under the Apache 2.0 License.}