Dictionaries
Python's efficient key/value hash table structure is called a dict. The contents of a dict can be written as a series of key: value pairs within braces { }, e.g. dict = {key1:value1, key2:value2, ... }. The "empty dict" is just an empty pair of curly braces {}.
Looking up or setting a value in a dict uses square brackets, e.g. dict['foo'] looks up the value under the key 'foo'. Strings, numbers, and tuples work as keys, and any type can be a value. Other types may or may not work correctly as keys (strings and tuples work cleanly since they are immutable). Looking up a value which is not in the dict throws a KeyError -- use in to check if the key is in the dict, or use dict.get(key) which returns the value or None if the key is not present (or get(key, not-found) allows you to specify what value to return in the not-found case).
## Can build up a dict by starting with the the empty dict {}## and storing key/value pairs into the dict like this:## dict[key] = value-for-that-keydict = {}dict['a'] = 'alpha'dict['g'] = 'gamma'dict['o'] = 'omega'print dict ## {'a': 'alpha', 'o': 'omega', 'g': 'gamma'}print dict['a'] ## Simple lookup, returns 'alpha'dict['a'] = 6 ## Put new key/value into dict'a' in dict ## True## print dict['z'] ## Throws KeyErrorif 'z' in dict: print dict['z'] ## Avoid KeyErrorprint dict.get('z') ## None (instead of KeyError)
Deleting items
The del operator does deletions. In the simplest case, it can remove the definition of a variable, as if that variable had not been defined. del can also be used on list elements or slices to delete that part of the list and to delete entries from a dictionary.
var = 6del var # var no more!list = ['a', 'b', 'c', 'd']del list[0] ## Delete first elementdel list[-2:] ## Delete last two elementsprint list ## ['b']dict = {'a':1, 'b':2, 'c':3}del dict['b'] ## Delete 'b' entryprint dict ## {'a':1, 'c':3}
Iterating over dictionaries
A for-loop on a dictionary iterates over its keys by default. The keys will appear in an arbitrary order. The methods dict.keys() and dict.values() return lists of the keys or values explicitly. There's also an items() which returns a list of (key, value) tuples, which is the most efficient way to examine all the key value data in the dictionary.
## By default, iterating over a dict iterates over its keys.## Note that the keys are in a random order.for key in dict: print key## prints a g o## Exactly the same as abovefor key in dict.keys(): print key## Get the .keys() list:print dict.keys() ## ['a', 'o', 'g']## Likewise, there's a .values() list of valuesprint dict.values() ## ['alpha', 'omega', 'gamma']## .items() is the dict expressed as (key, value) tuplesprint dict.items() ## [('a', 'alpha'), ('o', 'omega'), ('g', 'gamma')]## This loop syntax accesses the whole dict by looping## over the .items() tuple list, accessing one (key, value)## pair on each iteration.for k, v in dict.items(): print k, '>', v## a > alpha o > omega g > gamma
All of these lists can be passed to the sorted() function. For example:
## Common case -- loop over the keys in sorted order,## accessing each key/valuefor key in sorted(dict.keys()):print key, dict[key]
There are iter variants of these methods called iterkeys(), itervalues() and iteritems() which avoid the cost of constructing the whole list -- a performance win if the data is huge. However, I generally prefer the plain keys() and values() methods with their sensible names. In Python 3.0 revision, the need for the iterkeys() variants is going away.
Dictionary Performance
From a performance point of view, the dictionary is one of your greatest tools, and you should use it where you can as an easy way to organize data. For example, you might read a log file where each line begins with an IP address, and store the data into a dict using the IP address as the key, and the list of lines where it appears as the value. Once you've read in the whole file, you can look up any IP address and instantly see its list of lines. The dictionary takes in scattered data and makes it into something coherent.
Formatting using Dictionaries
The % string formatting operator works conveniently to substitute values from a dict into a string by name:
hash = {}hash['word'] = 'garfield'hash['count'] = 42s = 'I want %(count)d copies of %(word)s' % hash # %d for int, %s for string# 'I want 42 copies of garfield'
Tuples
A tuple is a fixed size grouping of elements, such as an (x, y) co-ordinate. Tuples are like lists, except they are immutable and do not change size (tuples are not strictly immutable since one of the contained elements could be mutable). Tuples play a sort of "struct" role in Python -- a convenient way to pass around a little logical, fixed size bundle of values. A function that needs to return multiple values can just return a tuple of the values. For example, if I wanted to have a list of 3-d coordinates, the natural python representation would be a list of tuples, where each tuple is size 3 holding one (x, y, z) group.
To create a tuple, just list the values within parenthesis separated by commas. The "empty" tuple is just an empty pair of parenthesis. Accessing the elements in a tuple is just like a list -- len(), [ ], for, in, etc. all work the same.
tuple = (1, 2, 'hi')print len(tuple) ## 3print tuple[2] ## hituple[2] = 'bye' ## NO, tuples cannot be changedtuple = (1, 2, 'bye') ## this works
To create a size-1 tuple, the lone element must be followed by a comma.
tuple = ('hi',) ## size-1 tuple
It's a funny case in the syntax, but the comma is necessary to distinguish the tuple from the ordinary case of putting an expression in parentheses. In some cases you can omit the parenthesis and Python will see from the commas that you intend a tuple.
Assigning a tuple to an identically sized tuple of variable names assigns all the corresponding values. If the tuples are not the same size, it throws an error. This feature works for lists too.
(x, y, z) = (42, 13, "hike")print z ## hike(err_string, err_code) = Foo() ## Foo() returns a length-2 tuple
Content was originally based on https://developers.google.com/edu/python/dict-files, but has been modified since. Licensed under CC BY 3.0. Code samples licensed under the Apache 2.0 License.