In this section, we look at a few of Python’s many standard utility modules to solve common problems.
os.path modules include many functions to interact with the file system. The
shutil module can copy files.
- os — Python 2.7 documentation
filenames = os.listdir(dir)— list of filenames in that directory path (not including
..). The filenames are just the names in the directory, not their absolute paths.
os.path.join(dir, filename)— given a filename from the above list, use this to put the dir and filename together to make a path
os.path.abspath(path)— given a path, return an absolute form, e.g.
dir/foo/bar.html, return the dirname
os.path.exists(path)— true if it exists
os.mkdir(dir_path)— makes one dir,
os.makedirs(dir_path)makes all the needed dirs in this path
shutil.copy(source-path, dest-path)— copy a file (dest path directories should exist)
import os ## Example pulls filenames from a dir, prints their relative and absolute paths def printdir(dir): filenames = os.listdir(dir) for filename in filenames: print filename ## foo.txt print os.path.join(dir, filename) ## dir/foo.txt (relative to current dir) print os.path.abspath(os.path.join(dir, filename)) ## /home/nick/dir/foo.txt
Exploring a module works well with the built-in python
dir() functions. In the interpreter, do an
import os, and then use these commands look at what’s available in the module:
subprocess module is a simple way to run an external command and capture its output.
- subprocess module docs
status = subprocess.call(cmd, shell=True)— runs the command, waits for it to exit, and returns its status
int. Any output from the command gets to the printed as usual. The status will be non-zero if the command failed.
status = subprocess.call(cmd, shell=True, stdout=file1, stderr=file2)— same as above, but capture the commands output and error into files
import sys import subprocess ## Given a dir path, run an external 'ls -l' on it -- ## shows how to call an external program def listdir(dir): cmd = 'ls -l ' + dir print "Command to run:", cmd ## good to debug cmd before actually running it ## Run the command while capturing output and error f1 = open('tmp', 'w') f2 = open('tmp2', 'w') status = subprocess.call(cmd, shell=True, stdout=f1, stderr=f2) f1.close() f2.close() f1 = open('tmp', 'r') f2 = open('tmp2', 'r') ## print output and error if status: ## Error case, print error and exit print "There was an error: " print f2.read() sys.exit(status) print f1.read() ## print the captured output
An exception represents a run-time error that halts the normal execution at a particular line and transfers control to error handling code. This section just introduces the most basic uses of exceptions. For example a run-time error might be that a variable used in the program does not have a value (
ValueError .. you’ve probably seen that one a few times), or a file open operation error because a file does not exist (
IOError). Learn more in the exceptions tutorial and see the list of Built-in Exceptions.
Without any error handling code (as we have done thus far), a run-time exception just halts the program with an error message. That’s a good default behavior, and you’ve seen it many times. You can add a
try-except structure to your code to handle exceptions, like this:
try: ## Either of these two lines could throw an IOError, say ## if the file does not exist or the read() encounters a low level error. f = open(filename, 'rU') text = f.read() f.close() except IOError: ## Control jumps directly to here if any of the above lines throws IOError. sys.stderr.write('problem reading:' + filename) ## In any case, the code then continues with the line after the try/except
try section includes the code which might throw an exception. The
except section holds the code to run if there is an exception. If there is no exception, the
except section is skipped (that is, that code is for error handling only, not the “normal” case for the code). You can get a pointer to the exception object itself with syntax
except IOError as e: .. (
e points to the exception object).
urllib provides url fetching — making a url look like a file you can read from. The
urlparse module can take apart and put together urls.
- urllib — Python 2.7 documentation
ufile = urllib.urlopen(url)— returns a file like object for that url
text = ufile.read()— can read from it, like a file (
readlines()etc. also work)
info = ufile.info()— the meta info for that request.
info.gettype()is the mime type, e.g.
baseurl = ufile.geturl()— gets the “base” url for the request, which may be different from the original because of redirects
urllib.urlretrieve(url, filename)— downloads the url data to the given file path
urlparse.urljoin(baseurl, url)— given a url that may or may not be full, and the baseurl of the page it comes from, return a full url. Use
geturl()above to provide the base url.
In Python 3,
urllib2 are merged into
import urllib ## Given a url, try to retrieve it. If it's text/html, ## print its base url and its text. def wget(url): ufile = urllib.urlopen(url) ## get file-like object for url info = ufile.info() ## meta-info about the url content if info.gettype() == 'text/html': print 'base url:' + ufile.geturl() text = ufile.read() ## read all its text print text
The above code works fine, but does not include error handling if a url does not work for some reason. Here’s a version of the function which adds
try-except logic to print an error message if the url operation fails.
import urllib ## Version that uses try/except to print an error message if the ## urlopen() fails. def wget2(url): try: ufile = urllib.urlopen(url) if ufile.info().gettype() == 'text/html': print ufile.read() except IOError: print 'problem reading url:', url
Content was originally based on https://developers.google.com/edu/python/utilities, but has been modified since. Licensed under CC BY 3.0. Code samples licensed under the Apache 2.0 License.