Python Modules and Command-line execution

July 26, 2018

Python source code

Python source files use the .py extension and are called modules. With a Python module hello.py, the easiest way to run it is with the shell command python hello.py Alice which calls the Python interpreter to execute the code in hello.py, passing it the command line argument Alice. See the official docs page on all the different options you have when running Python from the command-line.

Here’s a very simple hello.py program (notice that blocks of code are delimited strictly using indentation rather than curly braces — more on this later!):

#!/usr/bin/env python
# import modules used here -- sys is a very standard one
import sys
# Gather our code in a main() function
def main():
    print 'Hello there', sys.argv[1]
    # Command line args are in sys.argv[1], sys.argv[2] ...
    # sys.argv[0] is the script name itself and can be ignored
# Standard boilerplate to call the main() function to begin
# the program.
if __name__ == '__main__':
    main()

Running this program from the command line looks like:

$ python hello.py Guido
Hello there Guido
$ ./hello.py Alice  ## without needing 'python' first (Unix)
Hello there Alice

Command-line arguments and ’main’

The outermost statements in a Python file, or module, do its one-time setup — those statements run from top to bottom the first time the module is imported somewhere, setting up its variables and functions. A Python module can be run directly — as above python hello.py Bob — or it can be imported and used by some other module. When a Python file is run directly, the special variable __name__ is set to __main__. Therefore, it’s common to have the boilerplate if __name__ == '__main__' shown above to call a main() function when the module is run directly, but not when the module is imported by some other module.

In a standard Python program, the list sys.argv contains the command-line arguments in the standard way with sys.argv[0] being the program itself, sys.argv[1] the first argument, sys.argv[2] the second argument, and so on. If you know about argc, or the number of arguments, you can simply request this value from Python with len(sys.argv), just like we did in the interactive interpreter code above when requesting the length of a string. In general, len() can tell you how long a string is, the number of elements in lists and tuples (another array-like data structure), and the number of key-value pairs in a dictionary.

More on Modules and their Namespaces

Suppose you’ve got a module binky.py which contains a def foo(). The fully qualified name of that foo function is binky.foo. In this way, various Python modules can name their functions and variables whatever they want, and the variable names won’t conflict — module1.foo is different from module2.foo. In the Python vocabulary, we’d say that binky, module1, and module2 each have their own namespaces, which as you can guess are variable name-to-object bindings.

For example, we have the standard sys module that contains some standard system facilities, like the argv list, and exit() function. With the statement import sys you can then access the definitions in the sys module and make them available by their fully-qualified name, e.g. sys.exit(). (Yes, sys has a namespace too!)

import sys
# Now can refer to sys.xxx facilities
sys.exit(0)

There is another import form that looks like this: from sys import argv, exit. That makes argv and exit() available by their short names; however, we recommend the original form with the fully-qualified names because it’s a lot easier to determine where a function or attribute came from.

There are many modules and packages which are bundled with a standard installation of the Python interpreter, so you don’t have to do anything extra to use them. These are collectively known as the Python Standard Library. Commonly used modules/packages include:

sys — access to exit(), argv, stdin, stdout, …
re — regular expressions
os — operating system interface, file system

You can find the documentation of all the Standard Library modules and packages at: The Python Standard Library — Python 2.7 documentation.

Incremental Development

Build a Python program, don’t write the whole thing in one step. Instead identify just a first milestone, e.g. “well the first step is to extract the list of words.” Write the code to get to that milestone, and just print your data structures at that point, and then you can do a sys.exit(0) so the program does not run ahead into its not-done parts. Once the milestone code is working, you can work on code for the next milestone. Being able to look at the printout of your variables at one state can help you think about how you need to transform those variables to get to the next state. Python is very quick with this pattern, allowing you to make a little change and run the program to see how it works. Take advantage of that quick turnaround to build your program in little steps.

_{Content was originally based on https://developers.google.com/edu/python/introduction, but has been modified since. Licensed under CC BY 3.0. Code samples licensed under the Apache 2.0 License.}