19 Jan 2012

Taming AppEngine Projects with Buildout

AppEngine is limited in ways many Python programmers are unprepared for, and its dependency management options are no exception. Unlike services like Heroku which support standard tools such as pip, AppEngine's dependency management amounts to "throw everything into the lib directory." Seemingly to drive this point home, Google's own dependency—the AppEngine SDK—is distributed as a zip file utterly uninstallable by any standard Python tool.

Because it's not easy to setup a maintainable AppEngine environment, most projects won't have one—and such is the case with the AppEngine project I've been working on for the past year. Today, we have a Franken-project consisting of:

  1. A mix-and-match lib directory of modified and outdated external dependencies and in-house modules
  2. A requirements.txt used by pip to install develop-only dependencies we don't want on AppEngine
  3. One lonely Git submodule created in an attempt to find a better way to manage dependencies that has caused many more problems than it solved
  4. A pages-long tutorial in getting everything working properly, including the various path files required to get all of AppEngine's SDK dependencies working properly

Thanks to this complexity, many people can't keep an environment in working order. Some resort to their own solutions (MacPorts, etc.), others simply avoid those parts that don't work (like the test runner!), and almost everybody spends hours setting up (and later fixing) the environment.

Clearly, something had to be done. It was time to bring out the big guns.

Enter Buildout

Buildout is:

...a Python-based build system for creating, assembling and deploying applications from multiple parts, some of which may be non-Python-based. It lets you create a buildout configuration and reproduce the same software later.

Like many things to come out of the Zope community, Buildout is a very useful and powerful tool—given you have the requisite patience to grok it. Buildouts are composed of recipes, themselves Python classes conforming to the Recipe API. Countless recipes are available on pypi; they can also be installed locally. Recipes abstract common operations and provide customization hooks via configuration options. A single buildout and all its recipe configurations (parts) are defined in a buildout.cfg file.

The following guide results in (and/or assumes) this directory structure:

</path/to/repo>
|+lib/ -- Your static dependencies
|+distlib/ -- Generated by buildout
|+parts/
  |+google_appengine/ -- Generated by buildout
|+myapp/ -- Your application package
| |+lib/ -- Used by AppEngine; populated by buildout
|-bootstrap.py -- Downloaded script; creates buildout env

To get started, let's identify our core recipes:

  • z3c.recipe.scripts — Installs eggs, scripts, interpreters, and kitchen sinks
  • appfy.recipe.gae — Downloads the AppEngine SDK and scripts; generates a deployable lib directory from eggs
  • cns.recipe.symlink — Symlinks (except on Windows); this currently links to a tarball of my fork which supports a convenience mode for "symlink all files in these directories."

Let's cover each section of buildout.cfg in turn. First, the main [buildout]:

[buildout]
parts =
    py
    app
    gae_sdk
    gae_tools
    symlink
    test
app-lib = ${buildout:directory}/lib
app-src = myapp
develop = recipe_symlink

The parts declaration refers to the recipe sections we want included when doing a default buildout install while app-lib and app-src are just used as variables throughout the file. The develop declaration will ensure my custom recipe is available to buildout (assuming you saved it as recipe_symlink in the root of your repository).

Next, we setup our custom interpreter from which other recipes can inherit to get the proper path info:

[py]
recipe = z3c.recipe.scripts:interpreter
executable = ${buildout:directory}/bin/py
interpreter = py
extra-paths =
    ${buildout:directory}/${buildout:app-src}
    ${buildout:directory}/${buildout:app-src}/lib
    ${gae_sdk:destination}/google_appengine
    ${gae_sdk:destination}/google_appengine/lib/antlr3
    ${gae_sdk:destination}/google_appengine/lib/django_0_96
    ${gae_sdk:destination}/google_appengine/lib/fancy_urllib
    ${gae_sdk:destination}/google_appengine/lib/ipaddr
    ${gae_sdk:destination}/google_appengine/lib/protorpc
    ${gae_sdk:destination}/google_appengine/lib/webob
    ${gae_sdk:destination}/google_appengine/lib/yaml/lib
    ${gae_sdk:destination}/google_appengine/lib/simplejson
    ${gae_sdk:destination}/google_appengine/lib/graphy

Nothing special here, just setting up a custom interpreter at bin/py and manually setting all the SDK library paths. One important thing to note here: although we will be using python = py in other parts, this does not mean scripts generated by those sections will have the same eggs and paths as [py]! You will want to use extend = py (elaborated on at the end of this guide) to ensure your path and egg access stays consistent.

Next up, our application libraries (uploaded to AppEngine):

[app]
recipe = appfy.recipe.gae:app_lib
python = py
lib-directory = ${buildout:directory}/distlib
use-zipimport = false
eggs =
    beaker==1.6.2
    gdata==2.0.14
find-links =
    http://gdata-python-client.googlecode.com/files/gdata-2.0.14.zip
ignore-globs =
    *.c
    *.pyc
    *.pyo
    */test
    */tests
    */testsuite
    */django
ignore-packages =
    setuptools
    easy_install
    site

Here we tell buildout to install beaker and an outdated version of the gdata library (we have to find via direct link) into the distlib/ directory. We also tell the recipe to ignore a bunch of file types and eggs that we don't really want uploaded, but which may be installed as a side-effect of installing our eggs.

Next up, download the SDK and install its scripts:

[gae_sdk]
recipe = appfy.recipe.gae:sdk
url = http://googleappengine.googlecode.com/files/google_appengine_1.6.1.zip
hash-name = false
clear-destination = true

[gae_tools]
recipe = appfy.recipe.gae:tools
extra-paths =
    ${buildout:directory}/${buildout:app-src}/lib

Finally, symlink all the library directories' contents:

[symlink]
recipe = cns.recipe.symlink
symlink_target = ${buildout:directory}/${buildout:app-src}/lib
symlink_base =
    ${buildout:app-lib}
    ${app:lib-directory}
ignore = README*
autocreate = true
bulk = True

Our goal is to symlink everything in distlib/ (created by buildout) and lib/ (created by us; part of our project's repository) into myapp/lib so that it will be uploaded to AppEngine when we do bin/appcfg update.

As a bonus, z3c.recipe.scripts parts can extend one another, making it easy to customize the creation of individual scripts while still retaining the proper eggs and paths. Here's one that installs a slightly modified nosetests into bin/test:

[test]
recipe = z3c.recipe.scripts
python = py
extends = py
scripts = nosetests=test
initialization =
    import os
    os.chdir('${buildout:directory}/${buildout:app-src}/tests')

If you're confused by all these configuration sections and their seemingly-arbitrary options, fear not: most Buildout recipes are well-documented on pypi and missing options can generally be located in the release notes. Once you get the hang of Buildout, you'll love its flexibility and Python-powered recipes, most of which have already been written for you. I don't know if I'll always pick it over the simpler pip/virtualenv combo, but it's a damn fine tool for solving problems those cannot.

Next Steps

This guide intentionally glosses over numbers (2) and (4) in the original Franken-project list. The second item regarding development dependencies is solved by another recipe section, [develop], which extends [py] and installs a bunch of eggs. To replace submodules I found mr.developer, an incredibly handy buildout extension which supports arbitrary checkouts—even of code lacking setup.py! Here's a line from our sources:

repo_lib = git git://github.com/foo/repo_lib.git rev=2c989622 egg=false

That grabs revision 2c989622 and drops the directory into lib/. It doesn't get easier than that!

Tagged: python appengine buildout