Developer Guide

“Use the SOURCE, Luke.”

Source Tour

Pave is GPLv3+ licensed and may be available in commercial-friendly proprietary license in the future as well.

Follow along at bitbucket if you’d like. There is an executable script typically installed to /usr/local/bin/pave that does little but hand things off to the main module pave/

This main module gets everything rolling...

  • In load_data(), the pavefile is parsed and validated via pave/ using the voluptuous library.

  • The tasks(s) are then handed off to fabric for execution at the end of main() — each host gets steamroll()‘ed.

  • Each remote host is first inspected with pave/ A json blob is cached in the .cache folder and returned with the results.

  • With inspector results in hand, a set of platform-dependent shell commands and settings (as specified in pave/ are chosen.

    This is the place to start on new platform support.

  • The pave/lib/ folder contains the modules that implement items under the tasks: section of the pavefile.

    This is the place to add modules to support new functionality.

How to Write a Task Module

So, you’d like to create a module to simplify an ugly or tedious task—this is the right place. Task modules are written in Python and run locally, but generally speak in shell script at the remote end. (If you don’t know either language, there’s still hope, an alternative described in the FAQ - (Not So) Frequently Asked Questions).

Let’s get started:

  1. You’ll probably want to copy or at least refer to an existing one from pave/lib that is most similar to one you’d like to write.

    If the module will do a lot of repetetive tiny tasks, you may want the primary data structure to be a list. For a single task with lots of options a mapping is a good choice. Keep this in mind when choosing a module to copy from.

  2. The module needs the following items:

    • a handler function of the following signature:

      def handle(data, cmds, context):
      • data is the task’s parsed structure passed in from the pavefile. The convention when using mapping/dict keys is to use dashes instead of spaces, like css, e.g.: foo-bar:

      • cmds is the platform information object chosen by the inspector, the attributes of which should be used in place of command strings.

      • context is a dictionary that keeps track of properties such as the current user, whether to use sudo, and if so, as what user, etc.
        Make a copy if you need to modify it per-task:

        for task in tasks:
            mycon = context.copy()      # prevents spills
    • To check against data-entry errors, the passed data structure should have a robust schema definition (and validator functions), the building blocks of which can be imported from voluptuous.

    • Generally speaking if the data is a list you’ll want to loop over it, if a dictionary, you might .get() or .pop() off each member as you go.

    • There are a few helpful variables pave attaches to fabric that are accessible by importing its env variable:

      env.pave_platform           # the platform obj also known as cmds
      env.pave_platform_details   # a dict holding system parameters
      env.pave_vars               # a dict holding pavefile vars:
      env.pave_passwords          # a dict holding pavefile passwords:
      # configures pave to stop at the first error:
      env.pave_raise_errs         # set by /main/warn-only
      # makes copies of original files before changing them:
      env.pave_bak                # set by /main/bak-files
  3. Use the helper functions in pave/ whenever possible. They handle common situations for modules.

    runcmd() is probably the most important, as it wraps fabric operations. See below, cmds provides the cmd-line for the current platform, while arguments for printf-style %s string expansion follow. Finally context provides information for runcmd to make decisions:

    result = runcmd(cmds.pkg_upgrd_time, expires, **context)
    • A note about command strings from Occassionally string formatting with positional args is not sufficient, and the need arises to add or subtract whole parameters based on vars, similar to python keyword arguments. This is possible thru what the source calls (for lack of a better term) “chunking”... referring to the chunks in the command line:

      createdb [--locale] [--encoding] [--tablespace]     # kwargs
      createuser [--encrypted|no-encrypted]               # boolean

      Chunking can be enabled via context.chunking = True and can be seen in the postgres, users, and groups modules, for example.

  4. Try to reuse the other modules if possible. For example, the postgres: module leverages the configure: module for its config file modifications.

  5. If a task is about to make a change, log it with log.change(). If the task should be skipped, log it with log.skip() instead.

  6. Keep a running tab of these “events,” and return the number of changes and/or errors (in that order) generated during the module handler run.

  7. When satisfied with the module, run pyflakes and/or pylint against it to light up dark corners. There is a in the root of the project to help with this. This would also be a good time to write a few tests for it.

  • A final note to be careful about importing pave.main, pave.schema, or handle methods (of other library modules) in new modules. Doing so can create circular dependencies, as the schema module imports all modules at startup looking for schemas. Instead, import these into each needy function to delay import until runtime.

Roadmap and Wishlist

Can you help with any of these needs?

  • Platforms - Would appreciate some help getting it to run on:
    • RHEL6+ and other common distributions.
    • *BSD, OSX client
    • Windows
  • Common software support which has special needs or tedium that aren’t met with the current library.

  • The validation of pavefile sections and options is lacking. For example, data types and number of arguments are generally checked, but files aren’t checked for existence prior to execution. This is important for a 1.0 release. Check the voluptuous page linked above on how to do that.

  • Optimizations from gurus of various disciplines.

  • Necessary features from established systems that don’t harm the simple use-cases.

  • Tests — The design is starting to solidify, so writing tests would be useful.

  • Docs too. Sphinx doc building is currently a bit of a mess.

    Note: parts of the docs are built dynamically, such as schema-related details. Unfortunately, readthedocs doesn’t handle this. :/ So, docs will need to be re-built after changes to the schema and the output files checked in. There is a script called docs/bld_docs to automate the process.


  • Please consult with me before making major improvements if you’d like them included here. ;)

  • Currently we’re asking that you transfer copyright of your additions to the project to avoid issues if a change of gears is warranted. Is this a problem?.

  • Pull requests:
    • Please fix/improve one thing (or highly-related things) at a time.
    • Please test thoroughly before submission as I don’t have much time or resources available. VirtualBox is free and easy to use, (if there isn’t something you’d rather use).
  • Code should be in the pep8 style.

    I tend to do a few other things like line up columns and put double blank lines between classes and standalone functions that might bug people. Sorry about that. I’ve got an extra high-res portrait monitor for coding/browsing and vertical space is plentiful.

  • Single quotes by default, since they are easier to look at and type on keyboards in common languages.

    One exception are the command-lines in
    Since fabric doesn’t escape single quotes in command-lines, is it easier to visually parse the shell debug output if you quote with single on command-line tasks whenever possible. As single quotes will therefore be common inside command-line strings it is easier to wrap the whole thing in double-quotes or (if mixed, triple-single) in Python.