The portal uses a single "portal-jobs" script to manage scheduling
periodic tasks.  This provides a single script that may be scheduled
using cron, which then executes whatever code is appropriate for the
currently installed collection of modules.

Configuration file:

    Underneath each module should be a file named module.conf which
    describes the various components of your module.  Components that
    are described can currently be either "panel"s or "job"s.  The
    module section of a module.conf file looks like:

        [module]
        panels = <panel-name-1>, <panel-name-2>, ...
        jobs = <job-name-1>, <job-name-2>, ...

    A given module may export more than one "job" to the job execution
    system, and this is where these are defined.  A job is defined
    with configuration code like the following:

        [<job-name>]
        frequency = <integer>
        exec_file = <path>

    frequency should be an integral number of seconds.  This defines
    how often the job should be run.  For example, a frequency of 3600
    should be used for jobs that run hourly.

    exec_file should be a path relative to the top level of the
    module.  This path specifies a file of Python code that will be
    executed at the appropriate time.

Execution:

    Whenever the portal-jobs script runs, it performs the following steps.

    1) It checks to see if another copy of the process is already
       running.  It does this by looking for a pid file
       portal-jobs.pid in %(state_path)s, as defined by the
       portal.conf file.  If this file does not exist, it creates it
       and continues.  If it does exist, and it contains the PID of a
       currently-running process, it aborts and exits.  If it does
       exist, and contains an invalid PID, it is replaced.

    2) portal-jobs loads all of the jobs described in the module.conf
       files are loaded.

    3) The file %(state_path)s/portal-jobs.dat is read, and each like
       of "job-name: date" is loaded in as a "last-run date" for the
       named job.

    4) The job_manager object is told to run, which will execute every
       job that has not run recently.  Specifically, the jobs are
       randomly shuffled, and then for each one in order, one at a
       time:

       a) The current time is taken, modulo the job's frequency
       b) The current time is compared with the last run time
       c) If the current time is newer, the module is run, with global
          variables last_run_time and current_run_time set
          appropriately.

    5) the new last-run times from each job are written out to the
       portal-jobs.dat file.

Current Issues:

    The current implementation will never parallelize.  This means
    that while it will work well when all modules complete in time, it
    will begin to have trouble when the system bogs down.
    Specifically, since only one job will run at a time, it's not a
    very good idea to have one job that takes (say) 20 minutes to run
    on an hourly schedule, and another that takes 5 seconds to run on
    a five-minute schedule.

    In this case, the 20-minute run will block the 5-minute run for
    long periods of time.

    Note that since the jobs are told when they last ran and when the
    current time is, they should always be able to "catch up" so long
    as the last run times are correctly maintained and the jobs don't
    take longer than real time to run.

    If the jobs *do* take longer than real time to run, what you'll
    get is a situation where jobs rotate being caught up to the
    present.  Things will always be somewhat behind, but each job will
    be given the opportunity to try to catch up before other jobs are
    run more than once.

    In short, if all of your jobs are run at most once an hour, this
    system is probably sufficient.

Future Development:

    It would be good if each job could only be run once at a time,
    rather than the whole scheduling apparatus.  This way, quick
    little tasks could be kept up to date reliably.  The down side is
    that having multiple jobs running simultaneously is likely to
    increase disk contention, and therefore increase overhead.

    Because of that, such a future development should also have much
    more aggressive monitoring to keep track of how long various jobs
    are taking to run and to warn if the system begins to fall behind.
