PyPI Mirror on Ubuntu machine

This post will contains some details howto create a PyPI mirror on an Ubuntu machine. I write this post because I run into some problem with an cronjob setup.

Setup the mirror

You should use the bandersnatch scripts to sync your mirror from the master server. It works much better than the pep381run script. I followed the official guide from the PyPI site.

Use virtualenv:

I recommend to use a python virtualenv to keep your system clean.

# Setup virtualenv to /opt
virtualenv /opt/mirror-pypi

# Switch to virtualenv
source /opt/mirror-pypi/bin/activate

Install via pip:

Simple install bandersnatch via pip. It also handle the requirements for you.

pip install bandersnatch

Issue with cronjob

The default shell for cronjobs in Ubuntu is /bin/sh which is a symlink to /bin/dash. The documentation of bandersnatch contains an howto setup a cronjob.

The problem is, it's using a bash function with piping to logger:

# Will not work in dash or sh
bandersnatch mirror |& logger -t bandersnatch[mirror]

To fix the problem switch to bash as default cronjob shell. Modify /etc/crontab and change the SHELL variable:

# Replace SHELL from /bin/sh to /bin/bash
SHELL=/bin/bash

Now it's time to setup the cronjob which using the virtualenv:

# Modify crontab
crontab -e

# Add the following line to run every hour
0 * * * /opt/mirror-pypi/bin/python /opt/mirror-pypi/bin/bandersnatch mirror |& logger -t bandersnatch[mirror]

Monitoring mirror status

Because of the cronjob issue I created a monitoring script for nagios. This script check the last-modified file created by bandersnatch. It's a simple bash script which covert every date to UTC and unix timestamp to check the age of the mirror.

#!/bin/bash
# Thomas Merkel <tm@core.io>
# Check PyPI mirror with nagios

PROGPATH=$(echo ${0} | sed -e 's,[\\/][^\\/][^\\/]*$,,')
REVISION="1.1"

source ${PROGPATH}/utils.sh

# Function to print help
function help() {
    print_revision ${0} ${REVISION}
    echo
    echo "${0} -u <url/last-modified> -w <warning seconds> -c <critical seconds>"
    echo
    echo "OPTIONS:"
    echo " -u <url/last-modified>:  URL to the last-modified file"
    echo " -w <warning seconds>:    Difference in seconds for warning (1800)"
    echo " -c <critical seconds>:   Difference in seconds for criticial (3600)"
    exit ${STATE_UNKNOWN}
}

# Parse all option
WARN=1800
CRIT=3600
while getopts "h?u:w:c:" opt; do
    case "$opt" in
        h|\?)
            help
            ;;
        u)
            URL=${OPTARG}
            ;;
        w)
            WARN=${OPTARG}
            ;;
        c)
            CRIT=${OPTARG}
            ;;
    esac
done

if [ $# -eq 0 ]; then
    help
fi

shift $((OPTIND-1))

# Download and check curl return date
l_date=$(date -u)
r_curl=$(curl -sq ${URL})

if [ ${?} -ne 0 ]; then
    echo "CRIT: failed to download last-modified file"
    exit ${STATE_CRITICAL}
fi

# Convert remote date to utc timestamp, because the file contains
# another date format we need to convert it
r_date=$(echo "${r_curl} UTC" | sed "s:T: :")

# Convert to unix timestamp
l_unixtime=$(date -d "${l_date}" +%s)
r_unixtime=$(date -d "${r_date}" +%s)

# Check difference
if [[ ! $((${r_unixtime}+${CRIT})) -lt ${l_unixtime} || \
      ! $((${r_unixtime}+${WARN})) -lt ${l_unixtime} ]]; then
    echo "OK: mirror is up-to-date [remote ${r_date}]"
    exit ${STATE_OK}
fi
if [ $((${r_unixtime}+${CRIT})) -lt ${l_unixtime} ]; then
    echo "CRIT: mirror out of sync [remote ${r_date}]"
    exit ${STATE_CRITICAL}
fi
if [ $((${r_unixtime}+${WARN})) -lt ${l_unixtime} ]; then
    echo "WARN: mirror out of sync [remote ${r_date}]"
    exit ${STATE_WARNING}
fi

It's build for Nagios and using the utils.sh script from it. Move the file to /usr/lib/nagios/plugins or maybe any other plugins folder which contains utils.sh.

Posted

January 29, 2015, 5:48 pm

Tags

, , , ,

More

Permalink

Comments

Send your comment by mail.