J.P. Larocque (jp_larocque) wrote,
J.P. Larocque

  • Music:

Debian repository caching

There's about a dozen programs out there to lazily cache Debian repositories. What these do is act as a proxy to Debian apt repositories, serving apt on local Debian machines. There are three key features that have drawn me to the idea:

  1. Caching! But any HTTP proxy can do this.
  2. Indirection. To change to another mirror, change your configuration in one place.
  3. Keeping your cache when you move to another repository. A standard HTTP proxy on its own can't do this.

It all started with apt-proxy. Being a shell script web server (no joke), it suffered from all kinds of slowness and reliability problems. I threw that out about a year ago.

I don't remember how I ended up with Approx. Interestingly, it's written in OCaml. It's moderately fast. But it does have its shortcomings:

  1. Stale objects, at random, don't get refreshed when requested by a client. That is, Release will be up-to-date, but Sources.bz2 won't, and you'll have "MD5Sum mismatch" errors.
  2. Approx does the worst possible thing on a failed download: pretend that it's complete.
  3. As I recall, it also does "bad things" when you interrupt a download. Or worse, when you request the same object concurrently from two clients.

And then I realized there's an out-of-box solution. Plug any caching HTTP proxy into some web server proxying requests for /foo to http://some-repository.example.com/debian/foo. Configure apt clients to use the caching proxy to access the indirection-server; then your cached downloads will carry over even if you switch to another repository. (In theory.)

Since I use Polipo as my proxy—the same instance I use for my normal Interwebs experience—I get the following additional wins: pipelining, IPv6, partially-cached objects, concurrent client access to the same object1, and (oh oh!) STALE OBJECT INVALIDATION. You know, CORRECTNESS.

I have this stuffed in a <VirtualHost> directive:

# Repositories.
#ProxyPass /main/                       http://debian.oregonstate.edu/debian/ # Slow!  2006-10-21
#ProxyPass /main/                       http://mirrors.kernel.org/debian/ # Down!  2006-11-16
#ProxyPass /main/                       http://mirrors.usc.edu/pub/linux/distributions/debian/ # Faulty!  etch Packages lists are here, but not some packages.
#ProxyPass /main/                       http://ftp.us.debian.org/debian/ # Faulty!  See above.
ProxyPass /main/                        http://ftp-mirror.internap.com/pub/debian/
#ProxyPass /main/                       http://debian.crosslink.net/debian/
#ProxyPass /main/                       http://ftp.debian.org/debian/ # Not carrying some things: powerpc testing, i386-hurd unstable
ProxyPass /non-US/                      http://non-us.debian.org/debian-non-US/
ProxyPass /security/                    http://security.debian.org/
ProxyPass /blackdown-java/              http://mirrors.ibiblio.org/pub/mirrors/blackdown/debian/
#ProxyPass /marillat/                   ftp://ftp.nerim.net/debian-marillat/
ProxyPass /debian-mm/                   http://www.debian-multimedia.org/
ProxyPass /bunk/                        http://www.fs.tum.de/~bunk/debian/
#ProxyPass /amd64/                      http://mirror.espri.arizona.edu/debian-amd64/debian/ # Mirrored archive missing, 2006-10-11.
ProxyPass /amd64/                       http://debian.csail.mit.edu/debian-amd64/debian/
ProxyPass /debian-secure-testing/       http://secure-testing.debian.net/debian-secure-testing/

ProxyPass /ubuntu/                      http://ftp.osuosl.org/pub/ubuntu/
ProxyPass /ubuntu-security/             http://security.ubuntu.com/ubuntu/

# For GNU/Hurd.
ProxyPass /gnuab/                       http://ftp.gnuab.org/debian/

  1. In theory—empirical tests show serialization of requests, but the second-in-line gets a cached copy.
Tags: debian
  • Post a new comment


    default userpic

    Your reply will be screened

    Your IP address will be recorded