Bash alternative to Reflector for Ranking Mirrors

So if you don't already know, I am a long-time user and supporter of Arch Linux. Arch uses a package management tool called pacman that works similarly to yum or apt, but much better IMHO. It uses a list of mirrors to perform the actual downloading of the package files, so you want the fastest mirrors to be in the mirror list. The old way is to use reflector to rank the speed of the mirrors, which is a python script. My way is pure bash using curl, sed, awk, xargs, and sort. Very simple and IMHO more effective than reflector.

Creating /etc/pacman.d/mirrorlist

Just run the script and redirect the output to /etc/pacman.d/mirrorlist like this:

$ ./ | sudo tee /etc/pacman.d/mirrorlist


How it works

Well it's simple, essentially it performs these steps:

  1. Fetch the current list of (only current 100%) mirrors from the official site.
  2. Use curl to request a small 257 byte file from each of those mirrors (about 200-300), 40 at a time and save the fastest 50. This also gets the dns cached for the next step
  3. Use curl to request a 100 kb file from each of those mirrors, measuring the total time of the request, and the speed of the download, 10 at a time.
  4. Finally, merge the results of both of those tests into a list of 50 mirrors in the format for outputting directly to /etc/pacman.d/mirrorlist

Why pure bash over reflector?

Well because I like my systems extra crazy lean, I often don't want to install python right after an initial install of Arch. Also, this is much faster and easier on the system resources, and I believe it is also more accurate. I'd like to encourage others to turn to pure shell scripting to do simple tasks like this, often that is a better long-term solution than building a new piece of software. But I'm not against reflector, it's a pretty awesome bit of python with many features. Source


# Updated: Tues May 07 21:04:12 2013 by webmaster@askapache
# @
# Copyright (C) 2013 Free Software Foundation, Inc.
#   This program is free software: you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation, either version 3 of the License, or
#   (at your option) any later version.
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   GNU General Public License for more details.
#   You should have received a copy of the GNU General Public License
#   along with this program.  If not, see .

# if mirrors exists, cat it, otherwise create it
function get_mirrors () #{{{1
   if [[ -s $MIRRORS ]]; then
          cat $MIRRORS;
          curl -LksS -o - '' | \
          sed 's,{,\n{,g' | sed -n '/rsync/d; /pct": 1.0/p' | sed 's,^.*"url": "\([^"]\+\)".*,\1,g' > $MIRRORS
          cat $MIRRORS;

function get_core_urls () #{{{1
   get_mirrors | sed "s,$,core/os/${ARCH}/core.db.tar.gz,g"

function get_gcc_urls () #{{{1
   get_mirrors | sed "s,$,core/os/${ARCH}/${GCC_URL},g"

# rm tmp file on exit
trap "exitcode=\$?; (rm -f \$MIRRORS 2>/dev/null;) && exit \$exitcode" 0;
trap "exit 1" 1 2 13 15;

# file containing mirror urls
MIRRORS=`(mktemp -t reflector-mirrorsXXXX) 2>/dev/null` && test -w "$MIRRORS" || MIRRORS=~/reflector.mirrorsXXX

# arch
ARCH=`(uname -m) 2>/dev/null` || ARCH=x86_64

# the gcc file
GCC_URL=$( curl -LksSH --url${ARCH}/ 2>/dev/null | sed -n 's/^.*\ \(gcc-[0-9]\+.*.tar.xz.sig\)\ -.*$/\1/gp' );

   # faster as primarily used to pre-resolve dns for 2nd core test
   get_gcc_urls | xargs -I'{}' -P40 curl -Lks -o /dev/null -m 3 --retry 0 --no-keepalive -w '%{time_total}@%{speed_download}@%{url_effective}\n' --url '{}' |\
   sort -t@ -k2 -nr | head -n 50 | cut -d'@' -f3 | sed 's,core/os/'"${ARCH}/${GCC_URL}"',$repo/os/$arch,g'

   get_core_urls | xargs -I'{}' -P10 curl -Lks -o /dev/null -m 5 --retry 0 --no-keepalive -w '%{time_total}@%{speed_download}@%{url_effective}\n' --url '{}' |\
   sort -t@ -k2 -nr | head -n 50 | cut -d'@' -f3 | sed 's,core/os/'"${ARCH}"'/core.db.tar.gz,$repo/os/$arch,g'
} | sed 's,^,Server = ,g' | awk '{ if (!h[$0]) { print $0; h[$0]=1 } }'

exit $?;

Xargs running curl in parallel

Just shows the output of htop while running the script.


Shell Scripting ArchLinux awk bash cURL Python reflector sed shell-script