FREE THOUGHT · FREE SOFTWARE · FREE WORLD

Home » Shell Scripting » Bash Script to Create index.html of Dir Listing

by comment

Bash Script to Create index.html of Dir ListingIf you use Apache to auto-generate directory index listings of files/dirs, using the IndexOptions directive, like at http://gnu.askapache.com/, and you have a large number of files and directories in the root directory and/or slow IO speed, then generating the index could take Apache over a minute!

Fix for index Listings

I fixed that by writing a bash shell script, which is run by cron, in this case immediately after every rsync. This creates a static index.html file formatted exactly as the simple apache-generated one. Now apache serves the single index.html, which can be cached. Served in less than a second now.

Htaccess for Indexes

Here is the basic .htaccess to setup in the root directory.

Options SymLinksIfOwnerMatch Indexes

DirectoryIndex index.html
IndexOptions FancyIndexing TrackModified IgnoreClient ScanHTMLTitles SuppressRules VersionSort IgnoreCase NameWidth=* DescriptionWidth=*

Shell Script to create index.html

The source is below, or download it: mirror index.html creator.

#!/bin/bash
# Updated: Wed Apr 10 21:04:12 2013 by webmaster@askapache
# @ http://u.askapache.com/2013/04/gnu-mirror-index-creator.txt
# Copyright (C) 2013 Free Software Foundation, Inc.
#
#   This program is free software: you can redistribute it and/or modify
#   it under the terms of the GNU General Public License as published by
#   the Free Software Foundation, either version 3 of the License, or
#   (at your option) any later version.
#
#   This program is distributed in the hope that it will be useful,
#   but WITHOUT ANY WARRANTY; without even the implied warranty of
#   MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
#   GNU General Public License for more details.
#
#   You should have received a copy of the GNU General Public License
#   along with this program.  If not, see <http://www.gnu.org/licenses/>.

function create_gnu_index ()
{
    # call it right or die
    [[ $# != 3 ]] && echo "bad args. do: $FUNCNAME '/DOCUMENT_ROOT/' '/' 'gnu.askapache.com'" && exit 2
  
    # D is the doc_root containing the site
    local L= D="$1" SUBDIR="$2" DOMAIN="$3" F=

    # The index.html file to create
    F="${D}index.html"

    # if dir doesnt exist, create it
    [[ -d $D ]] || mkdir -p $D;

    # cd into dir or die
    cd $D || exit 2;

    # touch index.html and check if writable or die
    touch $F && test -w $F || exit 2;

    # remove empty directories, they dont need to be there and slow things down if they are
    find . -maxdepth 1 -type d -empty -exec rm -rf {} \;

    # start of total output for saving as index.html
    (

        # print the html header
        echo '<!DOCTYPE HTML PUBLIC "-//W3C//DTD HTML 3.2 Final//EN">';
        echo "<html><head><title>Index of http://${DOMAIN}${SUBDIR}</title></head>";
        echo "<body><h1><a id="Index_SUBDIR" href="#Index_SUBDIR">Index of ${SUBDIR}</a></h1><pre>      Name                                        Last modified      Size";

        # start of content output
        (
            # change IFS locally within subshell so the for loop saves line correctly to L var
            IFS=$'\n';

            # pretty sweet, will mimick the normal apache output
            for L in $(find -L . -mount -depth -maxdepth 1 -type f ! -name 'index.html' -printf "      <a href=\"%f\">%-44f@_@%Td-%Tb-%TY %Tk:%TM  @%f@\n"|sort|sed 's,\([\ ]\+\)@_@,</a>\1,g');
            do
                # file
                F=$(sed -e 's,^.*@\([^@]\+\)@.*$,\1,g'<<<"$L");

                # file with file size
                F=$(du -bh $F | cut -f1);

                # output with correct format
                sed -e 's,\ @.*$, '"$F"',g'<<<"$L";
            done;
        )

        # now output a list of all directories in this dir (maxdepth 1) other than '.' outputting in a sorted manner exactly like apache
        find -L . -mount -depth -maxdepth 1 -type d ! -name '.' -printf "      <a href=\"%f\">%-43f@_@%Td-%Tb-%TY %Tk:%TM  -\n"|sort -d|sed 's,\([\ ]\+\)@_@,/</a>\1,g'

        # print the footer html
        echo "</pre><address>Apache Server at ${DOMAIN}</address></body></html>";

    # finally save the output of the subshell to index.html
    )  > $F;

}

# start the run ( use function so everything is local and contained )
#    $1 is absolute document_root with trailing '/'
#    $2 is subdir like '/subdir/' if thats the web root, '/' if no subdir
#    $3 is the domain 'subdomain.domain.tld'
create_gnu_index "${HOME}/sites/gnu.askapache.com/htdocs/" "/" "gnu.askapache.com"

# takes about 1-5 seconds to complete
exit

Tags

Comments Welcome

Information is freedom. Freedom is non-negotiable. So please feel free to modify, copy, republish, sell, or use anything on this site in any way at any time ;)

My Online Tools

Popular Articles
Hacking and Hackers

The use of "hacker" to mean "security breaker" is a confusion on the part of the mass media. We hackers refuse to recognize that meaning, and continue using the word to mean someone who loves to program, someone who enjoys playful cleverness, or the combination of the two.
-- Richard M. Stallman


It's very simple - you read the protocol and write the code. -Bill Joy

Except where otherwise noted, content on this site is licensed under a Creative Commons Attribution 3.0 License, just credit with a link.
This site is not supported or endorsed by The Apache Software Foundation (ASF). All software and documentation produced by The ASF is licensed. "Apache" is a trademark of The ASF. NCSA HTTPd.
UNIX ® is a registered Trademark of The Open Group. POSIX ® is a registered Trademark of The IEEE.

+Askapache | htaccess.io | htaccess.guru

Site Map | Contact Webmaster | License and Disclaimer | Terms of Service | @Htaccess

↑ TOPMain