FREE THOUGHT · FREE SOFTWARE · FREE WORLD

Ultimate Htaccess Part II

Editing an Apache .htaccess file in VIM. Here is even more information from the Ultimate Htaccess Part I. For now this is very rough and you will want to come back later to read it.

Merging Notes

The order of merging is:

  1. <Directory> (except regular expressions) and .htaccess done simultaneously (with .htaccess, if allowed, overriding <Directory>)
  2. <DirectoryMatch> (and <Directory ~>)
  3. <Files> and <FilesMatch> done simultaneously
  4. <Location> and <LocationMatch> done simultaneously

Below is an artificial example to show the order of merging. Assuming they all apply to the request, the directives in this example will be applied in the order:

A > B > C > D > E

.
<Location />
E
</Location>
<Files askapache.txt>
D
</Files>
<VirtualHost *>
<Directory /a/b>
B
</Directory>
</VirtualHost>
<DirectoryMatch "^.*b$">
C
</DirectoryMatch>
<Directory /a/b>
A
</Directory>

Htaccess Software

Apache HTTP Server comes with the following programs.

httpd
Apache hypertext transfer protocol server
apachectl
Apache HTTP server control interface
ab
Apache HTTP server benchmarking tool
apxs
APache eXtenSion tool
dbmmanage
Create and update user authentication files in DBM format for basic authentication
fcgistarter
Start a FastCGI program
htcacheclean
Clean up the disk cache
htdigest
Create and update user authentication files for digest authentication
htdbm
Manipulate DBM password databases.
htpasswd
Create and update user authentication files for basic authentication
httxt2dbm
Create dbm files for use with RewriteMap
logresolve
Resolve hostnames for IP-addresses in Apache logfiles
log_server_status
Periodically log the server's status
rotatelogs
Rotate Apache logs without having to kill the server
split-logfile
Split a multi-vhost logfile into per-host logfiles
suexec
Switch User For Exec
Options All
AuthType "Basic"
AuthName "Protected Access"

#SetEnv APPLICATION_ENV production
SetEnv APPLICATION_ENV development
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]

ErrorDocument 404 /index.php

AddType text/html url

order allow,deny
allow from all
Options +ExecCGI

AddHandler python-program .py
PythonHandler mod_python.publisher
PythonDebug On
PythonAutoReload On

RewriteEngine on
RewriteRule .* - [E=HTTP_IF_MODIFIED_SINCE:%{HTTP:If-Modified-Since}]
RewriteRule .* - [E=HTTP_IF_NONE_MATCH:%{HTTP:If-None-Match}]

# updating tab.ser takes lots of memory
php_value memory_limit 64000000

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} -s [OR]
RewriteCond %{REQUEST_FILENAME} -l [OR]
RewriteCond %{REQUEST_FILENAME} -d
RewriteRule ^.*$ - [NC,L]

php_flag magic_quotes_gpc Off
php_flag magic_quotes_runtime Off

# redirect moved files
RedirectMatch Permanent ^/fop/anttask(.*) http://xmlgraphics.apache.org/fop/0.20.5/anttask$1
RedirectMatch Permanent ^/fop/compiling(.*) http://xmlgraphics.apache.org/fop/0.20.5/compiling$1
RedirectMatch Permanent ^/fop/configuration(.*) http://xmlgraphics.apache.org/fop/0.20.5/configuration$1
RedirectMatch Permanent ^/fop/embedding(.*) http://xmlgraphics.apache.org/fop/0.20.5/embedding$1

<IfModule mod_expires.c>
ExpiresActive On
ExpiresDefault "access plus 52 weeks"
</IfModule>

# $Id: .htaccess 1255 2003-12-21 05:30:57Z rufustfirefly $
# $Author: rufustfirefly $
#
# Since sometimes Apache isn't properly configured to view the
# proper PHP script, we do this instead, but only if PHP4 is


<IfModule mod_access.c>
<Files ~ "^wanewsletter">
Order Allow,Deny
Allow from 127.0.0.1
RewriteEngine On
# map phocoa framework wwwroot - enabled PHOCOA Versioning
RewriteRule ^/www/framework(/[0-9.]*)?/?(.*) /wwwroot/www/framework/$2 [L]
#enable a normal wwwroot


RewriteEngine on
RewriteRule Boot/(.*) Boot/$1 [QSA,L]

RewriteEngine on
RewriteRule ^xbl.js build/xbl.php
RewriteRule ^ie-dom.js build/ie-dom.php

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php [QSA,L]

# Rewrite rules for Zend Framework
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule (.*) index.php/$1 [L]

AuthUserFile /dev/null
AuthGroupFile /dev/null
AuthName "You are not welcomed here"
AuthType Basic
<Limit GET POST>

AddHandler application/x-httpd-php-source php

DirectoryIndex index.html index.php
AddDefaultCharset UTF-8
RewriteEngine off
# Web Optimizer options

ErrorDocument 404 /~george/blog/index.php
AddType application/x-httpd-php .php .rdf

# redirect moved files
RedirectMatch Permanent ^/fop/anttask(.*) http://xmlgraphics.apache.org/fop/0.20.5/anttask$1
RedirectMatch Permanent ^/fop/compiling(.*) http://xmlgraphics.apache.org/fop/0.20.5/compiling$1
RedirectMatch Permanent ^/fop/configuration(.*) http://xmlgraphics.apache.org/fop/0.20.5/configuration$1
RedirectMatch Permanent ^/fop/embedding(.*) http://xmlgraphics.apache.org/fop/0.20.5/embedding$1
AddHandler x-httpd-php504 .php
AddType application/x-httpd-php .html

RewriteEngine on
# Remember to change RewriteBase to where your site really is - for example if
# you access the site by www.somesite.com/~myname/thesite, you do as so:
# RewriteBase /~myname/thesite/

<Files "config.php">
Order deny,allow
Deny From All
</Files>

#
# Apache/PHP/Drupal settings:
#
# Protect files and directories from prying eyes.

DirectoryIndex membersite.pl
Options +ExecCGI
AddHandler cgi-script .pl

#AddType application/x-httpd-php .php3 .phtml .php .php4 .php5 .html .html .xml
Options -Indexes
Options +FollowSymLinks
RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f

# Comment/uncomment the lines below depending of your needs
# To deny any access to this directory
Deny from all

<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-f
RewriteRule ^(.*)$ index.php/$1 [L]

<FilesMatch "\.(php|inc)$">
Order allow,deny
Deny from all
</FilesMatch>

AddHandler application/x-httpd-php5 .php5 .php4 .php .php3 .php2 .phtml
AddType application/x-httpd-php5 .php5 .php4 .php .php3 .php2 .phtml
AddHandler application/x-httpd-php5 .snow
AddType application/x-httpd-php5 .snow
AddHandler image/jpeg me
Redirect permanent /~hoge/hns-lite/i/ http://www.example.ne.jp/~hoge/hns-lite/

# Make index.php the default
DirectoryIndex index.php
Options -Indexes

# In case it's on, turn off magic quotes:
php_value magic_quotes_gpc off
# On production servers, uncomment these lines and update the path
# to a suitable log file:

Deny from all
Satisfy All

##############################
# MicroMVC Apache settings
# .htaccess v1.0.0 9/29/2007
##############################

<IfModule mod_rewrite.c>
RewriteEngine On
RewriteBase /
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d

<filesmatch .(thtml|ctp)$="">
DENY FROM ALL
</filesmatch>

RewriteEngine on
#RewriteCond %{REMOTE_HOST} ^yupi-cms*
RewriteCond %{REQUEST_FILENAME} !-f
#RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php/$1 [L]

RewriteEngine On
RewriteBase /
# Rewrite www.domain.com to domain.com
RewriteCond %{HTTP_HOST} ^www.(.*)

#
# OntoWiki requires Apache's rewrite engine to work
#
<IfModule mod_rewrite.c>
RewriteEngine Off
<IfModule mod_rewrite.c>
# Turn on URL rewriting
RewriteEngine On

RewriteEngine on
# Remember to change RewriteBase to where your site really is - for example if
# you access the site by www.somesite.com/~myname/thesite, you do as so:
# RewriteBase /~myname/thesite/

# this file is placed here to restrict the access to this directory
# for non authorized users
# Note that the AllowOverride directive (in the Apache's access.conf) must be
# set to "Limit" or "All"

# This folder does not require access over HTTP
Order deny,allow
Deny from all
Allow from none

RewriteEngine on
# Remember to change RewriteBase to where your site really is - for example if
# you access the site by www.somesite.com/~myname/thesite, you do as so:
# RewriteBase /~myname/thesite/

#
# Latitude .htaccess Rules
# Special thanks to Drupal for a template (http://drupal.org/)
#

AddHandler php5-script .php
Options +FollowSymLinks +ExecCGI
<IfModule mod_rewrite.c>
RewriteEngine On

Options +Indexes
IndexOptions FancyIndexing
IndexOptions +NameWidth=*
IndexOptions +DescriptionWidth=*

RewriteEngine off
Options -Indexes

############
# SECURITY #
###########################################################
<files *.php>
Options +Indexes
IndexOptions FancyIndexing
IndexOptions +NameWidth=*
IndexOptions +DescriptionWidth=*


AddHandler fcgid-script .abc
#update this path according to your setup
#you can place the executable fcgishell anywhere on your server

AddEncoding x-compress Z
AddEncoding x-gzip gz

AddHandler cgi-script rb
# Send any page with a slow.html suffix with a delay
AddHandler slow-down .slow
Action slow-down @PATH_PREFIX@/cgi/send-delayed.rb

DirectoryIndex index.htm
AddHandler server-parsed .htm

# for current work
<IfModule mod_php5.c>
php_value memory_limit 128M
php_value max_execution_time 150
php_flag magic_quotes_gpc off

<Files password.xml>
Deny from all
</Files>

ErrorDocument 404 /cgi-bin/error.cgi

deny from all
<Files *.php>
order allow,deny
deny from all
</Files>

Options -Indexes
<IfModule mod_rewrite.c>
RewriteEngine on
RewriteCond %{REQUEST_FILENAME} !-d
RewriteCond %{REQUEST_FILENAME} !-f
# General Apache options
AddHandler fastcgi-script .fcgi
AddHandler cgi-script .cgi
Options +FollowSymLinks +ExecCGI

RewriteEngine On
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
# RewriteRule !.(js|ico|gif|jpg|png|css|txt)$ - [C]
RewriteRule (.*) http://127.0.0.1:8080/$1 [P]

RewriteEngine on
RewriteCond $1 !^(index.php|images|robots.txt)
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule ^(.*)$ index.php?$1

Order Deny,Allow
Deny from all
#dozwoleni userzy
Allow from localhost

# deny access to private folders and files, start with prefix "_"
<FilesMatch "(^|/)_">
deny from all
</FilesMatch>
# deny access to all executable files (.php)

php_flag register_globals off
php_flag magic_quotes_gpc off
php_flag short_open_tag off
php_flag allow_call_time_pass_reference off
php_flag safe_mode on


<IfModule mod_rewrite.c>
RewriteEngine on
RewriteBase /jblog/
RewriteCond %{REQUEST_FILENAME} !-f

deny from all
Satisfy all

#
# Apache/PHP/Drupal settings:
#
# Protect files and directories from prying eyes.

RewriteEngine On
RewriteRule ^(w*)/post$ index.php?at=$1&do=post [L,QSA]
RewriteRule ^edit/(d*)$ index.php?do=edit&to=$1 [L,QSA]
RewriteRule ^comment/(d*)$ index.php?do=comment&to=$1 [L,QSA]

Technical Look at .htaccess

Source: Apache API notes

Per-directory configuration structures

Let's look out how all of this plays out in mod_mime.c, which defines the file typing handler which emulates the NCSA server's behavior of determining file types from suffixes. What we'll be looking at, here, is the code which implements the AddType and AddEncoding commands. These commands can appear in .htaccess files, so they must be handled in the module's private per-directory data, which in fact, consists of two separate tables for MIME types and encoding information, and is declared as follows:

table *forced_types;      /* Additional AddTyped stuff */
table *encoding_types;    /* Added with AddEncoding... */
mime_dir_config;

When the server is reading a configuration file, or section, which includes one of the MIME module's commands, it needs to create a mime_dir_config structure, so those commands have something to act on. It does this by invoking the function it finds in the module's `create per-dir config slot', with two arguments: the name of the directory to which this configuration information applies (or NULL for srm.conf), and a pointer to a resource pool in which the allocation should happen.

(If we are reading a .htaccess file, that resource pool is the per-request resource pool for the request; otherwise it is a resource pool which is used for configuration data, and cleared on restarts. Either way, it is important for the structure being created to vanish when the pool is cleared, by registering a cleanup on the pool if necessary).

For the MIME module, the per-dir config creation function just ap_pallocs the structure above, and a creates a couple of tables to fill it. That looks like this:

void *create_mime_dir_config (pool *p, char *dummy)
mime_dir_config *new = (mime_dir_config *) ap_palloc (p, sizeof(mime_dir_config));

new->forced_types = ap_make_table (p, 4);
new->encoding_types = ap_make_table (p, 4);

Now, suppose we've just read in a .htaccess file. We already have the per-directory configuration structure for the next directory up in the hierarchy. If the .htaccess file we just read in didn't have any AddType or AddEncoding commands, its per-directory config structure for the MIME module is still valid, and we can just use it. Otherwise, we need to merge the two structures somehow.

To do that, the server invokes the module's per-directory config merge function, if one is present. That function takes three arguments: the two structures being merged, and a resource pool in which to allocate the result. For the MIME module, all that needs to be done is overlay the tables from the new per-directory config structure with those from the parent:

void *merge_mime_dir_configs (pool *p, void *parent_dirv, void *subdirv)
mime_dir_config *parent_dir = (mime_dir_config *)parent_dirv;
mime_dir_config *subdir = (mime_dir_config *)subdirv;
mime_dir_config *new =  (mime_dir_config *)ap_palloc (p, sizeof(mime_dir_config));
new->forced_types = ap_overlay_tables (p, subdir->forced_types, parent_dir->forced_types);
new->encoding_types = ap_overlay_tables (p, subdir->encoding_types, parent_dir->encoding_types);

As a note --- if there is no per-directory merge function present, the server will just use the subdirectory's configuration info, and ignore the parent's. For some modules, that works just fine (e.g., for the includes module, whose per-directory configuration information consists solely of the state of the XBITHACK), and for those modules, you can just not declare one, and leave the corresponding structure slot in the module itself NULL.

Command handling

Now that we have these structures, we need to be able to figure out how to fill them. That involves processing the actual AddType and AddEncoding commands. To find commands, the server looks in the module's command table. That table contains information on how many arguments the commands take, and in what formats, where it is permitted, and so forth. That information is sufficient to allow the server to invoke most command-handling functions with pre-parsed arguments. Without further ado, let's look at the AddType command handler, which looks like this (the AddEncoding command looks basically the same, and won't be shown here):

char *add_type(cmd_parms *cmd, mime_dir_config *m, char *ct, char *ext)
if (*ext == '.') ++ext;
ap_table_set (m->forced_types, ext, ct);

This command handler is unusually simple. As you can see, it takes four arguments, two of which are pre-parsed arguments, the third being the per-directory configuration structure for the module in question, and the fourth being a pointer to a cmd_parms structure. That structure contains a bunch of arguments which are frequently of use to some, but not all, commands, including a resource pool (from which memory can be allocated, and to which cleanups should be tied), and the (virtual) server being configured, from which the module's per-server configuration data can be obtained if required.

Another way in which this particular command handler is unusually simple is that there are no error conditions which it can encounter. If there were, it could return an error message instead of NULL; this causes an error to be printed out on the server's stderr, followed by a quick exit, if it is in the main config files; for a .htaccess file, the syntax error is logged in the server error log (along with an indication of where it came from), and the request is bounced with a server error response (HTTP error status, code 500).

The MIME module's command table has entries for these commands, which look like this:

command_rec mime_cmds[] =
{ "AddType", add_type, NULL, OR_FILEINFO, TAKE2, "a mime type followed by a file extension" },
{ "AddEncoding", add_encoding, NULL, OR_FILEINFO, TAKE2, "an encoding (e.g., gzip), followed by a file extension" },

The entries in these tables are:

  • The name of the command
  • The function which handles it a (void *) pointer, which is passed in the cmd_parms structure to the command handler --- this is useful in case many similar commands are handled by the same function.
  • A bit mask indicating where the command may appear. There are mask bits corresponding to each AllowOverride option, and an additional mask bit, RSRC_CONF, indicating that the command may appear in the server's own config files, but not in any .htaccess file.
  • A flag indicating how many arguments the command handler wants pre-parsed, and how they should be passed in. TAKE2 indicates two pre-parsed arguments. Other options are TAKE1, which indicates one pre-parsed argument, FLAG, which indicates that the argument should be On or Off, and is passed in as a boolean flag, RAW_ARGS, which causes the server to give the command the raw, unparsed arguments (everything but the command name itself). There is also ITERATE, which means that the handler looks the same as TAKE1, but that if multiple arguments are present, it should be called multiple times, and finally ITERATE2, which indicates that the command handler looks like a TAKE2, but if more arguments are present, then it should be called multiple times, holding the first argument constant.
  • Finally, we have a string which describes the arguments that should be present. If the arguments in the actual config file are not as required, this string will be used to help give a more specific error message. (You can safely leave this NULL).

Finally, having set this all up, we have to use it. This is ultimately done in the module's handlers, specifically for its file-typing handler, which looks more or less like this; note that the per-directory configuration structure is extracted from the request_rec's per-directory configuration vector by using the ap_get_module_config function.

Side notes --- per-server configuration, virtual servers, etc.

The basic ideas behind per-server module configuration are basically the same as those for per-directory configuration; there is a creation function and a merge function, the latter being invoked where a virtual server has partially overridden the base server configuration, and a combined structure must be computed. (As with per-directory configuration, the default if no merge function is specified, and a module is configured in some virtual server, is that the base configuration is simply ignored).

The only substantial difference is that when a command needs to configure the per-server private module data, it needs to go to the cmd_parms data to get at it. Here's an example, from the alias module, which also indicates how a syntax error can be returned (note that the per-directory configuration argument to the command handler is declared as a dummy, since the module doesn't actually have per-directory config data):

Htaccess Apache HTTP Server Htaccess World Wide Web

 

 

Comments