« mod_rewrite Fix for Caching Updated FilesProtecting Files with Advanced Mod_Rewrite Anti-Hotlinking »
Crazy Advanced Mod_Rewrite Debug Tutorial
September 11th, 2009
Contents
Are you an advanced mod_rewrite expert or guru? This article is for YOU too! Just make sure to read all the way to the bottom..
The following undocumented techniques and methods will allow you to utilize mod_rewrite at an "expert level" by showing you how to unlock its secrets.
Most if not all web developers and server administrators struggle with Apache mod_rewrite. It's very tough and only gets a little easier with practice. Until Now! Get ready to explode your learning curve, I figured something out.
Why mod_rewrite is so tough
I have come to the conclusion, after many hours of zenful thought, that the reason mod_rewrite is so tough is pretty obvious, people are trying to apply regular-expressions to URLs and Variables that they don't really understand. They understand what they want, but they don't understand what the URLS and Variables are that they are trying to rewrite.
Hit-Or-Miss with mod_rewrite
A lot of the mod_rewrite "experts" and "gurus" floating around the net absolutely know their mod_rewrite, but what separates them from a beginner or novice is for the most part an understanding of what the URLS and Variables look like that are targeted by the regular expressions. Take this simple rewriterule that rewrites requests made without the www to www.
RewriteEngine On
RewriteBase /
RewriteCond %{HTTP_HOST} !^www\.askapache\.com$ [NC]
RewriteRule .+ http://www.askapache.com%{REQUEST_URI}
Pretty simple right? WRONG. Most people could not figure that out..
Why?
The reason intelligent people can't figure that out is because they have no idea what HTTP_HOST or REQUEST_URI actually looks like. How can you write a rule for something if you don't know what it looks like? You can't.
When Not To Use Mod_Rewrite
Ok so heres an important concept that alot of people haven't heard. You should only use mod_rewrite's rewriterule when you use a rewritecond or if you are rewriting internally like my feedcount hack.
If you are simply redirecting one url to another, you should definately be using the much easier mod_alias's redirect and redirectmatch, which is enabled on most Apache servers.
When To Use Mod_Rewrite
So then, you should only use mod_rewrite's rewriterule when you are checking against one of the Apache Environment Variables to determine whether to rewrite or not. This is where the Apache Documentation is grossly lacking. They don't tell you what those variables look like, leaving us completely incapable of creating rewrites based on them. Not anymore.
Mod_Rewrite Environment Variables (The Secret)
Here's the variables I have found accessible by mod_rewrite (both documented and undocumented). A thing to note is that you can set these variables early in an .htaccess file using SetEnv, RewriteRule, Header, etc.. and they will be accessible at the end of the .htaccess file.
| Name |
|---|
API_VERSION |
AUTH_TYPE |
CONTENT_LENGTH |
CONTENT_TYPE |
DOCUMENT_ROOT |
GATEWAY_INTERFACE |
HTTPS |
HTTP_ACCEPT |
HTTP_ACCEPT_CHARSET |
HTTP_ACCEPT_ENCODING |
HTTP_ACCEPT_LANGUAGE |
HTTP_CACHE_CONTROL |
HTTP_CONNECTION |
HTTP_COOKIE |
HTTP_FORWARDED |
HTTP_HOST |
HTTP_KEEP_ALIVE |
HTTP_PROXY_CONNECTION |
HTTP_REFERER |
HTTP_USER_AGENT |
IS_SUBREQ |
ORIG_PATH_INFO |
ORIG_PATH_TRANSLATED |
ORIG_SCRIPT_FILENAME |
ORIG_SCRIPT_NAME |
PATH |
PATH_INFO |
PHP_SELF |
QUERY_STRING |
REDIRECT_QUERY_STRING |
REDIRECT_REMOTE_USER |
REDIRECT_STATUS |
REDIRECT_URL |
REMOTE_ADDR |
REMOTE_HOST |
REMOTE_IDENT |
REMOTE_PORT |
REMOTE_USER |
REQUEST_FILENAME |
REQUEST_METHOD |
REQUEST_TIME |
REQUEST_URI |
SCRIPT_FILENAME |
SCRIPT_GROUP |
SCRIPT_NAME |
SCRIPT_URI |
SCRIPT_URL |
SCRIPT_USER |
SERVER_ADDR |
SERVER_ADMIN |
SERVER_NAME |
SERVER_PORT |
SERVER_PROTOCOL |
SERVER_SIGNATURE |
SERVER_SOFTWARE |
THE_REQUEST |
TIME |
TIME_DAY |
TIME_HOUR |
TIME_MIN |
TIME_MON |
TIME_SEC |
TIME_WDAY |
TIME_YEAR |
TZ |
UNIQUE_ID |
Decoding Mod_Rewrite Variables
So when I realized my problem was that I didn't know the value of the variable being tested by the RewriteCond, I set out to try and discover how to view those variables.. Keep in mind you can also use RewriteLogging, but its only allowed for root users who can edit the httpd.conf, this is .htaccess.
Setting Environment Variables with RewriteRule
I discovered a multitude of methods to set and view apache environment variables, using various modules and some core tricks, but the method that allows me to view the most environment variables is RewriteRule.. I wanted to use SetEnvIf more, but its just not as powerful as mod_rewrite, due to programming.
This code sets the variable INFO_REQUEST_URI to have the value of REQUEST_URI.
RewriteEngine On
RewriteBase /
RewriteRule .* - [E=INFO_REQUEST_URI:%{REQUEST_URI},NE]
Saving the Apache Variable Values
Now the trick is how to view that environment variable... The method I came up with is nice... We will send the environment variable value in an HTTP Header, as there isn't much data manipulation/validation so you get an accurate look at the actual value.. At first I tried adding the variable value to a redirection using the query_string.. but a HTTP_USER_AGENT value doesn't play well as a query_string.
Using RequestHeader in .htaccess
This code takes advantage of the incredible mod_headers apache module to actually ADD a whole new header to YOUR request. Seriously one of the coolest tricks I've found yet.. Its almost the same as being able to spoof POST requests! Since Headers can be protected data... especially the HTTP_COOKIE header..
RequestHeader set INFO_REQUEST_URI "%{INFO_REQUEST_URI}e"
Viewing the Variable Values
Now you can use any kind of server-run interpreter like perl, php, ruby, etc., to view all the variable values. All cgi-script handlers like those are able to view request headers..
PHP Code to access Apache Variables
Works even in safe-mode... any interpreter can view HTTP Headers! Note that each of these variables are added as HTTP headers to the request for the script.. kinda confusing.. So each variable sent as a header is prefixed with HTTP_ to denote it was a header.
<?php
header("Content-Type: text/plain");
$INFO=$MISS=array();
foreach($_SERVER as $v=>$r)
{
if(substr($v,0,9)=='HTTP_INFO')
{
if(!empty($r))$INFO[substr($v,10)]=$r;
else $MISS[substr($v,10)]=$r;
}
}
/* thanks Mike! */
ksort($INFO);
ksort($MISS);
ksort($_SERVER);
echo "Received These Variables:\n";
print_r($INFO);
echo "Missed These Variables:\n";
print_r($MISS);
echo "ALL Variables:\n";
print_r($_SERVER);
?>
Time to Get Crazy
Just create the above php file on your site as /test/index.php or whatever, then create /test/.htaccess which should contain the below .htaccess file snippet. Now just request /test/index.php and be amazed! If you're looking for more general help check out this excellent mod_rewrite cheat sheet.
Ok, so I've prepared the .htaccess code you can use to view the values of all these variables. Just add it to a .htaccess file and make a request. For this test I created an index.php file that printed out all the $_SERVER variables, and made requests to it.
RewriteEngine On
RewriteBase /
RewriteRule .* - [E=INFO_API_VERSION:%{API_VERSION},NE]
RewriteRule .* - [E=INFO_AUTH_TYPE:%{AUTH_TYPE},NE]
RewriteRule .* - [E=INFO_CONTENT_LENGTH:%{CONTENT_LENGTH},NE]
RewriteRule .* - [E=INFO_CONTENT_TYPE:%{CONTENT_TYPE},NE]
RewriteRule .* - [E=INFO_DOCUMENT_ROOT:%{DOCUMENT_ROOT},NE]
RewriteRule .* - [E=INFO_GATEWAY_INTERFACE:%{GATEWAY_INTERFACE},NE]
RewriteRule .* - [E=INFO_HTTPS:%{HTTPS},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT:%{HTTP_ACCEPT},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_CHARSET:%{HTTP_ACCEPT_CHARSET},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_ENCODING:%{HTTP_ACCEPT_ENCODING},NE]
RewriteRule .* - [E=INFO_HTTP_ACCEPT_LANGUAGE:%{HTTP_ACCEPT_LANGUAGE},NE]
RewriteRule .* - [E=INFO_HTTP_CACHE_CONTROL:%{HTTP_CACHE_CONTROL},NE]
RewriteRule .* - [E=INFO_HTTP_CONNECTION:%{HTTP_CONNECTION},NE]
RewriteRule .* - [E=INFO_HTTP_COOKIE:%{HTTP_COOKIE},NE]
RewriteRule .* - [E=INFO_HTTP_FORWARDED:%{HTTP_FORWARDED},NE]
RewriteRule .* - [E=INFO_HTTP_HOST:%{HTTP_HOST},NE]
RewriteRule .* - [E=INFO_HTTP_KEEP_ALIVE:%{HTTP_KEEP_ALIVE},NE]
RewriteRule .* - [E=INFO_HTTP_MOD_SECURITY_MESSAGE:%{HTTP_MOD_SECURITY_MESSAGE},NE]
RewriteRule .* - [E=INFO_HTTP_PROXY_CONNECTION:%{HTTP_PROXY_CONNECTION},NE]
RewriteRule .* - [E=INFO_HTTP_REFERER:%{HTTP_REFERER},NE]
RewriteRule .* - [E=INFO_HTTP_USER_AGENT:%{HTTP_USER_AGENT},NE]
RewriteRule .* - [E=INFO_IS_SUBREQ:%{IS_SUBREQ},NE]
RewriteRule .* - [E=INFO_ORIG_PATH_INFO:%{ORIG_PATH_INFO},NE]
RewriteRule .* - [E=INFO_ORIG_PATH_TRANSLATED:%{ORIG_PATH_TRANSLATED},NE]
RewriteRule .* - [E=INFO_ORIG_SCRIPT_FILENAME:%{ORIG_SCRIPT_FILENAME},NE]
RewriteRule .* - [E=INFO_ORIG_SCRIPT_NAME:%{ORIG_SCRIPT_NAME},NE]
RewriteRule .* - [E=INFO_PATH:%{PATH},NE]
RewriteRule .* - [E=INFO_PATH_INFO:%{PATH_INFO},NE]
RewriteRule .* - [E=INFO_PHP_SELF:%{PHP_SELF},NE]
RewriteRule .* - [E=INFO_QUERY_STRING:%{QUERY_STRING},NE]
RewriteRule .* - [E=INFO_REDIRECT_QUERY_STRING:%{REDIRECT_QUERY_STRING},NE]
RewriteRule .* - [E=INFO_REDIRECT_REMOTE_USER:%{REDIRECT_REMOTE_USER},NE]
RewriteRule .* - [E=INFO_REDIRECT_STATUS:%{REDIRECT_STATUS},NE]
RewriteRule .* - [E=INFO_REDIRECT_URL:%{REDIRECT_URL},NE]
RewriteRule .* - [E=INFO_REMOTE_ADDR:%{REMOTE_ADDR},NE]
RewriteRule .* - [E=INFO_REMOTE_HOST:%{REMOTE_HOST},NE]
RewriteRule .* - [E=INFO_REMOTE_IDENT:%{REMOTE_IDENT},NE]
RewriteRule .* - [E=INFO_REMOTE_PORT:%{REMOTE_PORT},NE]
RewriteRule .* - [E=INFO_REMOTE_USER:%{REMOTE_USER},NE]
RewriteRule .* - [E=INFO_REQUEST_FILENAME:%{REQUEST_FILENAME},NE]
RewriteRule .* - [E=INFO_REQUEST_METHOD:%{REQUEST_METHOD},NE]
RewriteRule .* - [E=INFO_REQUEST_TIME:%{REQUEST_TIME},NE]
RewriteRule .* - [E=INFO_REQUEST_URI:%{REQUEST_URI},NE]
RewriteRule .* - [E=INFO_SCRIPT_FILENAME:%{SCRIPT_FILENAME},NE]
RewriteRule .* - [E=INFO_SCRIPT_GROUP:%{SCRIPT_GROUP},NE]
RewriteRule .* - [E=INFO_SCRIPT_NAME:%{SCRIPT_NAME},NE]
RewriteRule .* - [E=INFO_SCRIPT_URI:%{SCRIPT_URI},NE]
RewriteRule .* - [E=INFO_SCRIPT_URL:%{SCRIPT_URL},NE]
RewriteRule .* - [E=INFO_SCRIPT_USER:%{SCRIPT_USER},NE]
RewriteRule .* - [E=INFO_SERVER_ADDR:%{SERVER_ADDR},NE]
RewriteRule .* - [E=INFO_SERVER_ADMIN:%{SERVER_ADMIN},NE]
RewriteRule .* - [E=INFO_SERVER_NAME:%{SERVER_NAME},NE]
RewriteRule .* - [E=INFO_SERVER_PORT:%{SERVER_PORT},NE]
RewriteRule .* - [E=INFO_SERVER_PROTOCOL:%{SERVER_PROTOCOL},NE]
RewriteRule .* - [E=INFO_SERVER_SIGNATURE:%{SERVER_SIGNATURE},NE]
RewriteRule .* - [E=INFO_SERVER_SOFTWARE:%{SERVER_SOFTWARE},NE]
RewriteRule .* - [E=INFO_THE_REQUEST:%{THE_REQUEST},NE]
RewriteRule .* - [E=INFO_TIME:%{TIME},NE]
RewriteRule .* - [E=INFO_TIME_DAY:%{TIME_DAY},NE]
RewriteRule .* - [E=INFO_TIME_HOUR:%{TIME_HOUR},NE]
RewriteRule .* - [E=INFO_TIME_MIN:%{TIME_MIN},NE]
RewriteRule .* - [E=INFO_TIME_MON:%{TIME_MON},NE]
RewriteRule .* - [E=INFO_TIME_SEC:%{TIME_SEC},NE]
RewriteRule .* - [E=INFO_TIME_WDAY:%{TIME_WDAY},NE]
RewriteRule .* - [E=INFO_TIME_YEAR:%{TIME_YEAR},NE]
RewriteRule .* - [E=INFO_TZ:%{TZ},NE]
RewriteRule .* - [E=INFO_UNIQUE_ID:%{UNIQUE_ID},NE]
RequestHeader set INFO_API_VERSION "%{INFO_API_VERSION}e"
RequestHeader set INFO_AUTH_TYPE "%{INFO_AUTH_TYPE}e"
RequestHeader set INFO_CONTENT_LENGTH "%{INFO_CONTENT_LENGTH}e"
RequestHeader set INFO_CONTENT_TYPE "%{INFO_CONTENT_TYPE}e"
RequestHeader set INFO_DOCUMENT_ROOT "%{INFO_DOCUMENT_ROOT}e"
RequestHeader set INFO_GATEWAY_INTERFACE "%{INFO_GATEWAY_INTERFACE}e"
RequestHeader set INFO_HTTPS "%{INFO_HTTPS}e"
RequestHeader set INFO_HTTP_ACCEPT "%{INFO_HTTP_ACCEPT}e"
RequestHeader set INFO_HTTP_ACCEPT_CHARSET "%{INFO_HTTP_ACCEPT_CHARSET}e"
RequestHeader set INFO_HTTP_ACCEPT_ENCODING "%{INFO_HTTP_ACCEPT_ENCODING}e"
RequestHeader set INFO_HTTP_ACCEPT_LANGUAGE "%{INFO_HTTP_ACCEPT_LANGUAGE}e"
RequestHeader set INFO_HTTP_CACHE_CONTROL "%{INFO_HTTP_CACHE_CONTROL}e"
RequestHeader set INFO_HTTP_CONNECTION "%{INFO_HTTP_CONNECTION}e"
RequestHeader set INFO_HTTP_COOKIE "%{INFO_HTTP_COOKIE}e"
RequestHeader set INFO_HTTP_FORWARDED "%{INFO_HTTP_FORWARDED}e"
RequestHeader set INFO_HTTP_HOST "%{INFO_HTTP_HOST}e"
RequestHeader set INFO_HTTP_KEEP_ALIVE "%{INFO_HTTP_KEEP_ALIVE}e"
RequestHeader set INFO_HTTP_MOD_SECURITY_MESSAGE "%{INFO_HTTP_MOD_SECURITY_MESSAGE}e"
RequestHeader set INFO_HTTP_PROXY_CONNECTION "%{INFO_HTTP_PROXY_CONNECTION}e"
RequestHeader set INFO_HTTP_REFERER "%{INFO_HTTP_REFERER}e"
RequestHeader set INFO_HTTP_USER_AGENT "%{INFO_HTTP_USER_AGENT}e"
RequestHeader set INFO_IS_SUBREQ "%{INFO_IS_SUBREQ}e"
RequestHeader set INFO_ORIG_PATH_INFO "%{INFO_ORIG_PATH_INFO}e"
RequestHeader set INFO_ORIG_PATH_TRANSLATED "%{INFO_ORIG_PATH_TRANSLATED}e"
RequestHeader set INFO_ORIG_SCRIPT_FILENAME "%{INFO_ORIG_SCRIPT_FILENAME}e"
RequestHeader set INFO_ORIG_SCRIPT_NAME "%{INFO_ORIG_SCRIPT_NAME}e"
RequestHeader set INFO_PATH "%{INFO_PATH}e"
RequestHeader set INFO_PATH_INFO "%{INFO_PATH_INFO}e"
RequestHeader set INFO_PHP_SELF "%{INFO_PHP_SELF}e"
RequestHeader set INFO_QUERY_STRING "%{INFO_QUERY_STRING}e"
RequestHeader set INFO_REDIRECT_QUERY_STRING "%{INFO_REDIRECT_QUERY_STRING}e"
RequestHeader set INFO_REDIRECT_REMOTE_USER "%{INFO_REDIRECT_REMOTE_USER}e"
RequestHeader set INFO_REDIRECT_STATUS "%{INFO_REDIRECT_STATUS}e"
RequestHeader set INFO_REDIRECT_URL "%{INFO_REDIRECT_URL}e"
RequestHeader set INFO_REMOTE_ADDR "%{INFO_REMOTE_ADDR}e"
RequestHeader set INFO_REMOTE_HOST "%{INFO_REMOTE_HOST}e"
RequestHeader set INFO_REMOTE_IDENT "%{INFO_REMOTE_IDENT}e"
RequestHeader set INFO_REMOTE_PORT "%{INFO_REMOTE_PORT}e"
RequestHeader set INFO_REMOTE_USER "%{INFO_REMOTE_USER}e"
RequestHeader set INFO_REQUEST_FILENAME "%{INFO_REQUEST_FILENAME}e"
RequestHeader set INFO_REQUEST_METHOD "%{INFO_REQUEST_METHOD}e"
RequestHeader set INFO_REQUEST_TIME "%{INFO_REQUEST_TIME}e"
RequestHeader set INFO_REQUEST_URI "%{INFO_REQUEST_URI}e"
RequestHeader set INFO_SCRIPT_FILENAME "%{INFO_SCRIPT_FILENAME}e"
RequestHeader set INFO_SCRIPT_GROUP "%{INFO_SCRIPT_GROUP}e"
RequestHeader set INFO_SCRIPT_NAME "%{INFO_SCRIPT_NAME}e"
RequestHeader set INFO_SCRIPT_URI "%{INFO_SCRIPT_URI}e"
RequestHeader set INFO_SCRIPT_URL "%{INFO_SCRIPT_URL}e"
RequestHeader set INFO_SCRIPT_USER "%{INFO_SCRIPT_USER}e"
RequestHeader set INFO_SERVER_ADDR "%{INFO_SERVER_ADDR}e"
RequestHeader set INFO_SERVER_ADMIN "%{INFO_SERVER_ADMIN}e"
RequestHeader set INFO_SERVER_NAME "%{INFO_SERVER_NAME}e"
RequestHeader set INFO_SERVER_PORT "%{INFO_SERVER_PORT}e"
RequestHeader set INFO_SERVER_PROTOCOL "%{INFO_SERVER_PROTOCOL}e"
RequestHeader set INFO_SERVER_SIGNATURE "%{INFO_SERVER_SIGNATURE}e"
RequestHeader set INFO_SERVER_SOFTWARE "%{INFO_SERVER_SOFTWARE}e"
RequestHeader set INFO_THE_REQUEST "%{INFO_THE_REQUEST}e"
RequestHeader set INFO_TIME "%{INFO_TIME}e"
RequestHeader set INFO_TIME_DAY "%{INFO_TIME_DAY}e"
RequestHeader set INFO_TIME_HOUR "%{INFO_TIME_HOUR}e"
RequestHeader set INFO_TIME_MIN "%{INFO_TIME_MIN}e"
RequestHeader set INFO_TIME_MON "%{INFO_TIME_MON}e"
RequestHeader set INFO_TIME_SEC "%{INFO_TIME_SEC}e"
RequestHeader set INFO_TIME_WDAY "%{INFO_TIME_WDAY}e"
RequestHeader set INFO_TIME_YEAR "%{INFO_TIME_YEAR}e"
RequestHeader set INFO_TZ "%{INFO_TZ}e"
RequestHeader set INFO_UNIQUE_ID "%{INFO_UNIQUE_ID}e"
Mod_Rewrite Variables Decoded!
[API_VERSION] => 20020903:12 [AUTH_TYPE] => Digest [DOCUMENT_ROOT] => /home/user/www_root/askapache.com [HTTPS] => off [HTTP_ACCEPT] => text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 [HTTP_COOKIE] => PHPSESSID=752ee6d56e15f305233e30045987e5ce568c034; __qca=1176541225-59967328-5223185; [HTTP_HOST] => www.askapache.com [HTTP_REFERER] => http://www.askapache.com/protest/index.php?askapache=awesomeness&you=rock [HTTP_USER_AGENT] => Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.16) Gecko/20080702 Firefox/2.0.0.16 [IS_SUBREQ] => false [QUERY_STRING] => e=404 [REMOTE_ADDR] => 22.162.144.211 [REMOTE_HOST] => 22.162.144.211 [REMOTE_PORT] => 4511 [REMOTE_USER] => administrator [REQUEST_FILENAME] => /home/user/www_root/askapache.com/protest/index.php [REQUEST_METHOD] => GET [REQUEST_URI] => /protest/index.php [SCRIPT_FILENAME] => /home/user/www_root/askapache.com/protest/index.php [SCRIPT_GROUP] => daemonu [SCRIPT_USER] => askapache [SERVER_ADDR] => 208.113.134.190 [SERVER_ADMIN] => webmaster@askapache.com [SERVER_NAME] => www.askapache.com [SERVER_PORT] => 80 [SERVER_PROTOCOL] => HTTP/1.1 [SERVER_SOFTWARE] => Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 [THE_REQUEST] => GET /protest/adf HTTP/1.1 [TIME] => 20080820014309 [TIME_DAY] => 20 [TIME_HOUR] => 01 [TIME_MIN] => 43 [TIME_MON] => 08 [TIME_SEC] => 09 [TIME_WDAY] => 3 [TIME_YEAR] => 2008
Request using HTTPS
[API_VERSION] => 20020903:12 [AUTH_TYPE] => Digest [DOCUMENT_ROOT] => /home/user/www_root/askapache.com [HTTPS] => on [HTTP_ACCEPT] => text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 [HTTP_COOKIE] => PHPSESSID=752ee6d56e15f305233e30045987e5ce568c034; __qca=1176541225-59967328-5223185; [HTTP_HOST] => www.askapache.com [HTTP_REFERER] => http://www.askapache.com/protest/index.php?askapache=awesomeness&you=rock [HTTP_USER_AGENT] => Mozilla/5.0 (Windows; U; Windows NT 5.1; en-US; rv:1.8.1.16) Gecko/20080702 Firefox/2.0.0.16 [IS_SUBREQ] => false [QUERY_STRING] => hi=you&whats=&you [REMOTE_ADDR] => 22.162.144.211 [REMOTE_HOST] => 22.162.144.211 [REMOTE_PORT] => 4605 [REMOTE_USER] => administrator [REQUEST_FILENAME] => /home/user/www_root/askapache.com/protest/index.php [REQUEST_METHOD] => GET [REQUEST_URI] => /protest/index.php [SCRIPT_FILENAME] => /home/user/www_root/askapache.com/protest/index.php [SCRIPT_GROUP] => daemonu [SCRIPT_USER] => askapache [SERVER_ADDR] => 208.113.134.190 [SERVER_ADMIN] => webmaster@askapache.com [SERVER_NAME] => www.askapache.com [SERVER_PORT] => 443 [SERVER_PROTOCOL] => HTTP/1.1 [SERVER_SOFTWARE] => Apache/2.0.61 (Unix) PHP/4.4.7 mod_ssl/2.0.61 OpenSSL/0.9.7e mod_fastcgi/2.4.2 DAV/2 [THE_REQUEST] => GET /protest/index.php?hi=you&whats=&you HTTP/1.1 [TIME] => 20080820015016 [TIME_DAY] => 20 [TIME_HOUR] => 01 [TIME_MIN] => 50 [TIME_MON] => 08 [TIME_SEC] => 16 [TIME_WDAY] => 3 [TIME_YEAR] => 2008
Emulating ErrorDocuments with Mod_Rewrite
The ErrorDocument directive is helpful because an errordocument is called differently then a normal file, and it contains special variables to help an admin debug.
I've wanted to use a RewriteCond + a RewriteRule to cause an Apache ErrorDocument to be displayed for a long time... I finally figured it out. Simply use the HTTP STATUS CODE trick in combination with a simple RewriteRule to trigger an Apache ErrorDocument.
This code emulates the internal 404 process Apache goes through.. If the file is not found it requests the /test/trigger-error/404 internally which triggers the 404 ErrorDocument.
ErrorDocument 404 /test/errordocument/404.html
Redirect 404 /test/trigger-error/404
RewriteEngine On
RewriteBase /
RewriteCond %{ENV:REDIRECT_STATUS} !=404
RewriteCond %{REQUEST_FILENAME} !-f
RewriteCond %{REQUEST_FILENAME} !-d
RewriteRule .* /test/trigger-error/404 [L]
Big Deal.. you might say... well consider that this works with any status code, and using this method you now have the power to trigger any errordocument page based on any kind of rewritecond. I'll be writing about some practical uses for this powerful method in the coming weeks, but heres a good example now so you can see how it can be used.
This bit of code Triggers the 505 HTTP Version Not Supported When a request is made to the server with a protocol other than 1.1.
ErrorDocument 505 /test/errordocument/505.html
Redirect 505 /test/trigger-error/505
RewriteEngine On
RewriteBase /
RewriteCond %{ENV:REDIRECT_STATUS} !=505
RewriteCond %{THE_REQUEST} !^[A-Z]{3,9}\ /.*\ HTTP/(0\.9|1\.0|1\.1) [NC]
RewriteRule .* /test/trigger-error/505 [L]
YES! I realize I didn't explain that very well, I didn't realize it was that complicated.. I wanted to go into how to use these advanced tricks and methods to achieve some really cool stuff, but explaining just this little bit took me awhile and I'm out of page space!
So play around with this and I'll post back some of the untapped sicknesses you can give a website with such powerful methods at your disposal.
Ralf S. Engelschall/* * URL Rewriting Module * * This module uses a rule-based rewriting engine (based on a * regular-expression parser) to rewrite requested URLs on the fly. * * It supports an unlimited number of additional rule conditions (which can * operate on a lot of variables, even on HTTP headers) for granular * matching and even external database lookups (either via plain text * tables, DBM hash files or even external processes) for advanced URL * substitution. * * It operates on the full URLs (including the PATH_INFO part) both in * per-server context (httpd.conf) and per-dir context (.htaccess) and even * can generate QUERY_STRING parts on result. The rewriting result finally * can lead to internal subprocessing, external request redirection or even * to internal proxy throughput. * * This module was originally written in April 1996 and * gifted exclusively to the The Apache Software Foundation in July 1997 by * * Ralf S. Engelschall * rse engelschall.com * www.engelschall.com */
Reader Comments
-
Where can I donate? This article is the missing link of all mod_rewrite so called documentation. Thank you so much!
-
Bookmarked. Forever. Big thanks, man.
-
Note that for some reason the original error status trick wasn't working for me if I followed it with an [L]; I had to use [PT] to pass the request back through. Here's an example of my code to return a 409 when certain headers are present:
ErrorDocument 409 /error_file_conflict.json Redirect 409 /trigger-error/409 RewriteCond %{REQUEST_URI} ^/files/([^\/]+)/trunk/head/.*$ RewriteCond %{REQUEST_METHOD} ^PUT$ RewriteCond %{HTTP:X-Subversion-Revision} !283 Header set X-Subversion-Revision 283 RewriteRule ^.*$ /trigger-error/409 [PT]Notice the [PT] at the very end. Here's an example cURL request that now correctly works with this:
curl http://api-local.inkling.com/files/ganongs_physiology_23e/trunk/head/s9ml/chapter04/ch04_1_testfigure_4-4.s9ml -i -H "Content-Type: text/xml" -H "X-Subversion-Revision: 282" -X PUTIt took me forever to find this, so posting in case it helps other people :)
Best, Brad Neuberg
-
So I have an interesting situation. I have two webservers running as virtual hosts. Site1 and Site2 - they both run on the same set of hardware and both use the same IP address. When someone comes into
Site2/site.gateway.html$I need them to proxy toSite1/site.gateway.html$1.RewriteCond %{REQUEST_URI} ^/site.gateway.html$ RewriteRule ^(.*)$ http://Site1$1 [NE,P]Throws an error of:
The thing is, I have to have "ProxyPreserveHost On" because of other mod_proxy directives. Any ideas? I am thrown on this one.Your browser sent a request that this server could not understand. Size of a request header field exceeds server limit. X-Forwarded-Host: site1, site1, site1 ...
-
Cheers, thanks for the awesome insights on mod_rewrite. Found this article while searching for information about the IS_SUBREQ variable. Still I couldn't find out, when exactly this variable is set to true. Could you maybe enlighten me? Thanks a lot, Pelle.
-
I'm lost. In my problem need to change the variables SERVER_NAME and HTTP_POST, because these variables Akamai gives me the name of the original machine. With these values, the Joomla rewrite, I change the URL for these values. I tried:
SERVER_ADDR SetEnvIf 192 \ .168 \ .74 \ .142 SERVER_NAME = i.like.this.name.com RewriteRule .* - [E = SERVER_NAME "i.like.this.name.com, NE] RequestHeader set SERVER_NAME "i.like.this.name.com"
In this case if I create a variable,HTTP_SERVER_NAMEbut not changing the value ofSERVER_NAMENo positive result. -
That is what anybody search over Google for solution, you have done a great job especially making it possible for understanding with mod_rewrite in a unique and comprehensive way. Thanks for sharing such great and helpful content.
-
Dear Apache HTTP Server Experts, when i applied RewriteMap with text lookup, the textfile say had 3 entires, only alwasy the first entry key is successful, other entry key locate fails although the entry is present. when i interchange the entries order in the text file, still the entry in the first alone woks rest fails. Please state your guidance. Thanks
-
that was highly useful - the info I was looking for was already in the official mod_rewrite docs over at httpd.apache.org, but it was easier to find exactly what I was looking for on this page, and then I went back and got the more comprehensive docs. thanks a bunch!
-
And what if my rewriting rules fail and loop? Then I still can't display what error causes the recursion, and I can't display it using a script. The good way would be to turn on logging, but it's impossible for per-directory .htaccess configuration. Any ideas how to debug these variables without logging and without using scripts? Maybe some way to log these vars into AccessLog or ErrorLog?
-
Its a shame this article is not date-marked... I could only guess by some of the content that it may be from about 2008. As old as it is, it is not any less valuable now (2010) than it was then. Just writing to say Thank You so much for this detailed "expose"... even today this stuff is difficult to come by in comprehensive addressment.
-
is it possile to use mod_rewrite to cache request from a host apache server to different guest vservers running apache.... emulating a virtual host in wich each virtual host is contained in it's own vserver and has it's own apache process? I need it because i am tired of cgi , forbid functions and jails to disable acces from a virtual host to another (file access) and I have ONLY one public IP adress and I want all my requests to come to :80 port and the request to be deliveret to the local 192.168.x.x vservers... sory for my english
-
What a pity such a helpful (definitive?) article is flooded with 'Gimme a solution - for free!' comments. This is exactly what I needed; to see what variables are available, how they are formatted and their typical (or current) value. I would have found this article sooner if it contained the keyword 'debug' | 'de-bug'. Many, many thanks, Alan.
-
This article is a very interesting read, and will come in handy no doubt in the future, thank you for the contribution :)
-
Thanks for explaining all these details! I thought mod_rewrite was everything about Regex but there are tons of other things to be aware of which makes it really powerful and something to spend time on it. Cheers!
-
I want to make sure this is posted on AS MANY mod_rewrite and mod_alias pages as I can find. Say you're wanting to redirect all MISSPELLED folder/directory names to the CORRECTLY SPELLED folder/directory URI, as I was wanting to do, and subsequently wasted nearly four hours trying to find out how to do it. You see, in my public_html directory, I had various folders with one index.php file in it, redirecting to the correct folder. However, I could not account for misspellings of the folder name. All my folders were spelled correctly, just with different capitalized letters. As you might imagine, this junked up my base folder quite a bit, and so I wanted to use .htaccess to redirect all misspelled AND wrongly capitalized variations (to a degree, of course) to the correct folder and thence, the correct index.php. Simple enough, right? Lots of web admins must have this same or very similar problem, Right????
I had to figure it out on my own. No code sample I sifted through even came close to what eventually ended up doing what I wanted to do. So I offer it to all of you now. Free of charge, no strings attached. You're welcome, and enjoy. :-)Indubitably WRONG.
Options +FollowSymLinks RewriteEngine on RewriteRule mad.*$ http://site.com/madeupeasytomisspellbadlynameddirectory [L,QSA,R=301,NC] RewriteRule ano.*$ http://site.com/AnotherMadeUpDifficultToCapitalizeDirectoryOfWhichIHadOnMySite [L,QSA,R=301,NC]
Simple as that. The following is purely academic, for those of you (like me) who want to know how --and why(skip to the end)-- it works (Take a hint peoples!). I had two directories that I needed to check the spelling of, and every single code snippet I found online had one or more correspondingRewriteConds preceding theRewriteRule(RewriteCondis obviously NOT always required to be there, which several tutorial sites alluded to). The problem with having a precedingRewriteCondwas the firstRewriteRulewould always execute, regardless of the spelling of the directory. If I only had one folder on my site, this would have been sufficient. But I needed to check both. Independently of each other. And have it be NOT CASE-SENSITIVE. Which RedirectMatch is NOT capable of. So when I eventually found a site that said I could remove the preceding ^ from the regex on theRewriteRuleto have it match ALL requests that ended in that (the regex pattern), I gave that a shot. This, of course, after I had tried every single possible iteration/permutation/variation of every single code snippet on the web that I thought maybe might actually finally do what I wanted it to do! And that's how it works. So here's why it works:mad.*--This is the regex that matches any part of the URL, and catches any misspelled variation of the directory name. This MUST be unique, otherwise it will start redirecting when you don't want it to. You'll notice that in both directory names, mad and ano are both unique letter combinations. Other than that, it can be whatever you want it to be. The $ indicates the end of the regex. Not sure why removing the beginning signifier (^) of the regex was so crucial to the whole thing working...http://www.askapache.com/madeupeasytomisspellbadlynameddirectory--This of course is the correct path you want them to go. It can be any valid URL string on your server, even a different domain, if applicable...[L,QSA,R=301,NC]--L: means STOP, of course (for this request). --QSA: Not sure what that does. If it ain't broke... :-) --R=301: This is what makes it act like aRedirectMatch, and 301 of course is the permanent type. I'm sure you know you can change this to 302 if you want. --NC: This is the key that I was looking for for four straight hours. I just tried it out, left it in, and voila! Seemed to work perfectly for me. This is the Case Insensitive regex match that allows for all incorrectly capitalized spellings of the directory name to be caught and redirected, exactly what I originally wanted. :-D So there you go! If this helps one person in the history of this website, I will have done my small part in making this world a better, more programmer-friendly place. :-) -
Great tutorial. I have an issue though which I am still struggling to address. A client I manage a site for has put out a poster with a link similar to ~/VISIT US. I have added the following code to link from ~/visitus to ~/visitus.php. However any code I have used for the uppercase redirect seems to conflict with the existing code (below). Thanks for your help.
RewriteEngine on RewriteCond %{REQUEST_FILENAME} !-d RewriteCond %{REQUEST_FILENAME}\.php -f RewriteRule ^(.*)$ $1.php [NC] -
Hello, I have a small issue that I am trying to figure out and for the life of me I just can not seem to find the answer. I am working on a project that ties a few different technologies together for 1 social networking platform that I am building. What I am trying to accomplish is how to hide the directory structure from site visitors and give the members of the site a clean url to go to. For example the home page for the site members is currently setup like this: http://www.example.com/ss/user/username I am trying to figure out a way so that the users can get a clean url that would eliminate the /ss/user/ and look like this: http://www.example.com/username I not even sure if this is possible with .htaccess. Has anyone ever accomplished this/and if so how? Any help/insight would be greatly appreciated.
-
Yeh... this is crazy advanced! Pretty awesome though.
-
In case this throws a 500 for anyone, make sure that mod_headers is loaded:
LoadModule headers_module modules/mod_headers.so
-
Okay, this looks very good. I got it to work and display all the variables. But what I *really* wanted was a way to print out the variables while accessing my problematic reference. That is, this prints out the variables using a specific URL request. Namely, test/index.php ... or whatever the name given to it was. But I want to see what's going on for the specific URL I am trying to debug. Is there a way from within the .htaccess file ALONE that I can print out the various variables?
-
Kudos. I love your style of writing-- very cheesy infomercial style. Entertaining, informative, and it's not even 3am!
-
I am new to apache mod_rewrite. I have a question here. I would like to put an user defined variable in the RewriteRule, and the variable is retrieved from database. Is this possible?
-
Thanks for great post, I didn't know it is possible to "decode mod rewrite variables' in this way! I also found that
%{HTTP_ACCEPT_LANGUAGE}in .htaccess files for making multilingual websites doesn't work. However there is other way to read browser setting of prefferred language in order to make a localization of the site using Apache mod rewrite. Fortunately we can access every header sent by the browser using%{HTTP:header}, so the solution is not%{HTTP_ACCEPT_LANGUAGE}but:%{HTTP:Accept-Language}. I write more about this on my blog post: mod rewrite and HTTP_ACCEPT_LANGUAGE -
For some reason, I'm getting a 500 error. When I take out the "RequestHeader" portions of the .htaccess file, it actually brings me to the test/index.php file.
Is there some configuration in my config file I need to do to allow RequestHeader code in .htaccess?
-
Your php code isn't showing completely. For some reason, WordPress (or whatever) is interpreting your piece of quoted php code (index.php), and cuts off part of it. I can see in the source that you have:
$r) { if(substr($v,0,9)=='HTTP_INFO') { if(!empty($r))$INFO[substr($v,10)]=$r; else $MISS[substr($v,10)]=$r; } } /* thanks Mike! */ ksort($INFO); ksort($MISS); ksort($_SERVER); echo "Received These Variables:\n"; print_r($INFO); echo "Missed These Variables:\n"; print_r($MISS); echo "ALL Variables:\n"; print_r($_SERVER);However, the <?php is still being interpreted. The web page only lists from $r) onward. Maybe it's my firefox browser? -
Nice tutorial. Love this blog and your plugins are actual real plugins that are useful for everything. Cheers, Dan
-
An excellent article just what I was looking for. However I found one minor problem! Time to Get Crazy: I copied this into an .htaccess file. PHP Code to access Variables: Ran this code from a test page. The key values displayed were numeric and not the expected parameters. It looks as if the following lines do not preserve keys:
sort($INFO); sort($MISS); sort($_SERVER);
After changing the above to:ksort($INFO); ksort($MISS); ksort($_SERVER);
Worked like a charm, for the first time I can explore the variables I am working with. -
Wow! This is a fantastic resource! The lack of knowledge about exactly what the format of the RewriteCond variables was has been holding me back with mod_rewrite since I started trying to use it. Thank you so much for publishing this the day before I (once again) went on a search for this information!
-
At first I thought this was just another mod_rewrite tutorial but it's way better. Thanks for the great tip!
-
Nice. Will check that out. Look forward to your posts in a month or two then :)
-
How timely, especially the bit on "Emulating ErrorDocuments with Mod_Rewrite". I have been trying to do that last night with little luck. On a friend's site, they are getting a ton of SQL injection attempts (though their site doesn't use a database!) so I was hoping for a .htaccess only solution to force the 403 error document when we detect these. I came across the [F] flag, which I will try this evening, but your section looks like it will work, too.
