Blocking Bad Bots and Scrapers with .htaccess
Want to block a bad robot or web scraper using .htaccess files? Here are 2 methods that illustrate blocking 436 various user-agents.
Want to block a bad robot or web scraper using .htaccess files? Here are 2 methods that illustrate blocking 436 various user-agents.
FeedBurner is so RAD! I love it. Here's an alternative method to redirect scrapers and feed requests to your feedburner url, in my case, I use Branding by feedburner, which is so hot, taking advantage of CNAMEs in your DNS record.

3-Part article covering practical implementation of 3 advanced .htaccess features. Discover an easy way to boost your SEO the AskApache way (focus on visitors), a tip you might keep and use for life. Get some cool security tricks to use against spammers, crackers, and other nefarious sorts. Take your site's error handling to the next level, enhanced ErrorDocuments that go beyond 404's.
#### No https except to wp-admin -
# If the request is empty ( implies fopen or normal file access by a php script )
RewriteCond %{THE_REQUEST} ^$ [OR]
# OR if the request if for wp-admin or wp-login.php
RewriteCond %{REQUEST_URI} ^/(wp-admin|wp-login.php).*$ [NC,OR]
# OR if the Referer is https
RewriteCond %{HTTP_REFERER} ^https://www.askapache.com/.*$ [NC]
# THEN skip the following rule, basically all this does is force https or badhost to be redirected
# BUT because of the above 3 rewritecond's, this won't break poorly written admin scripts
RewriteRule .* - [S=1]
RewriteCond %{HTTPS} =on [OR]
RewriteCond %{HTTP_HOST} !^www.askapache.com$ [NC]
RewriteRule .* https://www.askapache.com%{REQUEST_URI} [R=301,L]
RewriteCond %{THE_REQUEST} ^[A-Z]{3,9} /(wp-admin/.*|wp-login.php.*) HTTP/ [NC]
RewriteCond %{HTTPS} !=on
RewriteRule .* https://%{SERVER_NAME}%{REQUEST_URI} [R=301,L]