Warning
.htaccess files can do more harm than good if used incorrectly and should only be used by experienced users. There are many other ways in which to control access to your site with Perl (CGI) or PHP scripting (users of Windows hosting accounts can also use ASP).
Uploading, Modifying and Deleting .htaccess Files
.htaccess and other files beginning with a '.' will not normally be visible in your FTP client. Create and save your files on your computer as plain text files such as htaccess.txt. You can then transfer these in ASCII mode to the correct directory and use your FTP client to rename them. Renaming htaccess.txt to .htaccess will replace your existing .htaccess file.
You can modify your .htaccess files by uploading the modified htaccess.txt file with FTP and then renaming it to .htaccess. This should replace the existing file. Because you cannot see .htaccess files with your FTP client, you will be unable to delete them via FTP. Rather than removing them, you could try uploading an empty text file and renaming it to .htaccess. You could also use the Perl unlink() function within a perl / CGI script to completely delete the file.
Password Protected Directories
You can use .htaccess to password protect directories on your account. Your .htaccess file should look something like this:
AuthType basic
AuthName "Secure Directory"
AuthUserFile /home/sites/SITENAME/.htpasswd
AuthPAM_Enabled off
<limit GET PUT POST>
require valid-user
</limit>
AuthName can be any name you like for the secure area. This will be shown on the dialogue box that requests the username and password. AuthUserFile is the full path to and filename for your .htpasswd file containing your list of usernames and passwords.
You also need to create a .htpassword file. This contains the list of usernames and encrypted passwords which look like this:
john:joVrhn8vGVM1k
fred:fr51bDvIVirJM
You can create encrytped passwords using the Encrypt UNIX Password tool available from http://www.magsys.co.uk/. You can copy and paste the usernames and passwords into htpasswd.txt, FTP the file to your account and rename it to .htpasswd.
You can also create an htpasswd file using the following method if you have shell access: htpasswd password file name user (add -c if creating a new file)
Example:
[Swamp@216 Swamp]$
htpasswd -c /var/www/.htpasswd Swamp
To create a new user in the same file simply drop the -c.
NOTE: After my Apache 2 install on a Gentoo (My new favorite distro) system it seems htpasswd is now called htpasswd2. “I wonder why?” :P
Seems the syntax stays exactly the same tho.
If you need password protection on more than one directory, you can create separate .htpasswd files with different names. Ensure you reference these correctly from your .htaccess files. Likewise you can create numerous htaccess files pointing to the same htpasswd file.
As an alternative to using .htaccess for password protection, you can use PHP or Perl / CGI scripting. You will need a database or table with usernames and passwords and a log in page with username and login boxes. You can use sessions or cookies to control how long the user is logged in for by adding a small piece of code to the top of each page to test whether the user is logged in (session is active or cookie is set).
Custom Error Pages
.htaccess allows you to use your own custom error pages very quickly and easily. Any time someone tried to access a page that does not exist or a page they are not permitted to view, present them with a custom error page that matches the style of your site. Custom error pages can be used for many errors, but we'll demonstrate the use of the custom 404 error page.
Create a custom 404 error page in HTML and upload it to the root directory of your site as error404.html. Create a text file called error.txt with the following content:
ErrorDocument 404 error404.html
Upload error.txt to your root directory in ASCII mode and rename it to .htaccess. Now whenever someone attempts to access a page on your site that does not exist, they will be presented with your own custom 404 error page.
You can use similar terminology for other errors. You can use any name that suits you for the error page. You don't even need to use an HTML page - you can use HTML and text instead of an error page. You can also redirect users to other pages on your site, even other domains. The following are just some examples of how to use custom error pages:
Error Meaning Example Code To Use
400 Bad Syntax ErrorDocument 400 error400.html
401 Unauthorised ErrorDocument 401 unauthorised.html
402 Not Used ErrorDocument 402 error402.html
403 Forbidden ErrorDocument 403 forbidden.html
404 Not Found ErrorDocument 404 "<h1>Sorry, that page does not exist</h1>"
500 Internal Error ErrorDocument 500 http://www.anothersite.com/anotherpage.html
501 Not Implemented ErrorDocument 501 error501.html
502 Overloaded ErrorDocument 502 error502.html
503 Gateway Timeout ErrorDocument 503 error503.html
One final point is that your .htaccess file will affect all directories below it. If you only want to use custom error pages only in one subdirectory, place the .htaccess file and the error page in that subdirectory and ensure the path in your .htaccess file points to the correct error page.
Redirecting
Redirecting users to other pages or directories is very easy and useful if you remove pages or reorganise your site. To redirect a user from pagename.html to newfile.html in another directory, simply use code similar to this in your .htaccess file:
Redirect /olddirectory/pagename.html /newdirectory/newfile.html
Rewriting URLs
Rewriting URLs is a technique used by webmasters to create search engine friendly URLs for forums, shopping carts and other dynamic sites. Search engines can often be unable to spider dynamic URLs with ?, &, = and other symbols necessary for your dynamic site to function correctly. URLs are rewritten using the Apache mod_rewrite module.
Rewriting URLs is a very advanced technique that should not be extensively tested on a private (test) site prior to use with a live site.
A shopping cart or forum page might normally have a URL of say
www.yoursite.com/cgi-bin/search.cgi?ID=7&item=123
which is not very search engine friendly. mod_rewrite allows you to use a displayed URL of say
www.yoursite.com/search/7/123.html
which is search engine friendly. When the second (friendly) URL is entered, mod_rewrite converts it invisibly to the first so that the script (in this case search.cgi) can process it.
The scripts will also need to be modified to output search engine safe URLs that can be processed by mod_rewrite. Sometimes you will need to use multiple lines in .htaccess to force rewriting of URLs with different numbers of parameters.
For more information on rewriting URLs and sample code, see the following sites:
Apache mod_rewrite guide
Ralf Engelschall's URL rewriting guide
Access Control and Banning Spiders
You may wish to exercise control over which browsers or spiders can access your site. For example, you may wish to ban "undesirable" spiders such as those that ignore your robots.txt file or only exist to gather email addresses in order to send spam.
You can do this using mod_rewrite. Ralf Engelschall's page as listed above has a section about banning robots.
If you are banning robots and spiders, you also need to know which user agents to block. Check your Webalizer statistics for user agent strings that you don't recognise and search for that string on a search engine. This should return some information about that user agent. If it appears to be undesirable, add it to your .htaccess file.
Remember to test your .htaccess script on a private (hidden) area of your site to ensure everything works as it should prior to using your script on your main site.