Part 1 - Introduction
In this guide you will find out about the .htaccess
file and the power it has to improve your website.
Although .htaccess is only a file, it can change settings on the servers and allow you to do many different things, the most popular being able to have your own custom 404 error pages.
.htaccess isn't difficult to use and is really just made up of a few simple instructions in a text file.
Does Woktron Support htacccess files?
Yes, we fully support the creation of .htaccess files
What Can I Do with htaccess?
You may be wondering what .htaccess can do, or you may have read about some of its uses but don't realise how many things you can actually do with it.
There is a huge range of things .htaccess can do including:
- password protecting folders
- redirecting users automatically
- custom error pages
- changing your file extensions
- banning users with certain IP addresses
- only allowing users with certain IP addresses
- stopping directory listings
- use of a different file as the index file
Creating a .htaccess File
Creating a .htaccess file may cause you a few problems. Writing the file is easy, you just need enter the appropriate code into a text editor (like notepad). You may run into problems with saving the file.
Because .htaccess is a strange file name (the file actually has no name but a 8 letter file extension) it may not be accepted by certain legacy systems. With most operating systems, though, all you need to do is to save the file by entering the name as:
".htaccess" (excluding the quotes)
If this doesn't work, you will need to name it something else (e.g. htaccess.txt
) and then upload it to the server. Once you have uploaded the file you can then rename it using an FTP program.
Custom Error Pages
The first use of the .htaccess file which I will cover is custom error pages. These will allow you to have your own, personal error pages (for example when a file is not found) instead of using your host's error pages or having no page. This will make your site seem much more professional in the unlikely event of an error. It will also allow you to create scripts to notify you if there is an error.
You can use custom error pages for any error as long as you know its number (like 404 for page not found) by adding the following to your .htaccess file:
ErrorDocument errornumber /file.html
For example if I had the file notfound.html
in the root directory of my site and I wanted to use it for a 404 error page I would use:
ErrorDocument 404 /notfound.html
If the file is not in the root directory of your site, you just need to put the path to it:
ErrorDocument 500 /errorpages/500.html
These are some of the most common errors:
- 401 - Authorization Required
- 400 - Bad request
- 403 - Forbidden
- 500 - Internal Server Error
- 404 - Wrong page
All you need to do is to create a file to display when the error happens and upload it, together with the .htaccess file.
More information can be found in this tutorial.
Part 2 - .htaccess Commands
In the last part I introduced you to .htaccess and some of its useful features. In this part I will show you how to use the .htaccess file to implement some of these.
Stop a Directory Index From Being Shown
Sometimes, for one reason or another, you will have no index file in your directory. This will, of course, mean that if someone types the directory name into their browser, a full listing of all the files in that directory will be shown. This could be a security risk for your site.
To prevent this (without creating lots of new 'index' files, you can enter a command into your .htaccess file to stop the directory list from being shown:
Options -Indexes
Deny/Allow Certian IP Addresses
Blocking or allowing an IP address using .htaccess
In some situations, you may want to only allow people with specific IP addresses to access your site or you may want to ban certain IP addresses.
You can block an IP address by using:
deny from 000.000.000.000
where 000.000.000.000
is the IP address.
If you only specify 1 or 2 of the groups of numbers (or octets), you will block a whole range.
You can allow an IP address by using:
allow from 000.000.000.000
where 000.000.000.000
is the IP address.
If you only specify 1 or 2 of the groups of numbers (or octets), you will allow a whole range.
If you want to deny everyone from accessing a directory, you can use:
deny from all
To allow access to one ip address but deny access to any other ip addresses you can use:
allow from 000.000.000.000
deny from all
Alternative Index Files
You may not always want to use index.htm
or index.html
as your index file for a directory, for example if you are using PHP files in your site, you may want index.php
to be the index file for a directory. You are not limited to 'index' files though. Using .htaccess you can set foofoo.blah
to be your index file if you want to!
Alternate index files are entered in a list. The server will work from left to right, checking to see if each file exists, if none of them exisit it will display a directory listing (unless, of course, you have turned this off).
DirectoryIndex index.php index.php3 messagebrd.pl index.html index.htm
Part 3 - Redirection using htaccess
One of the most useful functions of the .htaccess file is to redirect requests to different files, either on the same server, or on a completely different web site.
Redirection
Redirect can be extremely useful if you change the name of one of your files but allow users to still find it. Another use is to redirect to a longer URL.
The following can be done to redirect a specific file:
Redirect /location/from/root/file.ext http://www.othersite.com/new/file/location.xyz
In this above example, a file in the old root directory called oldfile.html would be entered as:
/oldfile.html
and a file in the old subdirectory would be entered as:
/old/oldfile.html
You can also redirect whole directories of your site using the .htaccess file, for example if you had a directory called olddirectory on your site and you had set up the same files on a new site at:
http://www.newsite.com/newdirectory/
You could redirect all the files in that directory without having to specify each one:
Redirect /olddirectory http://www.newsite.com/newdirectory
Then, any request to your site below /olddirectory
will be redirected to the new site, with the extra information in the URL added on. For example if someone typed in:
http://www.youroldsite.com/olddirecotry/oldfiles/images/image.gif
They would be redirected to:
http://www.newsite.com/newdirectory/oldfiles/images/image.gif
This can prove to be extremely powerful if used correctly.
301 and 302 Redirects
A redirect is a way to send both users and search engines to a different URL from the one they originally requested. The three most commonly used redirects are 301, 302, and Meta Refresh.
Some of Google's employees have indicated that there are cases where 301s and 302s may be treated similarly, but our evidence suggests that the safest way to ensure search engines and browsers of all kinds give full credit is to use a 301 when permanently redirecting URLs.
The Internet runs on a protocol called HyperText Transfer Protocol (HTTP) which dictates how URLs work. It has two major versions, 1.0 and 1.1. In the first version, 302 referred to the status code "Moved Temporarily." This was changed in version 1.1 to mean "Found."
301 (Permanent) Redirect: Point an entire site to a different URL on a permanent basis. This is the most common type of redirect and is useful in most situations. In this example, we are redirecting to the "example.com" domain:
# This allows you to redirect your entire website to any other domain Redirect 301 / http://example.com/
301 (Permanent) Redirect: Point a page to a different URL on a permanent basis:
# This allows you to permanently redirect a page to any other url Redirect 301 /someoldpage.php https://www.www.woktron.com/newpage.php
302 (Temporary) Redirect: Point an entire site to a different temporary URL. This is useful for SEO purposes when you have a temporary landing page and plan to switch back to your main landing page at a later date:
# This allows you to redirect your entire website to any other domain Redirect 302 / http://example.com/
307 Moved Temporarily (HTTP 1.1 Only)
A 307 redirect is the HTTP 1.1 successor of the 302 redirect. While the major crawlers will treat it like a 302 in some cases, it is best to use a 301 for almost all cases. The exception to this is when content is really moved only temporarily (such as during maintenance) AND the server has already been identified by the search engines as 1.1 compatible.
Since it's essentially impossible to determine whether or not the search engines have identified a page as compatible, it is generally best to use a 302 redirect for content that has been temporarily moved.
RedirectMatch
RedirectMatch redirects URLs that match a regular expression.
RedirectMatch 301 /Dir/Perl/.* https://www.woktron.com/perl
With the above command, any visitor hitting any URL under the /Dir/Perl
root will be redirected to the new Perl page. If you're used to wildcard characters in DOS or Unix, the .*
wildcard character is very similar to the usual *
wildcard you can use at the DOS or Unix command line.
Here's a quick breakdown:
- The
.
means "any character" - The
*
means "zero or more of the preceding character"
In a more complicated example, you can also use the Apache RedirectMatch syntax to find patterns in the URL pattern you're trying to match, and then use what you found in the URL you're redirecting users to. You do this with a similar pattern:
(.*)
With this syntax, the .*
part of the pattern again means look for any number of any character, and on top of that, the parentheses mean "remember whatever you found here so I can use it in my redirect URL".
For instance, in the follow RedirectMatch example, I'm telling Apache that it should remember whatever it finds in between my two search patterns, and use them as replacement patterns with my full URL $1 and $2
RedirectMatch 301 /java/jwarehouse/org.eclipse.(.*)/(.*) https://www.woktron.com/java/jwarehouse/eclipse/org.eclipse.$1/$2
So, if someone tries to go to a URL like this:
https://www.woktron.com/java/jwarehouse/org.eclipse.FOO/BAR
The above RedirectMatch example will redirect them to this URL:
http://www.devdaily.com/java/jwarehouse/eclipse/org.eclipse.FOO/BAR
Those URLs are fictitious, so you'll get error messages if you try to reach them, but I hope you get the idea of how this works.
SEO Best Practice
It is common practice to redirect one URL to another. When doing this, it is critical to observe best practices in order to maintain SEO value.
The first common example of this takes place with a simple scenario: a URL that needs to redirect to another address permanently.
There are multiple options for doing this, but in general, the 301 redirect is preferable for both users and search engines. Serving a 301 indicates to both browsers and search engine bots that the page has moved permanently.
Search engines interpret this to mean that not only has the page changed location, but that the content—or an updated version of it—can be found at the new URL. The engines will carry any link weighting from the original page to the new URL.
Be aware that when moving a page from one URL to another, the search engines will take some time to discover the 301, recognize it, and credit the new page with the rankings and trust of its predecessor. This process can be lengthier if search engine spiders rarely visit the given web page, or if the new URL doesn't properly resolve.
Other options for redirection, like 302s and meta refreshes, are poor substitutes, as they generally will not be indexed by search engines value like a 301 redirect will. The only time these redirects are good alternatives is if a webmaster purposefully does not want to pass link equity from the old page to the new.
Transferring content becomes more complex when an entire site changes its domain or when content moves from one domain to another. Due to abuse by spammers and suspicion by the search engines, 301s between domains sometimes require more time to be properly spidered and counted.
Part 4 - Password Protection
Introduction
Although there are many uses of the .htaccess file, by far the most popular, and probably most useful, is being able to reliably password protect directories on websites.
The .htaccess File
Adding password protection to a directory using .htaccess takes two stages. The first part is to add the appropriate lines to your .htaccess file in the directory you would like to protect. Everything below this directory will be password protected:
AuthName "Section Name"
AuthType Basic
AuthUserFile /full/path/to/.htpasswd
Require valid-user
There are a few parts of this which you will need to change for your site. You should replace Section Name
with the name of the part of the site you are protecting e.g. Members Area.
The /full/parth/to/.htpasswd
should be changed to reflect the full server path to the .htpasswd file (more on this later). If you do not know what the full path to your webspace is, contact your system administrator for details.
The .htpasswd File
Password protecting a directory takes a little more work than any of the other .htaccess functions because you must also create a file to contain the usernames and passwords which are allowed to access the site.
These should be placed in a file which (by default) should be called .htpasswd
. Like the .htaccess file, this is a file with no name and an 8 letter extension. The file can be created manually or can be created using the htpasswd
command as root.
More information can be found in this tutorial.
Option 1: Manually creating the .htpasswd file
Create a .htpasswd file. This file can be placed anywhere within you website (as the passwords are encrypted) but it is advisable to store it outside the web root so that it is impossible to access from the web.
Once you have created your .htpasswd file (you can do this in a standard text editor) you must enter the usernames and passwords to access the site. They should be entered as follows:
username:password
where the password is the encrypted format of the password. To encrypt the password you will either need to use one of the premade scripts available on the web or write your own. There is a good username/password service at the htaccesstools website, which will allow you to enter the user name and password and will output it in the correct format.
For multiple users, just add extra lines to your .htpasswd file in the same format as the first. There are even scripts available for free which will manage the .htpasswd file and will allow automatic adding/removing of users etc.
Option 2: Create .htpasswd file as root
Install the Apache Utilities Package
In order to create the file that will store the passwords needed to access our restricted content, we will use a utility called htpasswd
. This is found in the apache2-utils
package.
In Ubuntu or Debian based systems, update the local package cache and install the package by typing this command:
sudo apt-get update sudo apt-get install apache2-utils
In CentOS or RHEL based systems the package can be installed using yum
.
yum install httpd-tools
Create the Password File
We now have access to the htpasswd
command. We can use this to create a password file that Apache can use to authenticate users. We will create a hidden file for this purpose called .htpasswd
within our /etc/apache2
configuration directory.
The first time we use this utility, we need to add the -c
option to create the specified file. We specify a username (sammy
in this example) at the end of the command to create a new entry within the file:
sudo htpasswd -c /etc/apache2/.htpasswd sammy
You will be asked to supply and confirm a password for the user.
Leave out the -c
argument for any additional users you wish to add:
sudo htpasswd /etc/apache2/.htpasswd another_user
Accessing The Site
When you try to access a site which has been protected by .htaccess your browser will pop up a standard username/password dialog box. If you don't like this, there are certain scripts available which allow you to embed a username/password box in a website to do the authentication.
You can also send the username and password (unencrypted) in the URL as follows:
http://username:password@www.website.com/directory/
Summary
.htaccess is one of the most useful files a webmaster can use. There are a wide variety of different uses for it which can save time and increase security on your website.