Create Search Engine Friendly URLs With PHP and mod_rewrite
In today’s modern, Web 2.0 world, friendly URLs can be seen on every major site, and not without good reason, but what do we mean by friendly URLs? Firstly, let’s address the point of an URL being search engine friendly. This means that keywords relevant to the page are included within its URL, and parameters are taken out. For example, instead of http://www.example.com/article.php?articleid=10294 , we might decide to use the title of the article, in creating an URL such as http://www.example.com/articles/search-engine-friendly-urls. The benefits of this for the search engines are two fold. The search engines give some weight in ranking terms to keywords in URLs, and also, incoming links which are simply the URL will automatically contain anchor text relevant to the page.
Secondly, and perhaps more importantly, there is a benefit to your users. Looking again at the two examples above, which one would you be more likely to remember in a week? A month? A year? I hope you’d agree the second one. Another benefit is in creating a structure that can be understood and used by your users. In this example let’s take a social media site, where registered users have profiles. I decide to look at the profile of Tom Jones, and find it at example.com/user/tom-jones. If I now decide to look at the profile of Steve Smith, then I can logically deduce I can find it at example.com/user/steve-smith, rather than clicking through pages until I find it. Not only good for your users, but a bandwidth reduction for you as well!
So now we’ve decided friendly URLs are a good idea all round, let’s get into the meat and bones of how to create them. Our first example will be a simple one, where we have a dynamically generated page with PHP, with a limited range of values for URL parameters. So for example, a site that has 3 pages, a CV, a portfolio and a contact page. These pages are currently accessed from index.php?page=cv, index.php?page=portfolio, and index.php?page=contact, respectively. We’d like our new schema to look like http://www.example.com/curriculum-vitae, http://www.example.com/portfolio etc. To do this we need to edit the “.htaccess” file in the root directory of our server, and the code we add would look like this:
The first two lines of the code are commands to Apache, telling it to turn the mod_rewrite feature on. Other than including them above any rules, don’t worry about them. The work is done by the two lines commencing “RewriteRule.” Following the RewriteRule declaration, each line can be split into two parts. The left hand side is a regular expression, indicating what the address bar in the browser will show, and the address we will link to. This isn’t the place to teach regular expressions, but a quick search turns up numerous tutorials. The right hand side denotes where the page can actually be found on your server. The second line needs a little more explaining, due to the presence of “$1” instead of the page variable. This means take the value found in the first set of brackets from the left hand side, and insert it here.
The simple rewrites described above are all well and good for a small site, but there is a drawback – every time you add a new page you will need to add a new rule. Not only is this tedious and time consuming, but one mistake can prevent access to lots of your pages, so for frequently updated sites we need a better method than this. We will take a hypothetical article system, powered by PHP and MySQL to demonstrate this. Currently we’re storing the articles in a table, based on an id number, and referencing them by this id number in our URLs, for example, article.php?articleid=500. The PHP code to display these articles looks like this:
article.php
Our first action is to add a new field to our articles table, called articleurlslug. This should also be made into a unique key. The URL slug is what will be used in the address, as for practical reasons, we don’t want to use the unmodified title. This is a simple enough change to make, and we can modify our article.php file to fetch articles by their slug, instead of their id, by changing our query:
If you want to test this out, insert some sample URL slugs into your table, and try using article.php?articleurlslug=yoururlslug. Now obviously we need to have some means of creating an URL slug each time an article is submitted, and inserting it into the table with the rest of the information. We must first look at how we want to modify our title to turn it into a slug, and there are several requirements we want of our slug:
· Contain keywords relevant to the page
· To be all lowercase
· To have spaces replaced with hyphens
· To have only letters and numbers, no punctuation
There is reasoning behind all four reasons. The inclusion of relevant keywords is a positive factor in the SEO of your site, and by basing our slug on the title of the page we are ensuring that we include keywords (if your title doesn’t have keywords in it then you need to write better titles!). Lowercase slugs are the most common form seen on the internet, and as such are what your users will expect. Removing variable cases simply means less to remember. Spaces in an URL will be converted to “%20”, which is confusing, and not user-friendly. The hyphen is the best replacement, as it is universally recognised by search engines as a word delimiter. Punctuation is removed for the same reason as spaces, i.e. it will be converted into URL safe forms, which is again not user, or search engine friendly.
The code shown below will take an article title, and transform it into a slug that fulfils these requirements:
So we should now be able to insert a new article, with an automatically created slug, and access that article by using the slug in the form article.php?articleurlslug=yoururlslug. If you have just a few articles in your database, you can work out and insert the slugs for existing articles yourself. This would obviously be a tedious job for many articles, and a PHP script to convert existing article titles is easily created.
The final step in our dynamic rewriting process is to create the necessary RewriteRule in our .htaccess file, and it looks like this:
This is similar to our simple rewrites above, just with slightly more complex regex, to limit the user input to letters and numbers. Note that I’m allowing uppercase letters in case of typos, and we have allowed for this in our article.php file as well, by converting the slug to lower case before querying the database.
If you haven’t already done it, then now is the time to rewrite your URLs. Remember, your users will benefit, your rankings will benefit, and as such, you and your site will reap the rewards.
Secondly, and perhaps more importantly, there is a benefit to your users. Looking again at the two examples above, which one would you be more likely to remember in a week? A month? A year? I hope you’d agree the second one. Another benefit is in creating a structure that can be understood and used by your users. In this example let’s take a social media site, where registered users have profiles. I decide to look at the profile of Tom Jones, and find it at example.com/user/tom-jones. If I now decide to look at the profile of Steve Smith, then I can logically deduce I can find it at example.com/user/steve-smith, rather than clicking through pages until I find it. Not only good for your users, but a bandwidth reduction for you as well!
Simple Rewriting
So now we’ve decided friendly URLs are a good idea all round, let’s get into the meat and bones of how to create them. Our first example will be a simple one, where we have a dynamically generated page with PHP, with a limited range of values for URL parameters. So for example, a site that has 3 pages, a CV, a portfolio and a contact page. These pages are currently accessed from index.php?page=cv, index.php?page=portfolio, and index.php?page=contact, respectively. We’d like our new schema to look like http://www.example.com/curriculum-vitae, http://www.example.com/portfolio etc. To do this we need to edit the “.htaccess” file in the root directory of our server, and the code we add would look like this:
Options +FollowSymLinks
RewriteEngine on
RewriteRule ^/curriculum-vitae/?$ index.php?page=cv
RewriteRule ^/(portfolio|contact)/?$ index.php?page=$1
The first two lines of the code are commands to Apache, telling it to turn the mod_rewrite feature on. Other than including them above any rules, don’t worry about them. The work is done by the two lines commencing “RewriteRule.” Following the RewriteRule declaration, each line can be split into two parts. The left hand side is a regular expression, indicating what the address bar in the browser will show, and the address we will link to. This isn’t the place to teach regular expressions, but a quick search turns up numerous tutorials. The right hand side denotes where the page can actually be found on your server. The second line needs a little more explaining, due to the presence of “$1” instead of the page variable. This means take the value found in the first set of brackets from the left hand side, and insert it here.
Dynamic Rewriting
The simple rewrites described above are all well and good for a small site, but there is a drawback – every time you add a new page you will need to add a new rule. Not only is this tedious and time consuming, but one mistake can prevent access to lots of your pages, so for frequently updated sites we need a better method than this. We will take a hypothetical article system, powered by PHP and MySQL to demonstrate this. Currently we’re storing the articles in a table, based on an id number, and referencing them by this id number in our URLs, for example, article.php?articleid=500. The PHP code to display these articles looks like this:
article.php
//connect to database
$articleid=int($_GET[‘articleid’]);
$query="SELECT articletitle,articletext,otherarticleinfo FROM articles WHERE articleid='$articleid' LIMIT 1";
$result=mysql_query($query);
$row=mysql_fetch_assoc($result);
echo $row['articletitle']; //and the body of the article etc etc.
Our first action is to add a new field to our articles table, called articleurlslug. This should also be made into a unique key. The URL slug is what will be used in the address, as for practical reasons, we don’t want to use the unmodified title. This is a simple enough change to make, and we can modify our article.php file to fetch articles by their slug, instead of their id, by changing our query:
$articleurlslug=strtolower(mysql_real_escape_string($_GET['articleurlslug']));
$query="SELECT articletitle,articletext,otherarticleinfo FROM articles WHERE articleurlslug='$articleurslug' LIMIT 1";If you want to test this out, insert some sample URL slugs into your table, and try using article.php?articleurlslug=yoururlslug. Now obviously we need to have some means of creating an URL slug each time an article is submitted, and inserting it into the table with the rest of the information. We must first look at how we want to modify our title to turn it into a slug, and there are several requirements we want of our slug:
· Contain keywords relevant to the page
· To be all lowercase
· To have spaces replaced with hyphens
· To have only letters and numbers, no punctuation
There is reasoning behind all four reasons. The inclusion of relevant keywords is a positive factor in the SEO of your site, and by basing our slug on the title of the page we are ensuring that we include keywords (if your title doesn’t have keywords in it then you need to write better titles!). Lowercase slugs are the most common form seen on the internet, and as such are what your users will expect. Removing variable cases simply means less to remember. Spaces in an URL will be converted to “%20”, which is confusing, and not user-friendly. The hyphen is the best replacement, as it is universally recognised by search engines as a word delimiter. Punctuation is removed for the same reason as spaces, i.e. it will be converted into URL safe forms, which is again not user, or search engine friendly.
The code shown below will take an article title, and transform it into a slug that fulfils these requirements:
$articleurlslug=strtolower($articletitle); // transform to lowercase
$punc[0]="/!/";
$punc[1]="/\"/";
$punc[2]="/£/";
// add any punctuation marks likely to be used in your title
$articleurlslug=preg_replace($punc,"",$articleurlslug);
// replace all instances of any punctuation found in the $punc array with “”
$articleurlslug=str_replace(" ","-",$articleurlslug);
// replace spaces with hyphensSo we should now be able to insert a new article, with an automatically created slug, and access that article by using the slug in the form article.php?articleurlslug=yoururlslug. If you have just a few articles in your database, you can work out and insert the slugs for existing articles yourself. This would obviously be a tedious job for many articles, and a PHP script to convert existing article titles is easily created.
The final step in our dynamic rewriting process is to create the necessary RewriteRule in our .htaccess file, and it looks like this:
RewriteRule ^/articles/([a-zA-Z0-9\-]+)/?$ article.php?articleurlslug=$1This is similar to our simple rewrites above, just with slightly more complex regex, to limit the user input to letters and numbers. Note that I’m allowing uppercase letters in case of typos, and we have allowed for this in our article.php file as well, by converting the slug to lower case before querying the database.
If you haven’t already done it, then now is the time to rewrite your URLs. Remember, your users will benefit, your rankings will benefit, and as such, you and your site will reap the rewards.