URL Rewriting for content pages

Many of the content sites (blogs, news sites) that you see these days have a specific url for each page. Eg. News.com

Many of the sites have something like /news/75-news-title.html. You can’t have a actual distinct page for each content. The most common solution to this is URL Rewrite the content.

The idea is to grab the specific content from the URL and then map that content, id, or whatever from the url as a get parameter’s value to a specific page.

In the above scenario, check the URL: /news/75-news-title.html. What is commonly done is the content_id the key by which the content is mapped in the content table is placed in the URL along with the title.

As in this case
Content ID: 75
Title Text: news-title (The hyphens are to make it readable instead of %20 for space)

So lets assume that we have a page called news.php in which we will give a get parameter as newsid. All we got to do now is write the URL Rewrite rule using Apache’s mod_rewrite engine.

We’ll use the RewriteCond and RewriteRule

<IfModule mod_rewrite.c>
    RewriteEngine On

    RewriteCond %{REQUEST_URI} /news/([0-9]+).*\.html$
    RewriteRule (.*) /news.php?newsid=%1 [L]
</IfModule>

Now Let us look how we built the thing.

  • First we used the Server Variable REQUEST_URI, to match the pattern with the request. The variable is referenced using %{SERVER_VARIABLE} format.
  • The RewriteCond is basically a If condition, which means if the condition is true, only then the condition or rules below that statement will be executed. That means the pattern should match for the rule to work
  • The regular expression pattern we made was accepting a integer value after /news/, After that integer value any text can come. But should end with .html. As emphasized by the $ at the end.
  • Now if the Condition works, we need to write the rule for it, so we use RewriteRule. The first argument is .*, which means accept any URL
  • The second argument is the actual mapping of the news.php with the newsid parameter. Check that we’ve used %1 which means the first back reference of the RewriteCond regex pattern
  • Since our pattern was /news/([0-9]+).*\.html$ and had just one class in it, that class i.e. ([0-9]+) should be referenced by %1 in the RewriteRule directive

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.