Setting VirtualHosts

VirtualHosts
The most important part of setting Apache is setting the hosts, or VirtualHosts. The term “VirtualHost” comes from the fact that one single host or comptuer is hosting many hostnames. Apache was the one to start of with this type of hosting, in this Apache picks up the Host header from a standard HTTP request to translate the website associated for that host. This type of hosting is known as the Name-based virtual hosting, which is the most common of all the hosting types. The other one is the IP-based hosting which requires each domain to have a separate IP.

What I will show you is how to set up a name based virtualhost.

Now, A simple GET request for my page root would be as

GET / HTTP/1.1
Host: www.ruturaj.net

Now apache picks up “www.ruturaj.net” from the request header and then translates it to the virtual host that is mapped to www.ruturaj.net

Lets assume you have an IP 67.66.65.64, that you need to set up for virtual hosting, then first, you need to tell Apache that this IP is used for Namebased Virtual hosting.

NameVirtualHost 67.66.65.64

Now that you have done with setting the IP for virtual hosting, you need to configure the VirtualHosts.

Let us take ruturaj.net as the domain that needs to be set. So here it goes

<VirtualHost 67.66.65.64>
  ServerName ruturaj.net
  ServerAlias www.ruturaj.net
  DocumentRoot /www/domains/ruturaj.net
  CustomLog logs/ruturaj.net-access_log combined
  ErrorLog logs/ruturaj.net-error_log
  DirectoryIndex index.php
  ServerAdmin ruturaj@ruturaj.net
</VirtualHost>

Now let us review the configurations

  • ServerName: this is the main servername, it should be domain name
  • ServerAlias: this is an alias, eg www.ruturaj.net should mean same as ruturaj.net on HTTP
    You can set anything like default.ruturaj.net as well. Just make sure that default.ruturaj.net points to 67.66.65.64
  • DocumentRoot: This is the main directory that points to ruturaj.net domain, this is the file system path to the directory
  • CustomLog: This is the access_log for ruturaj.net, remember, we’d set the variable of “combined” log format, we are useing it here, if you want a different format, you can specify the LogFormat before specifying the CustomLog directive
  • ErrorLog: Any errors while serving are logged in this file
  • DirectoryIndex: Defines the default document page for root, eg when you do http://ruturaj.net/ it tells the server to serve “index.php”, so you can set it whatever you want default-page.html, default.pl, etc.
  • ServerAdmin: Just specify the email address, this would show up, when there is any server error.

So now if you want to add a configuration for host “johnsmith.com”…

<VirtualHost 67.66.65.64>
  ServerName johnsmith.com
  ServerAlias www.johnsmith.com
  DocumentRoot /www/domains/johnsmith.com
  CustomLog logs/johnsmith.com-access_log combined
  ErrorLog logs/johnsmith.com-error_log
  DirectoryIndex index.php
  ServerAdmin admin@johnsmith.com
</VirtualHost>

The httpd.conf file

The httpd.conf file is the main configuration file of Apache. It rests in “apache-install-dir/conf”

Now lets take a look at some important and useful parameters

ServerName
This is param sets the default server name, it should generally be the FQDN or the Fully Qualified Domain Name of the machine, or the IP, if the machine doesn’t have any FQDN.

Directory
This is a setting which encloses any of the settings for the given directory. So you specify the physical directory as the argument. So if you have a directory as /websites/mywebsite/somedir, you would do the following.

<Directory /websites/mywebsite/somedir>
... your settings
...
</Directory>

AllowOverride
AllowOverride
The AllowOverride allows the user, to override some of the settings by using their own file. This own file is the magical .htaccess file. By default it is set to None, which means the user can’t override the settings by specifying the .htaccess file in the directory. But you can change the AllowOverride None setting to AllowOverride All

Options
This directive takes several options, I’ll explain some them,
Indexes: This allows a directory listing. U must have come accross something like this
Directory Listing

FollowSymLinks: This allows apache to follow symbolic links, symbolic links are nothing but links in *nix systems, eg. “files” in /etc/ can point to /files/myfiles/files
You can use both these options at once by

Options +Indexes -FollowSymLinks

The above setting will allow directory listing but won’t allow Symbolic links. So “+” to apply and “-” to remove the setting

AccessFileName
I talked about the magic file .htaccess, This is the place where you specify the name of the file, By default it is “.htaccess”
The . “period” start is to make it a hidden file in *nix systems

Denying files
To deny files over the web, is the job of the server, in apache, we can do exactly by using the Files directive.

<Files ~ "^\.ht">
    Order allow,deny
    Deny from all
    Satisfy All
</Files>

Note the ~ sign, this is used when you are giving a regular expression to match the files., Once the files are selected, they can be denied by using the Deny directive.
The above regex is to deny all the files that start with a “.ht”

Access Logs
To create access logs, we need to specify the format of the log, and the file path.
First we need to set the LogFormat directive
The most common is the “combined” log, which logs ip, user, time error code, referer and user agent

LogFormat “%h %l %u %t \”%r\” %>s %b \”%{Referer}i\” \”%{User-Agent}i\”” combined

Note: the log format has been given a name “combined”, feel free to create different formats for your needs and name it accordingly
Then we need to set the filename of the log,

CustomLog /usr/local/apache/logs/access_log common

The second parameter of the CustomLog directive which sets the filename of the log is the log format name, that we defined earlier.

Server-Status
When you want to look at the current status of the server, ie whom is it responding to, what pages is it serving, how many servers are running… and so on..
There is no better way than to set server-status
Check the screen shot of it.

server-staus

To enable it …

<Location /server-status>
    SetHandler server-status
    Order deny,allow
    Deny from all
    Allow from 192.168.0.84
</Location>

check the configuration, it is allowing only IP 84 to check the stats and others are forbidden. You can set your IP as you wish.
If you want even more info. you can set the Extended status

ExtendedStatus On

Apache beginings

For guys who have reached here, but still don’t know what httpd is,
Apache is a web server, For all the web pages, websites, blogs, image galleries that are hosted on the web, there needs to be server who “serves” these documents (pages, images, files) to the client (the user’s browser)

Apache got its name from … well… its nothing but a “A patchy server”, httpd apache is an open-source project, which was programmed by many programmers over the world. And everytime a bug-fix, a new feature was required, the main code was just “patched”. And hence it got its name Apache.

Apache being a standard web-browser, runs on port 80, this is the standard HTTP port. Before you begin ahead, let me warn you changing the settings of Apache can change the way a website behaves, and to edit its settings you need root access or Administrator access.

To control apache, you basically need to edit 2 important files “httpd.conf” and “.htaccess”

IP to Country

This is not exactly a tutorial, but a small trick, to access ip-to-country.wehbosting.info demo as a web-service

ip-to-country.wehbosting.info does provide a nice CSV format of transferring IPs to country .. 🙂 Pretty amazing .
But the problem is you need to have a good dB support at your en to use it.

So for guys like me.. who can’t afford a dB have to rely on other ways, webservices etc…
But it seems that is not much in the line yet from those guys..
So I have written a mischiveous script that fakes a valid HTTP POST and gets back the country 😀

Since this script relies on their HTML page… It is highly likely that this script will fail and you need to modify it to make it working IF THEY CHANGE THE HTML.
Please don’t abuse the server… Its just for fun

<?php
$ip = $_GET['ip'];

function get_country($ip)
{
    $f = fsockopen('ip-to-country.webhosting.info', 80);
    if (!$f)
    {
        return false;
    }

    $postdata = "ip_address=".urlencode($ip)."&submit=".urlencode('Find Country');


    $request = '';
    $request .= "POST /node/view/36 HTTP/1.1\r\n";
    $request .= "Host: ip-to-country.webhosting.info\r\n";
    $request .= "User-Agent: Its me again\r\n";
    $request .= "Content-Length: ".strlen($postdata)."\r\n";
    $request .= "Content-Type: application/x-www-form-urlencoded\r\n";
    $request .= "\r\n";
    $request .= "$postdata\r\n";
    
    fwrite($f, $request);
    $response = '';
    while (!feof($f)) 
    {
           $response .= fgets($f, 128);
    }

    $pos1 = strpos ( $response , '</from>');

    $pos2 = strpos ( $response , '<br><br><img' , $pos1 );

    $parse_from = substr( $response, $pos1+21, ($pos2-$pos1) );
    $pattern = "/<b>([^\/]*)<\/b>/si";
    preg_match_all($pattern, $parse_from, $matches);

    return $matches[1][1];
}

echo (get_country($ip));
 ?>
 

Finally Displaying XML

First display from where the RSS Feed is from …, ie display the site name, link etc.

echo ("<h5><a href='$link' title='$desc'>".$title."</a></h5>\n<br />".date('F d, Y H:i:s', $dt)."<p> </p>"); 


Now move throgh a for loop to finally display the XML in a well formatted HTML
Note here how I have used the div tag to position the content… You can adjust the [width], [left] style elements to suite your page.

echo ("<div style='position:absolute;left:5px;width:150px;font-size:8pt;font-family:trebuchet ms;border:1px dashed silver;padding:3px;'>");

for ($i=0; $i<count($arr['title']); $i++)
{
  echo ("<p>\n");
  echo ("<a href='".trim($arr['link'][$i])."'>".trim($arr['title'][$i])."</a><br />\n");
  echo (trim($arr['desc'][$i])."\n");
  echo ("</p>\n");
}
echo ("</div>");

Reading XML RSS Feed

OK now that we have all the content… We are ready to display it…
But before that we should check what way is the XML document structured…
For that paste the $url string in the browser and you’ll see something like this

<!-- Copyright (C) 2000-2004 - Developer Shed, Inc. -->
<rss version="2.0">
  <channel>
    <title>Dev Shed - The Open Source Web Development Site</title>
    <link>http://www.devshed.com</link>
    <description>mos_rss</description>
    <language>en-us</language>
    <lastBuildDate>Tue, 27 Apr 2004 12:39:24 -0400</lastBuildDate>

      <item>
          <title>The Power of CSS</title>
          <link>
            http://www.devshed.com/c/a/Style-Sheets/The-Power-of-CSS/
          </link>
          <description>
            CSS or cascading style sheets are used to create a set of styles that can be applied to your fonts, ...
          </description>
      </item>
... and many more item tags...

IF you note carefully… The most important tag is the [item] tag, where there are sub tags [title], [link], [description], and these sub tags are at the level 4 The tags [title], [link], [description], [language], [lastBuildDate] which provide the general info of the website are at level 3

Now we just have to use the $vals array and get out the information, we do it using a simple for loop

for ($i=0; $i<count($vals); $i++)
{
	//get the primary info
	if ($vals[$i]['tag']=="TITLE" && $vals[$i]['level']==3) {
		$title = $vals[$i]['value'];
	}
	if ($vals[$i]['tag']=="LINK" && $vals[$i]['level']==3) {
		$link = $vals[$i]['value'];
	}
	if ($vals[$i]['tag']=="DESCRIPTION" && $vals[$i]['level']==3) {
		$desc = $vals[$i]['value'];
	}
	if ($vals[$i]['tag']=="LASTBUILDDATE" && $vals[$i]['level']==3) {
		$dt = strtotime($vals[$i]['value']);
	}

	//get the main feed
	if ($vals[$i]['tag']=="TITLE" && $vals[$i]['level']==4) {
		$arr['title'][] = $vals[$i]['value'];
	}
	if ($vals[$i]['tag']=="LINK" && $vals[$i]['level']==4) {
		$arr['link'][] = $vals[$i]['value'];
	}
	if ($vals[$i]['tag']=="DESCRIPTION" && $vals[$i]['level']==4) {
		$arr['desc'][] = $vals[$i]['value'];
	}
}

OK, Now we have all the required information only !

Fetching XML from a remote site

PHP has provided with a wonderful function file_get_contents

We first start with getting the xml feed into a string, by using file_get_contents

//the $url is the link from where XML is received
$url = 'http://devshed.com/index2.php?option=mos_rss&no_html=1'; 
$con = file_get_contents($url); //the contents are in the variable $con

//now create the SAX xml parser $xp = xml_parser_create();

Now using the function xml_parse_into_struct, we parse the xml and create 2 arrays $vals and $index, The $index contains all the info about the node nos. While the most important array is the $vals array

xml_parse_into_struct($xp, $con, $vals, $index);
// we now free the xml parser. xml_parser_free($xp);

If we print_r the $vals array we see something like this

Array
(
	[0] => Array
		(
			[tag] => RSS
			[type] => open
			[level] => 1
			[attributes] => Array
			    (
                    [VERSION] => 2.0
                )

            [value] => 

        )

    [1] => Array
        (
            [tag] => CHANNEL
            [type] => open
            [level] => 2
            [value] => 
	
        )

    [2] => Array
        (
            [tag] => TITLE
            [type] => complete
            [level] => 3
            [value] => Dev Shed - The Open...
        )

		.....
		Something like this

ऋतुराज का Home Page