Tag Archives: PHP

My darling web scripting language.

Hiphop PHP

Facebook has been working a lot, Cassandra, Scribe, u name it. The new kid from Facebook is Hiphop PHP which is an automated system that converts PHP to C++

HipHop programmatically transforms your PHP source code into highly optimized C++ and then uses g++ to compile it.

Read More here: HipHop

Redis, Memcached, Tokyo Tyrant and MySQL comparision

I wanted to compare the following DBs, NoSQLs and caching solutions for speed and connections. Tested the following

My test had the following criteria

  • 2 client boxes
  • All clients connecting to the server using Python
  • Used Python’s threads to create concurrency
  • Each thread made 10,000 open-close connections to the server
  • The server was
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • Fedora 10 32bit
    • Intel(R) Pentium(R) D CPU 3.00GHz
    • #1 SMP
    • 1GB RAM
  • Used a md5 as key and a value that was saved
  • Created an index on the key column of the table
  • Each server had SET and GET requests as a different test at same concurrency

Results please !

Work sheet

throughput set

throughput get

I wanted to simulate a situation where I had 2 servers (clients) serving my code, which connected to the 1 server (memcached, redis, or whatever). Another thing to note was that I used Python as the client in all the tests, definately the tests would give a different output had I used PHP. Again the test was done to check how well the clients could make and break the connections to the server, and I wanted the overall throughput after making and breaking the connections. I did not monitor the response times. I didnt change absolutely any parameters for the servers, eg didn’t change the innodb_buffer_pool_size or key_buffer_size.


MySQL lacked the whole scene terribly, I monitored the MySQL server via the MySQL Administrator and found that hardly there were any conncurrent inserts or selects, I could see the unauthenticated users, which meant that the client had connected to MySQL and was doing a handshake using MySQL authentication (using username and password). As you could see I didn’t even perform the 40 and 60 thread tests.

I truncated the table before I swtiched my tests from MyISAM to InnoDB. And always started the tests from lesser threads. My table was as follows

CREATE TABLE `comp_dump` (
  `k` char(32) DEFAULT NULL,
  `v` char(32) DEFAULT NULL,
  KEY `ix_k` (`k`)


For Tokyo Tyrant I used a file.tch as the DB, which is a hash database. I also tried MongoDB as u may find if u have opened the worksheet, But the server kept failing or actually the mongod failed after coming at an unhandled Exception. I found something similar over here. I tried 1.0.1, 1.1.3 and the available Nightly build, but all failed and I lost my patience.

Now what

If you need speed just to fetch a data for a given combination or key, Redis is a solution that you need to look at. MySQL can no way compare to Redis and Memcache. If you find Memcache good enough, you may want to look at Tokyo Tyrant as it does a synchronous writes. But you need to check for your application which server/combination suits you the best. In Marathi there is a saying “मेल्या शिवाय स्वर्ग दिसत नाही”, which means “You can’t see heaven without dieing” or need to do your hard work, can’t escape that 😉

I’ve attached the source code used to test, if anybody has any doubts, questions feel free to ask

How sessions work in PHP

HTTP is a stateless protocol. Which means that every request the browser makes to the server cant be identified by the server as a subsequent request of that user/IP/browser or a brand new request.

HTTP doesn’t understand who is requesting. So how do sessions manage to make HTTP look intelligent? The Answer lies in the request-response model with data.

When a normal request is made, eg my website, the minimalistic data passed by the client/browser is this

GET / HTTP/1.1
Host: ruturaj.net

The server responds by giving the output. But when a developer does a session_start();, What actually happens is, the PHP engine sets a PHPSESSID cookie. This data is sent from the Server as Set-Cookie header. So the response goes somewhat like this

HTTP/1.x 200 OK
Date: xxxx
Set-Cookie: PHPSESSID=<32charhexvalue>; expires=xxxx

Now considering the browser does accept the cookies, it saves the PHPSESSID cookie. Consequently the server also creates a file in the specified directory (by default on Linux as /tmp) as /tmp/sess_32charid.

Now when another request is made by the user/browser, the Cookie header is passed through the GET request back to the server, something like this…

GET /session2.php HTTP/1.1
Host: ruturaj.net
Cookie: PHPSESSID=<32charid>; othercookies=othervalues;

The session2.php, for example, is setting a value of name in session, by this

$_SESSION['name'] = $name_obtained_from_somewhere;

Now as the script finishes, the script flushes all the $_SESSION data into the /tmp/sess_32charid file associated to that session id. It saves all the data in the serialized format

Consider the browser makes another request to session3.php where $_SESSION['name'] is echoed. Now when the request is made, just like previous case, the PHPSESSID is passed in the cookie.

Now as mandated by php.net, that every page where sessions should be needed, a session_start(); is required. So as soon this function is invoked, PHP checks if the browser’s request had any PHPSESSID cookie sent in the header, as it was sent in our case, PHP Engine will open /tmp/sess_32charid file (with the same session id) and unserialize the contents of the file. It then assigns the values of the unserialized data structures to the $_SESSION variable.

The simple echo $_SESSION['name']; will now be able to output the name!! Sessions working…

On a session_destroy();, PHP sends a destructive, previous timestamp cookie for PHPSESSID and unlinks or deletes the /tmp/sess_32charid file. This ensures that no reference of that session is left.


  • http://in3.php.net/manual/en/session.configuration.php

Scribe PHP logging

I’d put some efforts to make scribed logging work with PHP, what I did was follow python’s example script “scribe_cat”. And made a similar PHP Script out of it, I’d to create many PHP scripts out of n number of .thrift files. Anyways I’ve got a working example. Here it is.

 * As found on http://highscalability.com/product-scribe-facebooks-scalable-logging-system
        $messages = array();
        $entry = new LogEntry;
        $entry->category = "buckettest";
        $entry->message = "something very interesting happened";
        $messages []= $entry;
        $result = $conn->Log($messages);

$GLOBALS['THRIFT_ROOT'] = './includes';

include_once $GLOBALS['THRIFT_ROOT'] . '/scribe.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/transport/TSocket.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/transport/TFramedTransport.php';
include_once $GLOBALS['THRIFT_ROOT'] . '/protocol/TBinaryProtocol.php';
//include_once '/usr/local/src/releases/scribe-2.0/src/gen-php/scribe.php';

$msg1['category'] = 'keyword';
$msg1['message'] = "This is some message for the category\n";
$msg2['category'] = 'keyword';
$msg2['message'] = "Some other message for the category\n";
$entry1 = new LogEntry($msg1);
$entry2 = new LogEntry($msg2);
$messages = array($entry1, $entry2);

$socket = new TSocket('localhost', 1464, true);
$transport = new TFramedTransport($socket);
$protocol = new TBinaryProtocol($transport, false, false);
$scribe_client = new scribeClient($protocol, $protocol);


You can have as many messages or entries into one log, as I’ve demonstrated or tried above, please change the corresponding scribed’s host and port values. I’ve attached a working file and all the required includes generated by scribe. Except for the above script everything is generated by Scribe/Thrift.

PHP, Python Consistent Hashing

I found out the hashing algorithm used in PHP-Memcache is different from that of Python-Memcache. The keys went to different servers as the hash created by python and php were different.

I posted a question on the memcache groups and was lucky to find this wonderful reply.

import memcache
import binascii
m = memcache.Client(['', '
', ''])

def php_hash(key):
    return (binascii.crc32(key) >> 16) & 0x7fff

for i in range(30):
       key = 'key' + str(i)
       a = m.get((php_hash(key), key))
       print i, a

This is the only thing that has to be done on Python’s end, change the way the hash is calculated. The coding on PHP end remains same. All you guys using PHP for web based front-end with MySQL and Python for back-end scripts shall find this helpful.

Thanks Brian Rue.

Reference: http://groups.google.com/group/memcached/msg/7bb75a026c44ec43

Creating a Live Score Board Client Logic

The the custom JavaScript object player is defined as below.

function player () {
	this.name = '';
	this.serving = false;
	this.gamepoint = 0;
	this.sets = new Array();
	this.toString = function() {
		return (this.name + ": " + (this.serving ? 'Serving' : 'Facing') + "\n" + this.sets + "\n" + this.gamepoint);

The sets member variable is an array.

Next comes the import updateScoreBoard function

function updateScoreBoard() {
	sc = xmlScore;
	pDOM = sc.getElementsByTagName("player");
	players = new Array();
	for (i=0; i<pDOM.length; i++) {
		players[i] = new player(pDOM[i].getAttribute("name"));
		players[i].name = pDOM[i].getAttribute("name");
		if (pDOM[i].getAttribute("serving") == 1) {
			players[i].serving = true;
		// Now get the match sets for the players
		setDOM = pDOM[i].getElementsByTagName("set");
		for (j=0; j<setDOM.length; j++) {
			players[i].sets[j] = setDOM[j].childNodes[0].nodeValue;

			// Now get the curent Game point for the players
			players[i].gamepoint = pDOM[i].getElementsByTagName("gamepoints")[0].childNodes[0].nodeValue;

//		alert(players[i].toString());

	// Now that all data has been gathered...
	document.getElementById("player1").innerHTML = players[0].name;
	document.getElementById("matchplayer1").innerHTML = players[0].name;
	document.getElementById("player2").innerHTML = players[1].name;
	document.getElementById("matchplayer2").innerHTML = players[1].name;

//	document.getElementById("serving1").innerHTML = (players[0].serving)?'o':'';
//	document.getElementById("serving2").innerHTML = (players[1].serving)?'o':'';

	// Sets
	document.getElementById("setplay11").innerHTML = players[0].sets[0];
	document.getElementById("setplay12").innerHTML = players[0].sets[1];
	document.getElementById("setplay13").innerHTML = players[0].sets[2];
	document.getElementById("setplay21").innerHTML = players[1].sets[0];
	document.getElementById("setplay22").innerHTML = players[1].sets[1];
	document.getElementById("setplay23").innerHTML = players[1].sets[2];

	// Game points
	document.getElementById("gamepoints1").innerHTML = players[0].gamepoint;
	document.getElementById("gamepoints2").innerHTML = players[1].gamepoint;

Notice the players variable is an array of player object. The traversing of XML content is easy. If you are unable to understand JavaScript DOM methods check following links

You can find a full working example here, Live Score Board

Creating a Live Score Board Client Logic

Let us start with a script tag that will enclose all the fundooo script. You can save in a .js file as well if you want…

The httpObj is the XMLHttpRequest object, and xmlScore is the response XMLDocument. I’ve created a player object class which is used by the tplayers variable.

<script type="text/javascript">
var httpObj, xmlScore, players;
function getScore(round, category, sex, player) {
	url = "http://localhost/score_server.php";
	url += "?round="+round+"&category="+escape(category)+
		"&sex="+escape(sex)+"&player="+escape(player)+"&rand="+new Date() ;
	httpObj = false;
    // branch for native XMLHttpRequest object
    if(window.XMLHttpRequest) {
    	try {
			httpObj = new XMLHttpRequest();
        } catch(e) {
			httpObj = false;
    // branch for IE/Windows ActiveX version
    } else if(window.ActiveXObject) {
       	try {
        	httpObj = new ActiveXObject("Msxml2.XMLHTTP");
      	} catch(e) {
        	try {
          		httpObj = new ActiveXObject("Microsoft.XMLHTTP");
        	} catch(e) {
          		httpObj = false;
	if(httpObj) {
		httpObj.onreadystatechange = processReqChange;
		httpObj.open("GET", url, true);

		timerID = setTimeout("getScore('"+round+"', '"+category+"', '"+sex+"', '"+player+"')",30000);

The getScore function requires params, that are need for the server. If you notice at the end of the function, the function calls itself again after 30 sec (30,000 msec). Assuming one serve takes half a min or so.

We define the function that will handle the process’ change of status, i.e. when the status turns 4, it assigns xmlScore the XMLDocument object.

function processReqChange() {
    // only if req shows "loaded"
    if (httpObj.readyState == 4) {
        // only if "OK"
        if (httpObj.status == 200) {
            xmlScore = httpObj.responseXML;
        } else {
            alert("There was a problem retrieving the XML data:\n" +