Script to make entries

We have our tables in place.
Its time to get with the code.

Lets get all the search engines and their attributes first.

$res_se = mysql_query('select * from search_engines', $conn);
$se = array();
while ($row_se = mysql_fetch_object($res_se)) {
    $se['id'][] = $row_se->se_id;
    $se['name'][] = $row_se->se_name;
    $se['regex'][] = $row_se->se_regex;
}


Lets collect our referers

$sql = 'select view_ref from view_log';
$res = mysql_query($sql, $conn);
$refs = array();
while ($row = mysql_fetch_object($res)) {
    $refs[] = $row->view_ref;
}
mysql_free_result($res);

Before we find out the keywords from the Referers, Let me explain how do you form a regular expression to grab out the keyword.
Let us consider someone searches a keyword “ruturaj” on google.com.
So the url where google shows the listing of my page will be
http://www.google.co.in/search?hl=en&q=ruturaj&btnG=Google+Search&meta=
This will also be the referer.

The most important part of this URL is the string “q=ruturaj”, and then the keyword “ruturaj” from that string.
let us start…

/ Pattern starts..
.*google.* Allow any charecters around string “google”
?q= These will be charecters prefixing the keyword
([^&]*) Start a class, which will allow all characters. But it should not contain any &, which means any other GET query.
.* Allow any trailing characters
/i End the pattern, specifying that is case-insensitive

So the final pattern would be…
/.*google.*?q=([^&]*).*/i

We have the Search Engine Attributes, and the referers, now is the time to apply the regular expression of the search engines to referers and grab out the keyword term.

$keywords = array();
$keyword = '';
$keyword_count = 0;
$se_cnt = array();

for ($i=0; $i<count($refs); $i++) {
   for ($j=0; $j<count($se['id']); $j++) {
       if(preg_match($se['regex'][$j], $refs[$i], $matches))
       {
           $k = strtolower($matches[1]);
           if ( !isset($keywords[$k]) ) { //exists....
               $keywords[$k] = 1;
           } else {
               $keywords[$k] += 1;
           }
           if ( !isset( $se_cnt[$se['name'][$j]] ) ) { //exists....
               $se_cnt[$se['name'][$j]] = 1;
           } else {
               $se_cnt[$se['name'][$j]] += 1;
           }
           //echo "<p>$refs[$i]<br/><b>{$se['name'][$j]}</b> - " . urldecode($matches[1]) . '</p>';
           //echo "<b>{$se['name'][$j]}</b> - " . urldecode($matches[1]) . '<br/>';
           break;
       }
   }
}

Get the array sorted in descending order maintaining their indexes

arsort($keywords);
arsort($se_cnt);

Grab the indexes and the values in arrays…

$query_term = array_keys($keywords);
$query_term_cnt = array_values($keywords);

Finally strip out the url encoding out of the keyword.

$final = array();
for ($i=0; $i<count($query_term); $i++) {
    $final[] = array(urldecode($query_term[$i]), $query_term_cnt[$i]);
}

Leave a Reply

Your email address will not be published. Required fields are marked *

This site uses Akismet to reduce spam. Learn how your comment data is processed.