We have our tables in place.
Its time to get with the code.
Lets get all the search engines and their attributes first.
$res_se = mysql_query('select * from search_engines', $conn); $se = array(); while ($row_se = mysql_fetch_object($res_se)) { $se['id'][] = $row_se->se_id; $se['name'][] = $row_se->se_name; $se['regex'][] = $row_se->se_regex; }
Lets collect our referers
$sql = 'select view_ref from view_log'; $res = mysql_query($sql, $conn); $refs = array(); while ($row = mysql_fetch_object($res)) { $refs[] = $row->view_ref; } mysql_free_result($res);
Before we find out the keywords from the Referers, Let me explain how do you form a regular expression to grab out the keyword.
Let us consider someone searches a keyword “ruturaj” on google.com.
So the url where google shows the listing of my page will be
http://www.google.co.in/search?hl=en&q=ruturaj&btnG=Google+Search&meta=
This will also be the referer.
The most important part of this URL is the string “q=ruturaj”, and then the keyword “ruturaj” from that string.
let us start…
/ | Pattern starts.. |
.*google.* | Allow any charecters around string “google” |
?q= | These will be charecters prefixing the keyword |
([^&]*) | Start a class, which will allow all characters. But it should not contain any &, which means any other GET query. |
.* | Allow any trailing characters |
/i | End the pattern, specifying that is case-insensitive |
So the final pattern would be… /.*google.*?q=([^&]*).*/i |
We have the Search Engine Attributes, and the referers, now is the time to apply the regular expression of the search engines to referers and grab out the keyword term.
$keywords = array(); $keyword = ''; $keyword_count = 0; $se_cnt = array(); for ($i=0; $i<count($refs); $i++) { for ($j=0; $j<count($se['id']); $j++) { if(preg_match($se['regex'][$j], $refs[$i], $matches)) { $k = strtolower($matches[1]); if ( !isset($keywords[$k]) ) { //exists.... $keywords[$k] = 1; } else { $keywords[$k] += 1; } if ( !isset( $se_cnt[$se['name'][$j]] ) ) { //exists.... $se_cnt[$se['name'][$j]] = 1; } else { $se_cnt[$se['name'][$j]] += 1; } //echo "<p>$refs[$i]<br/><b>{$se['name'][$j]}</b> - " . urldecode($matches[1]) . '</p>'; //echo "<b>{$se['name'][$j]}</b> - " . urldecode($matches[1]) . '<br/>'; break; } } }
Get the array sorted in descending order maintaining their indexes
arsort($keywords); arsort($se_cnt);
Grab the indexes and the values in arrays…
$query_term = array_keys($keywords); $query_term_cnt = array_values($keywords);
Finally strip out the url encoding out of the keyword.
$final = array(); for ($i=0; $i<count($query_term); $i++) { $final[] = array(urldecode($query_term[$i]), $query_term_cnt[$i]); }