Women in Technology

Hear us Roar



Article:
  Building a Simple Search Engine with PHP
Subject:   Simplifying the query
Date:   2007-01-25 15:45:35
From:   pjdevitt
This seems obvious to me, but why are you storing every occurance of a keyword within a document? A simpler solution would be to just count the number occurances of a word within the PHP code and write the value to the database. That would remove the GROUP BY used in most of your queries. The occurance table would need to be modified to include a 'count' field. Here's a snippet of PHP that will create an array of words and the number of times they occur in the document.



$wordbank = array();

preg_match_all("/(\b[\w+]+\b)/", $buf, $words);
for($j=0; $j<count($words[0]); $j++) {
$cur_word = addslashes(strtolower($words[0][$j]));
if(!in_array($cur_word, $filterWords)) {
if(!isset($wordbank[$cur_word])) {
$wordbank[$cur_word] = 0;
}
$wordbank[$cur_word]++;
}
}
</code>


You can then iterate through the $wordbank array and add a new record in the occurance table.