|
Hi All,
Thanks for the code - it is great.
Just inform you of my edits and how I used multiple keywords.
1st of all - I put the keyword extraction script in a function.
I would use it to list titles and keywords of pages as well as main body text. I call the function 3 times for words in a title, twice for keywords and once for main body text. This is a way of scoring a page.
E.g. when a search is done on 'business' and a page title has the word 'business' in it will show 3 occurrences (although I dont show occurrences I just use the score to order the list)
2. Multiple keywords.
Basically, I use the explode() function to get an array or keywords and loop through them applying them to the query. I keep the scores for each word and add them together before displaying the list by highest score.
//CODE
/* Get timestamp before executing the query: */
$start_time = getmicrotime();
$keyword_array = explode(" ", $_GET['keyword']);
$score = array();
foreach($keyword_array as $keyword)
{
/* Set $keyword and $results, and use addslashes() to
* minimize the risk of executing unwanted SQL commands: */
/* Execute the query that performs the actual search in the DB: */
$result = mysql_query(" SELECT p.page_title AS title,
COUNT(*) AS occurrences
FROM pages p, word w, occurrence o
WHERE p.pageID = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.pageID
ORDER BY occurrences DESC
LIMIT 0, 5" );
for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{
$score[$row['title']] += $row['occurrences']; //Array of scores
}
}
if(count($score) > 0)
{
arsort($score); //Reverse sort the associative array scores by highest
/* Get timestamp when the query is finished: */
$end_time = getmicrotime();
/* Present the search-results: */
print "<h2>Search results for '".$_GET['keyword']."':</h2>\n";
//Loop through array and display results
while ($element = each($score)) //Loop through array and output results
{
echo $element[ "key" ];
echo " - ";
echo $element[ "value"];
echo " ";
}
/* Present how long it took the execute the query: */
print "query executed in ".(substr($end_time-$start_time,0,5))." seconds.";
}
else
{
//Display a no pages found page
}
// END CODE
This works fine but is a little slower if the user wants to search for a sentence. All in all, it is an easy add-on to the already supplied code that provides multiple keyword searching.
Hope this helps someone!
|
Just a small edit from the code above.
If a user seached for "good web sites" and a page contained 100's of 'good' but no 'web' and 'sites' then it would rank higher than a page which can have all three. This is not what we want so ammend the above code with this part:
for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{
$score[$row['title']] += $row['occurrences']; //Array of scores
if($row['occurences'] > 0) { $score[$row['title']] += 1000; } //This makes pages containing all keywords rank highest
}
You can set the 1000 to whatever you like but you should be safe with that number.