Women in Technology

Hear us Roar



Article:
  Building a Simple Search Engine with PHP
Subject:   RE: multiple keywords
Date:   2004-09-21 10:57:36
From:   cityslicker
Response to: RE: multiple keywords

Hi All,


Thanks for the code - it is great.


Just inform you of my edits and how I used multiple keywords.


1st of all - I put the keyword extraction script in a function.


I would use it to list titles and keywords of pages as well as main body text. I call the function 3 times for words in a title, twice for keywords and once for main body text. This is a way of scoring a page.


E.g. when a search is done on 'business' and a page title has the word 'business' in it will show 3 occurrences (although I dont show occurrences I just use the score to order the list)


2. Multiple keywords.


Basically, I use the explode() function to get an array or keywords and loop through them applying them to the query. I keep the scores for each word and add them together before displaying the list by highest score.


//CODE


/* Get timestamp before executing the query: */
$start_time = getmicrotime();

$keyword_array = explode(" ", $_GET['keyword']);

$score = array();
foreach($keyword_array as $keyword)
{


/* Set $keyword and $results, and use addslashes() to
* minimize the risk of executing unwanted SQL commands: */


/* Execute the query that performs the actual search in the DB: */
$result = mysql_query(" SELECT p.page_title AS title,
COUNT(*) AS occurrences
FROM pages p, word w, occurrence o
WHERE p.pageID = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.pageID
ORDER BY occurrences DESC
LIMIT 0, 5" );



for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{


$score[$row['title']] += $row['occurrences']; //Array of scores

}
}


if(count($score) > 0)
{
arsort($score); //Reverse sort the associative array scores by highest

/* Get timestamp when the query is finished: */
$end_time = getmicrotime();


/* Present the search-results: */
print "<h2>Search results for '".$_GET['keyword']."':</h2>\n";
//Loop through array and display results

while ($element = each($score)) //Loop through array and output results
{

echo $element[ "key" ];
echo " - ";
echo $element[ "value"];
echo "
";

}


/* Present how long it took the execute the query: */
print "query executed in ".(substr($end_time-$start_time,0,5))." seconds.";
}
else
{

//Display a no pages found page


}



// END CODE


This works fine but is a little slower if the user wants to search for a sentence. All in all, it is an easy add-on to the already supplied code that provides multiple keyword searching.


Hope this helps someone!

Main Topics Oldest First

Showing messages 1 through 1 of 1.

  • RE: multiple keywords EDIT
    2004-09-21 11:10:36  cityslicker [View]

    Hi again,

    Just a small edit from the code above.

    If a user seached for "good web sites" and a page contained 100's of 'good' but no 'web' and 'sites' then it would rank higher than a page which can have all three. This is not what we want so ammend the above code with this part:

    for( $i = 1; $row = mysql_fetch_array($result); $i++ )
    {

    $score[$row['title']] += $row['occurrences']; //Array of scores
    if($row['occurences'] > 0) { $score[$row['title']] += 1000; } //This makes pages containing all keywords rank highest
    }

    You can set the 1000 to whatever you like but you should be safe with that number.