Women in Technology

Hear us Roar



Article:
  Building a Simple Search Engine with PHP
Subject:   RE: multiple keywords
Date:   2004-09-21 10:57:36
From:   cityslicker
Response to: RE: multiple keywords

Hi All,


Thanks for the code - it is great.


Just inform you of my edits and how I used multiple keywords.


1st of all - I put the keyword extraction script in a function.


I would use it to list titles and keywords of pages as well as main body text. I call the function 3 times for words in a title, twice for keywords and once for main body text. This is a way of scoring a page.


E.g. when a search is done on 'business' and a page title has the word 'business' in it will show 3 occurrences (although I dont show occurrences I just use the score to order the list)


2. Multiple keywords.


Basically, I use the explode() function to get an array or keywords and loop through them applying them to the query. I keep the scores for each word and add them together before displaying the list by highest score.


//CODE


/* Get timestamp before executing the query: */
$start_time = getmicrotime();

$keyword_array = explode(" ", $_GET['keyword']);

$score = array();
foreach($keyword_array as $keyword)
{


/* Set $keyword and $results, and use addslashes() to
* minimize the risk of executing unwanted SQL commands: */


/* Execute the query that performs the actual search in the DB: */
$result = mysql_query(" SELECT p.page_title AS title,
COUNT(*) AS occurrences
FROM pages p, word w, occurrence o
WHERE p.pageID = o.page_id AND
w.word_id = o.word_id AND
w.word_word = \"$keyword\"
GROUP BY p.pageID
ORDER BY occurrences DESC
LIMIT 0, 5" );



for( $i = 1; $row = mysql_fetch_array($result); $i++ )
{


$score[$row['title']] += $row['occurrences']; //Array of scores

}
}


if(count($score) > 0)
{
arsort($score); //Reverse sort the associative array scores by highest

/* Get timestamp when the query is finished: */
$end_time = getmicrotime();


/* Present the search-results: */
print "<h2>Search results for '".$_GET['keyword']."':</h2>\n";
//Loop through array and display results

while ($element = each($score)) //Loop through array and output results
{

echo $element[ "key" ];
echo " - ";
echo $element[ "value"];
echo "
";

}


/* Present how long it took the execute the query: */
print "query executed in ".(substr($end_time-$start_time,0,5))." seconds.";
}
else
{

//Display a no pages found page


}



// END CODE


This works fine but is a little slower if the user wants to search for a sentence. All in all, it is an easy add-on to the already supplied code that provides multiple keyword searching.


Hope this helps someone!

Full Threads Newest First

Showing messages 1 through 3 of 3.

  • RE: multiple keywords EDIT
    2004-09-21 11:10:36  cityslicker [View]

    Hi again,

    Just a small edit from the code above.

    If a user seached for "good web sites" and a page contained 100's of 'good' but no 'web' and 'sites' then it would rank higher than a page which can have all three. This is not what we want so ammend the above code with this part:

    for( $i = 1; $row = mysql_fetch_array($result); $i++ )
    {

    $score[$row['title']] += $row['occurrences']; //Array of scores
    if($row['occurences'] > 0) { $score[$row['title']] += 1000; } //This makes pages containing all keywords rank highest
    }

    You can set the 1000 to whatever you like but you should be safe with that number.

    • RE: multiple keywords EDIT
      2004-09-21 11:19:15  cityslicker [View]

      Hi once more!!

      Make sure you spell occurrences correctly unlike in my code above!!

      • RE: multiple keywords EDIT
        2009-11-04 09:45:30  xoqqa [View]

        Would you be so kind to send me your version of this search engine please? I've been trying to figure out what is wrong with mine and noticed that yours is somewhat different. For example I don't have page_title but just urls... I think you've also modified the populate script.

        I would appreciate if you can send me the script files on sammutmatu[at]gmail.com

        Thanks