|
Hi All,
Thanks for this useful search engine.
The accents ("à" in french for exemple) are encoded as "à" by html editors.
To get your search engine dealing with it, I put the following lines in populate.php :
/* Foreign site : convert french characters made by html editors : */
$patterns[0] = "/ /";
$patterns[1] = "/à/";
$patterns[2] = "/â/";
$patterns[3] = "/é/";
$patterns[4] = "/è/";
$patterns[5] = "/ê/";
$patterns[6] = "/î/";
$patterns[7] = "/ù/";
$patterns[8] = "/û/";
$patterns[9] = "/ç/";
$patterns[10] = "//";
$patterns[11] = "/€/";
$patterns[12] = "/©/";
$replacements[0] = " ";
$replacements[1] = "à";
$replacements[2] = "â";
$replacements[3] = "é";
$replacements[4] = "è";
$replacements[5] = "ê";
$replacements[6] = "î";
$replacements[7] = "ù";
$replacements[8] = "û";
$replacements[9] = "ç";
$replacements[10] = "œ";
$replacements[11] = "€";
$replacements[12] = "©";
$buf = preg_replace($patterns, $replacements, $buf);
BETWEEN LINE
$buf = ereg_replace('/&\w;/', '', $buf);
AND LINE
/* Extract all words matching the regexp from the current line: */
It's not big deal but it works and it is easy to adapt to foreign languages.
Regards,
Louis
http://www.interactive-trails.com
|
Space = & n b s p ;
à = & a g r a v e ;
â = & a c i r c ;
é = & e a c u t e ;
€ = & e u r o ;
and so on...