|
Maybe a mispelling / typo operator, or maybe soudex, or maybe a regex match.
I guess most operators would require the each token be reduced to some type of "hash code" and then that hash code would be stored in a separate field. Then the search would has query terms and check the hash-codes field.
But some operators would seem difficult to do if a source word could not be directly mapped to a single "hash code". For example, a regex match.
Also, would be nice to boost documents where two matching words are closer together.
|