IMPROVING SEARCH EXPERIENCE

FRIDAY, JUNE 11 2004 @ 03:42 PM

Those of you that have used the primitive search engine I had in place here, and attempted to look up for more than one word, may have been disappointed by the results, myself included. That's why I have been working on a PHP class to help deliver better results when searching within records in a MySQL table, and I just made it operational today.

Using this class, the search process happens at two levels, automatically: first, it tries to match the whole search string against the records in the database; second, it separates the search string into meaningful segments (most likely words), and loop through them trying to find a match; if at least one of the terms is found in a given record, it is considered and added to the results array, which is ordered according to relevance.

Relevance is measured following this criteria: in the first place, the search will return all results that contain the whole phrase; in second place, we will have all records containing all of the search terms, ordered according the number of occurrences; finally, all records matching at least one of the search terms, ordered in the same fashion.

Among the turn downs, it is still incapable of understanding exclusion operators (i.e. -keyword) or quoted phrases. I'm not really sure if I want to complicate it that much, unless there's real use for those features. So far, I haven't found any records of visitors using operators in their searches.

Keeping track of visitors' searches helps me to understand what kind of information they expect to find here. So, if you care enough to give a hand testing the new search, please try meaningful stuff :)

----
Update (06/20):

You can check the code here. For some reason, the coloring souce feature is not working properly in my server. But you can download it and check the source.

Archived under: PHP. | Permalink | google | del.icio.us Is it delicious? | digg Do you digg it?


DYLAN

JUNE 11 2004 @ 05:22 PM

If you are using MySQL, why don't you use its Full Text search capabilities? This will do all that tricky parsing for you, and probably faster and more effectively. It will also return a relevance number.
http://dev.mysql.com/doc/mysql/en/Fulltext_Search.html

ps. One thing to watch out for with Full Text searches: it will only work if you have at least 4 or 5 rows in your database.

OSCAR TRELLES

JUNE 11 2004 @ 05:47 PM

Yes, that's what my search engine was relaying on before. I followed the document you are sharing, and had been using that relevance elasticity factor they propose there. What you are saying is correct, using MySQL's indexing capabilities allows for faster results, but for some reason, accuracy is not as good.

Experimenting with this class, I've found better results than using the other method. However, before sharing it, I'd like to make sure my impression are more objective.

Maybe there's something that needs to be changed on MySQL's default configuration to get better results with match(), I don't know, I'm far from being an expert. But to propose a standard method or technique, it should not require the user to manipulate the underlaying basic functionality.

ADEDEJI OLOWE

JUNE 15 2004 @ 05:09 PM

Why don't you use regular expressions?

OSCAR TRELLES

JUNE 20 2004 @ 03:53 PM

Well, there is no need to complicate things. As long as we are looking for simple strings ocurrences, matching substrings is good enough, and makes it easy to keep track of the number of ocurrences for ordering purposes.

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM

A5USEGTV

JULY 13 2006 @ 01:55 AM