Click to See Complete Forum and Search --> : More Lenient Search Engine


Xiupan
06-27-2007, 11:49 AM
Sup Forum! ^^

Ok, here's my problem.
Right now I have a working search engine on my page that uses ASP coding to search a product database and return results. It works nicely right now, but if the user does not type in a name or product ID or SKU ~exactly~, it doesn't display the proper results. For example, there is an item in the database that has the word "cafe" in it, but the "e" has a little accent mark on it. If the user doesn't type in cafe with the little accented "e", it won't find it.

Is there a solution to make the search engine more lenient? Like to find all cases of "e", capitalized, accented or not without changing all the "cafe"s in my database to non-accented "e"s?

Thanks everyone! :D

nbcrockett
06-27-2007, 01:02 PM
I don't know how you would look for the different e's, but would be very interested to know how. However, if you want to search for a partial word just change your Select Statement to something like this.

Say your customer searches for the word work.

strSearchCriteria = "work"
SELECT * FROM Table WHERE ProductName LIKE '%" & strSearchCriteria & "%'"

The recordset would return any record in the database that included the word work in the ProductName field.
i.e. Work, Network, Networking

Xiupan
06-27-2007, 01:17 PM
Hmm... it seems my search engine already can find partial words or part of words. Yeah I just need to know how to find different characters that are similar like the accented "e" thing. Another problem I just realized is that some of the product IDs have "-" or "_" in them and if the user searches say AF10, it won't find "AF-10". Is there maybe a way to disregard the "-" when looking for likenesses to the user input?

nbcrockett
06-27-2007, 02:00 PM
You could remove them and look for both what they typed in and what they typed in minus those characters.

Removed Characters:

strSearchCriteria1 = "AF-10"
strSearchCriteria2 = Replace(strSearchCriteria,"-","")
SELECT * FROM Table WHERE ProductName LIKE '%" & strSearchCriteria1 & "%' OR ProductName LIKE '%" & strSearchCriteria2 & "%'"

This returns both AF-10 and AF10.

You could also use this method to search for the letter variations, but the coding could be very long and complex. Hopefully there's a better way to do this than my example, but here it is anyway. I also don't know if you can just type in "é" or if you have to use a special character code for it to work.

strSearchCriteria1 = "cafe"
strSearchCriteria2 = Replace(strSearchCriteria,"e","é")
SELECT * FROM Table WHERE ProductName LIKE '%" & strSearchCriteria1 & "%' OR ProductName LIKE '%" & strSearchCriteria2 & "%'"

Xiupan
06-27-2007, 02:17 PM
Yay~!
The cafe thing is working now. I'll try to look for more odd instances of different characters but I think this will suit my needs for now. Thanks! :D

If anyone wants to suggest an all-powerful code that encompasses all exceptions, please do! :P

nbcrockett
06-27-2007, 02:22 PM
You're welcome and yes, if anyone does know of a way to search for those characters more effectively please post it! Thanks.

Chikara
06-28-2007, 11:42 AM
Personally, I think the easiest way to do this would be to just remove all accented chars when you are adding products into the database. The problem you will run into is that it will be very difficult to search every possible combination of accented characters/unaccented characters. Not to mention that it will only get more and more complex if a string has more than one accented char. For instance cáfé would be hard to catch.

Xiupan
06-28-2007, 02:31 PM
Well that option is simply unavailable to me as that decision is for my client to make, and I don't think they would like me changing the existing database to make things easier for me.

Chikara
06-28-2007, 02:36 PM
This might not be the most effective way to handle this, but have you tried just substituting all accented chars with %? That way you would end up with a select statement like this.

Search for cáfé would result in a call like this.
Select * from products where name LIKE 'c%f%'

That might bring back a very large result though.

nbcrockett
06-28-2007, 02:47 PM
I don't suggest using that. If you do you'll end up returning every intro in the database that has a c before an f. However, you could use the same approach and use an underscore instead of a percent sign. The percent sign can return any number of characters between c and f, but the underscore will only allow for one character to be between them. Again you will have some false postives like say coffee.

Chikara
06-28-2007, 03:46 PM
nbcrockett's way is probably the best choice.
Questions
=======

How many different accented characters exists your DB's?
How many strings exist in your DB that have multiple unique accented chars?