Difference between revisions of "AmiGO: Search Relevance"

From GO Wiki
Jump to navigation Jump to search
Line 36: Line 36:
Please try it out and give me any feedback you might have.
Please try it out and give me any feedback you might have.

[[Category:AmiGO working group]]
[[Category:Web Presence Working Group]]

Latest revision as of 15:48, 11 February 2008

[From Amelia's WG message:]

If you just want to try it out, please go to:


to give it a whirl. I've left in the info about the best match. Comments and bug reports gratefully received! The relevance algorithm now has the following tweaks:

  • automatically remove 'complex' or 'activity' from the end of terms so that users are not penalised for not knowing GO speak
  • search results where words appear in the same order as the query phrase score higher than where the order is different
  • whole word matches score higher than partial matches

For those who are interested, the basic calculation performed to generate the relevance score is:

relevance = 1 - ( remainder / ( querystr + remainder ))


  • querystr = length of the query phrase
  • remainder = a figure based on the length of the search result after the query phrase has been removed

This is then multiplied by a factor depending on what field the search result appears in; e.g. if it's a related synonym, the factor is 0.5, if it's a term name, the factor is 1.0, etc..

The figure for the 'remainder' is now calculated using this formula:

remainder =

       # word chars x word char weighting  (currently 1.0)
       + # word boundaries x boundary weighting  (currently 0.25)
       + (# non word chars + # search matches in same order as query phrase) x non word char weighting  (currently 0.25)

word characters are a-z, 0-9 and _

I can mess around with the weightings to alter the relative importance of exact word matches, etc..

Please try it out and give me any feedback you might have.