Testing SearchEssays in this search series:
How do you go about testing your site's search functionality? First, start by identifying what can be tested. Some characteristics of search engines and systems are general, common to most versions, and I've identified some of these below. If your search has any special modifications or search strategies unique to your site/company, I recommend you get a handle on these common points first, before exploring how your search extends or pushes any informational retrieval boundaries. Please note that all of the information on this page is aimed specifically at search against product catalogues. AccuracyThe accuracy of a search system is its ability to find all the matching items in the information collection, usually a product database; in other words, if you search on the word metonymy, the query should correctly find every instance of this word. Test: Comparison of query results for back-end and front-endSince the user interacts with an interface that mediates for a back-end query against the database, the simple way to measure accuracy is to compare the results of a query made with the web interface with the results of a query made directly against the database. If the queries are identical, the returned results should also be identical. Any difference in returned hits indicates a problem. If there is a systemic reason why the web returns fewer results than a direct query, that reason should be attacked as a significant quality and usability problem. If the web interface is returning more results than the direct query against the database, that indicates a scope problem, with the query generated via the interface running against too many table columns. Test: Consistency of results over timeBuild a list of search terms and queries, and run them at periodic intervals; if the product catalogue is updated regularly, performing some consistent searches will allow you to track trends in the data. For example, if your data collection is increasing, you should reasonably expect your results to increase. Having a set of "benchmark" queries is also useful when evaluating changes and enhancements to your site's search functionality. Search PerformancePerformance in the context of search refers to the speed of returning results, the time between the user clicking the submit button and the page of results being fully displayed on the client's screen. Most search systems typically query against a database or index for the matching data, which means you actually have two lengths of time to measure:
Test: Back-end TimeIn practice, timing the query can be a chore if the measurement must be done manually from a command line. If possible, have your programmers code the search program to output the query timings to the interface -- just for testing, not for use by your customers. Test: Front-end TimeTesting general performance for a particular search should be a simple matter of recording the length of time from the submission of the form to the complete display and rendering of the search results page on the tester's client PrecisionThe precision of a system is its ability to select only relevant products and reject the irrelevant ones. Relevancy is more difficult to pin down, because of the following issues:
Test: Apparent or Visible RelevancyThis test measures how obvious the relevance of the search results is by verifying whether the search terms are visible in the surfaced product information. So for example, if you search for a video using the term "spanish", any products that include the word "spanish" in the product information displayed on the results page would be visibly relevant, while a product that had no mention of the word "spanish", even if it was about spanish cooking", would not be visibly relevant. This distinction is important because users should have an indication of why a particular result was returned; users shouldn't have to ponder the search engine's reasoning. The measurement is in the form of the ratio of obvious inclusion to non-inclusion, which will trap those terms that were matched against non-displayed fields. The ratio should obviously be higher for searches that don't query against non-displayed fields; the operative usability principle here is that clear relevance provides better information to the user. To generate the VR (visible relevancy),
Test: Literal AccuracyThis test measures the introduction of specific errors and the instances of the corrected query being returned. This is especially useful for evaluating fuzzy logic. To generate the literal accuracy,
Test: Objective AccuracyThis test measures the efficiency of the search functionality against the results of a separate database query. To generate the objective accuracy,
|