Rethinking the Recall Measure in Appraising Information Retrieval Systems and Providing a New Measure by Using Persian Search Engines

Mohsen Nowkarizi, Mahdi Zeynali Tazehkandi


The aim of the study was to improve Persian search engines’ retrieval performance by using the new measure. In this regard, consulting three experts from the Department of Knowledge and Information Science (KIS) at Ferdowsi University of Mashhad, 192 FUM students of different degrees from different fields of study, both male and female, were asked to conduct the search based on 32 simulated work tasks (SWT) on the selected search engines and report the results by citing the related URLs. The Findings indicated that to measure recall, one does not focus on how documents are selecting, but the retrieval of related documents that are indexed in the information retrieval system database is considered While to measure comprehensiveness, in addition to considering the related documents' retrieval in the system's database, the performance of the documents selecting on the web (performance of crawler) was also measured. At the practical level, there was no strong correlation between the two measures (recall and comprehensiveness) however, these two measure different features. Also, the test of repeated measures design showed that with the change of the measure from recall to comprehensiveness, the search engine’s performance score is varied.  Finally, it can be said, if the study purpose of the search engines evaluation is to assess the indexer program performance, the recall use will be suggested while, if its purpose is to appraise the search engines to determine which one retrieves the most relevant documents, the comprehensiveness use will be proposed.


Recall, Comprehensiveness, Evaluation of Information Retrieval Systems, Search Engines.

Full Text:



Ahlgren, P., & Grönqvist, L. (2008). Evaluation of retrieval effectiveness with incomplete relevance data: Theoretical and experimental comparison of three measures. Information Processing & Management, 44(1), 212-225.

Alexa (2017). Traffic Statistics. Retrieved from siteinfo/

Aqdasi Alamdari, P., Poormanaf, V. & Abdul-Jabar Pourniyavar, F. (2015). Check the performance of search engines. National Conference on Computer Engineering and Information Technology Management. Retrieved from Paper-CSITM02-CSITM02_103.html.

Baccini, A., Déjean, S., Lafage, L., & Mothe, J. (2012). How many performance measures to evaluate Information Retrieval Systems? Knowledge and Information Systems, 30(3), 693-713.

Bama, S. S., Ahmed, M. I., & Saravanan, A. (2015). A survey on performance evaluation measures for information retrieval system. International Research Journal of Engineering and Technology, 2(2), 1015-1020.

Bar-Ilan, J., Mat-Hassan, M., & Levene, M. (2006). Methods for comparing rankings of search engine results. Computer networks, 50(10), 1448-1463.

Biranvand, A. (2012). Computer and Internet Basics. Tehran: Chapar.

Buckley, C., & Voorhees, E. M. (2004). Retrieval evaluation with incomplete information. In Proceedings of the 26th annual international ACM SIGIR conference on research and development in information retrieval (pp. 25–32).

Cambridge English Dictionary (2018). Comprehensiveness. Retrieved from

Clarke, S. J., & Willett, P. (1997). Estimating the recall performance of Web search engines. Aslib Proceedings, 49 (7), 184-189.

Collin Dictionary (2018). Comprehensiveness. Retrieved from https://www.collinsdictionary .com/dictionary/english/comprehensive

Cooper, W. S. (1968). Expected search length: A single measure of retrieval effectiveness based on the weak ordering action of retrieval systems. American documentation, 19(1), 30-41.

Croft, W. B., Metzler, D., & Strohman, T. (2015). Search engines: Information retrieval in Practice. London: Pearson Education.

Davarpanah, M. (2008). Search for scientific and research information in print and electronic resources. Tehran: Dabizesh.

Davidson, D. (1977). The effect of individual differences of cognitive style on judgments of document relevance. Journal of the American Society for Information Science, 28(5), 273–284.

English Oxford Living Dictionary (2018). Comprehensiveness. Retrieved from

English Oxford Living Dictionary (2018). Recall. Retrieved from

Grönqvist, L. (2005). Evaluating latent semantic vector models with synonym tests and document retrieval. In ELECTRA Workshop on Methodologies and Evaluation of Lexical Cohesion Techniques in Real-world Applications (Beyond Bag of Words) (Vol. 5).

Hariri, N. (2011). Relevance ranking on Google: Are top ranked results really considered more relevant by the users? Online Information Review, 35(4), 598-610.

Henzinger, M. (2007). Search technologies for the Internet. Science, 317(5837), 468-471.

Huang, M., & Wang, H. (2004). The influence of document presentation order and number of documents judged on users’ judgments of relevance. Journal of American Society for Information Science and Technology, 55(11), 970–979.

Huang, X., & Soergel, D. (2013). Relevance: An improved framework for explicating the notion. Journal of the American Society for Information Science and Technology, 64(1), 18-35.

Ilic, D., Bessell, T. L., Silagy, C. A., & Green, S. (2003). Specialized medical search‐engines are no better than general search‐engines in sourcing consumer information about androgen deficiency. Human Reproduction, 18(3), 557-561.

Järvelin, K., & Kekäläinen, J. (2002). Cumulated gain-based evaluation of IR techniques. ACM Transactions on Information Systems (TOIS), 20(4), 422-446.

Knight, D., Holt, A., & Warren, J. (2009). Search engines: a study of nine search engines in four categories. Journal of Health Informatics in Developing Countries, 3(1), 1-8.

Kousha, K. (2003). Internet search Tools: Principles, Skills and Web Search Options. Tehran: Ketabdar.

Lancaster, F. W. (2003). Indexing and abstracting in theory and practice‬. Translated by Abbas Gilvari. Tehran: Chapar.

Lewandowski, D. (2008). The retrieval effectiveness of web search engines: considering results descriptions. Journal of Documentation, 64(6), 915-937.

Lewandowski, D. (2015). Evaluating the retrieval effectiveness of Web search engines using a representative query sample. Journal of the Association for Information Science and Technology, 66(9), 1763-1775.

Macmillan Dictionary (2018). Comprehensiveness. Retrieved from

Mea, V. D., & Mizzaro, S. (2004). Measuring retrieval effectiveness: A new proposal and a first experimental validation. Journal of the American Society for Information Science and Technology, 55(6), 530-543.

Meriam webester dictionary (2018). Comprehensiveness. Retrieved from Comprehensiveness.

Montazer, G. A. (2005). Internet search engines: Income on optimal information retrieval. Tehran: Kavir [In Persian].

Pao, M. L (2000). Concepts of information retrieval. Translated by Asdollah Azad and Ramatollah Fattahi. Mashhad: Ferdowsi University of Mashhad.

Riahinia, N, Bakshyan, L, Latifi, M., & Rahimi, F. (2016). Evaluation the accuracy and recall in general search engines, based on the system relevance and search logic. Journal of Academic Librarianship and Information Research, 50(1), 3-24. [In Persian]

Sakai, T., & Kando, N. (2008). On information retrieval metrics designed for evaluation with incomplete relevance assessments. Information Retrieval, 11(5), 447-470.

Saracevic, T. (2007). Relevance: A review of the literature and a framework for thinking on the notion in Information Science. Part II: Nature and manifestations of relevance. Journal of the American Society for Information Science and Technology, 58(13):1915–1933.

Saracevic, T. (2015). Why is relevance still the basic notion in Information Science? In: F. Pehar, C. Schlögl, C. Wolff (Eds.) Reinventing Information Science in the Networked Society (pp. 26-36), Proceedings of the 14th International Symposium on Information Science (ISI 2015), ZadarCroatia, 19th– 21st May 2015. Retrieved Dec. 1, 2017 from

Soleimani, H. (2009). Learning to search the web for databases. Tehran: Soleimani. [In Persian]

Su, L. T. (1994). The relevance of recall and precision in user evaluation. Journal of the American Society for Information Science, 45(3), 207-217.

Tang, T. T., Craswell, N., Hawking, D., Griffiths, K., & Christensen, H. (2006). Quality and relevance of domain-specific search: A case study in mental health. Information Retrieval, 9(2), 207-225.

Thornley, C. (2012). Information retrieval (IR) and the paradox of change: An analysis using the philosophy of Parmenides. Journal of Documentation, 68(3), 402-422.

Vakkari, P., & Järvelin, K. (2005). Explanation in information seeking and retrieval. New directions in cognitive information retrieval. Dordrecht, Netherland: Springer

Wu, G., & Li, J. (1999). Comparing Web search engine performance in searching consumer health information: evaluation and recommendations. Bulletin of the Medical Library Association, 87(4), 456-461.

Yilmaz, E., Carterette, B., & Kanoulas, E. (2012). Evaluating Web Retrieval Effectiveness. In Dirk lewandowski , Web search engine research. Bingley, west Yorkshire: Emerald Group Publishing.


  • There are currently no refbacks.

E-ISSN: 2008-8310

   ISSN: 2008-8302