Document Type : Articles


1 Associate Prof., Information Technology Department, Iranian Research Institute for Information Science and Technology (IranDoc), Tehran, Iran

2 B.Sc., Industrial Engineering Department, Islamic Azad University, Central Tehran Branch, Tehran, Iran.

3 Professor, Industrial Engineering Department, Sharif University of Technology, Tehran, Iran


Today, academic research plays a very influential role in the economic development of countries. These researches are often recorded and disseminated in the form and structure of theses and dissertations in scientific institutes. The better the quality of this data in the systems that collect and distribute it, the more it can be used and exploited by organizations and businesses. Therefore, providing this data requires proper monitoring to put the output of the recording and dissemination process in good condition. This paper offers a framework for evaluating theses and dissertation data quality. In the framework, the data inconsistency coding structure is introduced and presented in Word and PDF files and in the form of metadata (bibliographic information). The approaches presented in data quality methodologies (TDQM and DWQ) are also used to provide solutions to improve data quality in the provisioning phase. At this stage, approaches such as owner attribution to data or process, root cause analysis, process control, and continuous monitoring are considered. The focus group method determines the operational strategies for quality improvement. Finally, process-oriented techniques, such as quality control checklists and image processing, and data-driven approaches, such as data cleansing, are localized and developed in this section to improve the quality of theses/dissertation documents. The provided improvement solutions were categorized into two different groups. Guiding the user in the "Theses/Dissertations" registration process is identified as a process-driven category. On the other hand, introducing a specific format for "Theses/Dissertations" files and resolving the quality issues of PDF files were among the data-driven solutions.


Ashtarian Esfahani, A., Ershadi, M. J. & Azizi, A. (2020). Monitoring indicators of research data using I-MR control charts. Iranian Journal of Information Processing and Management 35 (4), 957-933. [in Persian]
Avenali, A., Batini, C., Bertolazzi, P. & Missier, P. (2008). Brokering infrastructure for minimum cost data procurement based on quality-quantity models. Decision Support Systems 45 (1), 95-109.
Azeroual, O., Ershadi, M. J., Azizi, A., Banihashemi, M. & Abadi, R. E. (2021). Data quality strategy selection in CRIS: Using a hybrid method of SWOT and BWM. Informatica, 45(1), 65-80.
Azeroual, O., Saake, G., Abuosba, M. & Schöpfel, J. (2020). Data quality as a critical success factor for user acceptance of research information systems. Data, 5 (2), 35.
Batini, C., Cabitza, F., Cappiello, C. & Francalanci, C. (2008). A comprehensive data quality methodology for web and structured data. International Journal of Innovative Computing and Applications 1 (3),205-218.
Batini, C., Cappiello, C., Francalanci, C. & Maurino, A. (2009). Methodologies for data quality assessment and improvement. ACM Computing Surveys (CSUR), 41 (3), 1-52.  
Cahyono, S. H. & Sucahyo, Y. G. (2020). Pengukuran Kualitas Data Menggunakan framework total data quality management (TDQM): Studi Kasus Sistem Informasi Beasiswa Universitas Indonesia Data Quality Assessment Using the TDQM Framework: A Case Study of University of Indonesia (UI) Scholarship Information System. Jurnal IPTEK-KOM (Jurnal Ilmu Pengetahuan dan Teknologi Komunikasi), 22 (2), 193-206.
De Amicis, F., Barone, D. & Batini, C. (2006). An analytical framework to analyze dependencies among data quality dimensions. In ICIQ (pp. 369-383).
Edris Abadi, R., Ershadi, M. J. & Niaki, S. T. A. (in Press). A clustering approach for data quality results of research information systems. Information Discovery and Delivery.
Elouataoui, W., El Alaoui, I., El Mendili, S. & Gahi, Y. (2022). An advanced big data quality framework based on weighted metrics. Big Data and Cognitive Computing, 6(4), 153.
English, L. P. (1999). Improving data warehouse and business information quality: methods for reducing costs and increasing profits. J. Wiley & Sons.
Eppler, M. J. & Muenzenmayer, P. (2002, November). Measuring information quality in the web context: a survey of state-of-the-art instruments and an application methodology. In Proceedings of the Seventh International Conference on Information Quality ICIQ (pp. 187-196).
Ershadi, M. J. & Ershadi, M. M. (2018). Implementation of failure modes and effects analysis in detergent production companies: A case study. Environmental Quality Management 27 (3), 89-95.
Ershadi, M. J., Jalalimanesh, A. & Nasiri, J. (2019). Designing a metadata quality model: case study of registration system. Iranian Journal of Information Processing & Management 34 (4): 1528-1499.
Ershadi, M. J. & Nabizadeh, M. (2022). Providing a structural methodology for measuring and analyzing the quality of theses and dissertations in the country. Iranian Journal of Information Processing and Management, 37(3), 667-694. [in Persian]
Ershadi, M. J. & Omidzadeh, D. (2018). Customer validation using hybrid logistic regression and credit scoring model: A case study.  Quality - Access to Success, 19 (167), 59-62. Retrieved from
Ershadi, M. J., Rajabi, T., Shirani, F. & Rezaee, N. (2016). Application of root-cause analysis on quality problem solving of research information systems: A case study on dissemination system of theses and dissertations (GANJ). Iranian Journal of Information Management, 1 (1), 89-75. Retrieved from   [in Persian]
Falge, C., Otto, B. & Österle, H. (2012, January). Data quality requirements of collaborative business processes. In 2012 45th Hawaii International Conference on System Sciences (pp. 4316-4325). IEEE. Retrieved from
Falorsi, P. D. & Righi, P. (2008). A balanced sampling approach for multi-way stratification designs for small area estimation. Survey Methodology, 34(2), 223-234. Retrieved from
Glowalla, P., Balazy, P., Basten, D. & Sunyaev, A. (2014, January). Process-driven data quality management--An application of the combined conceptual life cycle model. In 2014 47th Hawaii International Conference on System Sciences (pp. 4700-4709). IEEE.
Günther, L. C., Colangelo, E., Wiendahl, H. H. & Bauer, C. (2019). Data quality assessment for improved decision-making: A methodology for small and medium-sized enterprises. Procedia Manufacturing 29, 583-591. Retrieved from
Heinrich, B., Klier, M. & Kaiser, M. (2009). A procedure to develop metrics for currency and its application in CRM. Journal of Data and Information Quality (JDIQ) 1 (1), 1-28.
Jeusfeld, M. A., Quix, C. & Jarke, M. (1998). Design and Analysis of Quality Information for Data Warehouses. In: Ling, TW., Ram, S., Li Lee, M. (eds) Conceptual Modeling – ER ’98. ER 1998. Lecture Notes in Computer Science, vol 1507. Springer, Berlin, Heidelberg.
Kapsner, L. A., Kampf, M. O., Seuchter, S. A., Kamdje-Wabo, G., Gradinger, T., Ganslandt, T. & Prokosch, H. U. (2019). Moving towards an EHR data quality framework: the MIRACUM approach. In German Medical Data Sciences: Shaping Change–Creative Solutions for Innovative Medicine (pp. 247-253). IOS Press.
Khosroanjom, D., Ahmadzade, M., Niknafs, A. & Mavi, R. K. (2011). Using fuzzy AHP for evaluating the dimensions of data quality. International Journal of Business Information Systems 8 (3), 269-285.
Kwon, O., Lee, N. & Shin, B. (2014). Data quality management, data usage experience and acquisition intention of big data analytics. International Journal of Information Management 34 (3), 387-394.
Lee, Y. W., Strong, D. M., Kahn, B. K. & Wang, R. Y. (2002). AIMQ: A methodology for information quality assessment. Information & Management, 40(2), 133-146.
Long, J. A. & Seko, C. E. (2014). A cyclic-hierarchical method for database data-quality evaluation and improvement. In Information quality (pp. 52-66). Routledge.
Loshin, D. (2001). Enterprise knowledge management: The data quality approach. Morgan Kaufmann.
Liu, Q., Feng, G., Zhao, X. & Wang, W. (2020). Minimizing the data quality problem of information systems: A process-based method. Decision Support Systems 137. 113381.
Michelberger, B., Mutschler, B. & Reichert, M. (2011). Towards process-oriented information logistics: Why quality dimensions of process information matter. Lecture Notes in Informatics (EMISA 2011), (pp.107-120). Bonn: Gesellschaft für Informatik. 
Nikiforova, A. (2020). Definition and evaluation of data quality: User-oriented data object-driven approach to data quality assessment. Baltic Journal of Modern Computing 8 (3), 391-432.
Ochoa, X.  & Duval, E. (2006). Quality metrics for learning object metadata. In EdMedia+ Innovate Learning (pp. 1004-1011). Association for the Advancement of Computing in Education (AACE).
Peltier, J. W., Zahay, D. & Lehmann, D. R. (2013). Organizational learning and CRM success: a model for linking organizational practices, customer data quality, and performance. Journal of Interactive Marketing 27(1), 1-13.
Petrović, M. (2020). Data quality in customer relationship management (CRM): Literature review. Strategic Management, 25 (2), 40-47.
Pipino, L. L., Lee, Y. W. & Wang, R. Y. (2002). Data quality assessment. Communications of the ACM, 45(4), 211-218.
Rahman, M. S., Mannan, M., Hossain, M.A., Zaman, A. H. & Hassan, H. (2018). Tacit knowledge-sharing behavior among the academic staff: Trust, self-efficacy, motivation and big five personality traits embedded model. International Journal of Educational Management, 32 (5): 761-782.
Robinson, J.  (2019). Focus groups. In: Atkinson, P., Delamont, S., Cernat, A., Sakshaug, J. W. and Williams, R. A. (eds.) SAGE Research Methods: An Encyclopedia. SAGE.
Nyumba, T. O., Wilson, K., Derrick, C. J. & Mukherjee, N. (2018). The use of focus group discussion methodology: Insights from two decades of application in conservation. Methods in Ecology and Evolution, 9(1),20-32.
Russell-Rose, T., Chamberlain, J. & Azzopardi, L. (2018). Information retrieval in the workplace: A comparison of professional search practices. Information Processing & Management, 54 (6), 1042-1057.
Scannapieco, M., Virgillito, A., Marchetti, C., Mecella, M. & Baldoni, R. (2004). The DaQuinCIS architecture: a platform for exchanging and improving data quality in cooperative information systems. Information Systems, 29(7), 551-582.
Sharma, S. (2020). Big data analytics for customer relationship management: A systematic review and research agenda. In Advances in Computing and Data Sciences: 4th International Conference, ICACDS 2020, Valletta, Malta, April 24–25, 2020, Revised Selected Papers 4 (pp. 430-438). Springer Singapore.
Sidi, F., Panahy, P. H. S., Affendey, L. S., Jabar, M. A., Ibrahim, H. & Mustapha, A. (2012). Data quality: A survey of data quality dimensions. In 2012 International Conference on Information Retrieval & Knowledge Management (pp. 300-304). IEEE. Kuala Lumpur. [DOI:10.1109/InfRKM.2012.6204995]
Su, Z. & Jin, Z. (2007). A methodology for information quality assessment in the designing and manufacturing processes of mechanical products. In Information Quality Management: Theory and Applications (pp. 190-220). IGI Global.
Taleb, I., Serhani, M. A. & Dssouli, R. (2018). Big data quality assessment model for unstructured data. In 2018 International Conference on Innovations in Information Technology (IIT) (pp. 69-74). IEEE. AL AIN UAE.
Vaziri, R., Mohsenzadeh, M. & Habibi, J. (2017). Measuring data quality with weighted metrics. Total Quality Management & Business Excellence, 30(5-6), 708-720. 
Wang, R.Y. (1998). A product perspective on total data quality management. Communications of the ACM, 41(2), 58-65.   
Wang, R. Y. & Strong, D. M. (1996). Beyond accuracy: What data quality means to data consumers. Journal of Management Information Systems 12 (4), 5-33.
Wang, R. Y. & Stuart, E. M. (1990). A polygen model for heterogeneous database systems: The source tagging perspective. In Proceedings of the 16th International Conference on Very Large Data Bases (pp. 519-538). San Francisco, CA, United States. Retrieved from