Metadata Ontology of Dissertations: Designing a Model

Document Type : Research َ Article

Author

Assistant Professor of Knowledge and Information Science, Data Science, Information and Artificial Intelligence Group, National Library and Archives of Iran, Tehran, Iran

10.30484/nastinfo.2024.3498.2247

Abstract

Purpose: Designing metadata ontology model for semantic representation of Theses by using the SPAR (Semantic Publishing and Referencing) Ontologies.
Method: This study was an applied form and two methods were used, Content Analysis and mapping. The metadata of 69 theses and dissertations on the National Library and Archive of Iran in three Databases: 1) Digital Library of National Library and Archive of Iran. 2) Rasa Software and 3) Ganj in the Iranian Research Institute for Information Science and Technology were selected modified and completed by mapping. On the other hand, by analyzing the entities of each SPAR ontology and suggesting another entity to the researcher, the checklist was formed. This checklist included classes, properties, and individuals. At last, by entering them into Protégé software version 5.5, the model of metadata ontology, MdOntTDs, was drawn.
Findings: Findings identified deficiencies in the existence of four important metadata elements (subject, supervisor, advisor, and abstract) in RASA and NLAI Digital Library. Among the 18 SPAR Ontologies, the most entities were selected from FaBiO, FRAPO, and CiTO respectively. All entities of BiDO, BiRO, C4O, Fivestar, FR, FRBR, PO, PRO, PSO, and PWO were suitable for theses. 195 individuals from 6 SPAR Ontologies, 292 individuals labeled with MdTDs from theses, and 100 individuals labeled with SUNMdTDs were selected by the researcher and entered into the software. 1558 entities categorized by class, Properties (object, data, and Annotation), and individuals along with the description and definition of each entity were placed in the software, in the form of hierarchical and determining axioms for classes. And specifying domain and range for relationships. Finally, the RDF graph was drawn using the OntoGraf plugin, and the final Model, MdOntTDs was developed. this research has proposed three new types of metadata: 1) Except for the existing keywords, topics have been categorized and modeled up to three levels including 4 main categories, 16 subcategories, and many units. Each of these final topics has been related to “hasSubject” and “isSubjectOf” properties. 2) The research methods of Theses that were connected with “hasMethod” and “used in” properties. 3) The papers taken from Theses were also searched, as far as possible, and were connected with “hasJournalArticle” and “journalArticleOf” properties.
Conclusion: This model, if implemented, can overcome keyword search limitations, the problem of linking and Data sharing on the web, and the inconsistency of data. In the software, classes and their related individuals are visible in the form of a hierarchical network in RDF triples, and the connection between entities with increasing access points promises deeper semantic searches. However, due to the absence or lack of tagged and linked data, usage of the some of selected entities is not possible.

Keywords

Main Subjects


رخشانی، مریم (1391). وضعیت توصیف منابع اینترنتی در آر.دی.ای. و مقایسه آن با قواعد فهرستنویسی انگلوامریکن. پایان‌نامه کارشناسی ارشد، کتابداری و اطلاع رسانی، دانشکده علوم تربیتی و روان‌شناسی، دانشگاه بیرجند، بیرجند.
باقری، توران، نوروزی، یعقوب، اسفندیاری مقدم، علیرضا و زارعی، عاطفه (1398). ارائه الگوی به‌کارگیری فناوری معنایی در بازیابی اطلاعات در کتابخانه‌های دیجیتالی. فصلنامه مطالعات ملی کتابداری و سازماندهی اطلاعات، 30(2): 129-151. Doi: 10.30484/nastinfo.2019.2145.1820.
درخوش، ملیحه (1399). طراحی، پیادهسازی و ارزیابی الگوی هستی‌شناختی تخصصی حوزه نقاشی بر پایه الگوی مرجع مفهومی سیداک (CIDOC-CRM. پایان‌نامه دکتری، علم اطلاعات و دانش‌شناسی، دانشکده روان‌شناسی و علوم تربیتی، دانشگاه فردوسی مشهد، مشهد.
زره‌ساز، محمد و پازوکی، فاطمه (1396). مدل مرجع کتابخانه‌ای اف.‌آر.‌بی.‌آر.: تثبیت مدل‌های اف.‌آر.بی.‌آری در پیوند با محیط داده‌های پیوندی. فصلنامه مطالعات ملی کتابداری و سازماندهی اطلاعات، 28(3): 63-80.
صدیق بهزادی، ماندانا (۱۳۷۹). چکیده‌نامه پایان‌نامه‌های کتابداری و اطلاع‌رسانی. با همکاری سیمین حسین نیا. تهران: کتابخانه ملی جمهوری اسلامی ایران، مرکز اطلاع‌رسانی و خدمات علمی جهاد سازندگی.
عمرانی، سیدابراهیم، موسوی‌زاده، زهرا و امیری، ناهید (1392). بررسی میزان انطباق عناصر داده‌ای مارک ایران با موجودیت‌ها و ویژگی‌های الگوی ملزومات کارکردی پیشینه‌های کتابشناختی (اف.آر.بی.آر.): (نمونه موردی پیشینه‌های مارک رباعیات خیام موجود در نرم‌افزار کتابخانه ملی). پژوهشنامه پردازش و مد‌یر‌یت اطلاعات، ۲۸ (۳): 761-786.
علیپور حافظی، مهدی (1394). یکپارچه‌‎سازی معنایی منابع اطلاعاتی در کتابخانه‌های دیجیتالی ایران. فصلنامه مطالعات ملی کتابداری و سازماندهی اطلاعات، 26 (3): 93-113.
فعال، سهیلا (1398). بررسی مقوله‌های طبقه‌بندی خاص مدارک و ارائه طرح طبقه‌بندی پایان‌نامه‌ها با تأکید بر پرسش‌هایWH.. فصلنامه بازیابی دانش و نظام‌های معنایی، 5 (20): 55-74. Doi: 10.22054/jks.2019.46420.1250
فتحیان دستگردی، اکرم (1399) (طرح پژوهشی). طراحی الگوی هستان‌نگاری فراداده‌ای برای مدلسازی و بازنمون معنایی مقالات نشریات علمی در پایگاه رایسست. مرکز منطقه‌ای اطلاع‌رسانی علوم و فناوری (رایسست).
میرحسینی، زهره و دستاران، مراد (1398). راه‌حل‌های معنایی برای کتابخانه‌های دیجیتال با تأکید بر استانداردها و فن‌آوری‌های وب معنایی. دانش‌شناسی، 12 (44): 81-98.
نوذری، سودابه (1402). چه باشد آنچه خوانندش اُنتولوژی: تلاشی برای معادل گزینی یک مفهوم. پژوهشنامه پردازش و مدیریت اطلاعات، 38 (3): 745-781.
نوذری، سودابه (1402) (طرح پژوهشی). طراحی الگوی هستان‌نگاری فراداده‌ای برای بازنمون معنایی پایان‌نامه‌های علم اطلاعات و دانش‌شناسی. سازمان اسناد و کتابخانه ملی ایران. https://opac.nlai.ir/opac-prod/search/bibliographicAdvancedSearchProcess.do.
نوذری، سودابه و نوذر، سمانه (1399) (طرح پژوهشی). پژوهش‌های درباره سازمان اسناد و کتابخانه ملی ایران با تأکید بر گرایش‌های موضوعی و تعیین ارتباط آن‌ها با بخش‌های سازمان. سازمان اسناد و کتابخانه ملی ایران. https://opac.nlai.ir/opac-prod/search/bibliographicAdvancedSearchProcess.do-.
نوروزی، یعقوب و خویدکی، سمانه (1393). کتابخانه دیجیتالی معنایی اجتماعی: دورنمایی برای کتابخانه‌های دیجیتالی در ایران. رهیافت، 24 (57): 68- 92.
نیک‌نیا، معصومه (1398). پیاده‌سازی الگوی مرجع مفهومی سی‌داک (CIDOC CRM) برای حوزه باستان‌شناسی ایران. پایان‌نامه دکتری، دانشکده روانشناسی و علوم تربیتی، دانشگاه خوارزمی.
نیک‌نیا، معصومه و عمرانی، سیدابراهیم (1392). تطبیق و ارزش‌گذاری عناصر داده‌ای مارک ایران با موجودیت‌ها و وظایف کاربری الگوی ملزومات کارکردی پیشینه‌های کتابشناختی. پژوهشنامه پردازش و مد‌یر‌یت اطلاعات، ۲۹ (۲): 477-503.
یوسفی‌راد، ابراهیم (1388). آر.دی.اف: الگویی برای توصیف منابع در وب معنایی. فصلنامه مطالعات ملی کتابداری و سازماندهی اطلاعات، 20(3):
 209-220.
References
Alipour Hafezi, M. (2015). Semantic Integration of Information Resources in Iranian Digital Libraries. Librarianship and Information Organization Studies, 26(3): 93-113. [In Persian]
Aspers, P. (2015). Performing ontology. Social Studies of Science, 45(3): 449-453. Doi: 10.1177/0306312714548610 (accessed Apr. 9, 2022).
Atkins, A., Fox, E., France, R. & Suleman, H. (2001). ETD-MS: an interoperability metadata standard for electronic theses and dissertations, available at: www.ndltd.org/standards/metadata/ETD-MS-v1.00-rev2.html.
Babu, P. B., Sarangi, A. K. & Madalli, D. P. (2012). Knowledge Organization Systems for Semantic Digital Libraries. In Devika P. Madalli, Saiful Amin & Anila Sulochana (Eds). International Conference on Trends in Knowledge and Information Dynamics: Vol. II. Paper presented at the ICTK 2012, DRTC, Bangalore, 10-13 July, 2012 (pp. 988- 1007). Bangalore: Documentation Research & Training Centre (ISBN: 9789350678817).
Bagheri, T., Norouzi, Y., Isfandiari, A. & Zarei, A. (2019). Application of Semantic Technology in Information Retrieval in the Digital Libraries: Proposing a Conceptual ModelLibrarianship and Information Organization Studies, 30(2):129-151. DOI: 10.30484/nastinfo.2019.2145.1820. [In Persian]
Biagetti, M. T.) 2018(. A Comparative Analysis and Evaluation of Bibliographic Ontologies. In Challenges and Opportunities for Knowledge Organization in the Digital Age: Proceedings of the fifteenth International ISKO conference, Porto, July 9-11 2018, eds. Fernanda Ribeiro and Maria Elisa Cerveira. Baden-Baden: Ergon, 501-510.
 Biagetti, M. T. (2020). Ontologies in digital libraries (as knowledge organization systems). Available in ISKO Encyclopedia of Knowledge Organization, eds. Birger Hjørland and Claudio Gnoli, https://www.isko.org/cyclo/ontologies.
Boock, M. & Kunda, S. (2009). Electronic thesis and dissertation metadata workflow at Oregon State Libraries. Cataloging and Classification Quarterly, 47 (3): 297-308. Doi: 10.1080/01639370902737323
Brisebois, R., Abran, A. & Nadembega, A. (2017) A Semantic Metadata Enrichment Software Ecosystem (SMESE) Based on a Multi-Platform Metadata Model for Digital Libraries. Journal of Software Engineering and Applications, 10, 370-405. Doi: 10.4236/jsea.2017.104022
Constantin, A., Peroni, S., Pettifer, S., Shotton, D. & Vitali, F. (2016). The Document Components Ontology (DoCO). In Semantic Web – Interoperability, Usability, Applicability, 7 (2): 167-181. Amsterdam, the Netherlands: IOS Press. https://doi.org/10.3233/SW-150177.
Dorkhosh, M. (2021). CIDOC-CRM based ontological model for painting: design, implementation, and evaluation. KIS PhD thesis, Ferdowsi University of Mashhad. [In Persian]
Emrani, S. E., Mosavizade, Z. & Amiri, N. (2014). Study of Mapping Iran Machine Readable Cataloging (Iran MARC) Data Elements to Functional Requirements for Bibliographic Records (FRBR) Entities and Attributes. Iranian Journal of Information Processing and Management, 28(3): 761-786. [In Persian]
Faal, S. (2019). Study of Documents Specific Classifications and Offering a Dissertation Classification Plan based on WH Questions. Knowledge Retrieval and Semantic Systems, 6(20): 55-74. Doi: 10.22054/jks.2019.46420.1250. [In Persian]
Fathian, A. (2020). Designing the metadata ontology model for semantic modeling and representation of scholarly journals articles in the RICeST system. Regional Information Center for Science and Technology (Project Report), Shiraz. Retrieve from: https://search.ricest.ac.ir/dl/search/defaultta.aspx?DTC=15&DC=1503. [In Persian]
Iorio, A.D., Peroni, S., Vitali, F. & Zingoni, J. (2014). Semantic Lenses to Bring Digital and Semantic Publishing Together. LISC@ISWC.
Ivanović, L.Ivanović, D. & Surla, D. (2012). A data model of theses and dissertations compatible with CERIF, Dublin Core and ‌ETD‐MS. Online Information Review, 36(4): 548-567. https://doi.org/10.1108/14684521211254068
Jin, Y. (2004).The development of the China Networked Digital Library of Theses and Dissertations. Online Information Review, 28(5): 367-370. https://doi.org/10.1108/14684520410564299
Kakali, C., Lourdi, I., Stasinopoulou, Th., Bountouri, L., Papatheodorou, Ch., Doerr, M. & Gergatsoulis, M. (2007). Integrating Dublin Core metadata for cultural heritage collections using ontologies. In International Conference on Dublin Core and Metadata Applications, Singapore, August 27-31. Retrieve from: https://dcpapers.dublincore.org/pubs/article/view/871/867
Koutsomitropoulos, D. A. & Solomou, G. D. (2017). A learning object ontology repository to support annotation and discovery of educational resources using semantic thesauri. IFLA Journal, 44(1): 4-22. https://doi.org/10.1177/0340035217737559
Lourenço, C. & Alvarenga, L. (2009). Metadata Standard of Theses and Dissertations according to the Entity-Relationship Model. Knowledge Organization, 36(1): 30-45. Doi:10.5771/0943-7444-2009-1-30
Lund, B. D. & Wang, T. (2021). An analysis of research methods utilized in five top, practitioner-oriented LIS journals from 1980 to 2019. Journal of Documentation, 77(5): 1196–1208. In SAGE Research Methods. https://methods.sagepub.com/methods-map/content-analysis, search on 4 July 2022.
Macgregor, G. (2008). Introduction to a special issue on digital libraries and the semantic web: Context, applications and research. Library Review, 57(3): 173-177. Doi: 10.1108/00242530810865457
Machado, R. & Pereira, A. M. (2017). Analysis of the RDA standard: an applied study on theses and dissertation in literature and cinema. RDBCI: Digital Journal of Library and Information, 15(1): 130-147. Doi: 10.20396/rdbci.v0i0.8645396
McCutcheon, S.Kreyche, M.Beecher Maurer, M. & Nickerson, J. (2008). Morphing metadata: maximizing access to electronic theses and dissertations. Library Hi Tech, 26 (1): 41-57. https://doi.org/10.1108/07378830810857799
Mirhoseini, Z. & Dastaran, M. (2019). Semantic Solutions for Digital Libraries Emphasizing on Semantic Web Standards and Technologies. Journal of Knowledge Studies, 12(44): 81-98. [In Persian]
Niknia, M. (2019). Implementation of CIDOC CRM for the Domain of Iranian Archaeology. PhD thesis, Kharazmi University. https://ganj.irandoc.ac.ir/#/articles/7d801a95b06c8290eb39819ea4a2eba0. [In Persian]
Niknia, M. N. & Emrani, S. E. (2014). Mapping and Valuation Iran MARC (Machine Readable Cataloging) Data Elements with FRBR Entities and User Tasks. Iranian Journal of Information Processing and Management, 29(2): 477-503. Doi: 10.35050/JIPM010.2014.043. [In Persian]
Norouzi, Y. & Khovidaki, S. (2014). Social Semantic Digital Libraries; Prospects for Digital Libraries in Iran. Rahyaft, 24(57): 68-92. [In Persian]
Nozari, S. (2023). (Research project). Designing Metadata Ontology Model of KIS Theses. National Library and Archives of Iran. https://opac.nlai.ir/opac-prod/search/bibliographicAdvancedSearchProcess.do. [In Persian]
Nozari, S. & Nozar, S. (2020). Researches on NLAI: Emphasis on Subject Trends and Determining their Belonging to the Departments of the Organization. National Library and Archives of I.R. of IRAN‪ Tehran. https://opac.nlai.ir/opac-prod/search/bibliographicAdvancedSearchProcess.do. [In Persian]
Nozari, S. (2023). What is Called Ontology: An Attempt for Equivalent Selection for a Concept. Iranian Journal of Information Processing and Management, 38(3): 745-781. Doi: 10.22034/jipm.2023.704395. [In Persian]
Nurmikko-Fuller, T., Jett, J., W. Cole, T., Maden, Ch., Page, K. R. & Downie, J. S. (2016). A Comparative Analysis of Bibliographic Ontologies: Implications for Digital Humanities. In Digital Humanities 2016: Conference Abstracts. Kraków: Jagiellonian University and Pedagogical University, 639-42.
 Nurmikko-Fuller, T., Jett, J., W. Cole, T., Maden, Ch., Page, K. R. & Downie, J. S. (2015). Bibliographic Ontologies Comparative Features Dataset. Champaign, IL: University of Illinois, http://hdl.handle.net/2142/88356.
Osborne, F., Peroni, S. & Motta, E. (2014). Clustering citation distributions for semantic categorization and citation prediction. In 4th Workshop on Linked Science, Making Sense Out of Data (LISC2014), 19-23 Oct 2014, Riva Del Garda, Trentino, Italy. http://linkedscience.org/events/lisc2014/.
Peponakis, M. (2013). Libraries’ metadata as data in the era of the semantic web: modeling a repository of master thesis and PhD dissertations for the web of data. Journal of LibraryMetadata, 13(4): 330-348.
Peroni, S. (2012). Semantic Publishing: issues, solutions and new trends in scholarly publishing within the Semantic Web era, [Dissertation thesis], Alma Mater Studiorum Università di Bologna. Dottorato di ricerca in Informatica, 24 Ciclo. Doi: 10.6092/unibo/amsdottorato/4766.
Peroni, S. & Shotton, D. (2012). FaBIO and CiTO: Ontologies for Describing Bibliographic Resources and Citations. Journal of Web Semantics First Look, Available at SSRN: https://ssrn.com/abstract=3198992 or http://dx.doi.org/10.2139/ssrn.3198992.
Peroni, S. & Shotton, D. (2018). The SPAR Ontologies. In Proceedings of the 17th International Semantic Web Conference (ISWC 2018), 119-136. Doi: https://doi.org/10.1007/978-3-030-00668-6_8.
Peroni, S., Shotton, D. & Vitali, F. (2012b). Scholarly publishing and linked data: describing roles, statuses, temporal and contextual extents. In Proceedings of the 8th International Conference on Semantic Systems, Graz, Austria, pp. 9-16.
Peroni, S., Shotton, D. & Vitali, F. (2012a). Faceted documents: describing document characteristics using semantic lenses. DocEng '12. Doi:10.1145/2361354.2361396.
Rakhshani, M. (2012). Describing Web Resources: RDA Compared with AACR2. KIS Master's thesis, Birjand University. [In Persian]
Sadigh Behzadi, M. & Hosseinnia. S. (2000). Dissertations abstracts of library and Information Science. Tehran: National Library and Archives of I.R. IRAN, Agricultural Information Sciences and Technology Center. [In Persian]
Santamaria, T., Tapia-Leon, M. & Chicaiza, J. (2021) Construction and Leverage Scientific Knowledge Graphs by Means of Semantic Technologies. In: Botto-Tobar M., Zamora W., Larrea Plúa J., Bazurto Roldan J., Santamaría Philco A. (eds) Systems and Information Sciences. ICCIS 2020. Advances in Intelligent Systems and Computing, 1273. Springer, Cham. https://doi.org/10.1007/978-3-030-59194-6_37.
Shahrabi Farahani, R. & Hashemi, S. S. (2019). Grey literature as Valuable Resources in National Library of Iran: from Organizing to Digitization. IFLA, WLIC, Athens, Greece, Fri, August 23.
Shotton, D. (2010). CiTO, the Citation Typing Ontology. Journal of Biomedical Semantics, 1(Suppl 1): S6. Doi: 10.1186/2041-1480-1-S1-S6
Shotton, D. & Peroni, S. (2010). Semantic annotation of publication entities, in Beyond the PDF Workshop.
Smith, B. (2003). Ontology. In Luciano Floridi (ed.), Blackwell Guide to the Philosophy of Computing and Information. Oxford: Blackwell. pp. 155-166.
Solomou, G. & Koutsomitropoulos, D. (2015). Towards an evaluation of semantic searching in digital repositories: a DSpace case-study. Program: electronic library and information systems, 49(1): 63–90. Doi:10.1108/prog-07-2013-0037
Stasinopolou, Th., Bointouri, L., Kakali, C., Lourdi, I., Papatheodorou, Ch., Doerr, M., Gergatsoulis, M. (2007). Ontology-Based Metadata Integration in the Cultural Heritage Domain. In: Goh D.HL, Cao T.H., Sølvberg I.T., Rasmussen E. (eds.) Asian Digital Libraries. Looking Back 10 Years and Forging New Frontiers. ICADL 2007. Lecture Notes in Computer Science, 4822. Springer, Berlin, Heidelberg.
Stuckenschmidt, H. & Harmelen, F. V. (2001). Ontology-based metadata generation from semi structured information. In Proceedings of the First International Conference on Knowledge Capture, (pp. 163-170), October 21-23, 2001, Victoria, BC, Canada.
 Tang, M.Chen, J.Chen, H.Xu, Z.Wang, Y.Xie, M. & Lin, J. (2020). An ontology-improved vector space model for semantic retrieval. The Electronic Library, 38 (5/6): 919-942. https://doi.org/10.1108/EL-04-2020-0081.
Taye, M. M. (2010). Understanding Semantic Web and Ontologies: Theory and Applications. ArXiv, abs/1006.4567. https://arxiv.org/abs/1006.4567.
Yousefi Rad, E. (2009). R.D.F.: A model for resource description in semantic web. Librarianship and Information Organization Studies, 20(3): 9-22. [In Persian]
Zerehsaz, M. & Pazooki, F. (2017). FRBR Library Reference Model (FRBR/LRM): Consolidation of FRBR Models in Connection with Linked Data Environment. Librarianship and Information Organization Studies, 28(3): 63-80. [In Persian]
CAPTCHA Image