نوع مقاله : مقاله پژوهشی
نویسنده
استادیار، علم اطلاعات و دانششناسی، گروه علوم داده، اطلاعات و هوش مصنوعی، سازمان اسناد و کتابخانه ملی جمهوری اسلامی ایران، تهران، ایران
چکیده
کلیدواژهها
موضوعات
عنوان مقاله [English]
نویسنده [English]
Purpose: Synonymy is one of the important features of natural languages. Since a single concept may be expressed by two or more lexical forms, and it is not predictable which lexical form of a single concept will be searched for, the retrieval system must be able to refer from all synonyms of the same idea to the document in which the concept is discussed. This research aimed to investigate the use of synonyms in non-preferred headings/ terms in Persian subject headings and Asfa Thesaurus, using Farsentas a comprehensive lexical source of the Persian language.
Method: This was an applied research in terms of its goals, and used content analysis as a general methodology, specifically Natural Language Processing techniques and tools to measure the extent to which synonyms are used to build non-preferred headings/ terms in both controlled vocabulary, by measuring the similarity of the two groups of data. 3270 main subject headings and 2020 main thesaurus terms were selected, in a purposive sampling procedure, from Persian Subject Headings, and Asfa Thesaurus, as two controlled vocabulary used in the process of compiling the Iran National Bibliography. Non-preferred headings/ terms related to each main heading/ term, as well as synonyms of each, were also extracted from Farsent. Reliability was obtained by repeating the extraction of a part of the headings/ terms by a second researcher with a score of 0.618 and 0.706 between zero and 1 respectively. The similarity between the two data sets of non-preferred headings/terms with the synonyms of main headings/ terms related to them in Farsnet was measured using Cosine Similarity.
Findings: In the sample taken from Persian subject headings, 2561 main subject headings (78.3%) have non-preferred headings that refer to them. 2316 main subject headings (70.8%) also have synonyms in Farsent. The similarity score between non-preferred headings and synonyms of the corresponding main headings was 0.125, thus very low. Also, in the sample taken from Asfa, 545 main terms in Asfa (about 27%) have non-preferred terms. 1376 terms (68%) of these main terms also have synonyms in Farsnet. Thus, 1475 main terms (73%) do not have non-preferred terms (which refer to the main term). The similarity score between non-preferred terms in the Asfa Thesaurus and the synonyms of the corresponding main terms was 0.131, very low as well.
Conclusion: More commitment to the construction and use of subject references in the form of non-preferred headings is observable in Persian Subject Headings, but a small number of referential headings and terms (non-preferred) have been selected from among the synonyms of main subjects/terms in the Persian language. This research recommends the introduction of synonyms of terms for all users, including catalogers and those involved in the creation of controlled vocabularies, both during the search for concepts and in the creation of terms, because it can be a step towards improving subject authority databases and, ultimately, a more exhaustive user subject search and retrieval experience.
کلیدواژهها [English]
ارسال نظر درباره این مقاله