بررسی درستی استناد مناجات خمس عشر به امام سجاد (ع) براساس تکنیک های سبک سنجی (مقاله علمی وزارت علوم)
درجه علمی: نشریه علمی (وزارت علوم)
آرشیو
چکیده
با پیشرفت علم و تکنولوژی، دیگر وجود آثار با نویسنده مشکوک، پذیرفتنی نیست. سبک سنجی روشی است که با استفاده از تجزیه و تحلیل آماری، نویسنده آثار ادبی را تعیین می کند. روش های انتساب نویسنده، عمدتاً متکی بر شناسایی نویسنده براساس سبک نوشتاری هستند؛ با این پیش فرض که سبک نوشتاری هر شخص، دارای ویژگی های منحصربه فردی است. تشخیص هوشمند نویسنده در زمینه هایی مانند: سرقت ادبی، جرم شناسی و تعیین نویسنده نامشخص، کاربرد دارد. به دلیل آنکه عوامل بسیاری در تشخیص نویسنده متون، دخیل هستند، تاکنون روشی با دقت 100 درصد ارائه نشده است و پژوهشگران همچنان در تلاش هستند روشی را ارائه کنند که خطای محاسباتی را به حداقل ممکن برسانند. یکی از روش هایی که ادعاشده از دقت خوبی برخودار است نظریه یول است. در جستار حاضر، نظریه یول و چهار نظریه دیگر در زمینه غنای واژگانی، ترکیب شده اند و با بهره گیری از روش توصیفی تحلیلی و تبیین داده های آماری، مقایسه ای میان غنای واژگانی مناجات خمس عشر و دعاهای صحیفه سجادیه انجام شده تا درستی انتساب این مناجات به امام سجاد (ع) مورد بررسی قرار گیرد. نتایج تحقیق، بیانگر دقت بالای محاسبات انجام شده و نیز عدم وابستگی خروجی نظریه ها به طول متن است. همچنین به دلیل عدم وجود اختلاف قابل ملاحظه، میان غنای واژگانی مناجات خمس عشر و دعاهای صحیفه سجادیه، انتساب این مناجات به امام سجاد (ع) تأیید می شود.Investigating the Correctness of Attributing the MunajatKhams 'Ashar to Imam Sajjad (PBUH) Based on Stylometry Techniques
Advances in science and technology have made it no longer acceptable to have works with a dubious author. Stylometry is a method that uses statistical analysis to determine the author of a literary work. Author attribution methods rely heavily on writing style; assuming that each person has unique style. Author identification is used in areas such as plagiarism, criminology, and unspecified author identification. Due to the fact that many factors are involved in identifying the author of texts, a method with 100% accuracy has not been presented so far, and researchers are still trying to find a way to minimize computational errors. One of the methods that is claimed to have good accuracy is Yule’s theory. In this article, Yule's theory and four other theories have been combined to compare the vocabulary richness of the Munajat Khams 'Ashar and the prays of Al-Sahifa al-Sajjadiyya. Then, Using descriptive-analytical method and explanation of statistical datas, the correctness of the attribution of Munajat Khams 'Ashar to Imam Sajjad (PBUH) has been investigated. The results show the high accuracy of the calculations and the independence of the output of the theories to the length of the text. Also, due to the slight difference between the vocabulary richness of the Munajat Khams 'Ashar and the prays of Al-Sahifa al-Sajjadiyya, its attribution to Imam Sajjad (PBUH) is confirmed.
1. Introduction
The issue of attributing a text to someone who did not really write it, has always been the focus of researchers. With the advancement of science in the twentieth century, the need to prove the accuracy of attributing a text to a particular author has intensified, and with the advancement of information technology, the popularity of intelligent methods of author recognition has increased. Today, to identify the author of a text, various methods are used, one of the most important methods is study the writing style.
The study of writing style is a subset of the new rhetoric. The new rhetoric aims at adding formal logic a field of reasoning, and applies whenever action is linked to rationality (Perelman, 1971). In stylistics, using text reasoning and analysis, characteristics are considered for the author's style.
A variety of methods for attribution have been proposed. There are three main approaches: lexical methods, syntactic or grammatic methods, and language-model methods, including methods based on compression (Zhao & Zobel, 2005). In this article, the lexical method will be used. One of the most practical lexical methods to achieve the author's style is the "vocabulary richness" method. Unfortunately, the output of many methods depends on the length of the text. Therefore, a method should be used that has the least dependence on the length of the text. In this paper, we have combined five theories to calculate vocabulary richness to achieve the most accurate results.
Research Question(s)
1. How accurate and reliable are the results of the five equations used in this research?
2. How much does the output of the theories depend on the length of the text?
3. What is the difference between the vocabulary richness of Munajat Khams 'Ashar and the prays of Al-Sahifa al-Sajjadiyya?
2. Literature Review
Authorship attribution (AA) is the process of attempting to identify the likely authorship of a given document, given a collection of documents whose authorship is known (Bozkurt et al., 2007). The accepted assumption behind AA is that every author writes in a distinct way; some writing characteristics cannot be manipulated by the writer’s will, and therefore can be identified by an automated process (Howedi & Mohd, 2014).
One of the fundamental sub-problems of AA is the extraction of the most suitable features to represent the writing style of each author. This problem is known as “stylometry” (Howedi et al., 2020, p. 1334). stylometry is defined as those techniques that allow measure the style of an author by the identification of its features of style (stylemas). Those stylemas, also called style markers, are obtained from textual measurements normally calculated by statistical methods (Escobedo et al., 2013, Stamatatos, 2009).
Some researchers have used a combination of some lexical richness functions to achieve better results, namely: K proposed by Yule (1944), R proposed by Honore (1979), W proposed by Brunet (1978), S proposed by Sichel (1975), and D proposed by Simpson (1949) which are defined as follows (Stamatatos et al., 2000):
where:
Vi : is the number of words used exactly i times
N: Total number of words
V: Number of non-repetitive words
α: usually is fixed at 0.17
The final output for calculating vocabulary richness is obtained by combining these five equations.
Since the series of narrators and the document of Munajat Khams 'Ashar is not mentioned completely in the available sources, attributing it to Imam Sajjad (PBUH) needs to be proved, so in this research, using stylometry techniques, it is examined.
3. Methodology
In the present article, the correctness of attributing Munajat Khams 'Ashar to Imam Sajjad (PBUH) is examined by sampling the prays of Al-Sahifa al-Sajjadiyya and comparing his vocabulary richness with the Munajat Khams 'Ashar. Since, according to the claim, the output of the theories is not dependent on the length of the text, two statistical populations are selected: the first consists of prays which 80 words have been selected, and the second consists of prayers With different number of words; Therefore, in addition to comparing the vocabulary richness of the samples, the dependence of the equations on the length of the text will also be examined. Also, From Munajat Khams 'Ashar, we chose the first, fifth, tenth and fifteenth prays as samples.
4. Results
The results show that:
1. The accuracy of the calculations is very high and therefore the output of the theories is reliable.
2. The output of the theories was not dependent on the length of the text and did not increase in proportion to the increase in the number of words.
3. There is not much difference between the vocabulary richness of Munajat Khams 'Ashar and the prays of Al-Sahifa al-Sajjadiyya in both statistical populations; Therefore, the correctness of attributing the Munajat Khams 'Ashar to Imam Sajjad (PBUH) - from the perspective of stylometry techniques - is proved.