Relation Extraction

۱.

RePersian - A Fast Relation Extraction Tool in Persian(مقاله علمی وزارت علوم)

نویسنده: رعنا صاحب نسق مجید عسگری بهروز مینایی بیدگلی

منبع: International Journal of Web Research, Volume ۲, Issue ۲,Autumn-Winter ۲۰۱۹ 15 - 22

کلیدواژه‌ها: Relation Extraction Persian Language Regex POS Tag

تعداد بازدید : ۴۸۶ تعداد دانلود : ۲۷۷

The task of extracting semantic relations from raw data is called relation extraction. One of the most important fields in open information extraction is the automatically extraction of relations in any domain, especially in web mining. There are many works and approaches for relation extraction in English and other languages. Some of these approaches are based on parsing trees. Dependency parsing in the Persian language is difficult and time-consuming, since Persian is a low resource language and has also a dependency grammar and lexical structure, which affects also the speed of relations extraction in Persian. In this paper we will introduce a fast relation extraction method in Persian called RePersian. RePersian is dependent on part-of-speech (POS) tags of a sentence and special relation patterns, which are extracted by analyzing sentence structures in Persian. For finding relation patterns, RePersian searches through POS-tags that are given in regular expression forms. By matching the correct POS pattern to a relation pattern, RePersian extracts the semantic relations in a sentence. We appraise RePersian in two different scenarios on the Dadegan Persian dependency tree dataset. RePersian had on average the precisions 78.05%, 80.4% and 54.85% in finding the first argument on a relation, the second argument and the right relation between them.

۲.

A Distant Supervised Approach for Relation Extraction in Farsi Texts(مقاله علمی وزارت علوم)

نویسنده: شیرین عطارد علیرضا یاری

منبع: International Journal of Web Research, Volume ۳, Issue ۲,Autumn-Winter ۲۰۲۰ 1 - 8

کلیدواژه‌ها: Relation Extraction Information Extraction Distant Supervision Persian Wikipedia

تعداد بازدید : ۵۷۸ تعداد دانلود : ۳۰۱

The volume of Farsi information on the Internet has been increasing in recent years. However, most of this information is in the form of unstructured or semi-structured free text. For quick and accurate access to the vast knowledge contained in these texts, the information extraction methods are essential to generate knowledge bases. In recent years, relation extraction as a sub-task of information extraction has received much attention. While many of these systems were developed in English and other well-known languages, the systems for information extraction in Farsi have received less attention from researchers. In this systematic research for semi-automatic relation extraction, Persian Wikipedia articles were presented as reliable and semi-structured sources. In this system, the relation extraction is performed with the assistance of patterns that are automatically obtained with an approach based on distant supervised. In order to apply the distant supervised, the vast knowledge base of Wikidata has been used as a source in perfect synchronization with Wikipedia. The results show that the average precision value for all relations is 76.81%, which indicates an enhancement of precision compared to other methods in Farsi.