مطالب مرتبط با کلیدواژه

topic modeling


۱.

Topic Modeling and Classification of Cyberspace Papers Using Text Mining

کلیدواژه‌ها: Cyberspace text mining trend discovery topic modeling

حوزه های تخصصی:
تعداد بازدید : ۶۵۸ تعداد دانلود : ۳۳۸
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspace is an umbrella term that covers all issues occurring through the interaction of information systems and humans over these networks. Deep evaluation of the scientific articles on the cyberspace domain provides concentrated knowledge and insights about major trends of the field. Text mining tools and techniques enable the practitioners and scholars to discover significant trends in a large set of internationally validated papers. This study utilizes text mining algorithms to extract, validate, and analyze 1860 scientific articles on the cyberspace domain and provides insight over the future scientific directions or cyberspace studies.
۲.

A Movie Recommender System Based on Topic Modeling using Machine Learning Methods(مقاله علمی وزارت علوم)

تعداد بازدید : ۱۶۰ تعداد دانلود : ۹۷
In recent years, we have seen an increase in the production of films in a variety of categories and genres. Many of these products contain concepts that are inappropriate for children and adolescents. Hence, parents are concerned that their children may be exposed to these products. As a result, a smart recommendation system that provides appropriate movies based on the user's age range could be a useful tool for parents. Existing movie recommender systems use quantitative factors and metadata that lead to less attention being paid to the content of the movies. This research is motivated by the need to extract movie features using information retrieval methods in order to provide effective suggestions. The goal of this study is to propose a movie recommender system based on topic modeling and text-based age ratings. The proposed method uses latent Dirichlet allocation (LDA) modelling to identify hidden associations between words, document topics, and the levels of expression of each topic in each document. Machine learning models are then used to recommend age-appropriate movies. It has been demonstrated that the proposed method can determine the user's age and recommend movies based on the user's age with 93% accuracy, which is highly satisfactory.
۳.

Contextualized Text Representation Using Latent Topics for Classifying Scientific Papers(مقاله علمی وزارت علوم)

کلیدواژه‌ها: Article Content Analysis Contextualized Representation Distributional Semantics Neural Network Scientific Article Classification topic modeling

حوزه های تخصصی:
تعداد بازدید : ۹۱ تعداد دانلود : ۴۵
Annually, researchers in various scientific fields publish their research results as technical reports or articles in proceedings or journals. The collocation of this type of data is used by search engines and digital libraries to search and access research publications, which usually retrieve related articles based on the query keywords instead of the article’s subjects. Consequently, accurate classification of scientific articles can increase the quality of users’ searches when seeking a scientific document in databases. The primary purpose of this paper is to provide a classification model to determine the scope of scientific articles. To this end, we proposed a model which uses the enriched contextualized knowledge of Persian articles through distributional semantics. Accordingly, identifying the specific field of each document and defining its domain by prominent enriched knowledge enhances the accuracy of scientific articles’ classification. To reach the goal, we enriched the contextualized embedding models, either ParsBERT or XLM-RoBERTa, with the latent topics to train a multilayer perceptron model. According to the experimental results, overall performance of the ParsBERT-NMF-1HT was 72.37% (macro) and 75.21% (micro) according to F-measure, with a statistical significance compared to the baseline (p<0.05).