مطالب مرتبط با کلیدواژه

text mining


۱.

Topic Modeling and Classification of Cyberspace Papers Using Text Mining

کلیدواژه‌ها: Cyberspace text mining trend discovery topic modeling

حوزه های تخصصی:
تعداد بازدید : ۶۵۹ تعداد دانلود : ۳۳۹
The global cyberspace networks provide individuals with platforms to can interact, exchange ideas, share information, provide social support, conduct business, create artistic media, play games, engage in political discussions, and many more. The term cyberspace has become a conventional means to describe anything associated with the Internet and the diverse Internet culture. In fact, cyberspace is an umbrella term that covers all issues occurring through the interaction of information systems and humans over these networks. Deep evaluation of the scientific articles on the cyberspace domain provides concentrated knowledge and insights about major trends of the field. Text mining tools and techniques enable the practitioners and scholars to discover significant trends in a large set of internationally validated papers. This study utilizes text mining algorithms to extract, validate, and analyze 1860 scientific articles on the cyberspace domain and provides insight over the future scientific directions or cyberspace studies.
۲.

An Investigation on the User Behavior in Social Commerce Platforms: A Text Analytics Approach(مقاله علمی وزارت علوم)

کلیدواژه‌ها: Social commerce Tripadvisor social media text mining Data mining

حوزه های تخصصی:
تعداد بازدید : ۳۲۵ تعداد دانلود : ۱۴۰
Nowadays, the tourism industry accounts for approximately 10% of the global GDP, while it only contributes 3% of the economy in Iran. Since the pressure of US sanctions increases day after day on the Iranian economy, the necessity of paying attention to this industry as a source of foreign currency is felt more than ever. The purpose of this research is to analyze the reviews of users of social commerce websites by using a combination of text mining and data mining techniques. For this purpose, the database of TripAdvisor website (TripAdvisor.com) was evaluated, and all profile information of users who commented on hotels in Iran was collected. These comments on all the content of the website, such as hotels, restaurants, and attractions, were then extracted and analyzed. The optimal number of clusters was considered four clusters by calculating the Davies-Bouldin index, namingly water therapy tourists, boutique hotels style and Iran urban tourists, travelholics and food tourists, business and health tourists. Every single cluster possesses unique attributes and features. Afterward, the association rules were further identified for each cluster according to the characteristics of each cluster and the information in the users' profiles. Finally, a solution is proposed to increase the participation of the users on the website, and targeted promotional plans are expressed in accordance with the well-known features of each cluster.
۳.

Big Data Quality: From Content to Context(مقاله علمی وزارت علوم)

نویسنده:

کلیدواژه‌ها: Big Data Big data quality Data quality text mining

حوزه های تخصصی:
تعداد بازدید : ۱۲۱ تعداد دانلود : ۸۸
Over the last 20 years, and particularly with the advent of Big Data and analytics, the research area around Data and Information Quality (DIQ) is still a fast growing research area. There are many views and streams in DIQ research, generally aiming at improving the effectiveness of decision making in organizations. Although there are a lot of researches aimed at clarifying the role of BIG data quality for organizations, there is no comprehensive literature review that shows the main differences between traditional data quality researches and Big Data quality researches. This paper analyzed the papers published in Big data quality and find out that there is almost no new mainstream about Big Data quality. It is shown in this paper that the main concepts of data quality does not changes in Big Data context and that only some new issues have been added to this area.
۴.

Text Analytics of Customers on Twitter: Brand Sentiments in Customer Support(مقاله علمی وزارت علوم)

کلیدواژه‌ها: Brand community Sentiment Analysis text mining Twitter Customer support

حوزه های تخصصی:
تعداد بازدید : ۳۴۱ تعداد دانلود : ۱۲۴
Brand community interactions and online customer support have become major platforms of brand sentiment strengthening and loyalty creation. Rapid brand responses to each customer request though inbound tweets in twitter and taking proper actions to cover the needs of customers are the key elements of positive brand sentiment creation and product or service initiative management in the realm of intense competition. In this research, there has been an attempt to collect near three million tweets of inbound customer requests and outbound brand responses of international enterprises for the purpose of brand sentiment analysis. The steps of CRISP-DM have been chosen as the reference guide for business and data understanding, data preparation, text mining, validation of results as well as the final discussion and contribution. A rich phase of text pre-processing has been conducted and various algorithms of sentiment analysis were applied for the purpose of achieving the most significant analytical conclusions over the sentiment trends. The findings have shown that the sentiment of customers toward a brand is significantly correlated with the proper response of brands to the brand community over social media as well as providing the customers with a deep feeling of reciprocal understanding of their needs in a mid-to-long range planning.
۵.

Graph-Based Extractive Text Summarization Models: A Systematic Review(مقاله علمی وزارت علوم)

کلیدواژه‌ها: Natural Languages Processing text mining Graph approaches

حوزه های تخصصی:
تعداد بازدید : ۲۰۲ تعداد دانلود : ۱۰۴
The volume of digital text data is continuously increasing both online and offline storage, which makes it difficult to read across documents on a particular topic and find the desired information within a possible available time. This necessitates the use of technique such as automatic text summarization. Many approaches and algorithms have been proposed for automatic text summarization including; supervised machine learning, clustering, graph-based and lexical chain, among others. This paper presents a novel systematic review of various graph-based automatic text summarization models.
۶.

Social Media Toxic Content Filtering System using SOIR Model(مقاله علمی وزارت علوم)

تعداد بازدید : ۷۹ تعداد دانلود : ۶۳
Social media is a popular data source in the research community. It provides different opportunities to design practical applications to favor humanity and society. A significant amount of people consumes social media content. Thus, sometimes content promoters and influencers publish misleading and toxic content. Therefore, this paper proposes an unhealthy content filtering system using the information retrieval model SOIR to identify and remove poisonous content from social media. The Semantic query Optimization-based Information Retrieval (SOIR) uses Fuzzy C Means (FCM) clustering to produce a particular data structure. To incorporate a query generation technique for the generation of multiple queries to increase the probability of correct outcomes. The SOIR model is modified in this work to utilize the model with the social media toxic content filtering model. The model uses linguistic and semantically information to craft new feature sets. The Part of Speech (POS) tagging is used to construct the linguistic feature. Finally, the pattern-matching algorithm is designed to classify the tweets as toxic or nontoxic. Based on lexical and semantic analysis of similar semantic queries (Tweets), it is identified with the class labels of the tweets. Twitter text posts are used to create training and test samples in this context. Here, a total of 2002 tweets are used for the experiment. The experimental study has been carried out with the different I.R. models (K-NN, Cosine) based on precision, recall, and F1-Score demonstrating the superiority of the proposed classification model
۷.

Predicting Court Judgment in Criminal Cases by Text Mining Techniques(مقاله علمی وزارت علوم)

کلیدواژه‌ها: Legal Judgment Prediction text mining Sentiment Analysis Emotions Analysis Machine Learning

حوزه های تخصصی:
تعداد بازدید : ۱۶۷ تعداد دانلود : ۱۲۴
What is clear is that judges usually judge cases based on their knowledge, experience, personality, and sentiment. Due to high pressures and stress, it may be difficult for them to carefully examine documents and evidence, which leads to more subjective judgments. Legal judgment prediction with artificial intelligence algorithms can benefit judicial bodies, legal experts, and litigants as well as judges. In this research, we are looking at predicting legal sentences in drug cases involving the purchase, possession, concealment, or transportation of illicit drugs, using machine learning methods, and the effect of sentiment and emotions in case texts on predicting the severity of whipping, fines, and imprisonment. So, the text documents of 6000 Persian drug-related cases were pre-processed and then the translation of the NRC Glossary of Emotions and sentiment was used to give each item a score for positive or negative sentiment and a score for emotion. Then machine learning methods were used for modeling. BERT, TFIDF+Adaboost, and Skipgram+LSTM+CNN methods had the highest accuracy, respectively. Also, evaluation criteria were analyzed in situations where sentiment scores, emotional scores, or both were used in the prediction process along with judicial texts. Finally, it was found that the use of sentiment and emotion scores improves the accuracy of legal judgment predictions for all three types of sentences and that sentiments have a greater impact on the accuracy of legal judgment predictions than emotions
۸.

Quantitative Evaluation and Comparative Study of the Chinese Curriculum Standards for Physical Education and Health in Compulsory Education(مقاله علمی وزارت علوم)

نویسنده:

کلیدواژه‌ها: Curriculum Standards PMC Index Model Policy Evaluation text mining

حوزه های تخصصی:
تعداد بازدید : ۷۶ تعداد دانلود : ۵۱
It is of great importance to compare the old and the new Chinese Curriculum Standards for Physical Education and Health in Compulsory Education to observe what has changed and what has not for the better enforcement of the updated standards. This study combines text mining and content analysis to quantitatively evaluate the texts of the 2011 and the 2022 Curriculum Standards for Physical Education and Health in Compulsory Education by constructing a Policy Model Consistency Index. The results showed that the two editions had a high degree of consistency, which is in compliance with the standard design. However, structural imbalances were noticed in the effectiveness of the 2022 Edition, which needs reconsideration of the curriculum difficulty, enhancements in the curriculum implementation, and improvements in the effectiveness of curriculum objects. Suggestions are also provided to address these issues in the hope of shedding light on related areas for front-line educators.