Pre-processing

۱.

Sentiment Analysis of Tweets Using Supervised Machine Learning Techniques Based on Term Frequency(مقاله علمی وزارت علوم)

نویسنده: Deepti Aggarwal Vikram Bali Abhishek Agarwal Kshitiz Poswal Madhav Gupta Abhishek Gupta

منبع: Journal of Information Technology Management , Volume ۱۳, Issue ۱, ۲۰۲۱ 119 - 141

کلیدواژه‌ها: Feature representation TFIDF N-grams Pre-processing Tokenization Word Cloud

حوزه‌های تخصصی:

حوزه‌های تخصصی مدیریت مدیریت دانش و IT

تعداد بازدید : ۵۹۲ تعداد دانلود : ۲۴۵

World of technology provides everyone with a great outlet to give their opinion, using social media like Twitter and other platforms. This paper employs machine learning methods for text analysis to obtain sentiments of reviews by the people on twitter. Sentiment analysis of the text uses Natural language processing, a machine learning technique to tell the orientation of opinion of a piece of text. This system extracts attributes from the piece of writing such as a) The polarity of text, whether the speaker is criticizing or appreciating, b) The topic of discussion, subject of the text. A comparison of the work done so far on sentiment analysis on tweets has been shown. A detailed discussion on feature extraction and feature representation is provided. Comparison of six classifiers: Naïve Bayes, Decision Tree, Logistic Regression, Support Vector Machine, XGBoost and Random Forest, based on their accuracy depending upon type of feature, is shown. Moreover, this paper also provides sentiment analysis of political views and public opinion on lockdown in India. Tweets with ‘#lockdown’ are analysed for their sentiment categorically and a schematic analysis is shown.

۲.

HFC: Towards an Effective Model for the Improvement of heart Diagnosis with Clustering Techniques(مقاله علمی وزارت علوم)

نویسنده: رضیه اصغرنژاد karrar Ali Mohsin Alhameedawi

منبع: International Journal of Web Research, Volume ۴, Issue ۲,Autumn-Winter ۲۰۲۱ 16 - 24

کلیدواژه‌ها: Data mining Pre-processing Heart disease Clustering machine learning techniques

تعداد بازدید : ۳۸۷ تعداد دانلود : ۲۷۱

Heart disease pretends great danger to people, as heart disease has recently become a dangerous disease that acts as a threat to humans. It usually affects all groups from young to old. The biggest challenge in this paper is data pre-processing and discovering a solution to the failure of records Clinical heart, where an effective high-performance model is proposed to enhance heart disease and treat failure in the clinical heart failure records. The current authors applied the techniques of clustering with k-means, expectation-maximization clustering, DBSCAN, support vector clustering, and random clustering herein. Using cluster techniques, we gained good enough results for significantly predicting and improving the performance of heart disease. The goal of the model is a suggestion of a reduction method to find features of heart disease by applying several techniques. Our most important results are to predict faster and better. It indicates that the proposed model is excellent and gives excellent results. This model demonstrated a great superiority over its counterparts through the results obtained in this research. We obtained some values of 130, 980, 183, 125.133, 133, 203, and 125.800. It confirms that this model will predict significantly and improve the performance of the data that we have worked on this.