ارائه ی مدل پیش بینی کننده تحلیل احساسات کاربران از شهر مبتنی بر شبکه ی اجتماعی توئیتر؛ نمونه مطالعاتی: کلان شهرهای ایران (مقاله علمی وزارت علوم)
درجه علمی: نشریه علمی (وزارت علوم)
آرشیو
چکیده
تحلیل احساسات کاربران از طریق شبکه های مجازی، به حوزه ای موثر در علوم مختلف تبدیل شده و مخاطبان آن نه تنها صاحبان شرکت ها و سیاست مدارن، بلکه کاربران هستند. در این میان این حوزه در مطالعات شهری هم نفوذ کرده و به دلیل روش مندی آن؛ چه در قالب پژوهش هایی که صرفاً تحلیل احساس را هدف خود قرار داده اند و چه به صورت لایه ای تلفیقی در پژوهش ها مورد استفاده برنامه ریزان و طراحان شهری قرار گرفته است. مقاله ی پیش رو با هدف تبیین این حوزه در تحلیل احساسات شهری در قالب روش های مدل گرا بر آن است تا با بررسی اهمیت احساس و روش های مطرحِ بررسی آن در شهر، جایگاه این حوزه را در مطالعات شهری نشان دهد و در ادامه به آموزش ماشین برای ارائه ی مدل پیش بینی کننده برای تحلیل احساسات شهر بپردازد. مجموعه ی داده های این پژوهش مربوط به 8 کلان شهر ایران است که از توئیتر استخراج شده و تحلیل داده های متنی مورد توجه قرار گرفته است. به منظور آموزش ماشین برای تحلیل احساسات از یادگیری ماشین و یادگیری عمیق بهره برده شده و نتایج آنها با هم مقایسه شده است. الگوریتم های مورد استفاده در یادگیری ماشین، ماشین بردار پشتیبان، رگرسیون لجستیک و درخت تصمیم بوده و در یادگیری عمیق، ماشین با استفاده از شبکه ی عصبی و شبکه ی هیبریدی آموزش و تست شده است. براساس نتایج یادگیری عمیق برای پیش بینی احساسات و قطبیت متن در کلان شهرهای ایران بهتر عمل کرده و دقتی برابر با 80 داشته است.A predictive model for analyzing users’ perception of the cities based on the Twitter; Case studies: Iran metropolitans
Extended Abstract Background and Objectives: The examination of users’ emotions through social media has developed into an impactful domain across diverse scientific fields, appealing not only to business proprietors and politicians but also to general users. In the meantime, this field has infiltrated urban studies and has been used by urban planners and designers due to its methodology; whether in the form of research that aims solely at emotion analysis or as an integrated layer within broader research endeavors. The aim of this article is to explain this field in the analysis of urban emotions as modeling methods in order to identify the position of this field in urban studies by examining the importance of emotion and the methods of its study in the city. Methods: This research used the supervised machine learning approach and analyzed the sentiments of tweets related to eight major cities in Iran. The data collection consists of 930 tweets that were collected in a period of 10 years from 2011 to 2022. Initially, over 5000 tweets were collected, and during the tagging process, 80% of them were excluded due to their limited relevance to the city, emphasizing tweets related to urban space. The name of cities and tourist areas were searched to establish a balance between positive and negative data. The tweets are downloaded through Twitter streaming API and the metadata along with the text, including the number of retweets, number of likes and tweet ID, language and location. The data sets have been used for machine training after standard and normalization steps. In this research, the ratio of training data to testing data is 80 to 20. According to the supervised approach, the data were labeled by the researcher with three negative, neutral, and positive labels, and where the researcher had doubts, the opinions of two other experts were used. In general, both machine learning and deep learning have been used. In order to check the validity of the model and to test it, the confusion matrix has been used. Findings: Firstly, the machine was trained based on 3 algorithms that were used in many research related to text sentiment analysis. Based on the test results presented on the confusion matrix, the accuracy of the trained machine in determining the polarity of the text in three polarities was defined. Among the three used algorithms, support vector machine and random forest have performed better than other algorithms. Given that the model’s highest accuracy was approximately 70%, deep learning was employed to train the machine in order to assess the potential for achieving improved results. In the following, machine learning with a convolutional neural network algorithm and a hybrid algorithm were considered. At first, the machine was trained using a convolutional neural network. The results of the accuracy of the model showed that the model is predictable by up to 75%. Next, an attempt was made to improve the predictive accuracy of the model by writing a hybrid algorithm based on the convolutional neural network. The architecture of this network is such that two types of data are considered as input to the neural network, text data and other features in the data set, including location, number of retweets, number of likes, city codes and searched content (as metadata). Therefore, based on this input and output (classification based on the polarity of the text by the researcher), the machine was trained and finally tested. As depicted in the structure of the hybrid algorithm, the significance of the text is assigned a weight of 90%, while the importance of metadata is weighted 10%. It should be noted that different percentages were given to the importance of each of the inputs and the predictability accuracy of the model was checked. As the model test results show, the designed algorithm has improved the predictability of the machine by 4%. Conclusion: In this article, sentiment analysis based on model-oriented methods - mac