Danışmanlı ve yarı danışmanlı öğrenme kullanarak doküman vektörleri tabanlı tweetlerin duygu analizi

Metin Bilgin, İzzet Fatih Şentürk

Öz


İnternetin günlük hayatımızdaki artan kullanımı ile beraber sosyal medya organlarının gelişimi de paralellik göstermektedir. Mikroblog adı verilen facebook ve twitter benzeri uygulamaları ile anlık duyguları ve düşünceleri ifade etmek son derece yaygın bir hale gelmiştir. Mikroblog sitelerinin en yaygın kullanıma sahip olanlarından birisi de Twitter uygulamasıdır. Twitter üzerinden paylaşılan mesajlar bir ürün ya da hizmet hakkında olabileceği gibi bir kişiyle ilgili bir yorumda olabilmektedir. Yapılan yorumun belirtmek istediği anlamı ve duyguyu belirleyebilmek son dönemdeki gözde konulardan biridir. Bir ürün ya da hizmet hakkında yapılan binlerce yorumun tek tek okunup anlamlandırılması ve yorumlayanların fikirlerinin sınıflandırılması geleneksel yöntemlerde oldukça zaman ve emek alan bir alandır. Gerek makine öğrenmesi ve derin öğrenme algoritmalarındaki gelişmeler gerekse de bunları işleyip yorumlayacak bilgisayar sistemlerinin gelişimine parallel olarak milyonlarca veri üzerinde duygu sınıflandırılması mümkün hale gelmiştir. Gerçekleştirdiğimiz çalışmada Türkçe ve İngilizce tivitler üzerinde duygusal sınıflandırma çalışması gerçekleştirilmiştir. Döküman vektörleri (Doc2Vec) kullanılarak yapılan çalışmada hem DBoW ve DM gibi iki farklı döküman vektörü yönteminin çalışması hemde Yarı Danışmanlı ve Danışmanlı öğrenmenin etkileri araştırılmıştır. Çalışma sonuçları doğruluk, kesinlik, anma, özgünlük ve F-ölçütü metrikleri ile raporlanmıştır.  Gerçekleştirilen çalışma sonucunda Yarı Danışmanlı öğrenme yöntemi hem Türkçe hemde İngilizce veri kümesinde Danışmanlı öğrenmeye göre daha başarılı sonuçlar elde etmiştir.

Tam Metin:

PDF

Referanslar


. Go, A., Huang, Lei, and Bhayani,R.. "Twitter sentiment analysis”. Entropy, 17 (2009).

. Bollen, J., Huina M., and Xiaojun Z., "Twitter mood predicts the stock market". Journal of Computational Science, 2.1 (2011): 1-8.

. Prabowo, R. and Thelwall, M., "Sentiment analysis: A combined approach". Journal of Informetrics, 3.2 (2009): 143-157.

. Akgül, E.S., Ertano,C. ve Diri, B., "Twitter verileri ile duygu analizi". Pamukkale University Journal of Engineering Sciences, 22(2), (2016): 106-110.

. Szomszor, M. N., Patty Kostkova, and Ed De Quincey. "# Swineflu: Twitter predicts swine flu outbreak in 2009". 3rd International ICST Conference on Electronic Healthcare for the 21st Century. 2012.

. Bian J, Topaloglu U, Yu F. “Towards large-scale Twitter mining for drug-related adverse events”. International Workshop on Smart Health and Wellbeing (SHB’12), Maui, Hawaii, USA, 29 October-2 November 2012.

. Nguyen LE, Wu P, Chan W, Peng W, Zhang Y. “Predicting collective sentiment dynamics from time-series social media”. Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM ’12), Beijing, China, 12 August 2012.

. Claster WB, Dinh H, Cooper M. “Naive bayes and unsupervised artificial neural nets for Cancun tourismsocial media data analysis”. 2nd World Congress on Nature and Biologically Inspired Computing (NaBIC). Kitakyushu, Fukuoka, Japan, 15-17 December 2010.

. Liu Y, Huang X, An A, Yu X. “ARSA: A sentiment awaremodel for predicting sales performance using blogs”. 30th ACM SIGIR International Conference on Research and Development in Information Retrieval, Amsterdam, the Netherlands, 23-27 July 2007.

. Asur S, Huberman BA. “Predicting the Future with Social Media”. IEEE/WIC/ACM international conference on web intelligence and intelligent agent technology (WI-IAT), Toronto, ON, Canada, 31 August-3 September 2010.

. Joshi M, Das D, Gimpel K, Smith NA. “Movie reviews and revenues: an experiment in text regression”. Human Language Technologies: The 11th Annual Conference of the North American Chapter of the Association for Computational Linguistics (NAACL-HLT), Los Angeles, CA, USA, 1-6 June 2010.

. Bollen J, Mao H, Zeng X. “Twitter mood predicts the stock market”. Journal of Computational Science, 2(1), 1-8, 2011.

. Pang, Bo, Lillian Lee, and Shivakumar Vaithyanathan. "Thumbs up?: sentiment classification using machine learning techniques." Proceedings of the ACL-02 conference on Empirical methods in natural language processing-Volume 10. Association for Computational Linguistics, 2002.

. Pang, Bo, and Lillian Lee. "A sentimental education: Sentiment analysis using subjectivity summarization based on minimum cuts." Proceedings of the 42nd annual meeting on Association for Computational Linguistics. Association for Computational Linguistics, 2004.

. Whitelaw C, Garg N, Argamon S. “Using appraisal groups for sentiment analysis”. 14th ACM International Conference on Information and Knowledge Management (CIKM), Bremen, Germany, 31 October-5 November 2005.

. Yassenalina A, Yue Y, Cardie C. “Multi-Level structured models for document-level sentiment classification”. Conference on Empirical Methods in Natural Language Processing (EMNLP), Boston, MA, USA, 9-11 October 2010.

. Matsumoto S, Takamura H, Okumura M. “Sentiment classification using word sub-sequences and dependency sub-trees”. 9th Pacific-Asia Conference on Knowledge Discovery and Data Mining (PAKDD), Hanoi, Vietnam, 18-20 May 2005.

. Tan S, Zhang J. “An empirical study of sentiment analysis for Chinese document”. Expert Systems with Applications, 34(4), 2622-2629, 2008.

. Qui G, He X, Zhang F, Shi Y, Bu J, Chen C. “DASA: dissatisfaction-oriented advertising based on sentiment analysis”. Expert Systems with Application, 37(9), 6182-6191, 2010.

. Bai X. “Predicting consumer sentiments from online text”. Decision Support Systems, 50(4), 732-742, 2011.

. Chen CC, Tseng YD. “Quality evaluation of product reviews using an information quality framework”. Decision Support Systems, 50(4), 755-768, 2011.

. Xia R, Zong C, Li S. “Ensemble of feature sets and classification algorithms”. Information Sciences, 181(6), 1138-1152, 2011.

. Kang H, Yoo SJ, Han M. “Senti-Lexicon and improved Naïve Bayes algorithms for sentiment analysis of restaurant reviews”. Expert Systems with Applications, 39(5), 6000-6010, 2012.

. Li YM, Li TY. “Deriving Market intelligence from microblogs”. Decision Support Systems, 55(1), 206-217, 2013.

. Moraes R, Valiati JF, Neto WPG. “Document-Level sentiment classification: an empirical comparison between SVM and ANN”. Expert Systems with Applications, 40(2), 621-633, 2013.

. Wang G, Sun J, Ma J, Xu K, Gu J. “Sentiment classification: the contribution of ensemble learning”. Decision Support Systems, 57, 77-93, 2014.

. Chalothom T, Ellman J. Simple Approaches of Sentiment Analysis via Ensemble Learning. Editor: Kim KJ. Information Science and Applications, 631-639, Berlin, Germany, Springer, 2015.

. Zheng L, Wang H, Gao S. “Sentimental feature selection for sentiment analysis of Chinese online reviews”. International Journal of Machine Learning and Cybernetics, 1-10, 2015.

. Aue A, Gamon M. “Customizing sentiment classifiers to new domains: a case study”. International Conference on Recent Advances in Natural Language Processing (RANLP), Borovets, Bulgaria, 21-23 September 2005.

. Tan S, Wu G, Tang H, Cheng X. “A novel scheme for domain-transfer problem in the context of sentiment analysis”. Conference on Information and Knowledge Management (CIKM), Lisbon, Portugal, 6-10 November 2007.

. Blitzer J, Dredze M, Pereira F. “Biographies, bollywood, boom-boxes and blenders: Domain adaptation for sentiment classification”. 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague, Czech Republic, 25-27 June 2007.

. Mihalcea R, Banae C, Wiebe J. “Learning multilingual subjective language via cross-lingual projections”. 45th Annual Meeting of the Association for Computational Linguistics (ACL), Prague, Czech Republic, 25-27 June 2007.

. Li S, Zong C. “Multi-Domain sentiment classification”. 46th Annual Meeting of the Association for Computational Linguistics (ACL), Columbus, OH, USA, 19-20 June 2008.

. Banae C, Mihalcea R, Wiebe J. “Multilingual subjectivity analysis using machine translation”. Conference on Empirical Methods in Natural Language Processing (EMNLP), Honolulu, HI, USA, 25-27 October 2008.

. Dasgupta S, Ng V. “Mine the easy, classify the hard: a semi-supervised approach to automatic sentiment classification”. 47th Annual Meeting of the Association for Computational Linguistics (ACL), Suntec, Singapore, 2-7 August 2009.

. Wan X. “Co-training for cross-lingual sentiment classification”. 47th Annual Meeting of the Association for Computational Linguistics (ACL), Suntec, Singapore, 2-7 August 2009.

. He Y, Zhou D. “Self-Training from labelled features for sentiment analysis”. Information Processing and Management, 47(4), 606-616, 2011.

. Hernandez OJ, Rodriguez JD, Alzate, L, Lucania M, Inza I, Lozano JA. “Approaching sentiment analysis by using semi-supervised learning of multi-dimensional classifiers”. Neurocomputing, 92, 98-115, 2012.

. Hajmohammadi MS, Ibrahim R, Selamat A. “Bi-View semi-supervised active learning for cross-lingual sentiment classification”. Information Processing and Management, 50(5), 718-732, 2014.

. Hajmohammadi MS, Ibrahim R, Selamat A. “Cross-Lingual sentiment classification using multiple source languages in multi-view semi-supervised learning”. Engineering Applications of Artificial Intelligence, 36, 195-203, 2014.

. Hajmohammadi MS, Ibrahim R, Selamat A, Fujita H. “Combination of active learning and self-training for cross-lingual sentiment classification with density analysis of unlabelled samples”. Information Sciences, 317, 67-77, 2015.

. Eroğul U. Sentiment Analysis in Turkish. MSc Thesis, Middle East Technical University, Ankara, Turkey, 2009.

. Vural AG, Cambazoğlu BB, Şenkul P, Tokgöz ZO. “A frame work for sentiment analysis in Turkish: Application to polarity detection of movie reviews in Turkish”. 27th International Symposium on Computer and Information Sciences, Paris, France, 3-4 October 2012.

. Meral M, Diri B. “Twitter üzerinde duygu analizi”. IEEE 22. Sinyal İşleme ve İletişim Uygulamaları Kurultayı, Trabzon, Türkiye, 23-25 Nisan 2014.

. Şimşek M, Özdemir S. “Analysis of the relation between Turkish twitter messages and stock market index”. 6th International Conference on Application of Information and Communication Technologies (AICT), Tbilisi, Georgia, 17- 19 October 2012.

. Türkmenoğlu C, Tantuğ AC. “Sentiment analysis in Turkish media”. Workshop on Issues of Sentiment Discovery and Opinion Mining (WISDOM ’14), Beijing, China, 21-26 June 2014.

. Nilsson, Nils J. "Introduction to machine learning: An early draft of a proposed textbook." 1996.

. Chao, Wei-Lun. "Machine Learning Tutorial." Disp. Ee. Ntu. Edu. Tw 2011.

. Caruana, Rich, and Alexandru Niculescu-Mizil. "An empirical comparison of supervised learning algorithms." Proceedings of the 23rd international conference on Machine learning. ACM, 2006.

. Sebastiani, Fabrizio. "Machine learning in automated text categorization." ACM computing surveys (CSUR), 34.1, 2002.

. Bilgin, Metin. Makine Öğrenmesi. Papatya Yayincilik, Istanbul, 2018.

. Witten, Ian H., et al. Data Mining: Practical machine learning tools and techniques. Morgan Kaufmann, 2016.

. Çetin, Mahmut, and M. Fatih Amasyalı. "Supervised and Traditional Term Weighting Methods for Sentiment Analysis.", Sinyal İşleme Kurultayı, 2013, KKTC.

. Airline Twitter Sentiment, https://www.crowdflower.com/data-for-everyone/, Online: April 2017.

. Kesim, M. "Real time measurement of micro changes in dinamic images", Msc. Thesis, Karadeniz Technical University, Trabzon, Turkey, 2015.


Refback'ler

  • Şu halde refbacks yoktur.


Telif Hakkı (c) 2019 metin bilgin, izzet fatih şentürk

Creative Commons License
This work is licensed under a Creative Commons Attribution-NonCommercial 4.0 International License.