Slovene Sentiment Lexicon KSS

Version 1.1, 2017-04-14

The Slovene opinion lexicon is based on the manually translated opinion lexicon of Hu & Liu (2004).  The lexicon is updated with some positive and negative words typical for Slovenian language. There are three versions of the lexicon.
Each version of the lexicon contains two files, one for negative and one for positive words in a text format, one word per line. 'The lexicon also contains some multi-word units where the individual words are joined with an underscore, e.g. "bolezenska_znamenja".

Download is available from Clarin.si repository.

Empirical evaluation on a corpus of web commentaries

The lexicon was developed as a part of BSc Thesis (Kadunc, 2016) and empirically evaluated on a Slovene corpus of web commentaries KKS. The commentaries cover different topics (business, politics, sport, and other) from four Slovene web portals (RtvSlo, 24ur, Finance, Reporter).

References:

  1. Minqing Hu in Bing Liu (2004). Mining opinion features in customer reviews. In Proceedings of AAAI Conference on Artificial Intelligence, vol. 4, pp. 755–760
  2. Klemen Kadunc (2016). Določanje sentimenta slovenskim spletnim komentarjem s pomočjo strojnega učenja. Diplomsko delo. Univerza v Ljubljani, Fakulteta za računalništvo in informatiko (in Slovene). metainfo
  3. Klemen Kadunc, Marko Robnik-Šikonja (2016). Analiza mnenj s pomočjo strojnega učenja in slovenskega leksikona sentimenta. Conference on Language Technologies & Digital Humanities, Ljubljana (in Slovene), slides, proceedings


Back to repository of research resources