The Slovene opinion lexicon is based on the manually translated
opinion lexicon of Hu & Liu (2004). The lexicon is updated
with some positive and negative words typical for Slovenian
language. There are three versions of the lexicon.
All word forms extended with Sloleks, a lexicon of Slovene
word forms. It contains 90,620 entries, 62,941 negative word
forms and 27,679 positive word forms.
Only lemmas, containing 5,125 negative words and 1,911
negative words.
The original version used in (Kadunc & Robnik-Šikonja,
2016), containing 6,687 negative entries and 2,645
positive entries.
Each version of the lexicon contains two files, one for negative and
one for positive words in a text format, one word per line.
'The lexicon also contains some multi-word units where the
individual words are joined with an underscore, e.g.
"bolezenska_znamenja".
Empirical evaluation on a corpus of web commentaries
The lexicon was developed as a part of BSc Thesis (Kadunc, 2016) and
empirically evaluated on a Slovene
corpus of web commentaries KKS. The commentaries cover
different topics (business, politics, sport, and other) from four
Slovene web portals (RtvSlo, 24ur, Finance, Reporter).