Dataset for Targeted Sentiment Analysis in Turkish

Loading...
Thumbnail Image

Date

Journal Title

Journal ISSN

Volume Title

Publisher

Boğaziçi University

Abstract

Description

This dataset contains 3440 public Turkish tweets whose timestamps span a six- month period between January 2020 and June 2020 and that are about six different brands. The tweets are collected via the official Twitter API by separately searching our 6 targets selected from famous companies and brands. This dataset is manually annotated with three labels, positive, negative, and neutral. Two factors are considered in the annotation process, namely sentence sentiment and targeted sentiment. Each tweet has the following two labels. The sentence sentiment label expresses the overall sentiment of the sentence, regardless of the target word, as in traditional sentiment analysis techniques. On the other hand, the targeted sentiment label reflects the sentiment for the target in that sentence. The dataset is splitted as train, validation and test sets. Train set contains 2200 tweets. Validation set contains 548 tweets. Test set contains 692 tweets.

Citation

Collections

Endorsement

Review

Supplemented By

Referenced By

Creative Commons license

Except where otherwised noted, this item's license is described as The MIT License (MIT)