A framework for automated detection of offensive messages in social networks in Kiswahili

dc.contributor.authorBarongo, Everyjustus
dc.date.accessioned2018-10-15T17:55:53Z
dc.date.available2018-10-15T17:55:53Z
dc.date.issued2017
dc.descriptionDissertation (MSc Computer Science)en_US
dc.description.abstractThe diffusion of information generated in Social Networks Sites is the result of more people being connected. The connected people chats and comments by posting contents like images, video, and messages. In fact the social networks have been and are useful to communities in such they bring relatives together especially in sharing experiences and feelings. Although social networks have been beneficial to users, some of the shared messages and comments contain sexual and political harassments. This is particularly the same in Kiswahili speaking countries like Tanzania. Most if not all of the Kiswahili social networks sites, the offensive messages have been and are publicly posted. These messages harass, embarrass, and even assault users and to some extent lead to psychological effect. This study proposes a framework for automating the detection of offensive messages on social networks in Kiswahili settings by applying some selected machine learning techniques. Specifically, the study created Kiswahili dataset containing sexual and political offensive messages and normal messages1. All of these messages were collected from Facebook, YouTube, and JamiiForum and they were used for evaluating the performance of the selected text classification algorithms. The collected messages were preprocessed by using Bag-of-Word (BoW) model, Term Frequency Inverse Document Frequency (TF-IDF) and N-grams techniques to generate feature vectors. The experimental findings using the generated feature vectors showed that the Random Forest classifier was capable of correctly assigning a message into a correct class label with an accuracy of 95.0259 %, f1- Measure of 0.950 (95.0%) and false positive rate of 2.8 % when applied to three categorical dataset. On the other hand, the SVM-Linear showed better results when applied in two categorical data. The study suggests the REST API based framework with random forest classifier and Kiswahili dataset to be deployed in real social neten_US
dc.identifier.citationBarongo, E. (2017). A framework for automated detection of offensive messages in social networks in Kiswahili. Dodoma: The university of Dodoma.en_US
dc.identifier.urihttp://hdl.handle.net/20.500.12661/507
dc.language.isoenen_US
dc.publisherThe University of Dodomaen_US
dc.subjectSocial Mediaen_US
dc.subjectKiswahilien_US
dc.subjectJamii Forumen_US
dc.subjectSocial networksen_US
dc.subjectSocial mediaen_US
dc.subjectOffensive messagesen_US
dc.subjectSexual offensive messagesen_US
dc.subjectPolitical offensive messagesen_US
dc.subjectSexual harassmenten_US
dc.subjectPolitical harassmenten_US
dc.subjectHarassmenten_US
dc.subjectoffensive messagesen_US
dc.titleA framework for automated detection of offensive messages in social networks in Kiswahilien_US
dc.typeDissertationen_US
Files
Original bundle
Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
EVERYJUSTUS BARONGO.pdf
Size:
1.1 MB
Format:
Adobe Portable Document Format
Description:
License bundle
Now showing 1 - 1 of 1
No Thumbnail Available
Name:
license.txt
Size:
1.71 KB
Format:
Item-specific license agreed upon to submission
Description: