A framework for automated detection of offensive messages in social networks in Kiswahili

Barongo, Everyjustus

A framework for automated detection of offensive messages in social networks in Kiswahili

dc.contributor.author	Barongo, Everyjustus
dc.date.accessioned	2018-10-15T17:55:53Z
dc.date.available	2018-10-15T17:55:53Z
dc.date.issued	2017
dc.description	Dissertation (MSc Computer Science)	en_US
dc.description.abstract	The diffusion of information generated in Social Networks Sites is the result of more people being connected. The connected people chats and comments by posting contents like images, video, and messages. In fact the social networks have been and are useful to communities in such they bring relatives together especially in sharing experiences and feelings. Although social networks have been beneficial to users, some of the shared messages and comments contain sexual and political harassments. This is particularly the same in Kiswahili speaking countries like Tanzania. Most if not all of the Kiswahili social networks sites, the offensive messages have been and are publicly posted. These messages harass, embarrass, and even assault users and to some extent lead to psychological effect. This study proposes a framework for automating the detection of offensive messages on social networks in Kiswahili settings by applying some selected machine learning techniques. Specifically, the study created Kiswahili dataset containing sexual and political offensive messages and normal messages1. All of these messages were collected from Facebook, YouTube, and JamiiForum and they were used for evaluating the performance of the selected text classification algorithms. The collected messages were preprocessed by using Bag-of-Word (BoW) model, Term Frequency Inverse Document Frequency (TF-IDF) and N-grams techniques to generate feature vectors. The experimental findings using the generated feature vectors showed that the Random Forest classifier was capable of correctly assigning a message into a correct class label with an accuracy of 95.0259 %, f1- Measure of 0.950 (95.0%) and false positive rate of 2.8 % when applied to three categorical dataset. On the other hand, the SVM-Linear showed better results when applied in two categorical data. The study suggests the REST API based framework with random forest classifier and Kiswahili dataset to be deployed in real social net	en_US
dc.identifier.citation	Barongo, E. (2017). A framework for automated detection of offensive messages in social networks in Kiswahili. Dodoma: The university of Dodoma.	en_US
dc.identifier.uri	http://hdl.handle.net/20.500.12661/507
dc.language.iso	en	en_US
dc.publisher	The University of Dodoma	en_US
dc.subject	Social Media	en_US
dc.subject	Kiswahili	en_US
dc.subject	Jamii Forum	en_US
dc.subject	Social networks	en_US
dc.subject	Social media	en_US
dc.subject	Offensive messages	en_US
dc.subject	Sexual offensive messages	en_US
dc.subject	Political offensive messages	en_US
dc.subject	Sexual harassment	en_US
dc.subject	Political harassment	en_US
dc.subject	Harassment	en_US
dc.subject	offensive messages	en_US
dc.title	A framework for automated detection of offensive messages in social networks in Kiswahili	en_US
dc.type	Dissertation	en_US

Files

Original bundle

Now showing 1 - 1 of 1

Name:: EVERYJUSTUS BARONGO.pdf
Size:: 1.1 MB
Format:: Adobe Portable Document Format
Description:

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Master Dissertations