Amharic Language Hate Speech Detection on Social Media

Beyene Kassa Wondie; Ermias Melku Tadesse; Tarekegn Walle Yirdaw

doi:doi:10.11648/j.ajai.20250901.12

Research Article |

| Peer-Reviewed

Amharic Language Hate Speech Detection on Social Media

Beyene Kassa Wondie, Ermias Melku Tadesse^*

, Tarekegn Walle Yirdaw

Published in American Journal of Artificial Intelligence (Volume 9, Issue 1)

Received: 10 March 2025 Accepted: 31 March 2025 Published: 9 May 2025

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Social media platforms enable rapid communication, information sharing, and opinion expression. However, their misuse for hate speech targeting race, religion and political differences has become a growing concern. This issue is particularly sensitive for underrepresented languages like Amharic, a Semitic language with the second-largest number of speakers after Arabic and the working language of Ethiopia. This study addresses the challenge of detecting hate speech in Amharic text by analyzing posts and comments from Facebook, YouTube, and Twitter. A dataset of 7,590 labeled entries was collected using the Face pager tool, focusing on hate speech related to race, religion, politics, and neutral content. The dataset was annotated with the guidance of researchers, legal experts, and language specialists. Preprocessing techniques, including data cleaning, tokenization, and normalization, were applied, and feature extraction was performed using embedding layers. The dataset was split into training (80%), validation (10%), and testing (10%) sets. Several deep learning models LSTM, BiLSTM, GRU, BiGRU, and RoBERTa were developed and evaluated using precision, recall, F1-score, and accuracy metrics. The RoBERTa model outperformed others, achieving an accuracy of 91%. This research highlights the effectiveness of advanced deep learning techniques in detecting Amharic hate speech, offering a valuable tool for mitigating this critical issue in Ethiopian social media contexts.

Published in	American Journal of Artificial Intelligence (Volume 9, Issue 1)
DOI	10.11648/j.ajai.20250901.12
Page(s)	16-21
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Ge’ez, Fidel, LSTM, BiLSTM, GRU, BiGRU

References

[1]	Y. Kenenisa and T. Melak, “Adama, Ethiopia, September 2019,” Hate Speech Detect. Amharic Lang. Soc. Media Using Mach. Learn. Tech. By, vol. Unpublishe, pp. 1-103, 2019.
[2]	Z. Mossie and J. Wang, “SOCIAL NETWORK HATE SPEECH,” pp. 41-55, 2018.
[3]	B. Emuye, “Amharic Text Hate Speech Detection in Social Media Using Deep Learning Approach,” no. july, 2020.
[4]	N. Albadi, M. Kurdi, and S. Mishra, “Are They Our Brothers ? Analysis and Detection of Religious Hate Speech in the Arabic Twittersphere,” 2018.
[5]	B. Gambäck and U. K. Sikdar, “Using Convolutional Neural Networks to Classify Hate Speech,” no. 7491, pp. 85-90, 2017.
[6]	M. Zampieri, “Detecting Hate Speech in Social Media,” pp. 467-472, 2017.
[7]	Z. Mossie and J. Wang, “SOCIAL NETWORK HATE SPEECH,” no. April, 2018, https://doi.org/10.5121/csit.2018.80604
[8]	S. Teferra and W. Menzel, “Automatic Speech Recognition for an Under-Resourced Language-Amharic.”
[9]	F. A. Melat, “Hate Speech Detection for Amharic Language on Facebook Using Deep Learning,” pp. 1-23, 2022.
[10]	A. G. Debele and M. M. Woldeyohannis, “Multimodal Amharic Hate Speech Detection Using Deep Learning,” 2022 Int. Conf. Inf. Commun. Technol. Dev. Africa, ICT4DA 2022, no. December, pp. 102-107, 2022, https://doi.org/10.1109/ICT4DA56482.2022.9971436
[11]	S. G. Tesfaye and K. Kakeba, “Automated Amharic Hate Speech Posts and Comments Detection Model Using Recurrent Neural Network,” 2020, https://doi.org/10.21203/rs.3.rs-114533/v1
[12]	M. Bhardwaj, M. S. Akhtar, A. Ekbal, A. Das, and T. Chakraborty, “Hostility Detection Dataset in Hindi,” Nov. 2020, [Online]. Available: http://arxiv.org/abs/2011.03588
[13]	C. Ozgur, T. Colliau, G. Rogers, Z. Hughes, E. " Bennie, and " Myer-Tyson, “The Selection of Independent Variables for A Multiple Regression Problem Using LASSO methods,” 2017. [Online]. Available: https://www.researchgate.net/publication/328175547

Cite This Article

Plain Text BibTeX RIS

APA Style

Wondie, B. K., Tadesse, E. M., Yirdaw, T. W. (2025). Amharic Language Hate Speech Detection on Social Media. American Journal of Artificial Intelligence, 9(1), 16-21. https://doi.org/10.11648/j.ajai.20250901.12

Copy | Download

ACS Style

Wondie, B. K.; Tadesse, E. M.; Yirdaw, T. W. Amharic Language Hate Speech Detection on Social Media. Am. J. Artif. Intell. 2025, 9(1), 16-21. doi: 10.11648/j.ajai.20250901.12

Copy | Download

AMA Style

Wondie BK, Tadesse EM, Yirdaw TW. Amharic Language Hate Speech Detection on Social Media. Am J Artif Intell. 2025;9(1):16-21. doi: 10.11648/j.ajai.20250901.12

Copy | Download

@article{10.11648/j.ajai.20250901.12,
  author = {Beyene Kassa Wondie and Ermias Melku Tadesse and Tarekegn Walle Yirdaw},
  title = {Amharic Language Hate Speech Detection on Social Media
},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {1},
  pages = {16-21},
  doi = {10.11648/j.ajai.20250901.12},
  url = {https://doi.org/10.11648/j.ajai.20250901.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250901.12},
  abstract = {Social media platforms enable rapid communication, information sharing, and opinion expression. However, their misuse for hate speech targeting race, religion and political differences has become a growing concern. This issue is particularly sensitive for underrepresented languages like Amharic, a Semitic language with the second-largest number of speakers after Arabic and the working language of Ethiopia. This study addresses the challenge of detecting hate speech in Amharic text by analyzing posts and comments from Facebook, YouTube, and Twitter. A dataset of 7,590 labeled entries was collected using the Face pager tool, focusing on hate speech related to race, religion, politics, and neutral content. The dataset was annotated with the guidance of researchers, legal experts, and language specialists. Preprocessing techniques, including data cleaning, tokenization, and normalization, were applied, and feature extraction was performed using embedding layers. The dataset was split into training (80%), validation (10%), and testing (10%) sets. Several deep learning models LSTM, BiLSTM, GRU, BiGRU, and RoBERTa were developed and evaluated using precision, recall, F1-score, and accuracy metrics. The RoBERTa model outperformed others, achieving an accuracy of 91%. This research highlights the effectiveness of advanced deep learning techniques in detecting Amharic hate speech, offering a valuable tool for mitigating this critical issue in Ethiopian social media contexts.
},
 year = {2025}
}

Copy | Download

TY - JOUR
T1 - Amharic Language Hate Speech Detection on Social Media

AU - Beyene Kassa Wondie
AU - Ermias Melku Tadesse
AU - Tarekegn Walle Yirdaw
Y1 - 2025/05/09
PY - 2025
N1 - https://doi.org/10.11648/j.ajai.20250901.12
DO - 10.11648/j.ajai.20250901.12
T2 - American Journal of Artificial Intelligence
JF - American Journal of Artificial Intelligence
JO - American Journal of Artificial Intelligence
SP - 16
EP - 21
PB - Science Publishing Group
SN - 2639-9733
UR - https://doi.org/10.11648/j.ajai.20250901.12
AB - Social media platforms enable rapid communication, information sharing, and opinion expression. However, their misuse for hate speech targeting race, religion and political differences has become a growing concern. This issue is particularly sensitive for underrepresented languages like Amharic, a Semitic language with the second-largest number of speakers after Arabic and the working language of Ethiopia. This study addresses the challenge of detecting hate speech in Amharic text by analyzing posts and comments from Facebook, YouTube, and Twitter. A dataset of 7,590 labeled entries was collected using the Face pager tool, focusing on hate speech related to race, religion, politics, and neutral content. The dataset was annotated with the guidance of researchers, legal experts, and language specialists. Preprocessing techniques, including data cleaning, tokenization, and normalization, were applied, and feature extraction was performed using embedding layers. The dataset was split into training (80%), validation (10%), and testing (10%) sets. Several deep learning models LSTM, BiLSTM, GRU, BiGRU, and RoBERTa were developed and evaluated using precision, recall, F1-score, and accuracy metrics. The RoBERTa model outperformed others, achieving an accuracy of 91%. This research highlights the effectiveness of advanced deep learning techniques in detecting Amharic hate speech, offering a valuable tool for mitigating this critical issue in Ethiopian social media contexts.

VL - 9
IS - 1
ER -

Copy | Download

Author Information

Beyene Kassa Wondie

Department of Information System, Kombolcha Institute of Technology, Wollo University, Kombolcha, Ethiopia

Contact Email
Ermias Melku Tadesse

Department of Information Technology, Kombolcha Institute of Technology, Wollo University, Kombolcha, Ethiopia

Contact Email

http://orcid.org/0009-0005-1187-9190
Tarekegn Walle Yirdaw

Department of Information System, Kombolcha Institute of Technology, Wollo University, Kombolcha, Ethiopia

Contact Email

Download PDF

Sections

Plain Text BibTeX RIS

APA Style

Wondie, B. K., Tadesse, E. M., Yirdaw, T. W. (2025). Amharic Language Hate Speech Detection on Social Media. American Journal of Artificial Intelligence, 9(1), 16-21. https://doi.org/10.11648/j.ajai.20250901.12

Copy | Download

ACS Style

Wondie, B. K.; Tadesse, E. M.; Yirdaw, T. W. Amharic Language Hate Speech Detection on Social Media. Am. J. Artif. Intell. 2025, 9(1), 16-21. doi: 10.11648/j.ajai.20250901.12

Copy | Download

AMA Style

Wondie BK, Tadesse EM, Yirdaw TW. Amharic Language Hate Speech Detection on Social Media. Am J Artif Intell. 2025;9(1):16-21. doi: 10.11648/j.ajai.20250901.12

Copy | Download

@article{10.11648/j.ajai.20250901.12,
  author = {Beyene Kassa Wondie and Ermias Melku Tadesse and Tarekegn Walle Yirdaw},
  title = {Amharic Language Hate Speech Detection on Social Media
},
  journal = {American Journal of Artificial Intelligence},
  volume = {9},
  number = {1},
  pages = {16-21},
  doi = {10.11648/j.ajai.20250901.12},
  url = {https://doi.org/10.11648/j.ajai.20250901.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajai.20250901.12},
  abstract = {Social media platforms enable rapid communication, information sharing, and opinion expression. However, their misuse for hate speech targeting race, religion and political differences has become a growing concern. This issue is particularly sensitive for underrepresented languages like Amharic, a Semitic language with the second-largest number of speakers after Arabic and the working language of Ethiopia. This study addresses the challenge of detecting hate speech in Amharic text by analyzing posts and comments from Facebook, YouTube, and Twitter. A dataset of 7,590 labeled entries was collected using the Face pager tool, focusing on hate speech related to race, religion, politics, and neutral content. The dataset was annotated with the guidance of researchers, legal experts, and language specialists. Preprocessing techniques, including data cleaning, tokenization, and normalization, were applied, and feature extraction was performed using embedding layers. The dataset was split into training (80%), validation (10%), and testing (10%) sets. Several deep learning models LSTM, BiLSTM, GRU, BiGRU, and RoBERTa were developed and evaluated using precision, recall, F1-score, and accuracy metrics. The RoBERTa model outperformed others, achieving an accuracy of 91%. This research highlights the effectiveness of advanced deep learning techniques in detecting Amharic hate speech, offering a valuable tool for mitigating this critical issue in Ethiopian social media contexts.
},
 year = {2025}
}

Copy | Download

TY - JOUR
T1 - Amharic Language Hate Speech Detection on Social Media

VL - 9
IS - 1
ER -

Copy | Download