Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques

Adeline Makokha; Kevin Obote; Henry Muchiri; Kennedy Senagi

doi:doi:10.11648/j.ajnc.20261501.12

Research Article |

| Peer-Reviewed

Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques

Adeline Makokha^*

, Kevin Obote

, Henry Muchiri

, Kennedy Senagi

Published in American Journal of Networks and Communications (Volume 15, Issue 1)

Received: 11 February 2026 Accepted: 8 April 2026 Published: 24 April 2026

Views: Downloads:

Download PDF

Share This Article

Twitter
Linked In
Facebook

Abstract

Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.

Published in	American Journal of Networks and Communications (Volume 15, Issue 1)
DOI	10.11648/j.ajnc.20261501.12
Page(s)	10-26
Creative Commons	This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.
Copyright	Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Customer Churn Prediction, Telecommunications, Ensemble Learning, EasyEnsembleClassifier, SVMSMOTE, Class Imbalance, Feature Engineering, Model Explainability, CRISP-DM, FastAPI Dashboard

References

[1]	D. Tokmakov, “Customer churn dataset containing real records from a leading Bulgarian telecom operator, specifically for business customers,” Mendeley Data, vol. 1, 2024, https://doi.org/10.17632/nrb55gr66h.1
[2]	Statista, “Annual churn rate in global telecommunications sector,” 2023. https://www.statista.com/statistic/churnrate-by-industry-us/
[3]	Communications Authority of Kenya, “Telecommunications market reports,” 2023. https://ca.go.ke
[4]	GSMA Intelligence, “The mobile economy: Sub-Saharan Africa 2022,” GSMA, 2022. https://www.gsma.com
[5]	P. Nagarkar, “A customer churn prediction model using XGBoost for the telecommunication industry in Nepal,” Procedia Computer Science, vol. 215, pp. 652–661, 2022, https://doi.org/10.1016/j.procs.2022.12.083
[6]	F. E. Usman-Hamza et al., “Intelligent decision-forest models for customer churn prediction,” Applied Sciences, vol. 12, no. 16, p. 8270, 2022, https://doi.org/10.3390/app12168270
[7]	L. Saha, H. K. Tripathy, T. Gaber, H. El-Gohary, and E. S. M. El-Kenawy, “Deep churn prediction method for telecommunication industry,” Sustainability, vol. 15, no.5, p. 4543, 2023, https://doi.org/10.3390/su15054543
[8]	V. Chang et al., “Prediction of customer churn behaviour in the telecommunication industry using machine learning models,” Algorithms, vol. 17, no. 6, p. 231, 2024, https://doi.org/10.3390/a17060231
[9]	M. Rahman, R. Bista, and K. Poudel, “Explaining customer churn prediction in the telecom industry using interpretable machine learning techniques,” Information Sciences Letters, vol. 36, p. 100043, 2024, https://doi.org/10.1016/j.isl.2024.100043
[10]	E. T. Mwaura, “Effects of customer experience management on customer churn in the telco industry in Kenya,” Project Report, 2021. https://afribary.com/works/effects-of-customer-experience-management-on-customer-churn-in-the-telco-industryin-kenya
[11]	M. Shahabikargar, “ChurnKB: A generative AI-enriched knowledge base for customer churn feature engineering,” Algorithms, vol. 18, no. 4, p. 238, 2025, https://doi.org/10.3390/a18040238
[12]	N. N. Y. Vo, S. Liu, X. Li, and G. Xu, “Leveraging unstructured call log data for customer churn prediction,” Knowledge-Based Systems, vol. 212, p. 106586, 2021, https://doi.org/10.1016/j.knosys.2020.106586
[13]	A. K. Ahmad, A. Jafar, and K. Aljoumaa, “Customer churn prediction in telecom using big data and social network analysis,” arXiv preprint arXiv: 1904.00690, 2019.
[14]	C. Phua et al., “Predicting near-future churners and win backs in the telecommunications industry,” arXiv preprint arXiv: 1210.6891, 2012.
[15]	T. Zhang, S. Moro, and R. F. Ramos, “A data-driven approach to improve customer churn prediction based on telecom customer segmentation,” Future Internet, vol. 14, no. 3, p. 94, 2022, https://doi.org/10.3390/fi14030094
[16]	U. Ahmed et al., “Transfer learning and metaclassification based deep churn prediction system for telecom industry,” arXiv preprint arXiv: 1901.06091, 2019.
[17]	A. Sharma and P. K. Panigrahi, “A neural network based approach for predicting customer churn in cellular network services,” International Journal of Computer Applications, vol. 27, no. 11, pp. 26–31, 2011, https://doi.org/10.5120/3345-4608
[18]	H. Risselada, P. C. Verhoef, and T. H. A. Bijmolt, “Staying power of churn prediction models,” Journal of Interactive Marketing, vol. 24, no. 3, pp. 198–208, 2010, https://doi.org/10.1016/j.intmar.2010.04.002
[19]	A. Payne and P. Frow, “A strategic framework for customer relationship management,” Journal of Marketing, vol. 69, no. 4, pp. 167–176, 2005, https://doi.org/10.1509/jmkg.2005.69.4.167
[20]	A. Sohail, M. A. Khan, S. Kadry, and Y. Nam, “Telecommunication customers churn prediction model using machine learning approach,” in Proc. IEEEICECCE, 2020, pp. 1–6, https://doi.org/10.1109/ICECCE49384.2020.9262288
[21]	R. Mishra, P. K. Reddy, P. Nair, and R. Ranjan, “Customer churn prediction in telecom using machine learning,” in Proc. IEEE AISP, 2019, pp. 1–5, https://doi.org/10.1109/AISP.2019.8706988
[22]	A. T. Jahromi and H. Sharifi, “Analysing customer churn prediction techniques: A review of the literature,” Future Internet, vol. 14, no. 3, p. 94, 2020, https://doi.org/10.3390/fi14030094

Cite This Article

Plain Text BibTeX RIS

APA Style

Makokha, A., Obote, K., Muchiri, H., Senagi, K. (2026). Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. American Journal of Networks and Communications, 15(1), 10-26. https://doi.org/10.11648/j.ajnc.20261501.12

Copy | Download

ACS Style

Makokha, A.; Obote, K.; Muchiri, H.; Senagi, K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am. J. Netw. Commun. 2026, 15(1), 10-26. doi: 10.11648/j.ajnc.20261501.12

Copy | Download

AMA Style

Makokha A, Obote K, Muchiri H, Senagi K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am J Netw Commun. 2026;15(1):10-26. doi: 10.11648/j.ajnc.20261501.12

Copy | Download

@article{10.11648/j.ajnc.20261501.12,
  author = {Adeline Makokha and Kevin Obote and Henry Muchiri and Kennedy Senagi},
  title = {Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques
},
  journal = {American Journal of Networks and Communications},
  volume = {15},
  number = {1},
  pages = {10-26},
  doi = {10.11648/j.ajnc.20261501.12},
  url = {https://doi.org/10.11648/j.ajnc.20261501.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajnc.20261501.12},
  abstract = {Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.
},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques

AU  - Adeline Makokha
AU  - Kevin Obote
AU  - Henry Muchiri
AU  - Kennedy Senagi
Y1  - 2026/04/24
PY  - 2026
N1  - https://doi.org/10.11648/j.ajnc.20261501.12
DO  - 10.11648/j.ajnc.20261501.12
T2  - American Journal of Networks and Communications
JF  - American Journal of Networks and Communications
JO  - American Journal of Networks and Communications
SP  - 10
EP  - 26
PB  - Science Publishing Group
SN  - 2326-8964
UR  - https://doi.org/10.11648/j.ajnc.20261501.12
AB  - Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.

VL  - 15
IS  - 1
ER  -

Copy | Download

Author Information

Adeline Makokha

Strathmore Institute of Mathematical Sciences, Strathmore University, Nairobi, Kenya

Contact Email

http://orcid.org/0009-0006-5328-6198
Kevin Obote

Strathmore Institute of Mathematical Sciences, Strathmore University, Nairobi, Kenya

Contact Email

http://orcid.org/0009-0000-7099-2154
Henry Muchiri

Strathmore Institute of Mathematical Sciences, Strathmore University, Nairobi, Kenya

Contact Email

http://orcid.org/0000-0003-1773-6674
Kennedy Senagi

Strathmore Institute of Mathematical Sciences, Strathmore University, Nairobi, Kenya

Contact Email

http://orcid.org/0000-0002-0757-3907

Download PDF

Submit an Article

Sections

Plain Text BibTeX RIS

APA Style

Makokha, A., Obote, K., Muchiri, H., Senagi, K. (2026). Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. American Journal of Networks and Communications, 15(1), 10-26. https://doi.org/10.11648/j.ajnc.20261501.12

Copy | Download

ACS Style

Makokha, A.; Obote, K.; Muchiri, H.; Senagi, K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am. J. Netw. Commun. 2026, 15(1), 10-26. doi: 10.11648/j.ajnc.20261501.12

Copy | Download

AMA Style

Makokha A, Obote K, Muchiri H, Senagi K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am J Netw Commun. 2026;15(1):10-26. doi: 10.11648/j.ajnc.20261501.12

Copy | Download

@article{10.11648/j.ajnc.20261501.12,
  author = {Adeline Makokha and Kevin Obote and Henry Muchiri and Kennedy Senagi},
  title = {Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques
},
  journal = {American Journal of Networks and Communications},
  volume = {15},
  number = {1},
  pages = {10-26},
  doi = {10.11648/j.ajnc.20261501.12},
  url = {https://doi.org/10.11648/j.ajnc.20261501.12},
  eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajnc.20261501.12},
  abstract = {Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.
},
 year = {2026}
}

Copy | Download

TY  - JOUR
T1  - Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques

AU  - Adeline Makokha
AU  - Kevin Obote
AU  - Henry Muchiri
AU  - Kennedy Senagi
Y1  - 2026/04/24
PY  - 2026
N1  - https://doi.org/10.11648/j.ajnc.20261501.12
DO  - 10.11648/j.ajnc.20261501.12
T2  - American Journal of Networks and Communications
JF  - American Journal of Networks and Communications
JO  - American Journal of Networks and Communications
SP  - 10
EP  - 26
PB  - Science Publishing Group
SN  - 2326-8964
UR  - https://doi.org/10.11648/j.ajnc.20261501.12
AB  - Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.

VL  - 15
IS  - 1
ER  -

Copy | Download