Research Article | | Peer-Reviewed

Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques

Received: 11 February 2026     Accepted: 8 April 2026     Published: 24 April 2026
Views:       Downloads:
Abstract

Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.

Published in American Journal of Networks and Communications (Volume 15, Issue 1)
DOI 10.11648/j.ajnc.20261501.12
Page(s) 10-26
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2026. Published by Science Publishing Group

Keywords

Customer Churn Prediction, Telecommunications, Ensemble Learning, EasyEnsembleClassifier, SVMSMOTE, Class Imbalance, Feature Engineering, Model Explainability, CRISP-DM, FastAPI Dashboard

References
[1] D. Tokmakov, “Customer churn dataset containing real records from a leading Bulgarian telecom operator, specifically for business customers,” Mendeley Data, vol. 1, 2024,
[2] Statista, “Annual churn rate in global telecommunications sector,” 2023.
[3] Communications Authority of Kenya, “Telecommunications market reports,” 2023.
[4] GSMA Intelligence, “The mobile economy: Sub-Saharan Africa 2022,” GSMA, 2022.
[5] P. Nagarkar, “A customer churn prediction model using XGBoost for the telecommunication industry in Nepal,” Procedia Computer Science, vol. 215, pp. 652–661, 2022,
[6] F. E. Usman-Hamza et al., “Intelligent decision-forest models for customer churn prediction,” Applied Sciences, vol. 12, no. 16, p. 8270, 2022,
[7] L. Saha, H. K. Tripathy, T. Gaber, H. El-Gohary, and E. S. M. El-Kenawy, “Deep churn prediction method for telecommunication industry,” Sustainability, vol. 15, no.5, p. 4543, 2023,
[8] V. Chang et al., “Prediction of customer churn behaviour in the telecommunication industry using machine learning models,” Algorithms, vol. 17, no. 6, p. 231, 2024,
[9] M. Rahman, R. Bista, and K. Poudel, “Explaining customer churn prediction in the telecom industry using interpretable machine learning techniques,” Information Sciences Letters, vol. 36, p. 100043, 2024,
[10] E. T. Mwaura, “Effects of customer experience management on customer churn in the telco industry in Kenya,” Project Report, 2021.
[11] M. Shahabikargar, “ChurnKB: A generative AI-enriched knowledge base for customer churn feature engineering,” Algorithms, vol. 18, no. 4, p. 238, 2025,
[12] N. N. Y. Vo, S. Liu, X. Li, and G. Xu, “Leveraging unstructured call log data for customer churn prediction,” Knowledge-Based Systems, vol. 212, p. 106586, 2021,
[13] A. K. Ahmad, A. Jafar, and K. Aljoumaa, “Customer churn prediction in telecom using big data and social network analysis,” arXiv preprint arXiv: 1904.00690, 2019.
[14] C. Phua et al., “Predicting near-future churners and win backs in the telecommunications industry,” arXiv preprint arXiv: 1210.6891, 2012.
[15] T. Zhang, S. Moro, and R. F. Ramos, “A data-driven approach to improve customer churn prediction based on telecom customer segmentation,” Future Internet, vol. 14, no. 3, p. 94, 2022,
[16] U. Ahmed et al., “Transfer learning and metaclassification based deep churn prediction system for telecom industry,” arXiv preprint arXiv: 1901.06091, 2019.
[17] A. Sharma and P. K. Panigrahi, “A neural network based approach for predicting customer churn in cellular network services,” International Journal of Computer Applications, vol. 27, no. 11, pp. 26–31, 2011,
[18] H. Risselada, P. C. Verhoef, and T. H. A. Bijmolt, “Staying power of churn prediction models,” Journal of Interactive Marketing, vol. 24, no. 3, pp. 198–208, 2010,
[19] A. Payne and P. Frow, “A strategic framework for customer relationship management,” Journal of Marketing, vol. 69, no. 4, pp. 167–176, 2005,
[20] A. Sohail, M. A. Khan, S. Kadry, and Y. Nam, “Telecommunication customers churn prediction model using machine learning approach,” in Proc. IEEEICECCE, 2020, pp. 1–6,
[21] R. Mishra, P. K. Reddy, P. Nair, and R. Ranjan, “Customer churn prediction in telecom using machine learning,” in Proc. IEEE AISP, 2019, pp. 1–5,
[22] A. T. Jahromi and H. Sharifi, “Analysing customer churn prediction techniques: A review of the literature,” Future Internet, vol. 14, no. 3, p. 94, 2020,
Cite This Article
  • APA Style

    Makokha, A., Obote, K., Muchiri, H., Senagi, K. (2026). Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. American Journal of Networks and Communications, 15(1), 10-26. https://doi.org/10.11648/j.ajnc.20261501.12

    Copy | Download

    ACS Style

    Makokha, A.; Obote, K.; Muchiri, H.; Senagi, K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am. J. Netw. Commun. 2026, 15(1), 10-26. doi: 10.11648/j.ajnc.20261501.12

    Copy | Download

    AMA Style

    Makokha A, Obote K, Muchiri H, Senagi K. Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques. Am J Netw Commun. 2026;15(1):10-26. doi: 10.11648/j.ajnc.20261501.12

    Copy | Download

  • @article{10.11648/j.ajnc.20261501.12,
      author = {Adeline Makokha and Kevin Obote and Henry Muchiri and Kennedy Senagi},
      title = {Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques
    },
      journal = {American Journal of Networks and Communications},
      volume = {15},
      number = {1},
      pages = {10-26},
      doi = {10.11648/j.ajnc.20261501.12},
      url = {https://doi.org/10.11648/j.ajnc.20261501.12},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajnc.20261501.12},
      abstract = {Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.
    },
     year = {2026}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Predicting Customer Churn in the Telecommunications Industry using Machine Learning Techniques
    
    AU  - Adeline Makokha
    AU  - Kevin Obote
    AU  - Henry Muchiri
    AU  - Kennedy Senagi
    Y1  - 2026/04/24
    PY  - 2026
    N1  - https://doi.org/10.11648/j.ajnc.20261501.12
    DO  - 10.11648/j.ajnc.20261501.12
    T2  - American Journal of Networks and Communications
    JF  - American Journal of Networks and Communications
    JO  - American Journal of Networks and Communications
    SP  - 10
    EP  - 26
    PB  - Science Publishing Group
    SN  - 2326-8964
    UR  - https://doi.org/10.11648/j.ajnc.20261501.12
    AB  - Voluntary customer churn constitutes a persistent financial risk for telecommunications operators, particularly within enterprise customer segments where high-value accounts administer complex, multi-subscription portfolios. Industry data indicate that acquiring a new account costs between five and seven times more than retaining an existing one. Despite heightened industry awareness, the majority of operational retention platforms remain reactive, detecting departure only after the event has occurred. This investigation constructs and evaluates a machine learning pipeline engineered to identify enterprise customer churn risk proactively, drawing on authentic operational records extracted from a business-tobusiness telecommunications environment. The study follows the Cross-Industry Standard Process for Data Mining (CRISP-DM) lifecycle. A dataset of 8,454 unique business accounts, characterised by 14 raw attributes and enriched to a final 22-variable feature set, underpins the empirical work. Pronounced class imbalance, churned accounts representing approximately 6.5minority ratio of 14.3:1, necessitated specialised resampling prior to classifier training. Five oversampling strategies were benchmarked; SVMSMOTE produced the largest gain in minority-class sensitivity and was adopted for all subsequent training cycles. Ten classifier families were trained and assessed, including EasyEnsembleClassifier, RUSBoostClassifier, XGBoost, LightGBM, CatBoost, Histogram Gradient Boosting, Balanced Bagging, a multilayer perceptron, a soft-voting ensemble, and a stacking ensemble. EasyEnsembleClassifier emerged as the leading model, attaining an F1-score of 0.129 and a recall of 38.242 of 110 churned accounts. Post-hoc explainability analysis through SHAP and LIME identified active subscriber rate, geographic billing zone, and engineered interaction terms as the dominant predictive signals. The framework was operationalised within a FastAPI-based application supporting realtime individual scoring, batch CSV prediction, and retention campaign monitoring. The projected annual revenue protection under conservative assumptions exceeds 74,000 currency units. The study illustrates that interpretable, explainability-augmented machine learning frameworks can bridge the gap between quantitative model output and managerial action, offering a replicable blueprint for data-driven churn governance in both emerging and mature telecommunications markets.
    
    VL  - 15
    IS  - 1
    ER  - 

    Copy | Download

Author Information
  • Sections