Research Article | | Peer-Reviewed

Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration

Received: 10 September 2025     Accepted: 23 September 2025     Published: 20 December 2025
Views:       Downloads:
Abstract

Dummy variable regression model assumes a linear relationship between the categorical variables and outcome variable. Also its coefficients of regression might not be directly interpretable in terms of probability changes or odds ratios, which may potentially limit the usefulness of the model. This might not hold true if the gestation length is dichotomous. This study explores an alternative method of using dummy variable regression in estimating probabilities, odds and odds ratios in gestational outcomes if the outcome variable is continuous rather than the use of logistic regression. This involves first partitioning each of the parent independent variables into a set of mutually exclusive categories or subgroups and then use dummy variables to represent these categories in a regression model. In such a regression model, each parent independent variable is represented by one dummy variable of 1’s and 0’s less than the number of its categories. Any level of a parent independent variable that is not specifically represented by a dummy variable is referred to as the excluded level of that parent variable while the others are termed the included levels in the regression model. This study is limited as follows: there is need to ensure that the model assumptions of linearity and independence of observations, continuity of the outcome variable as well as categorical nature of the predictor variables are met. A pilot cross sectional study design was carried out at Alex Ekwueme Federal University Teaching Hospital Abakaliki where data on age, parity and sex of last births were collected from 41 anti-natal women. The overall results of analysis showed that R2 = 0.066; 95% CL = 1.234-2.765, p - value = 0.780 indicating an insignificant relationship between the outcome variable and the categorical predictor variables, signaling the end of analysis. For illustration purposes only, we estimated probabilities and used it to estimate any desired odds and odds ratios like the odds that the randomly selected mother has a male and female births with gestation length of more than 39.5 weeks gave 0.636 and 0.639 as odds for male and female respectively while odds ratio was 0.995 implying that for every 1000 female births with a gestation length of more than 39.5 weeks, there are 995 males’ births with the same gestation length of 39.5 weeks. We concluded that dummy variable regression enables one to estimate probabilities and odds ratios of continuous outcome variable and so compares favorably with logistic regression.

Published in International Journal of Theoretical and Applied Mathematics (Volume 11, Issue 6)
DOI 10.11648/j.ijtam.20251106.11
Page(s) 86-100
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2025. Published by Science Publishing Group

Keywords

Odds, Odds Ratio, Continuous Outcome, Dummy Variable Regression, Probabilities, Maternal Age

References
[1] Bajpai, P. (2013). Multiple Regression Analysis Using ANCOVA in University Model, 3(5), 336–340.
[2] Syla, S. (2013). Application of Multiple Linear Regression Analysis of Employment through ALMP, 3(12), 252–258.
[3] Oswald, F. L. (2012). Interpreting Multiple Linear Regression: A Guidebook of Variable Importance, 17(9).
[4] Kelley, K., & Maxwell, S. E. (2003). Sample Size for Multiple Regression: Obtaining Regression.
[5] Ludlow, L. (2014). Suppressor Variables: The Difference between “Is” versus“ Acting As,” 22(2), 1–28.
[6] Oyeka ICA, Okeh UM. Estimating Odds Ratios in Logistic Regression of Dichotomous Data (2013). 2: 608
[7] Miftha Delinda and Devni Prima Sari. University Election Analysis: Logistic Regression Approach with Dummy and Ordinal Variables. Mathematical Journal of Modelling and Forecasting; Vol. 1, No. 2, December 2023, pp. 1-9.
[8] Hua Y, Stead TS, George A, Ganti L. Clinical Risk Prediction with Logistic Regression: Best Practices, Validation Techniques, and Applications in Medical Research. Academic Medicine & Surgery. Published online March 7, 2025.
[9] Clemma J Muller and Richard F MacLehose. Estimating predicted probabilities from logistic regression: different methods correspond to different target populations. International Journal of Epidemiology, 2014, 962–970.
[10] Sherif A. Moawed and Ayman H. Abd El?Aziz. The estimation and interpretation of ordered logit models for assessing the factors connected with the productivity of Holstein–Friesian dairy cows in Egypt. Tropical Animal Health and Production (2022) 54: 345.
[11] Christopher J. Sroka and Haikady N. Nagaraja. Odds ratios from logistic, geometric, Poisson, and negative binomial regression models. BMC Medical Research Methodology (2018) 18: 112.
[12] Xu Zhou, Jian He, Aihua Wang, Xinjun Hua, Ting Li, Chuqiang Shu and Junqun Fang. Multivariate logistic regression analysis of risk factors for birth defects: a study from population-based surveillance data. BMC Public Health (2024) 24: 1037.
[13] K. A. Adeleke and A. A. Adepoju. Ordinal Logistic Regression Model: An Application to Pregnancy Outcomes. Journal of Mathematics and Statistics 6(3): 279-285, 2010.
[14] Hossain M. S. (2023) Estimation of a Logistic Regression Model to Determine the Effects of the Factors Associated with the Likelihood of Skilled Workers in the Garment Sector of Bangladesh, European Journal of Business and Innovation Research, Vol. 11, No. 7, pp. 1-34.
[15] Usman, U., Waziri, M., Manu, F., Zakari, Y. & Dikko, H. G. Assessing The Performance of Some Re-Sampling Methods Using Logistic Regression. FUDMA Journal of Sciences (FJS). Vol. 5 No. 1, March, 2021, pp 445 – 456
[16] Boyle, Richard P. Path Analysis and Ordinal Data. American Journal of Sociology. 1970; 75(4): 461–480.
[17] Neter J, Wasserman W, Christopher N. Applied Linear Statistical Models. New York, USA. 1974.
[18] Oyeka ICA. Ties Adjusted Two way Analysis of Variance tests with unequal observations per cell. Science Journal of Mathematics & Statistics. 2013; (2013): 1–6.
[19] Lara PINHEIRO-GUEDES, Clarisse MARTINHO, Maria ROSÁRIO O. MARTINS. Logistic Regression: Limitations in the Estimation of Measures of Association with Binary Health Outcomes. Acta Med Port 2024 Oct; 37(10): 697-705.
[20] Okeh U. M. and Onyeagu S. I. A Modified Wilcoxon Signed Rank Test for Comparing Roc Curves in A Matched Pair Design, International Journal of Physical Sciences Research, (2023) Vol. 6, No. 1, pp. 15-29.
Cite This Article
  • APA Style

    Okeh, U. M., Igwe, T. S., Okpara, P. A. (2025). Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration. International Journal of Theoretical and Applied Mathematics, 11(6), 86-100. https://doi.org/10.11648/j.ijtam.20251106.11

    Copy | Download

    ACS Style

    Okeh, U. M.; Igwe, T. S.; Okpara, P. A. Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration. Int. J. Theor. Appl. Math. 2025, 11(6), 86-100. doi: 10.11648/j.ijtam.20251106.11

    Copy | Download

    AMA Style

    Okeh UM, Igwe TS, Okpara PA. Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration. Int J Theor Appl Math. 2025;11(6):86-100. doi: 10.11648/j.ijtam.20251106.11

    Copy | Download

  • @article{10.11648/j.ijtam.20251106.11,
      author = {Uchechukwu Marius Okeh and Theophilus Sunday Igwe and Patrick Agwu Okpara},
      title = {Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration},
      journal = {International Journal of Theoretical and Applied Mathematics},
      volume = {11},
      number = {6},
      pages = {86-100},
      doi = {10.11648/j.ijtam.20251106.11},
      url = {https://doi.org/10.11648/j.ijtam.20251106.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijtam.20251106.11},
      abstract = {Dummy variable regression model assumes a linear relationship between the categorical variables and outcome variable. Also its coefficients of regression might not be directly interpretable in terms of probability changes or odds ratios, which may potentially limit the usefulness of the model. This might not hold true if the gestation length is dichotomous. This study explores an alternative method of using dummy variable regression in estimating probabilities, odds and odds ratios in gestational outcomes if the outcome variable is continuous rather than the use of logistic regression. This involves first partitioning each of the parent independent variables into a set of mutually exclusive categories or subgroups and then use dummy variables to represent these categories in a regression model. In such a regression model, each parent independent variable is represented by one dummy variable of 1’s and 0’s less than the number of its categories. Any level of a parent independent variable that is not specifically represented by a dummy variable is referred to as the excluded level of that parent variable while the others are termed the included levels in the regression model. This study is limited as follows: there is need to ensure that the model assumptions of linearity and independence of observations, continuity of the outcome variable as well as categorical nature of the predictor variables are met. A pilot cross sectional study design was carried out at Alex Ekwueme Federal University Teaching Hospital Abakaliki where data on age, parity and sex of last births were collected from 41 anti-natal women. The overall results of analysis showed that R2 = 0.066; 95% CL = 1.234-2.765, p - value = 0.780 indicating an insignificant relationship between the outcome variable and the categorical predictor variables, signaling the end of analysis. For illustration purposes only, we estimated probabilities and used it to estimate any desired odds and odds ratios like the odds that the randomly selected mother has a male and female births with gestation length of more than 39.5 weeks gave 0.636 and 0.639 as odds for male and female respectively while odds ratio was 0.995 implying that for every 1000 female births with a gestation length of more than 39.5 weeks, there are 995 males’ births with the same gestation length of 39.5 weeks. We concluded that dummy variable regression enables one to estimate probabilities and odds ratios of continuous outcome variable and so compares favorably with logistic regression.},
     year = {2025}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Estimating Probabilities and Odds Ratios in Gestational Outcomes: A Dummy Variable Regression Illustration
    AU  - Uchechukwu Marius Okeh
    AU  - Theophilus Sunday Igwe
    AU  - Patrick Agwu Okpara
    Y1  - 2025/12/20
    PY  - 2025
    N1  - https://doi.org/10.11648/j.ijtam.20251106.11
    DO  - 10.11648/j.ijtam.20251106.11
    T2  - International Journal of Theoretical and Applied Mathematics
    JF  - International Journal of Theoretical and Applied Mathematics
    JO  - International Journal of Theoretical and Applied Mathematics
    SP  - 86
    EP  - 100
    PB  - Science Publishing Group
    SN  - 2575-5080
    UR  - https://doi.org/10.11648/j.ijtam.20251106.11
    AB  - Dummy variable regression model assumes a linear relationship between the categorical variables and outcome variable. Also its coefficients of regression might not be directly interpretable in terms of probability changes or odds ratios, which may potentially limit the usefulness of the model. This might not hold true if the gestation length is dichotomous. This study explores an alternative method of using dummy variable regression in estimating probabilities, odds and odds ratios in gestational outcomes if the outcome variable is continuous rather than the use of logistic regression. This involves first partitioning each of the parent independent variables into a set of mutually exclusive categories or subgroups and then use dummy variables to represent these categories in a regression model. In such a regression model, each parent independent variable is represented by one dummy variable of 1’s and 0’s less than the number of its categories. Any level of a parent independent variable that is not specifically represented by a dummy variable is referred to as the excluded level of that parent variable while the others are termed the included levels in the regression model. This study is limited as follows: there is need to ensure that the model assumptions of linearity and independence of observations, continuity of the outcome variable as well as categorical nature of the predictor variables are met. A pilot cross sectional study design was carried out at Alex Ekwueme Federal University Teaching Hospital Abakaliki where data on age, parity and sex of last births were collected from 41 anti-natal women. The overall results of analysis showed that R2 = 0.066; 95% CL = 1.234-2.765, p - value = 0.780 indicating an insignificant relationship between the outcome variable and the categorical predictor variables, signaling the end of analysis. For illustration purposes only, we estimated probabilities and used it to estimate any desired odds and odds ratios like the odds that the randomly selected mother has a male and female births with gestation length of more than 39.5 weeks gave 0.636 and 0.639 as odds for male and female respectively while odds ratio was 0.995 implying that for every 1000 female births with a gestation length of more than 39.5 weeks, there are 995 males’ births with the same gestation length of 39.5 weeks. We concluded that dummy variable regression enables one to estimate probabilities and odds ratios of continuous outcome variable and so compares favorably with logistic regression.
    VL  - 11
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Department of Industrial Mathematics and Health Statistics, David Umahi Federal University of Health Sciences, Uburu, Nigeria;International Institute for Nuclear Medicine and Allied Health Research, David Umahi Federal University of Health Sciences, Uburu, Nigeria

  • Department of Industrial Mathematics and Health Statistics, David Umahi Federal University of Health Sciences, Uburu, Nigeria

  • Department of Industrial Mathematics and Health Statistics, David Umahi Federal University of Health Sciences, Uburu, Nigeria

  • Sections