| Peer-Reviewed

Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data

Received: 13 November 2020     Accepted: 21 November 2020     Published: 30 November 2020
Views:       Downloads:
Abstract

Statistical models create a basis for analysis of infectious disesase count. These data sets exhibit unique characteristics such as low counts, delayed reporting, underreporting amoung others. The tendency to model these counts using linear models with their simplicity is common with most research. Further, the assumption of a fixed dispersion in modeling infectious disease counts is quite high. Prediction relating to infectious disease counts have been based on the Poisson model framework. The extension of the Poisson models such NB and PIG distributions have gained popularity over the recent past in modeling count responses showing over dispersion relative to the Poisson distribution. In this study we propose non-linear models for these data sets, modeling the mean and dispersion parameters as additive terms. Negative Binomial (NB) and Poisson Inverse Gaussian (PIG) glm models with a fixed and a varying dispersion parameter and compare them with NB GAM and PIG GAM with both mean and dispersion modeled as additive terms. The model are fitted to over dispersed infectious counts, Salmonella Hadar data set. Residual plots are constructed to explore the quality of fits and analysis goodness of fit is carried out to access the best fitting model. The study results reveal better performance of the PIG models on both the linear and non linear model platforms. Further, modelling both the mean and dispersion proved better as compared to models assuming the dispersion as a constant.

Published in International Journal of Data Science and Analysis (Volume 6, Issue 6)
DOI 10.11648/j.ijdsa.20200606.14
Page(s) 204-212
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2020. Published by Science Publishing Group

Keywords

Poisson Inverse Gaussian Distribution, General Additive Model, Dispersion, Count Models

References
[1] Norman E. Breslow and N. E. Day. Statistical methods in cancer research. vol. 2. the design and analysis of cohort studies. Lyon, France: International Agency for Research on Cancer 1987.
[2] M. G. Bulmer. Principles of statistics dover publications. New York, 1979.
[3] A. Colin Cameron and Pravin K. Trivedi. Essentials of count data regression. A companion to theoretical econometrics, 331, 2001.
[4] A Colin Cameron and Pravin K Trivedi. Regression analysis of count data, volume 53. Cambridge university press, 2013.
[5] Wansu Chen, Lei Qian, Jiaxiao Shi, and Meredith Franklin. Comparing performance between log-binomial and robust poisson regression models for estimating risk ratios under model misspecification. BMC medical research methodology, 18 (1): 63, 2018.
[6] Prem C Consul and Gaurav C Jain. A generalization of the poisson distribution. Technometrics, 15 (4): 791–799, 1973.
[7] David R. Cox. Some remarks on overdispersion. Biometrik, 70 (1): 269–274, 1983.
[8] C Dean, JF Lawless, and GE Willmot. A mixed poisson–inverse-gaussian regression model. Canadian Journal of Statistics, 17 (2): 171–181, 1989.
[9] Bradley Efron. Double exponential families and their use in generalized linear regression. Journal of the American Statistical Association, 81 (395): 709–721, 1986.
[10] Ludwig Fahrmeir, Thomas Kneib, Stefan Lang, and Brian Marx. Regression models. In Regression, pages 21–72. Springer, 2013.
[11] Felix Famoye, John T. Wulu, and Karan P. Singh. On the generalized poisson regression model with an application to accident data. Journal of Data Science, 2 (2004): 287–295, 2004.
[12] Felix Famoye. Restricted generalized poisson regression model. Communications in Statistics-Theory and Methods, 22 (5): 1335–1354, 1993.
[13] Royce A. Francis, Srinivas Reddy Geedipally, Seth D. Guikema, Soma Sekhar Dhavala, Dominique Lord, and Sarah LaRocca. Characterizing the performance of the conway- maxwell poisson generalized linear model. Risk Analysis: An International Journal, 32 (1): 167–183, 2012.
[14] Srinivas Geedipally and Dominique Lord. Effects of varying dispersion parameter of poisson-gamma models on estimation of confidence intervals of crash prediction models. Transportation Research Record: Journal of the Transportation Research Board, (2061): 46–54, 2008.
[15] Seth D. Guikema and Jeremy P. Goffelt. A flexible count data regression model for risk analysis. Risk Analysis: An International Journal, 28 (1): 213–223, 2008.
[16] Pushpa Lata Gupta, Ramesh C. Gupta, and Ram C. Tripathi. Score test for zero inflated generalized poisson regression model. Communications in Statistics-Theory and Methods, 33 (1): 47–64, 2005.
[17] Trevor Hastie and Robert Tibshirani. Generalized additive models. Statistical Science, 1 (3): 297–318, 1986.
[18] Robert A Hauser, Stephane Heritier, Gerald J Rowse, L Arthur Hewitt, and Stuart H Isaacson. Droxidopa and reduced falls in a trial of parkinson disease patients with neurogenic orthostatic hypotension. Clinical neuropharmacology, 39 (5): 220, 2016.
[19] Leonhard Held, Michael Höhle, and Mathias Hofmann. A statistical framework for the analysis of multivariate infectious disease surveillance counts. Statistical modelling, 5 (3): 187–199, 2005.
[20] Gillian Z. Heller, Dominique-Laurent Couturier, and Stephane R. Heritier. Beyond mean modelling: Bias due to misspecification of dispersion in poisson-inverse gaussian regression. Biometrical Journal, 2018.
[21] Gillian Z. Heller, Dominique-Laurent Couturier, and Stephane R. Heritier. Beyond mean modelling: Bias due to misspecification of dispersion in poisson-inverse gaussian regression. Biometrical Journal, 61 (2): 333–342, 2019.
[22] Joseph M. Hilbe. Negative binomial regression. Cambridge University Press, 2011.
[23] Naratip Jansakul. Fitting a zero-inflated negative binomial model via r. In Proceedings 20th International Workshop on Statistical Modelling. Sidney, Australia, pages 277–284, 2005.
[24] B. M. Golam Kibria. Applications of some discrete regression models for count data. Pakistan Journal of Statistics and Operation Research, 2 (1): 1–16, 2006.
[25] Jerald F Lawless. Negative binomial and mixed poisson regression. Canadian Journal Of Statistics, 15 (3): 209–225, 1987.
[26] Scott J. Long, J. Scott Long, and Jeremy Freese. Regression models for categorical dependent variables using stata. Stata press, 2006.
[27] Julie M. Rickard. Factors influencing long-distance rail passenger trip rates in great britain. Journal of Transport Economics and policy, pages 209–233, 1988.
[28] Martin Ridout, John Hinde, and Clarice G. B. DeméAtrio. A score test for testing a zero- inflated poisson regression model against zero-inflated negative binomial alternatives. Biometrics., 57 (1): 219–223, 2001.
[29] R. A. Rigby, D. M. Stasinopoulos, and C. Akantziliotou. A framework for modelling overdispersed count data, including the poisson-shifted generalized inverse gaussian distribution. Computational Statistics & Data Analysis, 53 (2): 381–393, 2008.
[30] J Scott Long. Regression models for categorical and limited dependent variables. Advanced quantitative techniques in the social sciences, 7, 1997.
[31] George A. F. Seber and Alan J. Lee. Linear regression analysis, volume 329. John Wiley & Sons, 2012.
[32] Kimberly F. Sellers, Sharad Borle, and Galit Shmueli. The com-poisson model for count data: A survey of methods and applications. Applied Stochastic Models in Business and Industry, 28 (2): 104–116, 2012.
[33] H. S. Sichel. On a family of discrete distributions particularly suited to represent long- tailed frequency data. In Proceedings of the Third Symposium on Mathematical Statistics. SACSIR, 1971.
[34] Mikis D. Stasinopoulos, Robert A. Rigby, Gillian Z. Heller, Vlasios Voudouris, and Fernanda De Bastiani. Flexible regression and smoothing: using GAMLSS in R. Chapman and Hall/CRC, 2017.
[35] Gillian Z. Stein, Walter Zucchini, and June M. Juritz. Parameter estimation for the sichel distribution and its multivariate extension. Journal of the American Statistical Association, 82 (399): 938–944, 1987.
[36] Rainer Winkelmann. Econometric analysis of count data. Springer Science & Business Media, 2008.
[37] Simon N. Wood. Generalized additive models: an introduction with R. Chapman and Hall/CRC, 2006.
[38] Simon N. Wood. Fast stable direct fitting and smoothness selection for generalized additive models. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 70 (3): 495–518, 2008.
[39] Liteng Zha and Yajie Zou. The poisson inverse gaussian (pig) generalized linear Regression model for analyzing motor vehicle crash data. 2014.
[40] Yajie Zou, Dominique Lord, Yunlong Zhang, and Yichuan Peng. Comparison of sichel and negative binomial models in estimating empirical bayes estimates. Transportation research record, 2392 (1): 11–21, 2013.
[41] Yaotian Zou, Dominique Lord, and Srinivas Reddy Geedipally. Over-and under- dispersed count data: Comparing conway-maxwell-poisson and double-poisson distributions. In Transportation Research Board 91st Annual Meeting, Washington, DC, USA. Citeseer, 2012.
Cite This Article
  • APA Style

    Symon Kamuyu Matonyo, Oscar Ngesa, Anthony Wanjoya. (2020). Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data. International Journal of Data Science and Analysis, 6(6), 204-212. https://doi.org/10.11648/j.ijdsa.20200606.14

    Copy | Download

    ACS Style

    Symon Kamuyu Matonyo; Oscar Ngesa; Anthony Wanjoya. Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data. Int. J. Data Sci. Anal. 2020, 6(6), 204-212. doi: 10.11648/j.ijdsa.20200606.14

    Copy | Download

    AMA Style

    Symon Kamuyu Matonyo, Oscar Ngesa, Anthony Wanjoya. Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data. Int J Data Sci Anal. 2020;6(6):204-212. doi: 10.11648/j.ijdsa.20200606.14

    Copy | Download

  • @article{10.11648/j.ijdsa.20200606.14,
      author = {Symon Kamuyu Matonyo and Oscar Ngesa and Anthony Wanjoya},
      title = {Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data},
      journal = {International Journal of Data Science and Analysis},
      volume = {6},
      number = {6},
      pages = {204-212},
      doi = {10.11648/j.ijdsa.20200606.14},
      url = {https://doi.org/10.11648/j.ijdsa.20200606.14},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdsa.20200606.14},
      abstract = {Statistical models create a basis for analysis of infectious disesase count. These data sets exhibit unique characteristics such as low counts, delayed reporting, underreporting amoung others. The tendency to model these counts using linear models with their simplicity is common with most research. Further, the assumption of a fixed dispersion in modeling infectious disease counts is quite high. Prediction relating to infectious disease counts have been based on the Poisson model framework. The extension of the Poisson models such NB and PIG distributions have gained popularity over the recent past in modeling count responses showing over dispersion relative to the Poisson distribution. In this study we propose non-linear models for these data sets, modeling the mean and dispersion parameters as additive terms. Negative Binomial (NB) and Poisson Inverse Gaussian (PIG) glm models with a fixed and a varying dispersion parameter and compare them with NB GAM and PIG GAM with both mean and dispersion modeled as additive terms. The model are fitted to over dispersed infectious counts, Salmonella Hadar data set. Residual plots are constructed to explore the quality of fits and analysis goodness of fit is carried out to access the best fitting model. The study results reveal better performance of the PIG models on both the linear and non linear model platforms. Further, modelling both the mean and dispersion proved better as compared to models assuming the dispersion as a constant.},
     year = {2020}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Non-linear Approximations of Shape and Location Parameters in the Poisson Inverse Gaussian Model in Analysis of Infectious Count Data
    AU  - Symon Kamuyu Matonyo
    AU  - Oscar Ngesa
    AU  - Anthony Wanjoya
    Y1  - 2020/11/30
    PY  - 2020
    N1  - https://doi.org/10.11648/j.ijdsa.20200606.14
    DO  - 10.11648/j.ijdsa.20200606.14
    T2  - International Journal of Data Science and Analysis
    JF  - International Journal of Data Science and Analysis
    JO  - International Journal of Data Science and Analysis
    SP  - 204
    EP  - 212
    PB  - Science Publishing Group
    SN  - 2575-1891
    UR  - https://doi.org/10.11648/j.ijdsa.20200606.14
    AB  - Statistical models create a basis for analysis of infectious disesase count. These data sets exhibit unique characteristics such as low counts, delayed reporting, underreporting amoung others. The tendency to model these counts using linear models with their simplicity is common with most research. Further, the assumption of a fixed dispersion in modeling infectious disease counts is quite high. Prediction relating to infectious disease counts have been based on the Poisson model framework. The extension of the Poisson models such NB and PIG distributions have gained popularity over the recent past in modeling count responses showing over dispersion relative to the Poisson distribution. In this study we propose non-linear models for these data sets, modeling the mean and dispersion parameters as additive terms. Negative Binomial (NB) and Poisson Inverse Gaussian (PIG) glm models with a fixed and a varying dispersion parameter and compare them with NB GAM and PIG GAM with both mean and dispersion modeled as additive terms. The model are fitted to over dispersed infectious counts, Salmonella Hadar data set. Residual plots are constructed to explore the quality of fits and analysis goodness of fit is carried out to access the best fitting model. The study results reveal better performance of the PIG models on both the linear and non linear model platforms. Further, modelling both the mean and dispersion proved better as compared to models assuming the dispersion as a constant.
    VL  - 6
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Department of Statistics and Actuarial Science, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Department of Statistics and Actuarial Science, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Department of Statistics and Actuarial Science, Jomo Kenyatta University of Agriculture and Technology, Nairobi, Kenya

  • Sections