| Peer-Reviewed

The Analysis of GCFS Algorithm in Medical Data Processing and Mining

Received: 20 November 2014    Accepted: 2 December 2014    Published: 5 December 2014
Views:       Downloads:
Abstract

Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA.

Published in American Journal of Software Engineering and Applications (Volume 3, Issue 6)
DOI 10.11648/j.ajsea.20140306.11
Page(s) 68-73
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Feature Selection, GCFS, Ensemble learning

References
[1] G. Eason, B. Noble, and I. N. Sneddon, “On certain integrals of Lipschitz-Hankel type involving products of Bessel functions,” Phil. Trans. Roy. Soc. London, vol. A247, pp. 529–551, April 1955. (references)
[2] J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
[3] I. S. Jacobs and C. P. Bean, “Fine particles, thin films and exchange anisotropy,” in Magnetism, vol. III, G. T. Rado and H. Suhl, Eds. New York: Academic, 1963, pp. 271–350.
[4] K. Elissa, “Title of paper if known,” unpublished.
[5] R. Nicole, “Title of paper with only first word capitalized,” J. Name Stand. Abbrev., in press.
[6] Y. Yorozu, M. Hirano, K. Oka, and Y. Tagawa, “Electron spectroscopy studies on magneto-optical media and plastic substrate interface,” IEEE Transl. J. Magn. Japan, vol. 2, pp. 740–741, August 1987 [Digests 9th Annual Conf. Magnetics Japan, p. 301, 1982].
[7] M. Young, The Technical Writer's Handbook. Mill Valley, CA: University Science, 198.
[8] I. Skrypnik, V. Terziyan, S. Puuronen and A. Tsymbal: Proceedings of the 12th IEEE Symposium on Computer-Based Medical Systems. 1999, p. 53–58.
[9] B. Wang, M. Zhang, B. Zhang and W. Wei: Proceedings of the 7th International Conference on Parallel and Distributed Computing, Applications and Technologies. 2006, p. 128–131.
[10] H. M. Yan, J. Zheng, Y. T. Jiang, C. L. Peng, S. Z. Xiao, “Selecting critical clinical features for heart diseases diagnosis with a real-coded genetic algorithm”, Applied soft computing, no.8, (2008), pp. 1105-1111.
[11] R. E. Abdel-Aal, “GMDH-based feature ranking and selection for improved classification of medical data”, Journal of Biomedical Informatics, vol. 38, no.6, (2005), pp. 456-468.
[12] M. A. Hall, Correlation based feature selection for machine learning [D]. Hamilton, New Zealand:University of Waikato, 1999: 51-69.
[13] B. Zheng, Y. X. Jin. “The analysis of marine human error causes based on attribute reduction”, Journal of Shanghai Marine University, vol. 31, no. 1, pp. 92-93, 2010.
[14] J. T. Ren, J. H. Sun, H. Y. Huang, et al. “A feature selection method based on information gain and genetic algorithm”. Computer science, vol. 33, no. 10, pp. 194, 2006.
[15] S. C. Song, H. Pang, X. J. Ding. “The application research of GA-SVM algorithm in text classification”. Computer simulation, vol. 28, no. 1, pp. 223-225, 2011.
[16] R. E. Schapire, “The strength of weak learn ability”, Machine learning, vol. 5, no.2, (1990), pp. 197-227.
[17] Y. Freund, “Boosting a weak algorithm by majority”, Information and computation, vol.121, no.2, (1995), pp. 256-285.
[18] G. I. Webb, “MultiBoosting: A technique for combining boosting and wagging” , Machine Learning, vol. 40, no.1, (2000), pp. 159-196.
[19] J. R. Quinlan, C4.5: Programs for machine learning, Morgan Kaufmann Publishers, San Francisco, 1993.
[20] L. Breiman. Bagging predictors. Machine learning. 1996(24):123-140.
Cite This Article
  • APA Style

    Xiao Yu Chen, Bo Liu, Zhe Feng Zhang, Xin Xia. (2014). The Analysis of GCFS Algorithm in Medical Data Processing and Mining. American Journal of Software Engineering and Applications, 3(6), 68-73. https://doi.org/10.11648/j.ajsea.20140306.11

    Copy | Download

    ACS Style

    Xiao Yu Chen; Bo Liu; Zhe Feng Zhang; Xin Xia. The Analysis of GCFS Algorithm in Medical Data Processing and Mining. Am. J. Softw. Eng. Appl. 2014, 3(6), 68-73. doi: 10.11648/j.ajsea.20140306.11

    Copy | Download

    AMA Style

    Xiao Yu Chen, Bo Liu, Zhe Feng Zhang, Xin Xia. The Analysis of GCFS Algorithm in Medical Data Processing and Mining. Am J Softw Eng Appl. 2014;3(6):68-73. doi: 10.11648/j.ajsea.20140306.11

    Copy | Download

  • @article{10.11648/j.ajsea.20140306.11,
      author = {Xiao Yu Chen and Bo Liu and Zhe Feng Zhang and Xin Xia},
      title = {The Analysis of GCFS Algorithm in Medical Data Processing and Mining},
      journal = {American Journal of Software Engineering and Applications},
      volume = {3},
      number = {6},
      pages = {68-73},
      doi = {10.11648/j.ajsea.20140306.11},
      url = {https://doi.org/10.11648/j.ajsea.20140306.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajsea.20140306.11},
      abstract = {Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA.},
     year = {2014}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - The Analysis of GCFS Algorithm in Medical Data Processing and Mining
    AU  - Xiao Yu Chen
    AU  - Bo Liu
    AU  - Zhe Feng Zhang
    AU  - Xin Xia
    Y1  - 2014/12/05
    PY  - 2014
    N1  - https://doi.org/10.11648/j.ajsea.20140306.11
    DO  - 10.11648/j.ajsea.20140306.11
    T2  - American Journal of Software Engineering and Applications
    JF  - American Journal of Software Engineering and Applications
    JO  - American Journal of Software Engineering and Applications
    SP  - 68
    EP  - 73
    PB  - Science Publishing Group
    SN  - 2327-249X
    UR  - https://doi.org/10.11648/j.ajsea.20140306.11
    AB  - Feature selection plays a significant part in medical data processing and mining, it can reduce the dimensionalities of datasets and enhance the performance of the classifiers, and it is also helpful to clinical decision support to a great extent. At present, the clinical decision support is more performed by physicians subjectively based on clinical knowledge, which may hinder the diagnosis and treatment. This paper mainly outlines the performance of GCFS (Genetic Correlation-based Feature Selection) algorithm in the processing and mining procedure of medical data, and medical UCI datasets are employed as the studied materials for proving the improvement of feature selection in data classification. Compared with the algorithms of CFS and GA (Genetic Algorithm), ensemble learning methods are employed as the testing classifiers, and the results show GCFS algorithm almost improves the performances of the testing classifiers better than CFS and GA.
    VL  - 3
    IS  - 6
    ER  - 

    Copy | Download

Author Information
  • Department of Information Centre, East Hospital, Tongji University, School of Medicine, Shanghai, China

  • Department of Information Centre, East Hospital, Tongji University, School of Medicine, Shanghai, China

  • Department of Information Centre, East Hospital, Tongji University, School of Medicine, Shanghai, China

  • Department of Information Centre, East Hospital, Tongji University, School of Medicine, Shanghai, China

  • Sections