International Journal on Data Science and Technology

| Peer-Reviewed |

Performance Engineering for Scientific Computing with R

Received: 25 June 2018    Accepted:     Published: 26 June 2018
Views:       Downloads:

Share This Article

Abstract

R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations.

DOI 10.11648/j.ijdst.20180402.11
Published in International Journal on Data Science and Technology (Volume 4, Issue 2, June 2018)
Page(s) 42-48
Creative Commons

This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited.

Copyright

Copyright © The Author(s), 2024. Published by Science Publishing Group

Keywords

Performance Evaluation, R, Intel Xeon Phi, Multi-Core Computing, GPGPU

References
[1] Accelerating the intel math kernel library, 2007. M. Intel. Intel math kernel library, 2007.
[2] A hardware accelerator for the Intel Math Kernel. J. L. Gustafson and B. S. Greer. ClearSpeed whitepaper.
[3] Y. El-Khamra, N. Gaffney, D. Walling, E. Wernert, W. Xu, and H. Zhang. Performance evaluation of r with intel xeon phicoprocessor. In Big Data, 2013 IEEE International Conference on, pages 23–30. IEEE, 2013.
[4] Hui Zhang, Sidharth Thakur, and Andrew J. Hanson. Haptic exploration of mathematical knots. In ISVC (1), pages 745–756, 2007.
[5] Lin Jing, Xipei Huang, Yiwen Zhong, Yin Wu, and Hui Zhang. Python based 4d visualization environment. International Journal of Advancements in Computing Technology, 4 (16):460–469, September 2012.
[6] Hui Zhang, Jianguang Weng, and Andrew J. Hanson. A pseudo-haptic knot diagram interface. In Proc. SPIE, volume 7868, pages 786807–786807–14, 2011.
[7] Guangchen Ruan and Hui Zhang. Conquering Big Data with High Performance Computing, chapter Large-Scale Multimodal Data Exploration with Human in the Loop. Springer International Publishing, Springer International Publishing Switzerland, 2016.
[8] Jian Zou and Hui Zhang. Conquering Big Data with High Performance Computing, chapter High-Frequency Financial Analysis through High Performance Computing. Springer International Publishing, Springer International Publishing Switzerland, 2016.
[9] Weijia Xu, Ruizhu Huang, and Hui Zhang. Conquering Big Data with High Performance Computing, chapter Empowering R with High Performance Computing Resources for Big Data Analytics. Springer International Publishing, Springer International Publishing Switzerland, 2016.
[10] Hui Zhang, Huian Li, Michael J. Boyles, Robert Henschel, Eduardo Kazuo Kohara, and Masatoshi Ando. Exploiting hpc resources for the 3d-time series analysis of caries lesion activity. In Proceedings of the 1st Conference of the Extreme Science and Engineering Discovery Environment: Bridging from the eXtreme to the Campus and Beyond, XSEDE ’12, pages 19:1–19:8, New York, NY, USA, 2012. ACM.
[11] Hui Zhang, Michael J. Boyles, Guangchen Ruan, Huian Li, Hongwei Shen, and Masatoshi Ando. Xsede-enabled highthroughput lesion activity assessment. In Proceedings of the Conference on Extreme Science and Engineering Discovery Environment: Gateway to Discovery, XSEDE ’13, pages 10:1–10:8, New York, NY, USA, 2013. ACM.
[12] Hui Zhang, Jianguang Weng, and Guangchen Ruan. Visualizing 2-dimensional manifolds with curve handles in 4d. IEEE Transactions on Visualization and Computer Graphics, 20 (12):2575–2584, Dec 2014.
[13] Riqing Chen and Hui Zhang. Large-scale 3D Reconstruction with an R-based Analysis Workflow. In Proceedings of the Fourth IEEE/ACM International Conference on Big Data Computing, Applications and Technologies (BDCAT '17). ACM, New York, NY, USA.
[14] Hui Zhang, Yiwen. Zhong and Juan Lin, Divide-and-conquer strategies for large-scale simulations in R, 2017 IEEE International Conference on Big Data (Big Data), Boston, MA, 2017, pp. 3517-3523.
Cite This Article
  • APA Style

    Hui Zhang. (2018). Performance Engineering for Scientific Computing with R. International Journal on Data Science and Technology, 4(2), 42-48. https://doi.org/10.11648/j.ijdst.20180402.11

    Copy | Download

    ACS Style

    Hui Zhang. Performance Engineering for Scientific Computing with R. Int. J. Data Sci. Technol. 2018, 4(2), 42-48. doi: 10.11648/j.ijdst.20180402.11

    Copy | Download

    AMA Style

    Hui Zhang. Performance Engineering for Scientific Computing with R. Int J Data Sci Technol. 2018;4(2):42-48. doi: 10.11648/j.ijdst.20180402.11

    Copy | Download

  • @article{10.11648/j.ijdst.20180402.11,
      author = {Hui Zhang},
      title = {Performance Engineering for Scientific Computing with R},
      journal = {International Journal on Data Science and Technology},
      volume = {4},
      number = {2},
      pages = {42-48},
      doi = {10.11648/j.ijdst.20180402.11},
      url = {https://doi.org/10.11648/j.ijdst.20180402.11},
      eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ijdst.20180402.11},
      abstract = {R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations.},
     year = {2018}
    }
    

    Copy | Download

  • TY  - JOUR
    T1  - Performance Engineering for Scientific Computing with R
    AU  - Hui Zhang
    Y1  - 2018/06/26
    PY  - 2018
    N1  - https://doi.org/10.11648/j.ijdst.20180402.11
    DO  - 10.11648/j.ijdst.20180402.11
    T2  - International Journal on Data Science and Technology
    JF  - International Journal on Data Science and Technology
    JO  - International Journal on Data Science and Technology
    SP  - 42
    EP  - 48
    PB  - Science Publishing Group
    SN  - 2472-2235
    UR  - https://doi.org/10.11648/j.ijdst.20180402.11
    AB  - R has been adopted as a popular data analysis and mining tool in many domain fields over the past decade. As Big Data overwhelms those fields, the computational needs and workload of existing R solutions increases significantly. With recent hardware and software developments, it is possible to enable massive parallelism with existing R solutions with little to no modification. In this paper, three different approaches are evaluated to speed up R computations with the utilization of the multiple cores, the Intel Xeon Phi SE10P Co-processor, and the general purpose graphic processing unit (GPGPU). Performance engineering and evaluation efforts in this study are based on a popular R benchmark script. The paper presents preliminary results on running R-benchmark with the above packages and hardware technology combinations.
    VL  - 4
    IS  - 2
    ER  - 

    Copy | Download

Author Information
  • Computer Engineering and Computer Science Department, University of Louisville, Louisville, USA

  • Sections