Visual saliency refers to an area of an image that attracts human attention. The Human Visual System (HVS) can focus on specific parts of a scene, rather than the whole image. Visual attention describes a set of cognitive procedures that choose important information and filter out unnecessary information from cluttered visual scenes. Images become a soul in computer vision since it contains plenty of information and human beings receive 80% of information through vision. In processing the whole image while only a certain part of an image is needed, more resources are consumed. Instead of processing the whole pixels of an image, specifying only the needed pixel is computationally efficient to minimize the efforts. This is achieved by using GAN with CHASPP module and EfficientNet-B7 which uniformly scales up all dimensions of the image (depth, width, and resolution) is selected as feature extractor in this study which improves the way of extracting features in visual saliency prediction. Different datasets like CAT2000, MIT1003, DUTOMRON, and PASCALS are used in this study to illustrate the efficiency of the selected models and techniques. In this study, we developed effective visual saliency prediction using GAN with CHASPP and other factors like edge loss and perceptual loss. CHASPP module scored the best result on the same datasets measured by different evaluation metrics.
Published in | American Journal of Mathematical and Computer Modelling (Volume 10, Issue 2) |
DOI | 10.11648/j.ajmcm.20251002.13 |
Page(s) | 66-73 |
Creative Commons |
This is an Open Access article, distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution and reproduction in any medium or format, provided the original work is properly cited. |
Copyright |
Copyright © The Author(s), 2025. Published by Science Publishing Group |
Visual Saliency Prediction, Attention Area, Generative Adversarial Network, Low-level Features, High-level Features, Feature Extraction
[1] | A. Le, “Predicting Visual Saliency: Where Do People Look?”. |
[2] | S. J. B and S. S. Kamath, “Saliency Prediction for Visual Regions,” pp. 48–60, 2017, |
[3] | W. Wang and J. Shen, “Deep Visual Attention Prediction,” IEEE Trans. Image Process., vol. 27, no. 5, pp. 2368–2378, 2018, |
[4] | P. Christiaan Klink, P. Jentgens, and J. A. M. Lorteije, “Priority maps explain the roles of value, Attention, And salience in goal-oriented behavior,” J. Neurosci., vol. 34, no. 42, pp. 13867–13869, 2014, |
[5] | F. Yan, C. Chen, P. Xiao, S. Qi, Z. Wang, and R. Xiao, “Review of visual saliency prediction: Development process from neurobiological basis to deep models,” Appl. Sci., vol. 12, no. 1, 2022, |
[6] | R. Sharma and E. N. Singh, “Comparative Study of Different Low Level Feature Extraction Techniques,” Int. J. Eng. Res. Technol., vol. 3, no. 4, pp. 1454–1460, 2014. |
[7] | I. Goodfellow et al., “Generative adversarial networks,” Commun. ACM, vol. 63, no. 11, pp. 139–144, 2020, |
[8] | M. Assens, X. Giro-i-Nieto, K. McGuinness, and N. E. O’Connor, “PathGAN: Visual scanpath prediction with generative adversarial networks,” Lect. Notes Comput. Sci. (including Subser. Lect. Notes Artif. Intell. Lect. Notes Bioinformatics), vol. 11133 LNCS, pp. 406–422, 2019, |
[9] | S. Ioffe and C. Szegedy, “Batch normalization: Accelerating deep network training by reducing internal covariate shift,” 32nd Int. Conf. Mach. Learn. ICML 2015, vol. 1, pp. 448–456, 2015. |
[10] | K. R. Avery et al., “Fatigue Behavior of Stainless Steel Sheet Specimens at Extremely High Temperatures,” SAE Int. J. Mater. Manuf., vol. 7, no. 3, pp. 560–566, 2014, |
[11] | M. Tan and Q. V. Le, “EfficientNet: Rethinking model scaling for convolutional neural networks,” 36th Int. Conf. Mach. Learn. ICML 2019, vol. 2019-June, pp. 10691–10700, 2019. |
[12] | F. Chollet, “Xception: Deep learning with depthwise separable convolutions,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 1800–1807, 2017, |
[13] | G. Huang, Z. Liu, L. Van Der Maaten, and K. Q. Weinberger, “Densely connected convolutional networks,” Proc. - 30th IEEE Conf. Comput. Vis. Pattern Recognition, CVPR 2017, vol. 2017-Janua, pp. 2261–2269, 2017, |
[14] | X. Lian, Y. Pang, J. Han, and J. Pan, “Cascaded hierarchical atrous spatial pyramid pooling module for semantic segmentation,” Pattern Recognit., vol. 110, 2021, |
[15] | Z. Bylinskii, T. Judd, A. Oliva, A. Torralba, and F. Durand, “What Do Different Evaluation Metrics Tell Us about Saliency Models?,” IEEE Trans. Pattern Anal. Mach. Intell., vol. 41, no. 3, pp. 740–757, 2019, |
[16] | N. Riche, M. Duvinage, and M. Mancas, “A study of parameters affecting visual saliency assessment A study of parameters affecting visual saliency assessment,” no. May 2014, 2013. |
[17] | S. Barratt and R. Sharma, “A Note on the Inception Score,” 2018, [Online]. Available: |
[18] | A. Borji and L. Itti, “CAT2000: A Large Scale Fixation Dataset for Boosting Saliency Research,” May 2015. |
[19] | J. Smith et al., “Placeholder Text: A Study,” Citation Styles, vol. 3, Jul. 2021. |
APA Style
Dufera, D., Abate, F. (2025). Generative Adversarial Network Based Visual Saliency Prediction with Cascaded Hierarchical Atrous Spatial Pyramid Pooling. American Journal of Mathematical and Computer Modelling, 10(2), 66-73. https://doi.org/10.11648/j.ajmcm.20251002.13
ACS Style
Dufera, D.; Abate, F. Generative Adversarial Network Based Visual Saliency Prediction with Cascaded Hierarchical Atrous Spatial Pyramid Pooling. Am. J. Math. Comput. Model. 2025, 10(2), 66-73. doi: 10.11648/j.ajmcm.20251002.13
@article{10.11648/j.ajmcm.20251002.13, author = {Daniel Dufera and Felmeta Abate}, title = {Generative Adversarial Network Based Visual Saliency Prediction with Cascaded Hierarchical Atrous Spatial Pyramid Pooling }, journal = {American Journal of Mathematical and Computer Modelling}, volume = {10}, number = {2}, pages = {66-73}, doi = {10.11648/j.ajmcm.20251002.13}, url = {https://doi.org/10.11648/j.ajmcm.20251002.13}, eprint = {https://article.sciencepublishinggroup.com/pdf/10.11648.j.ajmcm.20251002.13}, abstract = {Visual saliency refers to an area of an image that attracts human attention. The Human Visual System (HVS) can focus on specific parts of a scene, rather than the whole image. Visual attention describes a set of cognitive procedures that choose important information and filter out unnecessary information from cluttered visual scenes. Images become a soul in computer vision since it contains plenty of information and human beings receive 80% of information through vision. In processing the whole image while only a certain part of an image is needed, more resources are consumed. Instead of processing the whole pixels of an image, specifying only the needed pixel is computationally efficient to minimize the efforts. This is achieved by using GAN with CHASPP module and EfficientNet-B7 which uniformly scales up all dimensions of the image (depth, width, and resolution) is selected as feature extractor in this study which improves the way of extracting features in visual saliency prediction. Different datasets like CAT2000, MIT1003, DUTOMRON, and PASCALS are used in this study to illustrate the efficiency of the selected models and techniques. In this study, we developed effective visual saliency prediction using GAN with CHASPP and other factors like edge loss and perceptual loss. CHASPP module scored the best result on the same datasets measured by different evaluation metrics. }, year = {2025} }
TY - JOUR T1 - Generative Adversarial Network Based Visual Saliency Prediction with Cascaded Hierarchical Atrous Spatial Pyramid Pooling AU - Daniel Dufera AU - Felmeta Abate Y1 - 2025/06/16 PY - 2025 N1 - https://doi.org/10.11648/j.ajmcm.20251002.13 DO - 10.11648/j.ajmcm.20251002.13 T2 - American Journal of Mathematical and Computer Modelling JF - American Journal of Mathematical and Computer Modelling JO - American Journal of Mathematical and Computer Modelling SP - 66 EP - 73 PB - Science Publishing Group SN - 2578-8280 UR - https://doi.org/10.11648/j.ajmcm.20251002.13 AB - Visual saliency refers to an area of an image that attracts human attention. The Human Visual System (HVS) can focus on specific parts of a scene, rather than the whole image. Visual attention describes a set of cognitive procedures that choose important information and filter out unnecessary information from cluttered visual scenes. Images become a soul in computer vision since it contains plenty of information and human beings receive 80% of information through vision. In processing the whole image while only a certain part of an image is needed, more resources are consumed. Instead of processing the whole pixels of an image, specifying only the needed pixel is computationally efficient to minimize the efforts. This is achieved by using GAN with CHASPP module and EfficientNet-B7 which uniformly scales up all dimensions of the image (depth, width, and resolution) is selected as feature extractor in this study which improves the way of extracting features in visual saliency prediction. Different datasets like CAT2000, MIT1003, DUTOMRON, and PASCALS are used in this study to illustrate the efficiency of the selected models and techniques. In this study, we developed effective visual saliency prediction using GAN with CHASPP and other factors like edge loss and perceptual loss. CHASPP module scored the best result on the same datasets measured by different evaluation metrics. VL - 10 IS - 2 ER -