Saliency-guided image translation

JIANG Lai; DAI Ning; XU Mai; DENG Xin; LI Shengxi

doi:10.13700/j.bh.1001-5965.2021.0732

Volume 49 Issue 10

Oct. 2023

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2023 > 49(10): 2689-2698.

JIANG L，DAI N，XU M，et al. Saliency-guided image translation[J]. Journal of Beijing University of Aeronautics and Astronautics，2023，49（10）：2689-2698 （in Chinese） doi: 10.13700/j.bh.1001-5965.2021.0732

Citation:

PDF( 1842 KB)

Saliency-guided image translation

doi: 10.13700/j.bh.1001-5965.2021.0732

JIANG Lai¹,
DAI Ning¹,
XU Mai^{1
,
,},
DENG Xin²,
LI Shengxi¹

1.
School of Electronic and Engineering，Beihang University，Beijing 100191，China
2.
School of Cyber Science and Technology，Beihang University，Beijing 100191，China

Funds:

National Natural Science Foundation of China (61876013,61922009,61573037)

More Information

Corresponding author: E-mail：maixu@cqjj8.com
Received Date: 06 Dec 2021
Accepted Date: 21 Jan 2022

Available Online: 31 Oct 2023

Publish Date: 04 Mar 2022

Abstract

Abstract

This paper proposes a novel task for saliency-guided image translation, with the goal of image-to-image translation conditioned on the user specified saliency map. To address this problem, we develop a novel generative adversarial network (GAN) method -based model, called SalG-GAN method. Given the original image and target saliency map, proposed method can generate a translated image that satisfies the target saliency map. In proposed method, a disentangled representation framework is proposed to encourage the model to learn diverse translations for the same target saliency condition. A saliency-based attention module is introduced as a special attention mechanism to facilitate the developed structures of saliency-guided generator, saliency cue encoder, and saliency-guided global and local discriminators. Furthermore, we build a synthetic dataset and a real-world dataset with labeled visual attention for training and evaluating proposed method. The experimental results on both datasets verify the effectiveness of our model for saliency-guided image translation.
- saliency,
- generative adversarial network,
- image translation,
- attention mechanism,
- dataset

FullText(HTML)

References(37)

References

[1]	EL-NOUBY A, SHARMA S, SCHULZ H, et al. Tell, draw, and repeat: Penerating and modifying images based on continual linguistic instruction[C]//2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 10303-10311.
[2]	HONG S, YANG D D, CHOI J, et al. Inferring semantic layout for hierarchical text-to-image synthesis[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7986-7994.
[3]	ISOLA P, ZHU J Y, ZHOU T H, et al. Image-to-image translation with conditional adversarial networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 5967-5976.
[4]	ZHAO B, MENG L L, YIN W D, et al. Image generation from layout[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 8576-8585.
[5]	CHOI Y, CHOI M, KIM M, et al. StarGAN: Unified generative adversarial networks for multi-domain image-to-image translation[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8789-8797.
[6]	YIN W D, LIU Z W, CHANGE LOY C. Instance-level facial attributes transfer with geometry-aware flow[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2019, 33(1): 9111-9118. doi: 10.1609/aaai.v33i01.33019111
[7]	JOHNSON J, GUPTA A, LI F F. Image generation from scene graphs[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 1219-1228.
[8]	KARRAS T, LAINE S, AILA T M. A style-based generator architecture for generative adversarial networks[C]//2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 4396-4405.
[9]	WANG T C, LIU M Y, ZHU J Y, et al. High-resolution image synthesis and semantic manipulation with conditional GANs[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 8798-8807.
[10]	ZHU J Y, PARK T, ISOLA P, et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2242-2251.
[11]	BAU D, ZHU J Y, STROBELT H, et al. GAN dissection: Visualizing and understanding generative adversarial networks[EB/OL]. (2018-12-26)[2021-10-20].
[12]	YU J H, LIN Z, YANG J M, et al. Free-form image inpainting with gated convolution[C]//2019 IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2020: 4470-4479.
[13]	MATEESCU V A, BAJIC I V. Visual attention retargeting[J]. IEEE MultiMedia, 2016, 23(1): 82-91. doi: 10.1109/MMUL.2015.59
[14]	MECHREZ R, SHECHTMAN E, ZELNIK-MANOR L. Saliency driven image manipulation[J]. Machine Vision and Applications, 2019, 30(2): 189-202. doi: 10.1007/s00138-018-01000-w
[15]	FRIED O, SHECHTMAN E, GOLDMAN D B, et al. Finding distractors in images[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 1703-1712.
[16]	NGUYEN T V, NI B B, LIU H R, et al. Image re-attentionizing[J]. IEEE Transactions on Multimedia, 2013, 15(8): 1910-1919. doi: 10.1109/TMM.2013.2272919
[17]	ITTI L, KOCH C, NIEBUR E. A model of saliency-based visual attention for rapid scene analysis[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 1998, 20(11): 1254-1259. doi: 10.1109/34.730558
[18]	JIANG L, XU M, WANG X F, et al. Saliency-guided image translation[C]//2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 16504-16513.
[19]	CHEN Y C, CHANG K J, TSAI Y H, et al. Guide your eyes: Learning image manipulation under saliency guidance[C]//British Machine Vision Conference. Cardiff: BMVA Press, 2019.
[20]	WONG L K, LOW K L. Saliency retargeting: An approach to enhance image aesthetics[C]//2011 IEEE Workshop on Applications of Computer Vision. Piscataway: IEEE Press, 2011: 73-80.
[21]	GATYS L A, KÜMMERER M, WALLIS T S A, et al. Guiding human gaze with convolutional neural networks[EB/OL]. (2017-09-18)[2021-10-20].
[22]	MEJJATI Y A, GOMEZ C F, KIM K I, et al. Look here! A parametric learning based approach to redirect visual attention[C]//European Conference on Computer Vision. Berlin: Springer, 2020: 343-361.
[23]	HAGIWARA A, SUGIMOTO A, KAWAMOTO K. Saliency-based image editing for guiding visual attention[C]//Proceedings of the 1st International Workshop on Pervasive Eye Tracking & Mobile Eye-Based Interaction. New York: ACM, 2011: 43-48.
[24]	MENDEZ E, FEINER S, SCHMALSTIEG D. Focus and context in mixed reality by modulating first order salient features[C]//International Symposium on Smart Graphics. Berlin: Springer, 2010: 232-243.
[25]	BERNHARD M, ZHANG L, WIMMER M. Manipulating attention in computer games[C]//2011 IEEE 10th IVMSP Workshop: Perception and Visual Signal Analysis. Piscataway: IEEE Press, 2011: 153-158.
[26]	ZHU J Y, ZHANG R, PATHAK D, et al. Multimodal image-to-image translation by enforcing bi-cycle consistency[C]//Advances in Neural Information Processing Systems. Long Beach: NIPS, 2017: 465-476.
[27]	LARSEN A B L, SØNDERBY S K, LAROCHELLE H, et al. Autoencoding beyond pixels using a learned similarity metric[C]//Proceedings of the 33rd International Conference on International Conference on Machine Learning. New York: ACM, 2016: 1558-1566.
[28]	MAO X D, LI Q, XIE H R, et al. Least Squares generative adversarial networks[C]//2017 IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2813-2821.
[29]	JIANG M, HUANG S S, DUAN J Y, et al. SALICON: Saliency in context[C]//2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 1072-1080.
[30]	JOHNSON J, HARIHARAN B, VAN DER MAATEN L, et al. CLEVR: A diagnostic dataset for compositional language and elementary visual reasoning[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 1988-1997.
[31]	ZHOU B L, LAPEDRIZA A, KHOSLA A, et al. Places: A 10 million image database for scene recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(6): 1452-1464. doi: 10.1109/TPAMI.2017.2723009
[32]	MIYATO T, KATAOKA T, KOYAMA M, et al. Spectral normalization for generative adversarial networks[EB/OL]. (2018-02-16)[2021-10-18].
[33]	ULYANOV D, VEDALDI A, LEMPITSKY V. Instance normalization: The missing ingredient for fast stylization[EB/OL]. (2016-07-27)[2021-10-24].
[34]	KINGMA D P, BA J. Adam: A method for stochastic optimization[EB/OL]. (2014-12-22)[2021-10-25].
[35]	HEUSEL M, RAMSAUER H, UNTERTHINER T, et al. GANs trained by a two time-scale update rule converge to a local Nash equilibrium[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 6629-6640.
[36]	ZHANG R, ISOLA P, EFROS A A, et al. The unreasonable effectiveness of deep features as a perceptual metric[C]//2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 586-595.
[37]	LEE H Y, TSENG H Y, HUANG J B, et al. Diverse image-to-image translation via disentangled representations[C]//European Conference on Computer Vision. Berlin: Springer, 2018: 36-52.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(4) / Tables(2)

Get Citation

PDF

XML

Article Metrics

Article views(1267) PDF downloads(31)

Saliency-guided image translation

doi: 10.13700/j.bh.1001-5965.2021.0732

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Saliency-guided image translation

doi: 10.13700/j.bh.1001-5965.2021.0732

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content