A matching method based on improved SuperPoint and linear Transformer for optical and infrared images

WU Wei; XIAN Yong; SU Juan; ZHANG Daqiao; LI Shaopeng; LI Bing

doi:10.13700/j.bh.1001-5965.2022.1022

Volume 51 Issue 1

Jan. 2025

Turn off MathJax

Article Contents

Journal of Beijing University of Aeronautics and Astronautics > 2025 > 51(1): 340-348.

WU W，XIAN Y，SU J，et al. A matching method based on improved SuperPoint and linear Transformer for optical and infrared images[J]. Journal of Beijing University of Aeronautics and Astronautics，2025，51（1）：340-348 （in Chinese） doi: 10.13700/j.bh.1001-5965.2022.1022

Citation:

PDF( 3825 KB)

A matching method based on improved SuperPoint and linear Transformer for optical and infrared images

doi: 10.13700/j.bh.1001-5965.2022.1022

WU Wei¹,
XIAN Yong¹,
SU Juan²,
ZHANG Daqiao¹,
LI Shaopeng^{1, 3
,
,},
LI Bing¹

1.
College of War Support，Rocket Force University of Engineering，Xi’an 710025，China
2.
College of Nuclear Engineering，Rocket Force University of Engineering，Xi’an 710025，China
3.
Department of Automation，Tsinghua University，Beijing 100084，China

Funds:

National Natural Science Foundation of China (62103432); China Postdoctoral Science Foundation (2022M721841)

More Information

Corresponding author: E-mail：sp-li16@mails.tsinghua.edu.cn
Received Date: 29 Dec 2022
Accepted Date: 29 May 2023

Available Online: 15 Jan 2025

Publish Date: 08 Sep 2023

Abstract

Abstract

A deep learning matching algorithm based on improved SuperPoint and linear transformer was proposed to solve the problem of difficult matching and high mismatching rates between visible and infrared heterologous images. Firstly, based on the SuperPoint network structure, the algorithm introduced the idea of a feature pyramid to build a feature description branch and trained it based on the hinge loss function, so as to better learn the multi-scale deep features of visible and infrared images and increase the similarity of the image with correspondence points to the descriptor. In the feature matching module, SuperGlue was improved by adopting a linear transformer for aggregating features to obtain better matching results. Experiments conducted on multiple datasets demonstrate that the proposed method improves the matching precision and provides better matching performance in comparison with existing matching methods.
- infrared image,
- visible light image,
- heterologous image matching,
- feature pyramid network,
- linear transformer

FullText(HTML)

References(22)

References

[1]	MA J Y, JIANG X Y, FAN A X, et al. Image matching from handcrafted to deep features: A survey[J]. International Journal of Computer Vision, 2021, 129(1): 23-79. doi: 10.1007/s11263-020-01359-2
[2]	ZITOVÁ B, FLUSSER J. Image registration methods: A survey[J]. Image and Vision Computing, 2003, 21(11): 977-1000. doi: 10.1016/S0262-8856(03)00137-9
[3]	LOWE D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 2004, 60(2): 91-110. doi: 10.1023/B:VISI.0000029664.99615.94
[4]	MA J Y, ZHAO J, MA Y, et al. Non-rigid visible and infrared face registration via regularized Gaussian fields criterion[J]. Pattern Recognition, 2015, 48(3): 772-784. doi: 10.1016/j.patcog.2014.09.005
[5]	YE Y X, SHAN J, BRUZZONE L, et al. Robust registration of multimodal remote sensing images based on structural similarity[J]. IEEE Transactions on Geoscience and Remote Sensing, 2017, 55(5): 2941-2958. doi: 10.1109/TGRS.2017.2656380
[6]	CHEN Y J, ZHANG X W, ZHANG Y N, et al. Visible and infrared image registration based on region features and edginess[J]. Machine Vision and Applications, 2018, 29(1): 113-123. doi: 10.1007/s00138-017-0879-6
[7]	LI J Y, HU Q W, AI M Y. RIFT: Multi-modal image matching based on radiation-variation insensitive feature transform[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 2019, 29: 3296-3310.
[8]	ARAR M, GINGER Y, DANON D, et al. Unsupervised multi-modal image registration via geometry preserving image-to-image translation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 13407-13416.
[9]	PIELAWSKI N, WETZER E, ÖFVERSTEDT J, et al. CoMIR: Contrastive multimodal image representation for registration[C]// Proceedings of Annual Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2020: 1-21.
[10]	YI K M, TRULLS E, LEPETIT V, et al. LIFT: Learned invariant feature transform[M]// Lecture Notes in Computer Science. Cham: Springer International Publishing, 2016: 467-483.
[11]	DETONE D, MALISIEWICZ T, RABINOVICH A. SuperPoint: Self-supervised interest point detection and description[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops (CVPRW). Piscataway: IEEE Press, 2018: 337-33712.
[12]	LI K H, WANG L G, LIU L, et al. Decoupling makes weakly supervised local feature better[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 15817-15827.
[13]	SARLIN P E, DETONE D, MALISIEWICZ T, et al. SuperGlue: Learning feature matching with graph neural networks[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2020: 4937-4946.
[14]	SHI Y, CAI J X, SHAVIT Y, et al. ClusterGNN: Cluster-based coarse-to-fine graph neural network for efficient feature matching[C]// 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2022: 12507-12516.
[15]	LIN T Y, DOLLÁR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2017: 936-944.
[16]	CUTURI M. Sinkhorn distance: Lightspeed computation of optimal transport[C]// Proceedings of Annual Conference on Neural Information Processing Systems. Cambridge: MIT Press, 2013: 1-9.
[17]	KATHAROPOULOS A, VYAS A, PAPPAS N, et al. Transformers are RNNs: Fast autoregressive transformers with linear attention[C]//Proceedings of 2020 International Conference on Machine Learing (ICML). New York: ACM Press, 2020: 1-10.
[18]	DUSMANU M, ROCCO I, PAJDLA T, et al. D2-net: a trainable CNN for joint description and detection of local features[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR). Piscataway: IEEE Press, 2019: 8084-8093.
[19]	QUAN D, WANG S, NING H Y, et al. Element-wise feature relation learning network for cross-spectral image patch matching[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(8): 3372-3386. doi: 10.1109/TNNLS.2021.3052756
[20]	RAZAKARIVONY S, JURIE F. Vehicle detection in aerial imagery: A small target detection benchmark[J]. Journal of Visual Communication and Image Representation, 2016, 34: 187-203. doi: 10.1016/j.jvcir.2015.11.002
[21]	ROCCO I, ARANDJELOVIC R, SIVIC J. Convolutional neural network architecture for geometric matching[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 41(11): 2553-2567. doi: 10.1109/TPAMI.2018.2865351
[22]	HARTLEY R, ZISSERMAN A. Multiple view geometry in computer vision[M]. New York: Cambridge University Press, 2004: 117-123.

Relative Articles

Supplements(0)

Cited By

Proportional views

Proportional views

通讯作者: 陈斌, bchen63@163.com

1.
沈阳化工大学材料科学与工程学院沈阳 110142

Figures(9) / Tables(3)

Get Citation

PDF

XML

Article Metrics

Article views(551) PDF downloads(24)

A matching method based on improved SuperPoint and linear Transformer for optical and infrared images

doi: 10.13700/j.bh.1001-5965.2022.1022

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

A matching method based on improved SuperPoint and linear Transformer for optical and infrared images

doi: 10.13700/j.bh.1001-5965.2022.1022

Abstract

References

Proportional views

Catalog

通讯作者: 陈斌, bchen63@163.com

Article Metrics

Proportional views

Related

Export File

Citation

Format

Content