留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于三元组哈希损失的半监督图像检索

邵伟志 熊思宇 潘丽丽

邵伟志,熊思宇,潘丽丽. 基于三元组哈希损失的半监督图像检索[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2526-2537 doi: 10.13700/j.bh.1001-5965.2023.0451
引用本文: 邵伟志,熊思宇,潘丽丽. 基于三元组哈希损失的半监督图像检索[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2526-2537 doi: 10.13700/j.bh.1001-5965.2023.0451
SHAO W Z,XIONG S Y,PAN L L. Semi-supervised image retrieval based on triplet hash loss[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2526-2537 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0451
Citation: SHAO W Z,XIONG S Y,PAN L L. Semi-supervised image retrieval based on triplet hash loss[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2526-2537 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0451

基于三元组哈希损失的半监督图像检索

doi: 10.13700/j.bh.1001-5965.2023.0451
基金项目: 

湖南省自然科学基金面上项目(2021JJ31164); 湖南省教育厅科学研究重点项目(22A0195)

详细信息
    通讯作者:

    E-mail:lily_pan@163.com

  • 中图分类号: TP391

Semi-supervised image retrieval based on triplet hash loss

Funds: 

The General Program of Natural Science Foundation of Hunan Province (2021JJ31164); The Key Program of Science Research Foundation of Education Department of Hunan Province (22A0195)

More Information
  • 摘要:

    目前大多数基于深度学习的图像检索方法是在有监督条件下进行的,需要大量的标签数据,但实际应用中获取大量标签数据困难且成本高昂。此外,现有基于欧氏距离的三元组损失计算不够精确,使模型对图像相似性学习的能力欠佳。采用熵最小化伪标签、三元组损失和半监督学习技术,提出了一种新的半监督哈希图像检索模型(SSITL)。应用多阶段模型联合与锐化技术为未标记数据生成伪标签,并通过熵最小化处理以提高伪标签的置信度。同时,利用标记数据和未标记数据的聚类结果选择三元组,并采用基于通道权重矩阵的三元组哈希损失(CWT loss)帮助SSITL学习图像相似性。为了生成更好的哈希码,在2个汉明嵌入间使用MixUp进行混洗得到新的汉明嵌入以改善图像检索性能。实验结果表明:相较于其他方法,SSITL在相仿的时间开销下,在CIFAR-10和NUS-WIDE数据集上的检索平均准确率分别提高了1.2%和0.7%,强有力地验证了SSITL是一种优秀的半监督哈希图像检索模型。

     

  • 图 1  基于三元组哈希损失的半监督图像检索模型

    Figure 1.  A semi-supervised image retrieval model based on triplet hash loss

    图 2  伪标签生成流程

    Figure 2.  Pseudo-label generation process

    图 3  损失函数计算图示

    Figure 3.  Graphical representation of loss function calculation

    图 4  不同数据集上不同标签数据量下的平均精度值

    Figure 4.  MAP scores for different amounts of labeled data on different datasets

    图 5  不同数据集上不同哈希码长度下的平均精度值

    Figure 5.  MAP scores with different hash code lengths on different datasets

    图 6  不同数据集上的可见类检索

    Figure 6.  Seen class retrieval on different datasets

    图 7  不同数据集上的不可见类检索

    Figure 7.  Unseen class retrieval on different datasets

    图 8  原图与不同层激活值的可视化对比

    Figure 8.  Visualization of the original image and activation values of different layers

    表  1  CIFAR-10数据集上不同方法的平均精度

    Table  1.   MAP scores of different methods on CIFAR-10

    方法 CIFAR-10
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.818 0.838 0.853 0.854
    ABML[19] 0.815 0.832 0.850 0.851
    CPQN[18] 0.817 0.830 0.848 0.852
    BGDH[30] 0.805 0.824 0.826 0.833
    DSH-GAN[31] 0.751 0.801 0.807 0.811
    SSDH[17] 0.802 0.810 0.816 0.819
    DPSH[27] 0.737 0.775 0.801 0.798
    DSDH[29] 0.738 0.784 0.795 0.818
    DRSCH[28] 0.616 0.625 0.630 0.629
    SDH[25] 0.438 0.520 0.558 0.587
    ITQ[26] 0.219 0.242 0.250 0.252
    下载: 导出CSV

    表  2  NUS-WIDE数据集上不同方法的平均精度

    Table  2.   MAP scores of different methods on NUS-WIDE dataset

    方法 NUS-WIDE
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.838 0.857 0.880 0.873
    ABML[19] 0.835 0.851 0.872 0.869
    CPQN[18] 0.833 0.849 0.869 0.870
    BGDH[30] 0.805 0.824 0.826 0.833
    DSH-GAN[31] 0.828 0.843 0.848 0.851
    SSDH[17] 0.803 0.808 0.826 0.833
    DPSH[27] 0.767 0.778 0.795 0.798
    DSDH[29] 0.772 0.804 0.821 0.831
    DRSCH[28] 0.616 0.623 0.627 0.627
    SDH[25] 0.541 0.548 0.579 0.621
    ITQ[26] 0.663 0.700 0.707 0.723
    下载: 导出CSV

    表  3  CIFAR-10和NUS-WIDE数据集上SSITL不同模型的平均精度分数

    Table  3.   MAP scores of different SSITLs on CIFAR-10 and NUS-WIDE datasets

    方法 CIFAR-10
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.818 0.838 0.853 0.854
    Nmid SSITL 0.781 0.812 0.828 0.831
    Nsim SSITL 0.737 0.764 0.785 0.778
    方法 NUS-WIDE
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.838 0.857 0.880 0.873
    Nmid SSITL 0.807 0.823 0.844 0.857
    Nsim SSITL 0.775 0.788 0.815 0.820
    下载: 导出CSV

    表  4  CIFAR-10数据集上不同方法对不可见类的检索精度

    Table  4.   MAP scores of unseen class retrieval with different methods with on CIFAR-10 datasets

    方法 CIFAR-10
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.319 0.327 0.349 0.358
    ABML[19] 0.316 0.325 0.337 0.348
    BGDH[30] 0.267 0.279 0.284 0.294
    DSH-GAN[31] 0.281 0.288 0.299 0.310
    SSDH[17] 0.287 0.291 0.309 0.317
    DPSH[27] 0.264 0.275 0.281 0.293
    DSDH[29] 0.255 0.263 0.278 0.288
    DRSCH[28] 0.217 0.218 0.233 0.251
    SDH[25] 0.190 0.192 0.197 0.206
    ITQ[26] 0.153 0.162 0.192 0.199
    下载: 导出CSV

    表  5  NUS-WIDE数据集上不同方法对不可见类检索的平均精度

    Table  5.   MAP scores of unseen class retrieval with different methods on NUS-WIDE datasets

    方法 NUS-WIDE
    12 bits 24 bits 32 bits 48 bits
    SSITL 0.537 0.553 0.581 0.584
    ABML[19] 0.532 0.550 0.574 0.582
    BGDH[30] 0.511 0.529 0.545 0.538
    DSH-GAN[31] 0.508 0.539 0.542 0.541
    SSDH[17] 0.514 0.534 0.538 0.549
    DPSH[27] 0.487 0.512 0.514 0.527
    DSDH[29] 0.255 0.263 0.278 0.288
    DRSCH[28] 0.458 0.463 0.471 0.468
    SDH[25] 0.468 0.489 0.491 0.505
    ITQ[26] 0.490 0.486 0.493 0.507
    下载: 导出CSV

    表  6  CIFAR-10数据集上使用不同层激活通道的不可见类检索

    Table  6.   Unseen class retrieval using activation channels of different layers on CIFAR-10 datasets

    不同层激活通道 CIFAR-10
    12 bits 24 bits 32 bits 48 bits
    第5层 0.226 0.233 0.251 0.273
    第30层 0.286 0.305 0.329 0.343
    第100层 0.319 0.327 0.349 0.358
    第150层 0.308 0.319 0.346 0.354
    下载: 导出CSV

    表  7  NUS-WIDE数据集上使用不同层激活通道的不可见类检索

    Table  7.   Unseen class retrieval using activation channels of different layers on NUS-WIDE datasets

    不同层激活通道 NUS-WIDE
    12 bits 24 bits 32 bits 48 bits
    第5层 0.462 0.473 0.481 0.483
    第30层 0.504 0.511 0.539 0.542
    第100层 0.537 0.553 0.581 0.584
    第150层 0.518 0.532 0.574 0.579
    下载: 导出CSV

    表  8  CIFAR-10数据集上不同权重的不可见类检索

    Table  8.   Unseen class retrieval using different weight values on CIFAR-10 datasets

    权重值 CIFAR-10
    12 bits 24 bits 32 bits 48 bits
    ${\lambda _2} = 0$ 0.275 0.286 0.305 0.311
    ${\lambda _2} = 0.2$ 0.302 0.311 0.332 0.343
    ${\lambda _2} = 0.5$ 0.319 0.327 0.349 0.358
    ${\lambda _2} = 1$ 0.284 0.306 0.329 0.338
    ${\lambda _2} = 2$ 0.265 0.281 0.297 0.301
    下载: 导出CSV

    表  9  NUS-WIDE数据集上不同权重值的不可见类检索

    Table  9.   Unseen class retrieval using different weight values on NUS-WIDE datasets

    权重值 NUS-WIDE
    12 bits 24 bits 32 bits 48 bits
    ${\lambda _2} = 0$ 0.508 0.517 0.526 0.539
    ${\lambda _2} = 0.2$ 0.519 0.542 0.559 0.571
    ${\lambda _2} = 0.5$ 0.537 0.553 0.581 0.584
    ${\lambda _2} = 1$ 0.513 0.537 0.541 0.564
    ${\lambda _2} = 2$ 0.497 0.511 0.530 0.543
    下载: 导出CSV

    表  10  不同方法的检索时间对比

    Table  10.   Retrieval time comparison of different methods ms

    方法 CIFAR-10 NUS-WIDE
    SSITL 3.58 12.04
    SSGAH[32] 3.51 12.01
    SSDH[17] 3.44 11.63
    DSH-GAN[31] 3.45 12.07
    DRSCH[28] 4.49 12.64
    NINH[3] 4.01 12.04
    CNNH[4] 3.83 12.07
    下载: 导出CSV
  • [1] LI W, DUAN L X, XU D, et al. Text-based image retrieval using progressive multi-instance learning[C]//Proceedings of the 2011 International Conference on Computer Vision. Piscataway: IEEE Press, 2011: 2049-2055.
    [2] LIU Y, ZHANG D S, LU G J, et al. A survey of content-based image retrieval with high-level semantics[J]. Pattern Recognition, 2007, 40(1): 262-282. doi: 10.1016/j.patcog.2006.04.045
    [3] CHEN R Y, PAN L L, LI C, et al. An improved deep fusion CNN for image recognition[J]. Computers, Materials & Continua, 2020, 65(2): 1691-1706.
    [4] LAI H J, PAN Y, YE L, et al. Simultaneous feature learning and hash coding with deep neural networks[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 3270-3278.
    [5] CHEN Y B, MANCINI M, ZHU X T, et al. Semi-supervised and unsupervised deep visual learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2024, 46(3): 1327-1347. doi: 10.1109/TPAMI.2022.3201576
    [6] 刘颖, 程美, 王富平, 等. 深度哈希图像检索方法综述[J]. 中国图象图形学报, 2020, 25(7): 1296-1317. doi: 10.11834/jig.190518

    LIU Y, CHENG M, WANG F P, et al. Deep Hashing image retrieval methods[J]. Journal of Image and Graphics, 2020, 25(7): 1296-1317(in Chinese). doi: 10.11834/jig.190518
    [7] ZHU X, GOLDBERG A B. Introduction to semi-supervised learning[J]. Synthesis Lectures on Artificial Intelligence and Machine Learning, 2009, 3(1): 1-130.
    [8] SCHROFF F, KALENICHENKO D, PHILBIN J. FaceNet: a unified embedding for face recognition and clustering[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 815-823.
    [9] SONG H O, XIANG Y, JEGELKA S, et al. Deep metric learning via lifted structured feature embedding[C]//Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 4004-4012.
    [10] 郑大刚, 刘光杰, 茅耀斌, 等. 基于三元组损失函数的深度人脸哈希方法[J]. 太赫兹科学与电子信息学报, 2021, 19(2): 313-318. doi: 10.11805/TKYDA2018108

    ZHENG D G, LIU G J, MAO Y B, et al. Deep face Hashing based on ternary-group loss function[J]. Journal of Terahertz Science and Electronic Information Technology, 2021, 19(2): 313-318(in Chinese). doi: 10.11805/TKYDA2018108
    [11] 杜雨佳, 李海生, 姚春莲, 等. 基于三元组网络的单图三维模型检索[J]. 北京亚洲成人在线一二三四五六区学报, 2020, 46(9): 1691-1700.

    DU Y J, LI H S, YAO C L, et al. Monocular image based 3D model retrieval using triplet network[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1691-1700(in Chinese).
    [12] 刘晗煜, 黄宏恩, 郑世宝. 基于视角一致性三元组损失的车辆重识别技术[J]. 测控技术, 2021, 40(8): 47-53,63.

    LIU H Y, HUANG H E, ZHENG S B. View consistency triplet loss for vehicle re-identification[J]. Measurement & Control Technology, 2021, 40(8): 47-53,63 (in Chinese).
    [13] LIAO S C, SHAO L. Graph sampling based deep metric learning for generalizable person re-identification[C]//Proceedings of the 2022 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 7349-7358.
    [14] YANG S, ZHANG Y F, ZHAO Q H, et al. Prototype-based support example miner and triplet loss for deep metric learning[J]. Electronics, 2023, 12(15): 3315. doi: 10.3390/electronics12153315
    [15] LI Z, KO B, CHOI H J. Naive semi-supervised deep learning using pseudo-label[J]. Peer-to-Peer Networking and Applications, 2019, 12(5): 1358-1368. doi: 10.1007/s12083-018-0702-9
    [16] BERTHELOT D, CARLINI N, GOODFELLOW I, et al. MixMatch: a holistic approach to semi-supervised learning[EB/OL]. (2019-10-23)[2023-05-23]. http://doi.org/10.48550/arXiv.1905.02249.
    [17] ZHANG J, PENG Y X. SSDH: Semi-supervised deep hashing for large scale image retrieval[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2017, 29(1): 212-225.
    [18] GUO Z T, HONG C Q, ZHUANG W W, et al. CPQN: central product quantization network for semi-supervised image retrieval[C]//Proceedings of the 2021 IEEE International Conference on Big Data. Piscataway: IEEE Press, 2021: 3183-3190.
    [19] WANG G A, HU Q H, YANG Y, et al. Adversarial binary mutual learning for semi-supervised deep hashing[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(8): 4110-4124. doi: 10.1109/TNNLS.2021.3055834
    [20] 魏翔, 王靖杰, 张顺利, 等. ReLSL: 基于可靠标签选择与学习的半监督学习算法[J]. 计算机学报, 2022, 45(6): 1147-1160. doi: 10.11897/SP.J.1016.2022.01147

    WEI X, WANG J J, ZHANG S L, et al. ReLSL: reliable label selection and learning based algorithm for semi-supervised learning[J]. Chinese Journal of Computers, 2022, 45(6): 1147-1160(in Chinese). doi: 10.11897/SP.J.1016.2022.01147
    [21] ZHANG H Y, CISSE M, DAUPHIN Y N, et al. Mixup: beyond empirical risk minimization[EB/OL]. (2018-04-27)[2023-05-25]. http://doi.org/10.48550/arXiv.1710.09412.
    [22] WANG G A, HU Q H, YANG Y, et al. Adversarial binary mutual learning for semi-supervised deep hashing[J]. IEEE Transactions on Neural Networks and Learning Systems, 2021, 33(8): 4110-4124.
    [23] KRIZHEVSKY A, HINTON G. Convolutional deep belief networks on cifar-10[J]. Unpublished Manuscript, 2010, 40(7): 1-9.
    [24] CHUA T S, TANG J H, HONG R C, et al. NUS-WIDE: a real-world web image database from National University of Singapore[C]// Proceedings of the ACM International Conference on Image and Video Retrieval. New York: ACM, 2009: 1-9.
    [25] SHEN F M, SHEN C H, LIU W, et al. Supervised discrete hashing[C]//Proceedings of the 2015 IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2015: 37-45.
    [26] GONG Y C, LAZEBNIK S, GORDO A, et al. Iterative quantization: a Procrustean approach to learning binary codes for large-scale image retrieval[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 35(12): 2916-2929.
    [27] LI W J, WANG S, KANG W C. Feature learning based deep supervised hashing with pairwise labels[EB/OL]. (2016-04-21)[2023-05-27]. http://doi.org/10.48550/arXiv.1511.03855.
    [28] ZHANG R M, LIN L, ZHANG R, et al. Bit-scalable deep hashing with regularized similarity learning for image retrieval and person re-identification[J]. IEEE Transactions on Image Processing, 2015, 24(12): 4766-4779. doi: 10.1109/TIP.2015.2467315
    [29] LI Q, SUN Z, HE R, et al. Deep supervised discrete hashing[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. Long Beach: Curran Associates, 2017: 2479-2488.
    [30] YAN X, ZHANG L, LI W J. Semi-supervised deep Hashing with a bipartite graph[C]//Proceedings of the 26th International Joint Conference on Artificial Intelligence. Melbourne: IJCAJ, 2017: 3238-3244.
    [31] QIU Z F, PAN Y W, YAO T, et al. Deep semantic hashing with generative adversarial networks[C]// Proceedings of the 40th International ACM SIGIR Conference on Research and Development in Information Retrieval. New York: ACM, 2017: 225-234.
    [32] WANG G A, HU Q H, CHENG J, et al. Semi-supervised generative adversarial hashing for image retrieval[C]// Computer Vision – ECCV 2018. Berlin: Springer, 2018: 491-507.
  • 加载中
图(8) / 表(10)
计量
  • 文章访问数:  337
  • HTML全文浏览量:  69
  • PDF下载量:  13
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-07-10
  • 录用日期:  2023-09-14
  • 网络出版日期:  2023-10-30
  • 整期出版日期:  2025-07-31

目录

    /

    返回文章
    返回
    常见问答