留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

用于长时视觉跟踪的级联目标漂移判定网络

侯志强 赵佳鑫 陈语 马素刚 余旺盛 范九伦

侯志强,赵佳鑫,陈语,等. 用于长时视觉跟踪的级联目标漂移判定网络[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2240-2252 doi: 10.13700/j.bh.1001-5965.2023.0504
引用本文: 侯志强,赵佳鑫,陈语,等. 用于长时视觉跟踪的级联目标漂移判定网络[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2240-2252 doi: 10.13700/j.bh.1001-5965.2023.0504
HOU Z Q,ZHAO J X,CHEN Y,et al. Cascaded object drift determination network for long-term visual tracking[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2240-2252 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0504
Citation: HOU Z Q,ZHAO J X,CHEN Y,et al. Cascaded object drift determination network for long-term visual tracking[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2240-2252 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0504

用于长时视觉跟踪的级联目标漂移判定网络

doi: 10.13700/j.bh.1001-5965.2023.0504
基金项目: 

国家自然科学基金(62072370);陕西省自然科学基金(2023-JC-YB-598)

详细信息
    通讯作者:

    E-mail:hzq@xupt.edu.cn

  • 中图分类号: TP391.4

Cascaded object drift determination network for long-term visual tracking

Funds: 

National Natural Science Foundation of China (62072370); Natural Science Foundation of Shaanxi Province (2023-JC-YB-598)

More Information
  • 摘要:

    针对现有目标漂移判定准则中需要人为选定阈值和判定性能不佳的问题,提出一种自适应选取阈值的级联目标漂移判定网络。通过2个子判定网络的级联设计,判定跟踪结果是否漂移;在所提网络中使用静态模板、长时模板和短时模板联合判定跟踪结果,提高判定的准确性,为使模板适应判定过程中目标的外观变化,设计长短时模板更新策略以保证模板质量;将所提级联目标漂移判定网络联合短时跟踪器TransT与全局重检测方法GlobalTrack,搭建长时视觉跟踪算法TransT_LT。在UAV20L、LaSOT、VOT2018-LT和VOT2020-LT等4个长时视觉跟踪数据集上对所提算法进行性能测试,实验结果表明:所提长时视觉跟踪算法具有优越的长时视觉跟踪性能,特别是在UAV20L数据集上,相较于基准算法,跟踪成功率和精度分别提升了7.7%和10.3%。所提目标漂移判定网络的判定速度为100帧/s,对长时视觉跟踪算法的速度影响不大。

     

  • 图 1  长时视觉跟踪算法框架

    Figure 1.  Framework of long-term visual tracking algorithm

    图 2  级联目标漂移判定网络结构

    Figure 2.  Structure of cascaded object drift determination network

    图 3  LFFM模块示意

    Figure 3.  Schematic diagram of LFFM module

    图 4  MGFFM模块示意图

    Figure 4.  Schematic diagram of MGFFM module

    图 5  部分正负样本

    Figure 5.  Partial positive and negative sample plots

    图 6  LaSOT跟踪结果评估曲线

    Figure 6.  LaSOT tracking result evaluation curves

    图 7  UAV20L跟踪结果评估曲线及属性的平均成功率对比结果

    Figure 7.  UAV20L tracking result evaluation curves and average success rate comparison results for attributes

    图 8  定性分析图

    Figure 8.  Qualitative analysis diagram

    图 9  视觉跟踪算法平均速度对比结果

    Figure 9.  Average speed comparison results of vision tracking algorithm

    图 10  加入第2级目标漂移判定网络后在尺度变换属性上的定性分析

    Figure 10.  Qualitative analysis on scale transformation attributes after incorporating the second-level target drift determination network

    表  1  不同$ \partial $与$ \beta $参数值对长时视觉跟踪性能的影响

    Table  1.   Impact of different parameter values on long-term visual tracking performance for $ \partial $ and $ \beta $

    $ \left( {\partial ,\beta } \right) $ UAV20L VOT2020-LT
    成功率 精确度 F分数 精确度 召回率
    (0.5,0.6) 0.635 0.826 0.648 0.693 0.608
    (0.5,0.7) 0.649 0.839 0.660 0.703 0.623
    (0.5,0.8) 0.654 0.862 0.661 0.711 0.618
    (0.5,0.9) 0.666 0.866 0.664 0.653 0.675
    (0.6,0.7) 0.671 0.870 0.664 0.709 0.625
    (0.6,0.8) 0.672 0.872 0.665 0.653 0.677
    (0.6,0.9) 0.693 0.898 0.671 0.715 0.632
    (0.7,0.8) 0.673 0.870 0.665 0.713 0.624
    (0.7,0.9) 0.672 0.875 0.665 0.711 0.622
     注:加粗字体表示最优值。
    下载: 导出CSV

    表  2  长时模板更新时间间隔参数$\delta $对长时视觉跟踪性能的影响

    Table  2.   Performance impact of long-term template update interval parameter on long-term visual tracking

    $ \delta $/帧 UAV20L VOT2020-LT
    成功率 精确度 F分数 精确度 召回率
    50 0.666 0.859 0.665 0.713 0.623
    100 0.693 0.898 0.671 0.715 0.632
    150 0.689 0.893 0.669 0.711 0.631
    200 0.686 0.888 0.664 0.713 0.622
    250 0.685 0.886 0.661 0.712 0.618
    300 0.672 0.866 0.660 0.715 0.613
     注:加粗字体表示最优值。
    下载: 导出CSV

    表  3  不同判定准则在2个数据集上的最优阈值

    Table  3.   Optimal thresholds for different determination criteria on two datasets

    判定准则 UAV20L VOT2020-LT
    Fmax[4] >0.9 >0.95
    PSR[8] >3 >5
    APCE[9] <17 <19
    RT-MDNet[10] / /
    MDNet[12] / /
    本文 / /
     注:“/”表示不需要选取阈值。
    下载: 导出CSV

    表  4  不同判定准则的最优阈值在2个数据集上的性能

    Table  4.   Performance of the optimal thresholds of different determination criteria on two datasets

    判定准则 UAV20L VOT2020-LT
    成功率 精确度 F分数 精确度 召回率
    Fmax[4] 0.632 0.817 0.644 0.696 0.599
    PSR[8] 0.639 0.824 0.648 0.704 0.600
    APCE[9] 0.641 0.828 0.651 0.709 0.602
    RT-MDNet[10] 0.624 0.806 0.641 0.712 0.590
    MDNet[12] 0.630 0.813 0.646 0.698 0.601
    本文 0.693 0.898 0.671 0.715 0.632
     注:加粗字体表示最优值。
    下载: 导出CSV

    表  5  消融实验结果

    Table  5.   Ablation experiment results

    TransT TTD LTUS CDDN UAV20L VOT2020-LT
    成功率 精确度 F分数 精确度 召回率
    0.616 0.795 0.638 0.682 0.599
    0.658 0.857 0.657 0.669 0.645
    0.688 0.892 0.668 0.660 0.676
    0.693 0.898 0.671 0.715 0.632
    下载: 导出CSV

    表  6  VOT2018-LT数据集上不同视觉跟踪算法的跟踪结果

    Table  6.   Tracking results of different visual tracking algorithms on VOT2018-LT dataset

    算法 F分数 精确度 召回率
    TANet[4] 0.586 0.649 0.535
    LTMU[5] 0.690 0.710 0.672
    ELGLT[6] 0.638 0.669 0.610
    LGST[7] 0.630 0.637 0.622
    MBMD[11] 0.610 0.634 0.588
    TransT[14] 0.670 0.714 0.631
    GlobalTrack[15] 0.555 0.503 0.528
    SPLT[26] 0.616 0.633 0.600
    Siam R-CNN[27] 0.668 0.667 0.675
    LTST[39] 0.636 0.653 0.620
    MTTNet[40] 0.621 0.634 0.609
    TransT_LT(本文) 0.693 0.736 0.654
    下载: 导出CSV

    表  7  VOT2020-LT数据集上不同视觉跟踪算法的跟踪结果

    Table  7.   Tracking results of different visual tracking algorithms on VOT2020-LT dataset

    算法 F分数 精确度 召回率
    TANet[4] 0.515 0.568 0.513
    ELGLT[6] 0.607 0.611 0.560
    LGST[7] 0.578 0.607 0.552
    MBMD[11] 0.575 0.623 0.534
    TransT[14] 0.638 0.682 0.599
    GlobalTrack[15] 0.555 0.503 0.528
    SPLT[26] 0.565 0.587 0.544
    Siam R-CNN[27] 0.670 0.658 0.676
    LTST[39] 0.587 0.631 0.548
    RLT-DiMP[41] 0.670 0.657 0.684
    TransT_LT(本文) 0.671 0.715 0.632
    下载: 导出CSV

    表  8  UAV20L数据集上第2级目标漂移判定网络在尺度变换属性上的实验结果

    Table  8.   Experimental results of the second-level target drift determination network on scale transformation attributes on UAV20L dataset

    有无级联 成功率 精确度
    TransT_LT(无级联) 0.688 0.886
    TransT_LT(有级联) 0.693 0.892
    下载: 导出CSV
  • [1] 李玺, 查宇飞, 张天柱, 等. 深度学习的目标跟踪算法综述[J]. 中国图象图形学报, 2019, 24(12): 2057-2080. doi: 10.11834/jig.190372

    LI X, ZHA Y F, ZHANG T Z, et al. Survey of visual object tracking algorithms based on deep learning[J]. Journal of Image and Graphics, 2019, 24(12): 2057-2080(in Chinese). doi: 10.11834/jig.190372
    [2] 刘芳, 孙亚楠, 王洪娟, 等. 基于残差学习的自适应无人机目标跟踪算法[J]. 北京亚洲成人在线一二三四五六区学报, 2020, 46(10): 1874-1882.

    LIU F, SUN Y N, WANG H J, et al. Adaptive UAV target tracking algorithm based on residual learning[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(10): 1874-1882(in Chinese).
    [3] 蒲磊, 冯新喜, 侯志强, 等. 基于级联注意力机制的孪生网络视觉跟踪算法[J]. 北京亚洲成人在线一二三四五六区学报, 2020, 46(12): 2302-2310.

    PU L, FENG X X, HOU Z Q, et al. Siamese network visual tracking algorithm based on cascaded attention mechanism[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(12): 2302-2310(in Chinese).
    [4] WANG X, TANG J, LUO B, et al. Tracking by joint local and global search: a target-aware attention-based approach[J]. IEEE Transactions on Neural Networks and Learning Systems, 2022, 33(11): 6931-6945. doi: 10.1109/TNNLS.2021.3083933
    [5] DAI K N, ZHANG Y H, WANG D, et al. High-performance long-term tracking with meta-updater[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6297-6306.
    [6] ZHAO H J, YAN B, WANG D, et al. Effective local and global search for fast long-term tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2023, 45(1): 460-474. doi: 10.1109/TPAMI.2022.3153645
    [7] GAO Z, ZHUANG Y, GU J J, et al. A joint local-global search mechanism for long-term tracking with dynamic memory network[J]. Expert Systems with Applications, 2023, 223: 119890. doi: 10.1016/j.eswa.2023.119890
    [8] BOLME D S, BEVERIDGE J R, DRAPER B A, et al. Visual object tracking using adaptive correlation filters[C]//Proceedings of the IEEE Computer Society Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2010: 2544-2550.
    [9] WANG M M, LIU Y, HUANG Z Y. Large margin object tracking with circulant feature maps[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 4800-4808.
    [10] JUNG I, SON J, BAEK M, et al. Real-time MDNet[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 83-98.
    [11] ZHANG Y, WANG D, WANG L, et al. Learning regression and verification networks for long-term visual tracking[EB/OL]. (2018-11-19)[2023-08-01]. http://arxiv.org/abs/1809.04320v1.
    [12] NAM H, HAN B. Learning multi-domain convolutional neural networks for visual tracking[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 4293-4302.
    [13] HOU Z Q, WANG Z, PU L, et al. Target drift discriminative network based on deep learning in visual tracking[J]. Journal of Electronic Imaging, 2022, 31: 043052.
    [14] CHEN X, YAN B, ZHU J, et al. Transformer tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 8126-8135.
    [15] HUANG L H, ZHAO X, HUANG K Q. GlobalTrack: a simple and strong baseline for long-term tracking[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 2020, 34(7): 11037-11044. doi: 10.1609/aaai.v34i07.6758
    [16] MUELLER M, SMITH N, GHANEM B. A benchmark and simulator for UAV tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2016: 445-461.
    [17] FAN H, LIN L T, YANG F, et al. LaSOT: a high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 5369-5378.
    [18] LUKEŽIČ A, ZAJC L Č, VOJÍŘ T, et al. Now you see me: evaluating performance in long-term visual tracking[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2018: 4991391.
    [19] KRISTAN M, LEONARDIS A, MATAS J, et al. The eighth visual object tracking VOT2020 challenge results[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, 2020: 547-601.
    [20] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2018: 7132-7141.
    [21] ZHAO H S, SHI J P, QI X J, et al. Pyramid scene parsing network[C]//Proceedings of theIEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2017: 6230-6239.
    [22] NI J J, WU J H, TONG J, et al. GC-Net: global context network for medical image segmentation[J]. Computer Methods and Programs in Biomedicine, 2020, 190: 105121. doi: 10.1016/j.cmpb.2019.105121
    [23] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2015-04-10)[2023-08-01]. http://arxiv.org/abs/1409.1556.
    [24] DONG Q J, HE X D, GE H Y, et al. Improving model drift for robust object tracking[J]. Multimedia Tools and Applications, 2020, 79(35): 25801-25815.
    [25] WU Y, LIM J, YANG M H. Object tracking benchmark[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9): 1834-1848. doi: 10.1109/TPAMI.2014.2388226
    [26] YAN B, ZHAO H J, WANG D, et al. ‘skimming-perusal’ tracking: a framework for real-time and robust long-term tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 2385-2393.
    [27] VOIGTLAENDER P, LUITEN J, TORR P H S, et al. Siam R-CNN: visual tracking by re-detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 6577-6587.
    [28] MA F, SHOU M Z, ZHU L C, et al. Unified Transformer tracker for object tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2022: 8771-8780.
    [29] FU Z H, LIU Q J, FU Z H, et al. STMTrack: template-free visual tracking with space-time memory networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 13769-13778.
    [30] BHAT G, DANELLJAN M, VAN GOOL L, et al. Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2019: 6181-6190.
    [31] GUO D Y, SHAO Y Y, CUI Y, et al. Graph attention tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 9538-9547.
    [32] LI B, WU W, WANG Q, et al. Evolution of Siamese visual tracking with very deep networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 16-20.
    [33] DANELLJAN M, BHAT G, KHAN F S, et al. ATOM: accurate tracking by overlap maximization[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2019: 4655-4664.
    [34] WANG N, ZHOU W G, WANG J, et al. Transformer meets tracker: exploiting temporal context for robust visual tracking[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2021: 1571-1580.
    [35] DANELLJAN M, HÄGER G, KHAN F S, et al. Learning spatially regularized correlation filters for visual tracking[C]//Proceedings of the IEEE International Conference on Computer Vision. Piscataway: IEEE Press, 2015: 4310-4318.
    [36] HARE S, GOLODETZ S, SAFFARI A, et al. Struck: structured output tracking with kernels[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2016, 38(10): 2096-2109. doi: 10.1109/TPAMI.2015.2509974
    [37] DANELLJAN M, HÄGER G, KHAN F S, et al. Discriminative scale space tracking[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(8): 1561-1575. doi: 10.1109/TPAMI.2016.2609928
    [38] KALAL Z, MIKOLAJCZYK K, MATAS J. Tracking-learning-detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2012, 34(7): 1409-1422. doi: 10.1109/TPAMI.2011.239
    [39] YU L, QIAO B J, ZHANG H L, et al. LTST: long-term segmentation tracker with memory attention network[J]. Image and Vision Computing, 2022, 119: 104374. doi: 10.1016/j.imavis.2022.104374
    [40] SANG H, LI G, ZHAO Z. Multi-scale global retrieval and temporal-spatial consistency matching based long-term tracking network[J]. Chinese Journal of Electronics, 2022, 32: 1-11.
    [41] CHOI S, LEE J, LEE Y, et al. Robust long-term object tracking via improved discriminative model prediction[C]//Proceedings of the European Conference on Computer Vision. Berlin: Springer, , 2020: 602-617.
  • 加载中
图(10) / 表(8)
计量
  • 文章访问数:  388
  • HTML全文浏览量:  65
  • PDF下载量:  12
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-08-02
  • 录用日期:  2023-11-10
  • 网络出版日期:  2023-11-23
  • 整期出版日期:  2025-07-14

目录

    /

    返回文章
    返回
    常见问答