Volume 46 Issue 9
Sep.  2020
Turn off MathJax
Article Contents
ZHOU Qianli, ZHANG Wenjing, ZHAO Luping, et al. Cross-modal object tracking algorithm based on pedestrian attribute[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042(in Chinese)
Citation: ZHOU Qianli, ZHANG Wenjing, ZHAO Luping, et al. Cross-modal object tracking algorithm based on pedestrian attribute[J]. Journal of Beijing University of Aeronautics and Astronautics, 2020, 46(9): 1635-1642. doi: 10.13700/j.bh.1001-5965.2020.0042(in Chinese)

Cross-modal object tracking algorithm based on pedestrian attribute

doi: 10.13700/j.bh.1001-5965.2020.0042
Funds:

National Key R & D Program of China A19808

The Operating Expenses of Basic Scientific Research Project of the People's Public Security University of China 2019JKF111

More Information
  • Corresponding author: WANG Rong.E-mail: dbdxwangrong@163.com
  • Received Date: 21 Feb 2020
  • Accepted Date: 15 Mar 2020
  • Publish Date: 20 Sep 2020
  • The accuracy and robustness of the object tracking algorithm have been influenced by the intra-class interference when tracking pedestrian. In this paper, we analyze the drawbacks of current tracking algorithms and propose a model to combine the visual feature and language priori to improve the performance of the tracker.The language guided branch is added to supervise the visual tracking branch by generating the attention, so the intra-class interference can be alleviated.We also propose a method to improve the accuracy of thecross-modal object tracking based on the location confidence instead of classification confidence for siamese trackers.To validate our method, we customize the dataset specialized for pedestrian tracking. The experiment shows the effectiveness of this model.

     

  • loading
  • [1]
    BERTINETTO L, VALMADRE J, HENRIQUE J F, et al.Fully-convolutional siamese networks for object tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 850-865.
    [2]
    LI B, YAN J, WU W, et al.High performance visual tracking with siamese region proposal network[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 8971-8980.
    [3]
    KOSIOREK A R, BEWLEY A, POSNER I, et al.Hierarchical attentive recurrent tracking[C]//Neural Information Processing Systems, 2017, 36: 3053-3061.
    [4]
    ZHANG Z, PENG H.Deeper and wider siamese networks for real-time visual tracking[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 4591-4600.
    [5]
    LI B, WU W, WANG Q, et al.Evolution of siamese visual tracking with very deep networks[J].IEEE Computer Vision and Pattern Recognition, 2019, 35(9):4282-4291.
    [6]
    ZHU Z, WANG Q, LI B, et al.Distractor-aware siamese networks for visual object tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 103-119.
    [7]
    REN L, YUAN X, LU J, et al.Deep reinforcement learning with iterative shift for visual tracking[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 684-700.
    [8]
    ZHANG L, GONZALEZGARCIA A, DE WEIJER J V, et al.Learning the model update for siamese trackers[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 4010-4019.
    [9]
    MOGADALA A, KALIMUTHU M, KLAKOWl D, et al.Trends in integration of vision and language research:A survey of tasks, datasets, and methods[J].IEEE Computer Vision and Pattern Recognition, 2019, 30(19):1183-1986.
    [10]
    HU R, ROHRBACH M, DARRELL T, et al.Segmentation from natural language expressions[C]//European Conference on Computer Vision.Berlin: Springer, 2016: 108-124.
    [11]
    LI Z, TAO R, GAVVES E, et al.Tracking by natural language specification[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2017: 7350-7358.
    [12]
    YU L, LIN Z, SHEN X, et al.Modular attention network for referring expression comprehension[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2018: 1307-1315.
    [13]
    SUN C, MYERS A, VONDRICK C, et al.A joint model for video and language representation learning[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 7464-7473.
    [14]
    SU W, ZHU X, CAO Y, et al.Pre-training of generic visual-linguistic representations[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 13-23.
    [15]
    WU Y, LIM J, YANG M, et al.Online object tracking: A benchmark[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2013: 2411-2418.
    [16]
    WU Y, LIM J, YANG M H.Object tracking benchmark[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 37(9):1834-1848. doi: 10.1109/TPAMI.2014.2388226
    [17]
    GALOOGAHI H K, FAGG A, HUANG C, et al.A benchmark for higher frame rate object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2017: 1134-1143.
    [18]
    MULLER M, BIBI A, GIANCOLA S, et al.A large-scale dataset and benchmark for object tracking in the wild[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 310-327.
    [19]
    HUANG L, ZHAO X, HUANG K, et al.A large high-diversity benchmark for generic object tracking in the wild[J].IEEE Transactions on Pattern Analysis and Machine Intelligence, 2019, 45(21):1374-1391.
    [20]
    FAN H, LIN L, YANG F, et al.A high-quality benchmark for large-scale single object tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2018: 5374-5383.
    [21]
    WANG Q, ZHANG L, BERTINETTO L, et al.Fast online object tracking and segmentation: A unifying approach[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 1328-1338.
    [22]
    HE K, ZHANG X, REN S, et al.Deep residual learning for image recognition[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2016: 770-778.
    [23]
    MARGFFOYTUAY E A, PEREZ J C, BOTERO E, et al.Dynamic multimodal instance segmentation guided by natural language queries[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 656-672.
    [24]
    JIANG B, LUO R, MAO J, et al.Acquisition of localization confidence for accurate object detection[C]//European Conference on Computer Vision.Berlin: Springer, 2018: 816-832.
    [25]
    KAZEMZADE S, ORDONEZ V, MATTENV M, et al.Referring to objects in photographs of natural scene[C]//Empirical Methods in Natural Language Processing, 2014, 28: 787-789.
    [26]
    DANELLJIAN M, BHAT G, KHAN F S, et al.Efficient convolution operators for tracking[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2017: 6931-6939.
    [27]
    DANELLJIAN M, BHAT G, KHAN F S, et al.Accurate tracking by overlap maximization[C]//Proceedings of the IEEE Computer Vision and Pattern Recognition.Piscataway: IEEE Press, 2019: 4660-4669.
    [28]
    BHAT G, DANELLJAN M, VAN GOOL L, et al.Learning discriminative model prediction for tracking[C]//Proceedings of the IEEE International Conference on Computer Vision.Piscataway: IEEE Press, 2019: 6182-6191.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(8)  / Tables(2)

    Article Metrics

    Article views(946) PDF downloads(117) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return