留言板

尊敬的读者、作者、审稿人, 关于本刊的投稿、审稿、编辑和出版的任何问题, 您可以本页添加留言。我们将尽快给您答复。谢谢您的支持!

姓名
邮箱
手机号码
标题
留言内容
验证码

基于多视口自适应融合的全景图像质量评价

冯晨曦 张地 林敢 叶龙

冯晨曦,张地,林敢,等. 基于多视口自适应融合的全景图像质量评价[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2404-2414 doi: 10.13700/j.bh.1001-5965.2023.0381
引用本文: 冯晨曦,张地,林敢,等. 基于多视口自适应融合的全景图像质量评价[J]. 北京亚洲成人在线一二三四五六区学报,2025,51(7):2404-2414 doi: 10.13700/j.bh.1001-5965.2023.0381
FENG C X,ZHANG D,LIN G,et al. Omnidirectional image quality assessment based on adaptive multi-viewport fusion[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2404-2414 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0381
Citation: FENG C X,ZHANG D,LIN G,et al. Omnidirectional image quality assessment based on adaptive multi-viewport fusion[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(7):2404-2414 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0381

基于多视口自适应融合的全景图像质量评价

doi: 10.13700/j.bh.1001-5965.2023.0381
基金项目: 

国家重点研发计划(2021YFF0900503);国家自然科学基金(61971383,62001432);北京自然科学基金(4244109)

详细信息
    通讯作者:

    E-mail:yelong@cuc.edu.cn

  • 中图分类号: TN911.73

Omnidirectional image quality assessment based on adaptive multi-viewport fusion

Funds: 

National Key Research and Development Program of China (2021YFF0900503); National Natural Science Foundation of China (61971383,62001432); Beijing Natural Science Foundation (4244109)

More Information
  • 摘要:

    现有的全景图像质量评价(OIQA)模型在提取各视口局部特征时相对独立,导致计算复杂度较高,且难以利用端到端的融合模型对各视口间的关联性进行刻画。针对该问题,提出了一种基于特征共享和多视口自适应融合的质量评价方法。利用共享的骨干网络,将现有方法中互相独立的视口分割与计算任务转换到特征域,使得仅经1次前馈计算即可提取整个图像的局部特征。在此基础上,引入基于球面均匀采样的特征域视口分割方法以保证观察空间和表示空间的像素密度一致,并用语义信息指导各视口局部质量特征的自适应融合。所提方法在压缩的虚拟现实图像质量(CVIQ)数据集和OIQA数据集上的Pearson线性相关系数(PLCC)和Spearman秩相关系数(SRCC)均在0.96以上,与现有主流评价方法相比达到最优。相较于传统评价方法结构相似性指数(SSIM),所提方法在2个数据集上的平均PLCC和平均SRCC分别提高了9.52%和8.7%;相较于最新评价方法多感知特征图像质量评价(MPFIQA),所提方法在2个数据集上的平均PLCC和平均SRCC分别提高了1.71%和1.44%。

     

  • 图 1  YOFONet模型整体结构

    Figure 1.  Overall structure of YOFONet model

    图 2  基于CNN的多尺度特征融合模块

    Figure 2.  CNN-based multi-scale feature fusion module

    图 3  基于球面均匀采样的非均匀视口分割

    Figure 3.  Non-uniform viewport segmentation based on spherical uniform sampling

    图 4  质量感知与语义识别任务间的相互依赖关系示例

    Figure 4.  Illustration of interdependent relationships between quality perception and semantic recognition

    图 5  YOFONet与基准模型的gMAD竞争结果

    Figure 5.  gMAD competition between YOFONet and baseline models

    图 6  Transformer注意力权重可视化

    Figure 6.  Visualization of Transformer attention weights

    图 7  投影畸变对识别模型性能的影响

    Figure 7.  Influence of projection distortion on recognition model performance

    图 8  不同分辨率下YOFONet的性能比较

    Figure 8.  Performance of YOFONet at different resolutions

    表  1  CVIQ和OIQA数据集信息统计

    Table  1.   Information on CVIQ and OIQA datasets

    数据集 参考图像数量 失真图像数量 失真类型 分辨率 主观实验人数 量表等级 头显设备
    CVIQ 16 528 JPEG/AVC/HEVC 4096×2048 20 10 HTC Vive
    OIQA 16 320 JPEG/JPEG2000/GB/WGN 11332×5666/13322×6661 20 10 HTC Vive
    下载: 导出CSV

    表  2  CVIQ和OIQA数据集上模型性能比较

    Table  2.   Performance comparison of models on CVIQ and OIQA datasets

    类型 算法 PLCC SRCC
    CVIQ OIQA CVIQ OIQA
    全参考 PSNR 0.7320 0.6918 0.7662 0.6802
    WS-PSNR[11] 0.7498 0.6985 0.7755 0.6932
    SSIM[34] 0.8857 0.8892 0.8972 0.8798
    MS-SSIM[35] 0.8762 0.8427 0.8875 0.8332
    无参考 NIQE[37] 0.5126 0.4701 0.5329 0.3306
    BRISQUE[36] 0.7448 0.7778 0.7641 0.7794
    MC360IQA[8] 0.9506 0.9247 0.9139 0.9187
    MPFIQA[13] 0.9530 0.9583 0.9466 0.9574
    AHGCN[17] 0.9658 0.9682 0.9617 0.9647
    本文 0.9715 0.9724 0.9639 0.9675
     注:粗体表示性能最好的模型。
    下载: 导出CSV

    表  3  消融实验结果(CVIQ数据集)

    Table  3.   Ablation experiment results (CVIQ dataset)

    名称 模块 PLCC SRCC
    多尺度特
    征融合
    语义-质
    量聚合
    非均匀视
    口分割
    Baseline × × × 0.9201 0.8849
    Model1 × × × 0.9400 0.9080
    Model2 × × 0.9444 0.9179
    Model3 × 0.9590 0.9354
    Model4 0.9627 0.9455
    YOFONet 0.9674 0.9506
     注:粗体表示性能最好。
    下载: 导出CSV
  • [1] 叶成英, 李建微, 陈思喜. VR全景视频传输研究进展[J]. 计算机应用研究, 2022, 39(6): 1601-1607.

    YE C Y, LI J W, CHEN S X. Research progress of VR panoramic video transmission[J]. Application Research of Computers, 2022, 39(6): 1601-1607(in Chinese).
    [2] ZHAI G T, MIN X K. Perceptual image quality assessment: a survey[J]. Science China Information Sciences, 2020, 63(11): 211301. doi: 10.1007/s11432-019-2757-1
    [3] 曾焕强, 丁瑞, 黄海靓, 等. 沉浸式视频质量评价综述[J]. 信号处理, 2022, 38(9): 1853-1862.

    ZENG H Q, DING R, HUANG H L, et al. Immersive video quality assessment: an overview[J]. Journal of Signal Processing, 2022, 38(9): 1853-1862(in Chinese).
    [4] LIU Y, YU H W, HUANG B Q, et al. Blind omnidirectional image quality assessment based on structure and natural features[J]. IEEE Transactions on Instrumentation and Measurement, 2021, 70: 5014011.
    [5] JIANG H, JIANG G Y, YU M, et al. Cubemap-based perception-driven blind quality assessment for 360-degree images[J]. IEEE Transactions on Image Processing, 2021, 30: 2364-2377. doi: 10.1109/TIP.2021.3052073
    [6] TRUONG T Q, TRAN H T T, THANG T C. Non-reference quality assessment model using deep learning for omnidirectional images[C]//Proceedings of the IEEE 10th International Conference on Awareness Science and Technology. Piscataway: IEEE Press, 2019: 1-5.
    [7] XU J H, ZHOU W, CHEN Z B. Blind omnidirectional image quality assessment with viewport oriented graph convolutional networks[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2021, 31(5): 1724-1737. doi: 10.1109/TCSVT.2020.3015186
    [8] SUN W, MIN X K, ZHAI G T, et al. MC360IQA: a multi-channel CNN for blind 360-degree image quality assessment[J]. IEEE Journal of Selected Topics in Signal Processing, 2020, 14(1): 64-77. doi: 10.1109/JSTSP.2019.2955024
    [9] DOSOVITSKIY A, BEYER L, KOLESNIKOV A, et al. An image is worth 16×16 words: transformers for image recognition at scale[C]//The 9th International Conference on Learning Representations. [S.l.:s.n.], 2021: 1-21.
    [10] YU M, LAKSHMAN H, GIROD B. A framework to evaluate omnidirectional video coding schemes[C]//Proceedings of the IEEE International Symposium on Mixed and Augmented Reality. Piscataway: IEEE Press, 2015: 31-36.
    [11] YULE S, LU A, LU Y. Weighted-to-spherically-uniform quality evaluation for omnidirectional video[J]. IEEE Signal Processing Letters, 2017, 24(9): 1408-1412.
    [12] ZAKHARCHENKO V, CHOI K P, PARK J H. Quality metric for spherical panoramic video[C]//Proceedings of the Optics and Photonics for Information Processing X. Bellingham: SPIE Press, 2016, 9970: 57-65.
    [13] LIU Y, YIN X H, WAN Z L, et al. Toward a no-reference omnidirectional image quality evaluation by using multi-perceptual features[J]. ACM Transactions on Multimedia Computing, Communications, and Applications, 2023, 19(2): 1-19.
    [14] KIM H G, LIM H T, RO Y M. Deep virtual reality image quality assessment with human perception guider for omnidirectional image[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2020, 30(4): 917-928. doi: 10.1109/TCSVT.2019.2898732
    [15] BOSSE S, MANIRY D, MÜLLER K R, et al. Deep neural networks for no-reference and full-reference image quality assessment[J]. IEEE Transactions on Image Processing, 2018, 27(1): 206-219. doi: 10.1109/TIP.2017.2760518
    [16] MA K D, LIU W T, ZHANG K, et al. End-to-end blind image quality assessment using deep neural networks[J]. IEEE Transactions on Image Processing, 2018, 27(3): 1202-1213. doi: 10.1109/TIP.2017.2774045
    [17] FU J, HOU C, ZHOU W, et al. Adaptive hypergraph convolutional network for no-reference 360-degree image quality assessment[C]//Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM, 2022: 961-969.
    [18] ZHOU W, XU J H, JIANG Q P, et al. No-reference quality assessment for 360-degree images by analysis of multifrequency information and local-global naturalness[J]. IEEE Transactions on Circuits and Systems for Video Technology, 2022, 32(4): 1778-1791. doi: 10.1109/TCSVT.2021.3081182
    [19] HE K M, ZHANG X Y, REN S Q, et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[C]//European Conference on Computer Vision. Berlin: Springer, 2014: 346-361.
    [20] KE J J, WANG Q F, WANG Y L, et al. MUSIQ: multi-scale image quality transformer[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2021: 5128-5137.
    [21] SU S L, YAN Q S, ZHU Y, et al. Blindly assess image quality in the wild guided by a self-adaptive hyper network[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2020: 3664-3673.
    [22] DENG J, DONG W, SOCHER R, et al. ImageNet: a large-scale hierarchical image database[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2009: 248-255.
    [23] HE K, GKIOXARI G, DOLLAR P, et al. Mask R-CNN[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision. Piscataway: IEEE Press, 2017: 2961-2969.
    [24] SAFF E B, KUIJLAARS A B J. Distributing many points on a sphere[J]. The Mathematical Intelligencer, 1997, 19(1): 5-11. doi: 10.1007/BF03024331
    [25] LI D Q, JIANG T T, LIN W S, et al. Which has better visual quality: the clear blue sky or a blurry animal?[J]. IEEE Transactions on Multimedia, 2019, 21(5): 1221-1234. doi: 10.1109/TMM.2018.2875354
    [26] MA X Y, WANG Y Q, LIU C, et al. ADGNet: attention discrepancy guided deep neural network for blind image quality assessment[C]//Proceedings of the 30th ACM International Conference on Multimedia. New York: ACM, 2022: 1309-1318.
    [27] GLOROT X, BORDES A, BENGIO Y. Deep sparse rectifier neural networks[C]//Proceedings of the 14th International Conference on Artificial Intelligence and Statistics. Ft. Lauderdale: Journal of Machine Learning Research Workshop and Conference Proceedings, 2011: 315-323.
    [28] HE K M, ZHANG X Y, REN S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 770-778.
    [29] KINGMA D P, BA J. Adam: a method for stochastic optimization[EB/OL]. (2017-01-30)[2023-04-10]. http://arxiv.org/abs/1412.6980.
    [30] SRIVASTAVA N, HINTON G, KRIZHEVSKY A, et al. Dropout: a simple way to prevent neural networks from overfitting[J]. Journal of Machine Learning Research, 2014, 15: 1929-1958.
    [31] SUN W, GU K, MA S W, et al. A large-scale compressed 360-degree spherical image database: from subjective quality evaluation to objective model comparison[C]//Proceedings of the IEEE 20th International Workshop on Multimedia Signal Processing. Piscataway: IEEE Press, 2018: 1-6.
    [32] DUAN H Y, ZHAI G T, MIN X K, et al. Perceptual quality assessment of omnidirectional images[C]//Proceedings of the IEEE International Symposium on Circuits and Systems. Piscataway: IEEE Press, 2018: 1-5.
    [33] ITU-R. Methodology for the subjective assessment of the quality of television pictures: ITU-R BT. 500-11[S]. Geneva: International Telecommunication Union, 2002: 2-30.
    [34] WANG Z, BOVIK A C, SHEIKH H R, et al. Image quality assessment: from error visibility to structural similarity[J]. IEEE Transactions on Image Processing, 2004, 13(4): 600-612. doi: 10.1109/TIP.2003.819861
    [35] WANG Z, SIMONCELLI E P, BOVIK A C. Multiscale structural similarity for image quality assessment[C]//The 37th Asilomar Conference on Signals, Systems & Computers. Piscataway: IEEE Press, 2003: 1398-1402.
    [36] MITTAL A, MOORTHY A K, BOVIK A C. No-reference image quality assessment in the spatial domain[J]. IEEE Transactions on Image Processing, 2012, 21(12): 4695-4708. doi: 10.1109/TIP.2012.2214050
    [37] MITTAL A, SOUNDARARAJAN R, BOVIK A C. Making a “completely blind” image quality analyzer[J]. IEEE Signal Processing Letters, 2013, 20(3): 209-212. doi: 10.1109/LSP.2012.2227726
    [38] MA K D, DUANMU Z F, WANG Z, et al. Group maximum differentiation competition: model comparison with few samples[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2020, 42(4): 851-864. doi: 10.1109/TPAMI.2018.2889948
    [39] MA K D, WU Q B, WANG Z, et al. Group MAD competition? a new methodology to compare objective image quality models[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. Piscataway: IEEE Press, 2016: 1664-1673.
    [40] XUE W F, ZHANG L, MOU X Q, et al. Gradient magnitude similarity deviation: a highly efficient perceptual image quality index[J]. IEEE Transactions on Image Processing, 2014, 23(2): 684-695. doi: 10.1109/TIP.2013.2293423
    [41] REDMON J, FARHADI A. YOLOv3: an incremental improvement[EB/OL]. (2018-04-08)[2023-04-10]. http://arxiv.org/abs/1804.02767.
  • 加载中
图(8) / 表(3)
计量
  • 文章访问数:  337
  • HTML全文浏览量:  69
  • PDF下载量:  19
  • 被引次数: 0
出版历程
  • 收稿日期:  2023-06-16
  • 录用日期:  2023-12-15
  • 网络出版日期:  2024-02-01
  • 整期出版日期:  2025-07-31

目录

    /

    返回文章
    返回
    常见问答