Volume 51 Issue 4
Apr.  2025
Turn off MathJax
Article Contents
LIU Q,YIN W,LI K. Image preprocessing acceleration method based on RISC-V vector extension[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1074-1084 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0208
Citation: LIU Q,YIN W,LI K. Image preprocessing acceleration method based on RISC-V vector extension[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1074-1084 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0208

Image preprocessing acceleration method based on RISC-V vector extension

doi: 10.13700/j.bh.1001-5965.2023.0208
Funds:

National Natural Science Foundation of China (U21B2031) 

More Information
  • Corresponding author: E-mail:qiangliu@tju.edu.cn
  • Received Date: 24 Apr 2023
  • Accepted Date: 26 May 2023
  • Available Online: 19 Jun 2023
  • Publish Date: 09 Jun 2023
  • As the pre-order step of convolutional neural network (CNN) computing, image preprocessing is indispensable but time-consuming. To accelerate image preprocessing, a method based on RISC-V vector extension was proposed to accelerate eleven image preprocessing algorithms such as gray scale processing, standardization, and Gaussian filtering. Firstly, eleven image preprocessing algorithms were classified into four categories according to the computing mode, and acceleration schemes for the preprocessing algorithms were designed based on RISC-V vector extension. In order to further improve the performance, six customized vector instructions were added. The customized instructions were implemented by modifying the compiler and designing the hardware module. Finally, a field programmable gate array (FPGA) was used for testing, and the impact of vector processor configuration on performance and resource consumption was analyzed. The results showed that the proposed method achieves 3.13–9.97 times speedup compared with scalar processors, which effectively solves the performance bottleneck problem of image preprocessing in deep learning.

     

  • loading
  • [1]
    PAL K K, SUDEEP K S. Preprocessing for image classification by convolutional neural networks[C]//Proceedings of the IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology. Piscataway: IEEE Press, 2016: 1778-1781.
    [2]
    ŞABAN Ö, AKDEMIR B. Effects of histopathological image pre-processing on convolutional neural networks[J]. Procedia Computer Science, 2018, 132: 396-403. doi: 10.1016/j.procs.2018.05.166
    [3]
    PITALOKA D A, WULANDARI A, BASARUDDIN T, et al. Enhancing CNN with preprocessing stage in automatic emotion recognition[J]. Procedia Computer Science, 2017, 116: 523-529. doi: 10.1016/j.procs.2017.10.038
    [4]
    TABIK S, PERALTA D, HERRERA-POYATOS A, et al. A snapshot of image pre-processing for convolutional neural networks: case study of MNIST[J]. International Journal of Computational Intelligence Systems, 2017, 10(1): 555-568. doi: 10.2991/ijcis.2017.10.1.38
    [5]
    DU C Y, TSAI C F, CHEN W C, et al. A 28 nm 11.2 TOPS/W hardware-utilization-aware neural-network accelerator with dynamic dataflow[C]//Proceedings of the IEEE International Solid-State Circuits Conference. Piscataway: IEEE Press, 2023: 1-3.
    [6]
    KELLER B, VENKATESAN R, DAI S, et al. A 95.6-TOPS/W deep learning inference accelerator with per-vector scaled 4-bit quantization in 5 nm[J]. IEEE Journal of Solid-State Circuits, 2023, 58(4): 1129-1141. doi: 10.1109/JSSC.2023.3234893
    [7]
    KARNIK T, KURIAN D, ASERON P, et al. A cm-scale self-powered intelligent and secure IoT edge mote featuring an ultra-low-power SoC in 14 nm tri-gate CMOS[C]//Proceedings of the IEEE International Solid-State Circuits Conference. Piscataway: IEEE Press, 2018: 46-48.
    [8]
    NARAYANAN D, SANTHANAM K, PHANISHAYEE A, et al. Accelerating deep learning workloads through efficient multi-model execution[C]//Proceedings of the NeurIPS Workshop on Systems for Machine Learning. [S. l. ]: NeurIPS, 2018: 20-27.
    [9]
    JIA T Y, JU Y H, JOSEPH R, et al. NCPU: an embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance[C]//Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture. Piscataway: IEEE Press, 2020: 1097-1109.
    [10]
    NVIDIA. NVIDIA data loading library (DALI) [EB/OL]. [2023-04-01]. http://github.com/NVIDIA/DALI.
    [11]
    MA N. A SoC-based acceleration method for UAV runway detection image pre-processing algorithm[C]//Proceedings of the 25th International Conference on Automation and Computing. Piscataway: IEEE Press, 2019: 1-6.
    [12]
    KUO Y M, GARCÍA-HERRERO F, RUANO O, et al. RISC-V Galois field ISA extension for non-binary error-correction codes and classical and post-quantum cryptography[J]. IEEE Transactions on Computers, 2023, 72(3): 682-692.
    [13]
    RAZILOV V, MATÚŠ E, FETTWEIS G. Communications signal processing using RISC-V vector extension[C]//Proceedings of the International Wireless Communications and Mobile Computing. Piscataway: IEEE Press, 2022: 690-695.
    [14]
    CAVALCANTE M, SCHUIKI F, ZARUBA F, et al. Ara: a 1-GHz scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI[J]. IEEE Transactions on Very Large Scale Integration Systems, 2020, 28(2): 530-543. doi: 10.1109/TVLSI.2019.2950087
    [15]
    刘强, 李一可. 基于指令扩展的RISC-V可配置故障注入检测方法[J/OL]. 北京亚洲成人在线一二三四五六区学报, 2023(2023-03-10)[2023-04-01]. http://bhxb.cqjj8.com/bhzk/cn/article/doi/10.13700/j.bh.1001-5965.2022.0995.

    LIU Q, LI Y K. Configurable fault detection method for RISC-V processors based on instruction extension[J/OL]. Journal of Beijing University of Aeronautics and Astronautics, 2023(2023-03-10)[2023-04-01]. http://bhxb.cqjj8.com/bhzk/cn/article/doi/10.13700/j.bh.1001-5965.2022.0995(in Chinese).
    [16]
    ASANOVIC K. Vector extension 1.0[EB/OL]. (2021-09-20)[2023-04-01]. http://github.com/riscv/riscv-v-spec/releases/tag/v1.0.
    [17]
    PLATZER M, PUSCHNER P. Vicuna: a timing-predictable RISC-V vector coprocessor for scalable parallel computation[C]//Proceedings of the 33rd Euromicro Conference on Real-Time Systems. Porto: [s. n. ], 2021: 1-18.
    [18]
    SCHIAVONE P D, CONTI F, ROSSI D, et al. Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-things applications[C]//Proceedings of the 27th International Symposium on Power and Timing Modeling, Optimization and Simulation. Piscataway: IEEE Press, 2017: 1-8.
    [19]
    OpenHW Group. OpenHW group eXtension interface[EB/OL]. [2023-04-01]. http://docs.openhwgroup.org/projects/openhw-group-core-v-xif/en/latest/index.html.
    [20]
    ALEX K, VINOD N. The CIFAR-10 dataset[EB/OL]. [2023-04-01]. http://www.cs.toronto.edu/~kriz/cifar.html.
    [21]
    YAN S, LIU Z Y, WANG Y, et al. An FPGA-based MobileNet accelerator considering network structure characteristics[C]//Proceedings of the 31st International Conference on Field-Programmable Logic and Applications. Piscataway: IEEE Press, 2021: 17-23.
  • 加载中

Catalog

    通讯作者: 陈斌, bchen63@163.com
    • 1. 

      沈阳化工大学材料科学与工程学院 沈阳 110142

    1. 本站搜索
    2. 百度学术搜索
    3. 万方数据库搜索
    4. CNKI搜索

    Figures(11)  / Tables(8)

    Article Metrics

    Article views(557) PDF downloads(40) Cited by()
    Proportional views
    Related

    /

    DownLoad:  Full-Size Img  PowerPoint
    Return
    Return