| Citation: | LIU Q,YIN W,LI K. Image preprocessing acceleration method based on RISC-V vector extension[J]. Journal of Beijing University of Aeronautics and Astronautics,2025,51(4):1074-1084 (in Chinese) doi: 10.13700/j.bh.1001-5965.2023.0208 |
As the pre-order step of convolutional neural network (CNN) computing, image preprocessing is indispensable but time-consuming. To accelerate image preprocessing, a method based on RISC-V vector extension was proposed to accelerate eleven image preprocessing algorithms such as gray scale processing, standardization, and Gaussian filtering. Firstly, eleven image preprocessing algorithms were classified into four categories according to the computing mode, and acceleration schemes for the preprocessing algorithms were designed based on RISC-V vector extension. In order to further improve the performance, six customized vector instructions were added. The customized instructions were implemented by modifying the compiler and designing the hardware module. Finally, a field programmable gate array (FPGA) was used for testing, and the impact of vector processor configuration on performance and resource consumption was analyzed. The results showed that the proposed method achieves 3.13–9.97 times speedup compared with scalar processors, which effectively solves the performance bottleneck problem of image preprocessing in deep learning.
| [1] |
PAL K K, SUDEEP K S. Preprocessing for image classification by convolutional neural networks[C]//Proceedings of the IEEE International Conference on Recent Trends in Electronics, Information & Communication Technology. Piscataway: IEEE Press, 2016: 1778-1781.
|
| [2] |
ŞABAN Ö, AKDEMIR B. Effects of histopathological image pre-processing on convolutional neural networks[J]. Procedia Computer Science, 2018, 132: 396-403. doi: 10.1016/j.procs.2018.05.166
|
| [3] |
PITALOKA D A, WULANDARI A, BASARUDDIN T, et al. Enhancing CNN with preprocessing stage in automatic emotion recognition[J]. Procedia Computer Science, 2017, 116: 523-529. doi: 10.1016/j.procs.2017.10.038
|
| [4] |
TABIK S, PERALTA D, HERRERA-POYATOS A, et al. A snapshot of image pre-processing for convolutional neural networks: case study of MNIST[J]. International Journal of Computational Intelligence Systems, 2017, 10(1): 555-568. doi: 10.2991/ijcis.2017.10.1.38
|
| [5] |
DU C Y, TSAI C F, CHEN W C, et al. A 28 nm 11.2 TOPS/W hardware-utilization-aware neural-network accelerator with dynamic dataflow[C]//Proceedings of the IEEE International Solid-State Circuits Conference. Piscataway: IEEE Press, 2023: 1-3.
|
| [6] |
KELLER B, VENKATESAN R, DAI S, et al. A 95.6-TOPS/W deep learning inference accelerator with per-vector scaled 4-bit quantization in 5 nm[J]. IEEE Journal of Solid-State Circuits, 2023, 58(4): 1129-1141. doi: 10.1109/JSSC.2023.3234893
|
| [7] |
KARNIK T, KURIAN D, ASERON P, et al. A cm-scale self-powered intelligent and secure IoT edge mote featuring an ultra-low-power SoC in 14 nm tri-gate CMOS[C]//Proceedings of the IEEE International Solid-State Circuits Conference. Piscataway: IEEE Press, 2018: 46-48.
|
| [8] |
NARAYANAN D, SANTHANAM K, PHANISHAYEE A, et al. Accelerating deep learning workloads through efficient multi-model execution[C]//Proceedings of the NeurIPS Workshop on Systems for Machine Learning. [S. l. ]: NeurIPS, 2018: 20-27.
|
| [9] |
JIA T Y, JU Y H, JOSEPH R, et al. NCPU: an embedded neural CPU architecture on resource-constrained low power devices for real-time end-to-end performance[C]//Proceedings of the 53rd Annual IEEE/ACM International Symposium on Microarchitecture. Piscataway: IEEE Press, 2020: 1097-1109.
|
| [10] |
NVIDIA. NVIDIA data loading library (DALI) [EB/OL]. [2023-04-01]. http://github.com/NVIDIA/DALI.
|
| [11] |
MA N. A SoC-based acceleration method for UAV runway detection image pre-processing algorithm[C]//Proceedings of the 25th International Conference on Automation and Computing. Piscataway: IEEE Press, 2019: 1-6.
|
| [12] |
KUO Y M, GARCÍA-HERRERO F, RUANO O, et al. RISC-V Galois field ISA extension for non-binary error-correction codes and classical and post-quantum cryptography[J]. IEEE Transactions on Computers, 2023, 72(3): 682-692.
|
| [13] |
RAZILOV V, MATÚŠ E, FETTWEIS G. Communications signal processing using RISC-V vector extension[C]//Proceedings of the International Wireless Communications and Mobile Computing. Piscataway: IEEE Press, 2022: 690-695.
|
| [14] |
CAVALCANTE M, SCHUIKI F, ZARUBA F, et al. Ara: a 1-GHz scalable and energy-efficient RISC-V vector processor with multiprecision floating-point support in 22-nm FD-SOI[J]. IEEE Transactions on Very Large Scale Integration Systems, 2020, 28(2): 530-543. doi: 10.1109/TVLSI.2019.2950087
|
| [15] |
刘强, 李一可. 基于指令扩展的RISC-V可配置故障注入检测方法[J/OL]. 北京亚洲成人在线一二三四五六区学报, 2023(2023-03-10)[2023-04-01]. http://bhxb.cqjj8.com/bhzk/cn/article/doi/10.13700/j.bh.1001-5965.2022.0995.
LIU Q, LI Y K. Configurable fault detection method for RISC-V processors based on instruction extension[J/OL]. Journal of Beijing University of Aeronautics and Astronautics, 2023(2023-03-10)[2023-04-01]. http://bhxb.cqjj8.com/bhzk/cn/article/doi/10.13700/j.bh.1001-5965.2022.0995(in Chinese).
|
| [16] |
ASANOVIC K. Vector extension 1.0[EB/OL]. (2021-09-20)[2023-04-01]. http://github.com/riscv/riscv-v-spec/releases/tag/v1.0.
|
| [17] |
PLATZER M, PUSCHNER P. Vicuna: a timing-predictable RISC-V vector coprocessor for scalable parallel computation[C]//Proceedings of the 33rd Euromicro Conference on Real-Time Systems. Porto: [s. n. ], 2021: 1-18.
|
| [18] |
SCHIAVONE P D, CONTI F, ROSSI D, et al. Slow and steady wins the race? A comparison of ultra-low-power RISC-V cores for Internet-of-things applications[C]//Proceedings of the 27th International Symposium on Power and Timing Modeling, Optimization and Simulation. Piscataway: IEEE Press, 2017: 1-8.
|
| [19] |
OpenHW Group. OpenHW group eXtension interface[EB/OL]. [2023-04-01]. http://docs.openhwgroup.org/projects/openhw-group-core-v-xif/en/latest/index.html.
|
| [20] |
ALEX K, VINOD N. The CIFAR-10 dataset[EB/OL]. [2023-04-01]. http://www.cs.toronto.edu/~kriz/cifar.html.
|
| [21] |
YAN S, LIU Z Y, WANG Y, et al. An FPGA-based MobileNet accelerator considering network structure characteristics[C]//Proceedings of the 31st International Conference on Field-Programmable Logic and Applications. Piscataway: IEEE Press, 2021: 17-23.
|