Abstract
Reconfigurable architecture has great potential in computation-intensive and memory-intensive applications due to its flexible information configuration. Aiming at the problem of low computing efficiency caused by the inconsistency between different granularity data and the underlying hardware structure in applications such as communication baseband signal processing, a parallel computing method supporting multi-bit data is proposed, and a dynamic granularity configuration structure used this method is designed based on reconfigurable array processors. The structure divides the calculation granularity into 8 bits, 16 bits, and 32 bits, and realizes four functions: data-combination, data-splitting, parallel-addition, and parallel-multiplication. These features increase the parallelism and flexibility of array structures. The experimental results show that the speedup ratio can reach 1.5 within a certain error range, the running time is reduced by about 20%, and the code complexity is also significantly reduced. In addition, the maximum operating frequency of the dynamic configuration circuit is 133.5 MHz by FPGA comprehensive implementation, which can realize the dynamic configuration of different granularity data in the calculation and achieve parallel computing of multi-bit data.
Supported by National Key R &D Program of China
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Lu, Y., Liu, L., Zhu, J., et al.: Architecture, challenges and applications of dynamic reconfigurable computing. J. Semicond. 41(2), 4–13 (2020)
Chiu, J.-C., Yan, Z.-Y., Liu, Y.-C.: Design and implementation of the CNN accelator based on multi-streaming SIMD mechanisms. In: Hsieh, S.-Y., Hung, L.-J., Klasing, R., Lee, C.-W., Peng, S.-L. (eds.) New Trends in Computer Technologies and Applications: 25th International Computer Symposium, ICS 2022, Taoyuan, Taiwan, December 15–17, 2022, Proceedings, pp. 460–473. Springer Nature Singapore, Singapore (2022). https://doi.org/10.1007/978-981-19-9582-8_40
Sharma, H., Park, J., Suda, N.: Bit Fusion: Bit-Level Dynamically Composable Architecture for Accelerating Deep Neural Networks. In: ACM/IEEE 45th annual international symposium on computer architecture (ISCA). IEEE 2018, 764–775 (2018)
Moss, D.J., Krishnan, S., Nurvitadhi, E., et al.: A customizable matrix multiplication framework for the intel harpv2 xeon+fpga platform: a deep learning case study. In: 2018 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays. ACM, pp. 107–116 (2018)
Faraone, J. Kumm, M., Hardieck, M., et al.: AddNet: deep neural networks using fpga-optimized multipliers. IEEE Trans. Very Large Scale Integr. (VLSI) Syst. 28(1), 115–128 (2020)
Tang, S.N.: Area-efficient parallel multiplication units for CNN accelerators with output channel parallelization. IEEE Trans. Very Large Scale Integr. (VLSI) Systems. 31(3), 406–410 (2023)
Sun, M., Li, Z., Lu, A., et al.: FILM-QNN: efficient FPGA acceleration of deep neural networks with intra-layer, mixed-precision quantization. In: Proceedings of the 28th ACM/SIGDA International Symposium on Field-Programmable Gate Arrays (FPGA), pp. 134–145 (2022)
Wang, N., Nia, J., Li, J., et al.: A compression strategy to accelerate LSTM meta-learning on FPGA. ICT Express 8(3), 322–327 (2022)
Nataraj Urs, H.D., Venkata Siva Reddy, R., Gudodagi, R., et al.: A novel algorithm for reconfigurable architecture for software-defined radio receiver on baseband processor for demodulation. Sustainable Computing. Springer, Cham, pp. 187–206 (2023). https://doi.org/10.1007/978-3-031-13577-4_11
Umuroglu, Y., Conficconi, D., Rasnayake, L., et al.: Optimizing bit-serial matrix multiplication for reconfigurable computing. ACM Trans. Reconfigurable Technol. Syst. (TRETS) 12(3), 1–24 (2019)
Liu, K., Tian, Z., Li, Z., et al.: RfLoc: a reflector-assisted indoor localization system using a single-antenna AP. IEEE Trans. Instrum. Meas. 70(3), 1–16 (2021)
Wang, A., Xu, W., Sun, H., et al.: Arrhythmia classifier using binarized convolutional neural network for resource-constrained devices. In: 2022 4th International Conference on Communications, Information System and Computer Engineering (CISCE), Shenzhen, China, 2022, pp. 213–220 (2022)
Stepchenkov, Y.A., Khilko, D.V., Shikunov, Y.I.: Filter kernels preliminary benchmarking, DSP, for recurrent data-flow architecture. In: IEEE Conference of Russian Young Researchers in Electrical and Electronic Engineering (ElConRus). IEEE 2021, pp. 2040–2044 (2021)
Deng, J., Jiang, L., Zhu, Y., et al.: HRM: H-tree based reconfiguration mechanism in reconfigurable homogeneous PE array. J. Semiconductors. 41(2), 1–9 (2020)
Shan, R., Jiang, L., Wu, H., He, F., Liu, X.: Dynamical self-reconfigurable mechanism for data-driven cell array. J. Shanghai Jiaotong Univ. (Science) 26(4), 511–521 (2021). https://doi.org/10.1007/s12204-021-2319-z
Maki, A., Miyashita, D., Nakata, K., et al.: FPGA-based CNN processor with filter-wise-optimized bit precision. In: 2018 IEEE Asian Solid-State Circuits Conference (A-SSCC). IEEE, pp. 47–50 (2018)
Chen, Y., Du, H., Chang, L.: A reconfigurable micro-processing element for mixed precision CNNs. In: 2022 14th International Conference on Measuring Technology and Mechatronics Automation (ICMTMA). IEEE, pp. 1–5 (2022)
Liu, W., Liao, Q., Qiao, F., et al.: Approximate designs for fast Fourier transform (FFT) with application to speech recognition. IEEE Trans. Circuits Syst. I Regul. Pap. 66(12), 4727–4739 (2019)
Acknowledgements
This work was supported by National Key R &D Program of China (2022ZD0119001); Key projects of National Natural Science Foundation of China (61834005).
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2024 The Author(s), under exclusive license to Springer Nature Singapore Pte Ltd.
About this paper
Cite this paper
Jiang, L., Liu, S., Zhu, J., Shan, R., Li, Y. (2024). Dynamic Multi-bit Parallel Computing Method Based on Reconfigurable Structure. In: Tari, Z., Li, K., Wu, H. (eds) Algorithms and Architectures for Parallel Processing. ICA3PP 2023. Lecture Notes in Computer Science, vol 14488. Springer, Singapore. https://doi.org/10.1007/978-981-97-0801-7_20
Download citation
DOI: https://doi.org/10.1007/978-981-97-0801-7_20
Published:
Publisher Name: Springer, Singapore
Print ISBN: 978-981-97-0800-0
Online ISBN: 978-981-97-0801-7
eBook Packages: Computer ScienceComputer Science (R0)