Abstract
Deep convolutional neural network (CNN), which is widely applied in image tasks, can also achieve excellent performance in acoustic tasks. However, activation data in convolutional neural network is usually indicated in floating format, which is both time-consuming and power-consuming when be computed. Quantization method can turn activation data into fixed-point, replacing floating computing into faster and more energy-saving fixed-point computing. Based on this method, this article proposes a design space searching method to quantize a binary weight neural network. A specific accelerator is built on FPGA platform, which has layer-by-layer pipeline design, higher throughput and energy-efficiency compared with CPU and other hardware platforms.
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.References
Alemdar, H., Leroy, V., Prost-Boucle, A., P ́etrot, F.: Ternary neural networks for resource-efficient ai applications. Int. Jt. Conf. Neural Netw. (IJCNN) 2547–2554 (2017). https://doi.org/10.1109/IJCNN.2017.7966166
Alessandro, A., Hesham, M., Enrico, C., et al.: NullHop: a flexible convolutional neural network accelerator based on sparse representations of feature maps. IEEE Trans. Neural Netw. Learn. Syst. 1–13 (2018). https://doi.org/10.1109/TNNLS.2018.2852335
Baicheng, L., Song, C., Yi, K., Feng, W., et al.: An energy-efficient systolic pipeline architecture for binary convolutional neural network. IEEE Int. Conf. ASIC (2019). https://doi.org/10.1109/ASICON47005.2019.8983637
Bo, L., Hai, Q., Yu, G., et al.: EERA-ASR: an energy-efficient reconfigurable architecture for automatic speech recognition with hybrid DNN and approximate computing. IEEE Access 6, 52227–52237 (2018)
Chen, T., Du, Z., Sun, N., et al.: A high-throughput neural network accelerator. IEEE Micro 35(3), 24–32 (2015)
Cheng, G., Yao, C., Ye, L., Tao, L., Cong, H., et al.: Vecq: minimal loss DNN model compression with vectorized weight quantization. IEEE Trans. Comput. (2020). https://doi.org/10.1109/TC.2020.2995593
Cheung, K., Schultz, S.R., Luk, W.: A large-scale spiking neural network accelerator for FPGA systems. In: Proceedings of the 22nd international conference on artificial neural networks and machine learning - volume part I, Springer, Berlin, Heidelberg (2012)
Conti, F., Schiavone, P.D., Benini, L.: XNOR neural engine: a hardware accelerator IP for 21.6-fJ/op binary neural network inference. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(11), 2940–2951 (2018)
Douglas, C., Sabato, L., Martin, L., et al.: A neural attention model for speech command recognition. arXiv:1808.08929v1[eess.AS] (2018)
Dundar, G., Rose, K.: The effects of quantization on multilayer neural networks. IEEE Trans. Neural Netw. 6(6), 1446–1451 (1995)
Geoffrey, H., Li, D., Dong, Y., et al.: Deep neural networks for acoustic modeling in speech recognition. IEEE Signal Process. Mag. 12, 82–98 (2012)
Giri, E.P., Fanany, M.I., Arymurthy, A.M., et al.: Ischemic stroke identification based on EEG and EOG using 1D convolutional neural network and batch normalization. ICACSIS. IEEE (2016)
Guo, P., Ma, H., Chen, R., et al.: A high-efficiency FPGA-based accelerator for binarized neural networks. J. Circuits Syst. Comput. (2019). https://doi.org/10.1142/S0218126619400048
Gwennap, L.: Microsoft brainwave uses FPGAs. Microprocess. Rep. 31(11), 25–27 (2017)
Han et al.: 4th International Conference on Learning Representations, ICLR 2016, Conference Track Proceedings (2016)
Jacob, B., Kligys, S., Chen, B., et al.: Quantization and training of neural networks for efficient integer-arithmetic-only inference. (2017). https://doi.org/10.1109/CVPR.2018.00286
Jyrki Kivinen, M.K.W.: Exponentiated gradient versus gradient descent for linear predictors. Inf. Comput. 132(1), 1–63 (1997)
Krizhevsky, A., Sutskever, I., Hinton, G.: ImageNet classification with deep convolutional neural networks. NIPS. Curran Associates Inc. (2012)
Liang, S., Yin, S., Liu, L., et al.: FP-BNN: binarized neural network on FPGA. Neurocomputing 275, 1072–1086 (2018)
Liu, S., Pattabiraman, K., Moscibroda, T., et al.: Flikker: saving DRAM refresh-power through critical data partitioning. Comput. Arch. News 39(1), 213–224 (2011)
Liu, M., Wu, W., Gu, Z., et al.: Deep learning based on batch normalization for P300 signal detection. Neurocomputing S0925231217314601 (2017). https://doi.org/10.1016/j.neucom.2017.08.039
Matthieu, C., Itay, H., Daniel, S., et al.: Training deep neural networks with weights and activations constrained to +1 or -1. arXiv:1602.02830v3[cs.LG] (2016)
Muckenhirn, H., Magimai-Doss, M., Marcell, S.: [IEEE ICASSP 2018 - 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - Calgary, AB (2018.4.15–2018.4.20)] 2018 IEEE international conference on acoustics, speech and signal processing (ICASSP) - towards directly modeling raw speech signal for speaker verification using CNNS, pp. 4884–4888 (2018)
Pakyurek, M., Atmis, M., Kulac, S., et al.: Extraction of novel features based on histograms of MFCCs used in emotion classification from generated original speech dataset. Electron. Electr. Eng. 26, 46–51 (2020)
Palaz, D., Magimai-Doss, M., Collobert, R.: Convolutional neural networks-based continuous speech recognition using raw speech signal. In: 2015 IEEE international conference on acoustics, speech and signal processing (ICASSP). IEEE (2015)
Perkins, S., Lacker, K., Theiler, J.: Grafting: fast, incremental feature selection by gradient descent in function space. J. Mach. Learn. Res. 3(3), 1333–1356 (2003)
Renzo, A., Lukas, C., Davide, R., Luca, B.: YodaNN: an architecture for ultralow power binary-weight CNN acceleration. IEEE Trans. Comput. Aided Des. Integr. Circuits Syst. 37(1) (2018). https://doi.org/10.1109/TCAD.2017.2682138
Samuel, K., Boris, G.: MatchboxNet: 1D time-channel separable convolutional neural network architecture for speech commands recognition. arXiv:2004.085431v2[eess.AS] (2020)
Samuel, K., Stanislav, B., Boris, G., et al.: QuartzNet: deep automatic speech recognition with 1D time-channel separable convolutions. In: 2020 IEEE international conference on acoustic, speech and signal processing (2020)
Santurkar, S., Tsipras, D., Ilyas, A., et al.: 2018; NIPS'18: Proceedings of the 32nd International Conference on Neural Information Processing Systems, pp. 2488– 2498 (2018)
Shijie, C., Chen, Z., Zhuliang, Y., et al.: Efficient and effective sparse LSTM on FPGA with bank-balanced sparsity. In: FPGA’19: proceedings of the 2019 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 63–72 (2019)
Shuo, W., Zhe, L., Caiwen, D., et al.: C-LSTM: enabling efficient LSTM using structured compression techniques on FPGAs. In: FPGA’18: proceedings of the 2018 ACM/SIGDA international symposium on field-programmable gate arrays, pp. 21–30 (2018)
Simonyan, K., Zisserman, A.: 3rd International Conference on Learning Representations, ICLR 2015, Conference Track Proceedings (2014)
Song, H., Junlong, K., Huizi, M., et al.: ESE: efficient speech recognition engine with sparse LSTM on FPGA. In: FPGA’17: Proceedings of the 2017 ACM/sigda international symposium on field-programmable gate arrays, pp. 75–84 (2017)
Tom, S., Vaibhava, G.: Advances in very deep convolutional neural networks for LVCSR. arXiv:1604.01792v2[cs.CL] (2016)
Turan, F., Roy, S.S., Verbauwhede, I.: HEAWS: an accelerator for homomorphic encryption on the amazon AWS FPGA. IEEE Trans. Comput. (99):1–1 (2020). https://doi.org/10.1109/TC.2020.2988765
Umuroglu, Y., Fraser, N.J., Gambardella, G., Blott, M., Leong, P., Jahre, M., Vissers, K.: Finn: a framework for fast, scalable binarized neural network inference. In: ACM/SIGDA international symposium on field-programmable gate arrays, pp. 65–74 (2017)
Wan, H., Guo, S., Yin, K., et al.: CTS-LSTM: LSTM-based neural networks for correlated time series prediction. Knowl. Based Syst. 191 (2019). https://doi.org/10.1016/j.knosys.2019.105239
Wei, Z., Jingyi, Q., Renbiao, W.: Straight convolutional neural networks algorithm based on batch normalization for image classification. J. Comput. Aided Des. Comput. Graph. 29(9), 1650–1657 (2017)
Xu, Y., Wang, Y., Zhou, A., et al.: Thirty-Second AAAI Conference on Artificial Intelligence. 32 (2018)
Zeng, X., Zhi, T., Zhou, X., Du, Z., Guo, Q., Liu, S., et al.: Addressing irregularity in sparse neural networks through a cooperative software/hardware approach. IEEE Trans. Comput. 69(7), 968–985 (2020)
Acknowledgements
This work is supported by National Science and Technology Major Project 2018ZX01028101.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of interest
On behalf of all authors, the corresponding author states that there is no conflict of interest.
Rights and permissions
About this article
Cite this article
Wen, D., Jiang, J., Dou, Y. et al. An energy-efficient convolutional neural network accelerator for speech classification based on FPGA and quantization. CCF Trans. HPC 3, 4–16 (2021). https://doi.org/10.1007/s42514-020-00055-4
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s42514-020-00055-4