Abstract
The magnitude of the real-time digital signal processing challenge attached to large radio astronomical antenna arrays motivates use of high performance computing (HPC) systems. The need for high power efficiency at remote observatory sites parallels that in HPC broadly, where efficiency is a critical metric. We investigate how the performance-per-watt of graphics processing units (GPUs) is affected by temperature, core clock frequency and voltage. Our results highlight how the underlying physical processes that govern transistor operation affect power efficiency. In particular, we show experimentally that GPU power consumption increases non-linearly (quadratic) with both temperature and supply voltage, as predicted by physical transistor models. We show lowering GPU supply voltage and increasing clock frequency while maintaining a low die temperature increases the power efficiency of an NVIDIA K20 GPU by up to 37–48 % over default settings when running xGPU, a compute-bound code used in radio astronomy. We discuss how automatic temperature-aware and application-dependent voltage and frequency scaling (T-DVFS and A-DVFS) may provide a mechanism to achieve better power efficiency for a wider range of compute codes running on GPUs.
Similar content being viewed by others
Notes
References
Broekema PC, van Nieuwpoort RV, Bal HE (2012) ExaScale high performance computing in the Square Kilometer Array. In: Workshop on high-performance computing for astronomy data, p. 9. ACM Press, New York
Clark MA, Plante PL, Greenhill LJ (2013) Accelerating radio astronomy cross-correlation with graphics processing units. Int. J High Perform. Comput. Appl. 27(2):178–192
Collange S, Defour D, Tisserand A (2009) Power consumption of GPUs from a software perspective. Computational science—ICCS 2009. Springer, Berlin, Heidelberg, pp 914–923
Ellingson SW, Taylor GB, Craig J et al (2013) The LWA1 radio telescope. IEEE Trans. Antennas Propag. 61(5):2540–2549
Ge R, Vogt R, Majumder J, Alam A et al (2013) Effects of dynamic voltage and frequency scaling on a K20 GPU. In: Parallel processing (ICPP), 42nd International Conference on, pp 826–833
Hong S, Kim H (2010) An integrated GPU power and performance model. ACM SIGARCH Computer Architecture News - ISCA ’10, 38(3):280–289
Januszewskia R, Gillyb L, Yilmazc E, Auweterd A (2013) Cooling-making efficient choices. Tech rep, partnership for advanced computing in Europe
Jiao Y, Lin H, Balaji P, Feng W (2010) Power and performance characterization of computational kernels on the GPU. In: Green computing and communications (GreenCom), IEEE/ACM Int’l Conf on, pp 221–228
Kasichayanula K, Terpstra D, Luszczek P et al (2012) Power aware computing on GPUs. In: Symposium on Application Accelerators in High Performance Computing (SAAHPC), 10–11 July 2012, pp 64–73
Kocz J, Greenhill LJ, Barsdell BR et al (2014) A scalable hybrid 543 FPGA/GPU FX correlator. J Astro. Instrum. 03(01):1450,002
Leng J, Zu Y, Reddi VJ (2014) Energy efficiency benefits of reducing the voltage guardband on the Kepler GPU architecture. In: 10th IEEE workshop on silicon errors in logic—system effects
Liao W, He L (2005) Coupled power and thermal simulation with active cooling. In: Power-aware computer systems, pp 148–163. Springer, Berlin
Liao W, He L, Lepak KM (2005) Temperature and supply voltage aware performance and power modeling at microarchitecture level. IEEE Trans. Comput. Aided Des. Integr. Circ Syst 552. 24(7):1042–1053
Liu Y, Dick RP, Shang L, Yang H (2007) Accurate temperature-dependent integrated circuit leakage power estimation is easy. In: Design, automation and test in Europe conf and exhibition, DATE ’07, pp 1–6
Magro A, Adami KZ, Ord S (2014) Suitability of NVIDIA GPUs for SKA1-low. arXiv:1407.4698v3
Mei X, Yung LS, Zhao K, Chu X (2013) A measurement study of GPU DVFS on energy conservation. In: Workshop on power-aware computing and systems. ACM Press, New York, pp 1–5
Nagasaka H, Maruyama N, Nukada A et al (2010) Statistical power modeling of GPU kernels using performance counters. In: International conference on green computing (green comp), pp 115–122 IEEE
Nugteren C, van den Braak GJ, Corporaal H (2014) Roofline-aware DVFS for GPUs. International workshop on Adaptive Self-tuning Computing Systems. ACM Press, New York, pp 8–10
Perley RA, Chandler CJ, Butler BJ, Wrobel JM (2011) The expanded Very Large Array: a new telescope for new science. Astrophys. J. Lett. 739(1):L1
Ren DQ, Suda R (2010) Investigation on the power efficiency of multi-core and GPU processing element in large scale SIMD computation with CUDA. In: International conference on green computing (green comp), pp 309–316. IEEE
Rofouei M, Stathopoulos T, Ryffel S (2008) Energy-aware high performance computing with graphic processing units. In: HotPower’08 proceedings of the 2008 conference on power aware computing and systems
Wootten A, Thompson A (2009) The Atacama Large Millimeter/submillimeter Array. In: Proceedings of the IEEE 97(8):1463–1471
Zimmermann S, Meijer I, Tiwari MK, Paredes S et al (2012) Aquasar: a hot water cooled data center with direct energy reuse. Energy 43(1):237–245
Author information
Authors and Affiliations
Corresponding author
Additional information
The authors acknowledge support from NSF Grants PHYS-080357, AST-1106059, and OIA-1120587. BB thanks the NVIDIA internship program for support.
Rights and permissions
About this article
Cite this article
Price, D.C., Clark, M.A., Barsdell, B.R. et al. Optimizing performance-per-watt on GPUs in high performance computing. Comput Sci Res Dev 31, 185–193 (2016). https://doi.org/10.1007/s00450-015-0300-5
Published:
Issue Date:
DOI: https://doi.org/10.1007/s00450-015-0300-5