skip to main content
research-article

A case for integrated processor-cache partitioning in chip multiprocessors

Published: 14 November 2009 Publication History

Abstract

Existing cache partitioning schemes are designed in a manner oblivious to the implicit processor partitioning enforced by the operating system. This paper examines an operating system directed integrated processor-cache partitioning scheme that partitions both the available processors and the shared cache in a chip multiprocessor among different multi-threaded applications. Extensive simulations using a set of multiprogrammed workloads show that our integrated processor-cache partitioning scheme facilitates achieving better performance isolation as compared to state of the art hardware/software based solutions. Specifically, our integrated processor-cache partitioning approach performs, on an average, 20.83% and 14.14% better than equal partitioning and the implicit partitioning enforced by the underlying operating system, respectively, on the fair speedup metric on an 8 core system. We also compare our approach to processor partitioning alone and a state-of-the-art cache partitioning scheme and our scheme fares 8.21% and 9.19% better than these schemes on a 16 core system.

References

[1]
S. Balakrishnan, R. Rajwar, M. Upton, and K. Lai. The impact of performance asymmetry in emerging multicore architectures. SIGARCH Comput. Archit. News, 33(2), 2005.
[2]
R. Bitirgen, E. Ipek, and J. F. Martinez. Coordinated management of multiple interacting resources in chip multiprocessors: A machine learning approach. In Proc. of the International Symposium on Microarchitecture, 2008.
[3]
F. Bower, D. Sorin, and L. Cox. The impact of dynamically heterogeneous multicore processors on thread scheduling. IEEE Micro, 28(3), 2008.
[4]
J. R. Bulpin and I. A. Pratt. Hyper-threading aware process scheduling heuristics. In Proc. of the USENIX Annual Technical Conference, 2005.
[5]
D. Chandra, F. Guo, S. Kim, and Y. Solihin. Predicting inter-thread cache contention on a chip multi-processor architecture. In Proc. of the International Symposium on High-Performance Computer Architecture, 2005.
[6]
J. Chang and G. S. Sohi. Cooperative cache partitioning for chip multiprocessors. In Proc. of the International Conference on Supercomputing, 2007.
[7]
S. Eyerman and L. Eeckhout. System-level performance metrics for multiprogram workloads. IEEE Micro, 28(3), 2008.
[8]
A. Fedorova, M. Seltzer, and M. D. Smith. Improving performance isolation on chip multiprocessors via an operating system scheduler. In Proc. of the International Conference on Parallel Architecture and Compilation Techniques, 2007.
[9]
A. Fedorova, C. Small, D. Nussbaum, and M. Seltzer. Chip multithreading systems need a new operating system scheduler. In Proc. of the SIGOPS European Workshop, 2004.
[10]
R. Gabor, S. Weiss, and A. Mendelson. Fairness enforcement in switch on event multithreading. ACM Trans. Archit. Code Optim., 4(3), 2007.
[11]
F. Guo, H. Kannan, L. Zhao, R. Illikkal, R. Iyer, D. Newell, Y. Solihin, and C. Kozyrakis. From chaos to QoS: case studies in CMP resource management. SIGARCH Comput. Archit. News, 35(1), 2007.
[12]
L. R. Hsu, S. K. Reinhardt, R. Iyer, and S. Makineni. Communist, utilitarian, and capitalist cache policies on CMPs: caches as a shared resource. In Proc. of the International Conference on Parallel Architectures and Compilation Techniques, 2006.
[13]
R. Iyer. CQoS: a framework for enabling QoS in shared caches of CMP platforms. In Proc. of the International Conference on Supercomputing, 2004.
[14]
R. Iyer, L. Zhao, F. Guo, R. Illikkal, S. Makineni, D. Newell, Y. Solihin, L. Hsu, and S. Reinhardt. Qos policies and architecture for cache/memory in cmp platforms. SIGMETRICS Perform. Eval. Rev., 35(1), 2007.
[15]
L. K. John. More on finding a single number to indicate overall performance of a benchmark suite. SIGARCH Comput. Archit. News, 32(1), 2004.
[16]
J. Kay and P. Lauder. A fair share scheduler. Communications of the ACM, 31(1), 1988.
[17]
S. Kim, D. Chandra, and Y. Solihin. Fair cache sharing and partitioning in a chip multiprocessor architecture. In Proc. of the International Conference on Parallel Architectures and Compilation Techniques, 2004.
[18]
R. Kumar, D. M. Tullsen, N. P. Jouppi, and P. Ranganathan. Heterogeneous chip multiprocessors. Computer, 38(11), 2005.
[19]
K. Luo, J. Gummaraju, and M. Franklin. Balancing thoughput and fairness in smt processors. In Proc. of the International Symposium on Performance Analysis of Systems and Software., 2001.
[20]
P. S. Magnusson, M. Christensson, J. Eskilson, D. Forsgren, G. Hallberg, J. Hogberg, F. Larsson, A. Moestedt, and B. Werner. Simics: A full system simulation platform. Computer, 35(2), 2002.
[21]
J. Mogul, J. Mudigonda, N. Binkert, P. Ranganathan, and V. Talwar. Using asymmetric single-ISA CMPs to save energy on operating systems. IEEE Micro, 28(3), 2008.
[22]
O. Mutlu and T. Moscibroda. Stall-time fair memory access scheduling for chip multiprocessors. In Proc. of the International Symposium on Microarchitecture, 2007.
[23]
K. J. Nesbit, N. Aggarwal, J. Laudon, and J. E. Smith. Fair queuing memory systems. In Proc. of the International Symposium on Microarchitecture, 2006.
[24]
K. J. Nesbit, M. Moreto, F. Cazorla, A. Ramirez, M. Valero, and J. Smith. Multicore resource management. IEEE Micro, 28(3), 2008.
[25]
P. Padala, K. G. Shin, X. Zhu, M. Uysal, Z. Wang, S. Singhal, A. Merchant, and K. Salem. Adaptive control of virtualized resources in utility computing environments. SIGOPS Oper. Syst. Rev., 41(3), 2007.
[26]
M. K. Qureshi and Y. N. Patt. Utility-based cache partitioning: A low-overhead, high-performance, runtime mechanism to partition shared caches. In Proc. of the International Symposium on Microarchitecture, 2006.
[27]
N. Rafique, W.-T. Lim, and M. Thottethodi. Architectural support for operating system-driven CMP cache management. In Proc. of the International Conference on Parallel Architectures and Compilation Techniques, 2006.
[28]
N. Rafique, W.-T. Lim, and M. Thottethodi. Effective management of DRAM bandwidth in multicore processors. In Proc. of the International Conference on Parallel Architecture and Compilation Techniques, 2007.
[29]
A. Sen and M. Srivastava. Regression Analysis. Springer, 1990.
[30]
J. E. Smith. Characterizing computer performance with a single number. Commun. ACM, 31(10), 1988.
[31]
S. Srikantaiah, M. Kandemir, and M. J. Irwin. Adaptive set pinning: managing shared caches in chip multiprocessors. In Proc. of International Conference on Architectural Support for Programming Languages and Operating Systems, 2008.
[32]
G. E. Suh, L. Rudolph, and S. Devadas. Dynamic partitioning of shared cache memory. J. Supercomput., 28(1), 2004.
[33]
D. Tam, R. Azimi, and M. Stumm. Thread clustering: sharing-aware scheduling on SMP-CMP-SMT multiprocessors. In Proc. of the European Conference on Computer Systems, 2007.
[34]
C. A. Waldspurger and W. E. Weihl. Lottery scheduling: flexible proportional-share resource management. In Proc. of the USENIX conference on Operating Systems Design and Implementation, 1994.
[35]
T. Y. Yeh and G. Reinman. Fast and fair: data-stream quality of service. In Proc. of the International Conference on Compilers, Architectures and Synthesis for Embedded Systems, 2005.
[36]
K. Yue and D. Lilja. Dynamic processor allocation with the Solaris operating system. In Proc. of the International Parallel Processing Symposium, 1998.

Cited By

View all
  • (2022)A Pressure-Aware Policy for Contention Minimization on Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/352461619:3(1-26)Online publication date: 25-May-2022
  • (2022)Online energy-efficient fair scheduling for heterogeneous multi-cores considering shared resource contentionThe Journal of Supercomputing10.1007/s11227-021-04159-878:6(7729-7748)Online publication date: 3-Jan-2022
  • (2020)CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00025(193-206)Online publication date: Feb-2020
  • Show More Cited By

Index Terms

  1. A case for integrated processor-cache partitioning in chip multiprocessors

      Recommendations

      Comments

      Information & Contributors

      Information

      Published In

      cover image ACM Conferences
      SC '09: Proceedings of the Conference on High Performance Computing Networking, Storage and Analysis
      November 2009
      778 pages
      ISBN:9781605587448
      DOI:10.1145/1654059
      Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

      Sponsors

      Publisher

      Association for Computing Machinery

      New York, NY, United States

      Publication History

      Published: 14 November 2009

      Permissions

      Request permissions for this article.

      Check for updates

      Qualifiers

      • Research-article

      Funding Sources

      Conference

      SC '09
      Sponsor:

      Acceptance Rates

      SC '09 Paper Acceptance Rate 59 of 261 submissions, 23%;
      Overall Acceptance Rate 1,516 of 6,373 submissions, 24%

      Contributors

      Other Metrics

      Bibliometrics & Citations

      Bibliometrics

      Article Metrics

      • Downloads (Last 12 months)5
      • Downloads (Last 6 weeks)0
      Reflects downloads up to 23 Oct 2024

      Other Metrics

      Citations

      Cited By

      View all
      • (2022)A Pressure-Aware Policy for Contention Minimization on Multicore SystemsACM Transactions on Architecture and Code Optimization10.1145/352461619:3(1-26)Online publication date: 25-May-2022
      • (2022)Online energy-efficient fair scheduling for heterogeneous multi-cores considering shared resource contentionThe Journal of Supercomputing10.1007/s11227-021-04159-878:6(7729-7748)Online publication date: 3-Jan-2022
      • (2020)CLITE: Efficient and QoS-Aware Co-Location of Multiple Latency-Critical Jobs for Warehouse Scale Computers2020 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA47549.2020.00025(193-206)Online publication date: Feb-2020
      • (2019)TAR Based Hotspot Prediction in Cloud Data CentresInternational Journal of Grid and High Performance Computing10.4018/IJGHPC.201907010111:3(1-22)Online publication date: Jul-2019
      • (2019)Data Similarity-Aware Computation Infrastructure for the CloudSearchable Storage in Cloud Computing10.1007/978-981-13-2721-6_7(153-178)Online publication date: 9-Feb-2019
      • (2017)A Survey of Techniques for Cache Partitioning in Multicore ProcessorsACM Computing Surveys10.1145/306239450:2(1-39)Online publication date: 10-May-2017
      • (2016)Predicting Cross-Core Performance Interference on Multicore Processors with Regression AnalysisIEEE Transactions on Parallel and Distributed Systems10.1109/TPDS.2015.244298327:5(1443-1456)Online publication date: 1-May-2016
      • (2016)Cache QoS: From concept to reality in the Intel® Xeon® processor E5-2600 v3 product family2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446102(657-668)Online publication date: Mar-2016
      • (2016)Modeling cache performance beyond LRU2016 IEEE International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2016.7446067(225-236)Online publication date: Mar-2016
      • (2015)Talus: A simple way to remove cliffs in cache performance2015 IEEE 21st International Symposium on High Performance Computer Architecture (HPCA)10.1109/HPCA.2015.7056022(64-75)Online publication date: Feb-2015
      • Show More Cited By

      View Options

      Get Access

      Login options

      View options

      PDF

      View or Download as a PDF file.

      PDF

      eReader

      View online with eReader.

      eReader

      Media

      Figures

      Other

      Tables

      Share

      Share

      Share this Publication link

      Share on social media