skip to main content
research-article

Automatic speculative parallelization of loops using polyhedral dependence analysis

Published: 24 February 2013 Publication History

Abstract

Speculative Execution (SE) runs loops in parallel even in the presence of a dependence. Using polyhedral dependence analysis, more speculation candidate loops can be discovered than normal OpenMP parallelization. In this research, a framework is implemented that can automatically perform speculative parallelization of loops using Polly's [15] polyhedral dependence analysis. The framework uses two different heuristics to find speculation candidates. The first heuristic allows loops with only may dependences to run speculatively in parallel while the second heuristic filters out cold loops and, using profile information, loops with actual run time dependences. The framework is fully automatic. Running SPEC2006 and the PolyBench/C benchmarks on the IBM BlueGene/Q [16] machine shows that the framework is able to discover more parallelization candidates than OpenMP parallelization and achieve better speedup.

References

[1]
Spec2006 benchmark suite. http://www.spec.org/cpu2006/.
[2]
M. M. Baskaran, U. Bondhugula, S. Krishnamoorthy, J. Ramanujam, A. Rountev, and P. Sadayappan. A compiler framework for optimization of affine loop nests for GPGPUs. In International Conference on Supercomputing (ICS), pages 225--234, 2008.
[3]
P. Berube, A. Preuss, and J. N. Amaral. Combined Profiling: practical collection of feedback information for code optimization. In Proceedings of the second joint WOSP/SIPEW International Conference on Performance engineering, ICPE '11, pages 493--498, Karlshure, Germany, 2011.
[4]
U. Bondhugula, A. Hartono, and J. Ramanujam. A practical automatic polyhedral parallelizer and locality optimizer. In Programming Language Design and Implementation (PLDI), pages 101--113, Tucson, AZ, USA, 2008.
[5]
P. Boulet, A. Darte, G.-A. Silber, and F. Vivien. Loop parallelization algorithms: from parallelism extraction to code generation. In Compiler Optimizations for Scalable Parallel Systems Languages, pages 141--172, 2001.
[6]
M. Cintra, J. F. Martinez, and J. Torrellas. Architectural support for scalable speculative parallelization in shared-memory multiprocessors. In International Symposium on Computer Architecture (ISCA), pages 13--24, Jun 2000.
[7]
A. Darte and F. Vivien. Optimal fine and medium grain parallelism detection in polyhedral reduced dependence graphs. International Journal of Parallel Programming (IJPP), 25(6): 447--496, 1997.
[8]
P. Feautrier. Array expansion. In International Conference on Supercomputing (ICS), July 1988.
[9]
P. Feautrier. Parametric integer programming. In RAIRO Recherche Operationnelle, pages 243--268, 1988.
[10]
P. Feautrier. Some efficient solutions to the affine scheduling problem. part 1. one-dimensional time. International Journal of Parallel Programming (IJPP), 21(5): 313--348, 1992.
[11]
P. Feautrier. Some efficient solutions to the affine scheduling problem. part 2. multidimensional time. International Journal of Parallel Programming (IJPP), 21(6): 389--420, 1992.
[12]
S. Girbal, N. Vasilache, C. Bastoul, A. Cohen, D. Parello, M. Sigler, and O. Temam. Semi-automatic composition of loop transformations for deep parallelism and memory hierarchies. International Journal of Parallel Programming (IJPP), 34(3): 261--317, July 1988.
[13]
S. Gopal, T. Vijaykumar, J. Smith, and G. Sohi. Speculative versioning cache. In International Symposium on High-Performance Computer Architecture, pages 195--205, Feb 1998.
[14]
M. Griebl. Automatic parallelization of loop programs for distributed memory architectures. Habilitation thesis, University of Passau, 2004.
[15]
T. Grosser, H. Zheng, R. A, A. Simbürger, A. Grosslinger, and L.-N. Pouchet. Polly - polyhedral optimization in llvm. In First International Workshop on Polyhedral Compilation Techniques (IMPACT), Chamonix, France, Apr 2011.
[16]
R. Haring, M. Ohmacht, T. Fox, M. Gschwind, D. Satterfield, K. Sugavanam, P. Coteus, P. Heidelberger, M. Blumrich, R. Wisniewski, A. Gara, G.-T. Chiu, P. Boyle, N. Chist, and C. Kim. The IBM Blue Gene/Q compute chip. Micro, IEEE, 32(2): 48--60, March-April 2012.
[17]
A. Jimborean, P. Clauss, B. Pradelle, L. Mastrangelo, and V. Loechner. Adapting the polyhedral model as a framework for efficient speculative parallelization. In Proceedings of the 17th ACM SIGPLAN symposium on Principles and Practice of Parallel Programming, PPoPP '12, pages 295--296, New Orleans, Louisiana, USA, 2012.
[18]
H. Kim, N. P. Johnson, J. W. Lee, S. A. Mahlke, and D. I. August. Automatic speculative doall for clusters. In Code Generation and Optimization (CGO), pages 94--103, 2012.
[19]
X. Kong, D. Klappholz, and K. Psarris. The I Test: A New Test for Subscript Data Dependence. In International Conference on Parallel Processing (ICPP), pages 204--211, 1990.
[20]
C. Lattner. The llvm compiler infrastructure. http://llvm.org.
[21]
J. F. Martínez and J. Torrellas. Speculative synchronization: applying thread-level speculation to explicitly parallel applications. In International Conference on Architectural support for programming languages and operating systems (ASPLOS), pages 18--29, San Jose, California, 2002.
[22]
M. Wolfe. High performance compilers for parallel computing. Addison Wesley, 1996.
[23]
S. Pop, A. Cohen, C. Bastoul, S. Girbal, G. Andre Silber, and N. Vasilache. Graphite: Polyhedral analyses and optimizations for GCC. In Proceedings of the 2006 GCC Developers Summit, 2006.
[24]
L.-N. Pouchet. Polybench/c benchmarks. http://www.cse.ohio-state.edu/pouchet/software/polybench/.
[25]
W. Pugh. The Omega Test: a fast and practical integer programming algorithm for dependence analysis. Communications of the ACM, 8: 4--13, 1992.
[26]
L. Rauchwerger and D. Padua. The LRPD test: Speculative run-time parallelization of loops with privatization and reduction parallelization. In Programming Language Design and Implementation (PLDI), pages 218--232, La Jolla, California, USA, 1995.
[27]
J. G. Steffan, C. B. Colohan, A. Zhai, and T. C. Mowry. A scalable approach to thread-level speculation. In International Symposium on Computer Architecture (ISCA), pages 1--12, Vancouver, British Columbia, Canada, 2000.
[28]
M. walid Benabderrahmane, L. Noel Pouchet, A. Cohen, and C. Bastoul. The Polyhedral Model is more widely applicable than you think. In Proceedings of the 19th joint European conference on Theory and Practice of Software, international conference on Compiler Construction, CC'10/ETAPS'10, pages 283--303, Berlin, Heidelberg, 2010.
[29]
A. Wang, M. Gaudet, P. Wu, J. N. Amaral, M. Ohmacht, C. Barton, R. Silvera, and M. Michael. Evaluation of Blue Gene/Q hardware support for transactional memories. In International conference on Parallel architectures and compilation techniques (PACT), pages 127--136, Minneapolis, Minnesota, USA.

Cited By

View all
  • (2022)An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheadsInternational Journal of Intelligent Systems10.1002/int.2306237:12(11764-11790)Online publication date: 8-Sep-2022
  • (2020)IR-Level Dynamic Data Dependence Using Abstract Interpretation Towards Speculative ParallelizationIEEE Access10.1109/ACCESS.2020.29977158(99910-99921)Online publication date: 2020
  • (2019)A study on popular auto‐parallelization frameworksConcurrency and Computation: Practice and Experience10.1002/cpe.516831:17Online publication date: 11-Feb-2019
  • Show More Cited By

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences
COSMIC '13: Proceedings of the First International Workshop on Code OptimiSation for MultI and many Cores
February 2013
34 pages
ISBN:9781450319713
DOI:10.1145/2446920
Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

Sponsors

  • AMD

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 24 February 2013

Permissions

Request permissions for this article.

Check for updates

Author Tags

  1. auto-parallelization
  2. polyhedral model
  3. thread level speculation

Qualifiers

  • Research-article

Conference

COSMIC '13
Sponsor:

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)12
  • Downloads (Last 6 weeks)8
Reflects downloads up to 21 Oct 2024

Other Metrics

Citations

Cited By

View all
  • (2022)An efficient hardware supported and parallelization architecture for intelligent systems to overcome speculative overheadsInternational Journal of Intelligent Systems10.1002/int.2306237:12(11764-11790)Online publication date: 8-Sep-2022
  • (2020)IR-Level Dynamic Data Dependence Using Abstract Interpretation Towards Speculative ParallelizationIEEE Access10.1109/ACCESS.2020.29977158(99910-99921)Online publication date: 2020
  • (2019)A study on popular auto‐parallelization frameworksConcurrency and Computation: Practice and Experience10.1002/cpe.516831:17Online publication date: 11-Feb-2019
  • (2017)Toward Emotion-Aware Computing: A Loop Selection Approach Based on Machine Learning for Speculative MultithreadingIEEE Access10.1109/ACCESS.2017.26841295(3675-3686)Online publication date: 2017
  • (2017)Automatic CPU/GPU Generation of Multi-versioned OpenCL Kernels for C++ Scientific ApplicationsInternational Journal of Parallel Programming10.1007/s10766-016-0425-645:2(262-282)Online publication date: 1-Apr-2017
  • (2016)A Survey on Thread-Level Speculation TechniquesACM Computing Surveys10.1145/293836949:2(1-39)Online publication date: 30-Jun-2016
  • (2015)Data-dependence profiling to enable safe thread level speculationProceedings of the 25th Annual International Conference on Computer Science and Software Engineering10.5555/2886444.2886459(91-100)Online publication date: 2-Nov-2015

View Options

Get Access

Login options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Media

Figures

Other

Tables

Share

Share

Share this Publication link

Share on social media