An optimization technique for lowering the iteration bound of DSP programs

49 Accesses
Explore all metrics

Abstract

The throughput of a parallel execution of a DSP algorithm is limited by the iteration bound, which is the minimum period between the starts of consecutive iterations. It is given byT _i∞=max (T _i/D _i), whereT _i andD _i are the total time of operations and the number of delays in loopi, respectively. The execution throughput of a DSP algorithm can be increased by reducing theT _is, and this reduction can be realized by taking as many operations as possible out of loops without changing the semantic of the calculation. Since many DSP algorithms extensively use the four basic arithmetic operations, a simple and effective way of doing this reduction is to apply commutativity, associativity and distributivity on these operations. This paper presents an optimization technique, calledLoop Shrinking, which reduces the iteration bound by using the above method. Loop Shrinking is based on a heuristic method which is time-efficient for simple cases but can also tackle complex examples. An implementation of Loop Shrinking is presented in this article. The results show that it can yield a reduction in the iteration bound near or equal to careful hand-tuning.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine Granularity and OpenMP Tasks: The SPEC mcf Case

Implementation, Scheduling, and Adaptation of Partial Expansion Graphs on Multicore Platforms

Article 22 February 2016

Case Study: DCT with Aurora

References

A.V. Aho, R. Sethi and J.D. Ullman, Compilers: Principles, Techniques and Tools, Reading, MA: Addison-Wesley, 1988.
Google Scholar
G. De Michelli and D.C. Ku, “HERCULES—a system for high-level synthesis,”Proc. of the 25th ACM/IEEE Design Automation Conference, 1988, pp. 483–488.
T. Tanaka, T. Kobayashi and O. Karatsu, “HARP: fortran to silicon,”IEEE Transactions on Computer-Aided Design, vol. 8, 1989. pp. 649–660.
Article Google Scholar
S.H. Lee and T.P. Barnwell III, “Optimal multiprocessor implementations from a serial algorithm specification,”Proc. ICASSP-88, vol. 3, 1988, pp. 1694–1697.
Google Scholar
J. Bhasker and H. Lee, “An optimizer for hardware synthesis,”IEEE Design and Test of Computers, vol. 7, 1990, pp. 20–36.
Article Google Scholar
R. Camposano and W. Rosenstiel, “Synthesizing circuits from behavioral descriptions,”IEEE Transactions on Computer-Aided Design, vol. 8, 1989, pp. 171–180.
Article Google Scholar
H. Trickey, “Flamel: A high-level hardware compiler,”IEEE Transactions on Computer-Aided Design, vol. CAD-6, 1987, pp. 259–269.
Article Google Scholar
W. Rosenstiel,Optimizations in High Level Synthesis, Microprocessing and Microprogramming (18), North Holland, 1986, pp. 347–352.
R. A. Walker and D.E. Thomas, “Design representation and transformation in The System Architect's Workbench,”Proc. of ICCAD-87, 1987, pp. 166–169.
M.C. McFarland, A.C. Parker and R. Camposano, “The high-level synthesis of digital systems,”Proceedings of the IEEE, vol. 78, 1990, pp. 301–317.
Article Google Scholar
K.K. Parhi and D.G. Messesrschmitt, “Static rate-optimal scheduling of iterative data-flow programs via optimum unfolding,”IEEE Transactions on Computers, vol. 40, 1991, pp. 178–195.
Article Google Scholar
D.A. Schwartz and T.P. Barnwell III, “Cyclo-static multiprocessor scheduling for the optimal realization of shift-invariant flow graphs,”Proc. ICASSP-85, 1985, pp. 1384–1387.
M. Potkonjak and J. Rabaey, “Optimizing resource utilization using transformations,”Proc. of ICCAD-91, pp. 88–91.
R. Hartley and A. Casavant, “Tree-height minimization in pipelined architectures,”Proc. of ICCAD-89, pp. 112–115.
Y. Muraoka,Parallelism Exposure and Exploitation in Programs, Ph.D. Dissertation, Univ. of Illinois at Urbana-Champaign, Dept. of Comp. Science, 1971.
S. Kung, “On supercomputing with systolic/wavefront array processors,”Proceedings of the IEEE, vol. 72, 1984, pp. 867–884.
Article Google Scholar
P.H. Winston,Artificial Intelligence, Reading, MA: Addison-Wesley, 1984.
MATH Google Scholar
Y. Miyanaga, Y. Yokoyama and K. Tochinai, “Automatic design system of parallel/pipelined VLSI architecture for adaptive signal processing,”Proceedings of the ISMM International Conference on Parallel and Distributed Computing, and Systems, 1990, pp. 24–28.
M. Tatibana, Y. Miyanaga and K. Tochinai, “Automatic design system of parallel VLSI architectures using neural network,”Proceedings of the Fourth Japanese-Sino Sapporo International Conference on Computer Applications, pp. 187–190.
H. Forren and D.A. Schwartz, “Transforming periodic synchronous multiprocessor programs,”Proc. ICASSP-S7, 1987, pp. 1406–1409.
Y. Miyanaga, N. Nagai and K. Nagata, “Parallel procesing methods for parametric modeling of stochastic signals,”Transactions of IEICE, vol. J 70-A, pp. 1395–1405.
K. Ito and H. Kunieda, “VLSI system compiler for digital signal processing: modularization and synchronization,”IEEE Transactions on Circuits and Systems, vol. 38, 1991, pp. 423–433.
Article Google Scholar
S.Y. Kung, H.J. Whitehouse and T. Kaliath,VLSI and Modern Signal Processing, Englewood Cliffs, NJ: Prentice Hall, 1985, pp. 258–264.
Google Scholar
F.B. Maciel, Y. Miyanaga and K. Tochinai, “Two optimization techniques for high-level synthesis,” Proceedings of the 1991 Joint Convention, The Hokkaido Chapters of the IEICE, 1991, p. 314.

Download references

Author information

Authors and Affiliations

Department of Electronic Engineering, Faculty of Engineering, Hokkaido University, 060, Sapporo, Japan
Frederico Buchholz Maciel, Yoshikazu Miyanaga & Koji Tochinai

Authors

Frederico Buchholz Maciel
View author publications
You can also search for this author in PubMed Google Scholar
Yoshikazu Miyanaga
View author publications
You can also search for this author in PubMed Google Scholar
Koji Tochinai
View author publications
You can also search for this author in PubMed Google Scholar

Rights and permissions

Reprints and permissions

About this article

Cite this article

Maciel, F.B., Miyanaga, Y. & Tochinai, K. An optimization technique for lowering the iteration bound of DSP programs. J VLSI Sign Process Syst Sign Image Video Technol 5, 273–282 (1993). https://doi.org/10.1007/BF01581301

Download citation

Received: 31 December 1991
Revised: 23 July 1992
Published: 01 April 1993
Issue Date: April 1993
DOI: https://doi.org/10.1007/BF01581301

An optimization technique for lowering the iteration bound of DSP programs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine Granularity and OpenMP Tasks: The SPEC mcf Case

Implementation, Scheduling, and Adaptation of Partial Expansion Graphs on Multicore Platforms

Case Study: DCT with Aurora

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

An optimization technique for lowering the iteration bound of DSP programs

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

How to Efficiently Parallelize Irregular DOACROSS Loops Using Fine Granularity and OpenMP Tasks: The SPEC mcf Case

Implementation, Scheduling, and Adaptation of Partial Expansion Graphs on Multicore Platforms

Case Study: DCT with Aurora

References

Author information

Authors and Affiliations

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation