subscribe to arXiv mailings

doi 10.3390/jlpea8040046

Low-Complexity Loeffler DCT Approximations for Image and Video Coding

Authors: D. F. G. Coelho, R. J. Cintra, F. M. Bayer, S. Kulasekera, A. Madanayake, P. A. C. Martinez, T. L. T. Silveira, R. S. Oliveira, V. S. Dimitrov

Abstract: This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where… ▽ More This paper introduced a matrix parametrization method based on the Loeffler discrete cosine transform (DCT) algorithm. As a result, a new class of eight-point DCT approximations was proposed, capable of unifying the mathematical formalism of several eight-point DCT approximations archived in the literature. Pareto-efficient DCT approximations are obtained through multicriteria optimization, where computational complexity, proximity, and coding performance are considered. Efficient approximations and their scaled 16- and 32-point versions are embedded into image and video encoders, including a JPEG-like codec and H.264/AVC and H.265/HEVC standards. Results are compared to the unmodified standard codecs. Efficient approximations are mapped and implemented on a Xilinx VLX240T FPGA and evaluated for area, speed, and power consumption. △ Less

Submitted 28 July, 2022; originally announced July 2022.

Comments: 25 pages, 11 figures, 7 tables

Journal ref: J. Low Power Electron. Appl. 2018, 8(4), 46

arXiv:2206.01146 [pdf, ps, other]

doi 10.1142/S0129626412500090

Block-Parallel Systolic-Array Architecture for 2-D NTT-based Fragile Watermark Embedding

Authors: H. P. L. Arjuna Madanayake, R. J. Cintra, V. S. Dimitrov, L. Bruton

Abstract: Number-theoretic transforms (NTTs) have been applied in the fragile watermarking of digital images. A block-parallel systolic-array architecture is proposed for watermarking based on the 2-D special Hartley NTT (HNTT). The proposed core employs two 2-D special HNTT hardware cores, each using digital arithmetic over $\mathrm{GF}(3)$, and processes $4\times4$ blocks of pixels in parallel every clock… ▽ More Number-theoretic transforms (NTTs) have been applied in the fragile watermarking of digital images. A block-parallel systolic-array architecture is proposed for watermarking based on the 2-D special Hartley NTT (HNTT). The proposed core employs two 2-D special HNTT hardware cores, each using digital arithmetic over $\mathrm{GF}(3)$, and processes $4\times4$ blocks of pixels in parallel every clock cycle. Prototypes are operational on a Xilinx Sx35-10ff668 FPGA device. The maximum estimated throughput of the FPGA circuit is 100 million $4\times4$ HNTT fragile watermarked blocks per second, when clocked at 100 MHz. Potential applications exist in high-traffic back-end servers dealing with large amounts of protected digital images requiring authentication, in remote-sensing for high-security surveillance applications, in real-time video processing of information of a sensitive nature or matters of national security, in video/photographic content management of corporate clients, in authenticating multimedia for the entertainment industry, in the authentication of electronic evidence material, and in real-time news streaming. △ Less

Submitted 2 June, 2022; originally announced June 2022.

Comments: 11 pages, 4 figures

Journal ref: Parallel Processing Letters, vol. 22, no. 03, 1250009, 2012

arXiv:2105.06440 [pdf, other]

Powers of 3 with few nonzero bits and a conjecture of Erdős

Authors: Vassil S. Dimitrov, Everett W. Howe

Abstract: Using completely elementary methods, we find all powers of 3 that can be written as the sum of at most twenty-two distinct powers of 2, as well as all powers of 2 that can be written as the sum of at most twenty-five distinct powers of 3. The latter result is connected to a conjecture of Erdős, namely, that 1, 4, and 256 are the only powers of 2 that can be written as a sum of distinct powers of 3… ▽ More Using completely elementary methods, we find all powers of 3 that can be written as the sum of at most twenty-two distinct powers of 2, as well as all powers of 2 that can be written as the sum of at most twenty-five distinct powers of 3. The latter result is connected to a conjecture of Erdős, namely, that 1, 4, and 256 are the only powers of 2 that can be written as a sum of distinct powers of 3. We present this work partly as a reminder that for certain exponential Diophantine equations, elementary techniques based on congruences can yield results that would be difficult or impossible to obtain with more advanced techniques involving, for example, linear forms in logarithms. △ Less

Submitted 3 July, 2023; v1 submitted 13 May, 2021; originally announced May 2021.

Comments: Corrected typos. Final version, 18 pages. To appear in the Rocky Mountain Journal of Mathematics

MSC Class: 11D61 (Primary) 11A63; 11D72; 11D79 (Secondary)

arXiv:2008.09633 [pdf, ps, other]

doi 10.1049/el.2019.4030

Low-complexity Architecture for AR(1) Inference

Authors: A. Borges Jr., R. J. Cintra, D. F. G. Coelho, V. S. Dimitrov

Abstract: In this Letter, we propose a low-complexity estimator for the correlation coefficient based on the signed $\operatorname{AR}(1)$ process. The introduced approximation is suitable for implementation in low-power hardware architectures. Monte Carlo simulations reveal that the proposed estimator performs comparably to the competing methods in literature with maximum error in order of $10^{-2}$. Howev… ▽ More In this Letter, we propose a low-complexity estimator for the correlation coefficient based on the signed $\operatorname{AR}(1)$ process. The introduced approximation is suitable for implementation in low-power hardware architectures. Monte Carlo simulations reveal that the proposed estimator performs comparably to the competing methods in literature with maximum error in order of $10^{-2}$. However, the hardware implementation of the introduced method presents considerable advantages in several relevant metrics, offering more than 95% reduction in dynamic power and doubling the maximum operating frequency when compared to the reference method. △ Less

Submitted 21 August, 2020; originally announced August 2020.

Comments: 7 pages, 3 tables, 4 figures

Journal ref: Electronics Letters 56 (14), 732-734, 2020

arXiv:1807.08084 [pdf, ps, other]

doi 10.1016/j.cageo.2018.07.002

Fast Matrix Inversion and Determinant Computation for Polarimetric Synthetic Aperture Radar

Authors: D. F. G. Coelho, R. J. Cintra, A. C. Frery, V. S. Dimitrov

Abstract: This paper introduces a fast algorithm for simultaneous inversion and determinant computation of small sized matrices in the context of fully Polarimetric Synthetic Aperture Radar (PolSAR) image processing and analysis. The proposed fast algorithm is based on the computation of the adjoint matrix and the symmetry of the input matrix. The algorithm is implemented in a general purpose graphical proc… ▽ More This paper introduces a fast algorithm for simultaneous inversion and determinant computation of small sized matrices in the context of fully Polarimetric Synthetic Aperture Radar (PolSAR) image processing and analysis. The proposed fast algorithm is based on the computation of the adjoint matrix and the symmetry of the input matrix. The algorithm is implemented in a general purpose graphical processing unit (GPGPU) and compared to the usual approach based on Cholesky factorization. The assessment with simulated observations and data from an actual PolSAR sensor show a speedup factor of about two when compared to the usual Cholesky factorization. Moreover, the expressions provided here can be implemented in any platform. △ Less

Submitted 21 July, 2018; originally announced July 2018.

Comments: 7 pages, 1 figure

Journal ref: Computers and Geosciences, no. 119 (2018), pages 109-114

arXiv:1801.08589 [pdf, ps, other]

doi 10.1109/ISCAS.2011.5937664

A New Algorithm for Double Scalar Multiplication over Koblitz Curves

Authors: J. Adikari, V. S. Dimitrov, R. J. Cintra

Abstract: Koblitz curves are a special set of elliptic curves and have improved performance in computing scalar multiplication in elliptic curve cryptography due to the Frobenius endomorphism. Double-base number system approach for Frobenius expansion has improved the performance in single scalar multiplication. In this paper, we present a new algorithm to generate a sparse and joint $τ$-adic representation… ▽ More Koblitz curves are a special set of elliptic curves and have improved performance in computing scalar multiplication in elliptic curve cryptography due to the Frobenius endomorphism. Double-base number system approach for Frobenius expansion has improved the performance in single scalar multiplication. In this paper, we present a new algorithm to generate a sparse and joint $τ$-adic representation for a pair of scalars and its application in double scalar multiplication. The new algorithm is inspired from double-base number system. We achieve 12% improvement in speed against state-of-the-art $τ$-adic joint sparse form. △ Less

Submitted 25 January, 2018; originally announced January 2018.

Comments: 5 pages, 2 figures, 1 table

Journal ref: Circuits and Systems (ISCAS), 2011 IEEE International Symposium on

arXiv:1801.05832 [pdf, ps, other]

doi 10.1007/s11265-017-1270-6

Efficient Computation of the 8-point DCT via Summation by Parts

Authors: D. F. G. Coelho, R. J. Cintra, V. S. Dimitrov

Abstract: This paper introduces a new fast algorithm for the 8-point discrete cosine transform (DCT) based on the summation-by-parts formula. The proposed method converts the DCT matrix into an alternative transformation matrix that can be decomposed into sparse matrices of low multiplicative complexity. The method is capable of scaled and exact DCT computation and its associated fast algorithm achieves the… ▽ More This paper introduces a new fast algorithm for the 8-point discrete cosine transform (DCT) based on the summation-by-parts formula. The proposed method converts the DCT matrix into an alternative transformation matrix that can be decomposed into sparse matrices of low multiplicative complexity. The method is capable of scaled and exact DCT computation and its associated fast algorithm achieves the theoretical minimal multiplicative complexity for the 8-point DCT. Depending on the nature of the input signal simplifications can be introduced and the overall complexity of the proposed algorithm can be further reduced. Several types of input signal are analyzed: arbitrary, null mean, accumulated, and null mean/accumulated signal. The proposed tool has potential application in harmonic detection, image enhancement, and feature extraction, where input signal DC level is discarded and/or the signal is required to be integrated. △ Less

Submitted 28 March, 2018; v1 submitted 17 January, 2018; originally announced January 2018.

Comments: Fixed Fig. 1 with the block diagram of the proposed architecture. Manuscript contains 13 pages, 4 figures, 2 tables

Journal ref: J Sign Process Syst (2017)

arXiv:1710.11200 [pdf, ps, other]

doi 10.1109/TC.2014.2366732

VLSI Computational Architectures for the Arithmetic Cosine Transform

Authors: N. Rajapaksha, A. Madanayake, R. J. Cintra, J. Adikari, V. S. Dimitrov

Abstract: The discrete cosine transform (DCT) is a widely-used and important signal processing tool employed in a plethora of applications. Typical fast algorithms for nearly-exact computation of DCT require floating point arithmetic, are multiplier intensive, and accumulate round-off errors. Recently proposed fast algorithm arithmetic cosine transform (ACT) calculates the DCT exactly using only additions a… ▽ More The discrete cosine transform (DCT) is a widely-used and important signal processing tool employed in a plethora of applications. Typical fast algorithms for nearly-exact computation of DCT require floating point arithmetic, are multiplier intensive, and accumulate round-off errors. Recently proposed fast algorithm arithmetic cosine transform (ACT) calculates the DCT exactly using only additions and integer constant multiplications, with very low area complexity, for null mean input sequences. The ACT can also be computed non-exactly for any input sequence, with low area complexity and low power consumption, utilizing the novel architecture described. However, as a trade-off, the ACT algorithm requires 10 non-uniformly sampled data points to calculate the 8-point DCT. This requirement can easily be satisfied for applications dealing with spatial signals such as image sensors and biomedical sensor arrays, by placing sensor elements in a non-uniform grid. In this work, a hardware architecture for the computation of the null mean ACT is proposed, followed by a novel architectures that extend the ACT for non-null mean signals. All circuits are physically implemented and tested using the Xilinx XC6VLX240T FPGA device and synthesized for 45 nm TSMC standard-cell library for performance assessment. △ Less

Submitted 30 October, 2017; originally announced October 2017.

Comments: 8 pages, 2 figures, 6 tables

Journal ref: IEEE Transactions on Computers, vol. 64, no. 9, Sep 2015

arXiv:1710.09975 [pdf, ps, other]

doi 10.1109/TCSVT.2013.2270397

A Single-Channel Architecture for Algebraic Integer Based 8$\times$8 2-D DCT Computation

Authors: A. Edirisuriya, A. Madanayake, R. J. Cintra, V. S. Dimitrov

Abstract: An area efficient row-parallel architecture is proposed for the real-time implementation of bivariate algebraic integer (AI) encoded 2-D discrete cosine transform (DCT) for image and video processing. The proposed architecture computes 8$\times$8 2-D DCT transform based on the Arai DCT algorithm. An improved fast algorithm for AI based 1-D DCT computation is proposed along with a single channel 2-… ▽ More An area efficient row-parallel architecture is proposed for the real-time implementation of bivariate algebraic integer (AI) encoded 2-D discrete cosine transform (DCT) for image and video processing. The proposed architecture computes 8$\times$8 2-D DCT transform based on the Arai DCT algorithm. An improved fast algorithm for AI based 1-D DCT computation is proposed along with a single channel 2-D DCT architecture. The design improves on the 4-channel AI DCT architecture that was published recently by reducing the number of integer channels to one and the number of 8-point 1-D DCT cores from 5 down to 2. The architecture offers exact computation of 8$\times$8 blocks of the 2-D DCT coefficients up to the FRS, which converts the coefficients from the AI representation to fixed-point format using the method of expansion factors. Prototype circuits corresponding to FRS blocks based on two expansion factors are realized, tested, and verified on FPGA-chip, using a Xilinx Virtex-6 XC6VLX240T device. Post place-and-route results show a 20% reduction in terms of area compared to the 2-D DCT architecture requiring five 1-D AI cores. The area-time and area-time${}^2$ complexity metrics are also reduced by 23% and 22% respectively for designs with 8-bit input word length. The digital realizations are simulated up to place and route for ASICs using 45 nm CMOS standard cells. The maximum estimated clock rate is 951 MHz for the CMOS realizations indicating 7.608$\cdot$10$^9$ pixels/seconds and a 8$\times$8 block rate of 118.875 MHz. △ Less

Submitted 26 October, 2017; originally announced October 2017.

Comments: 8 pages, 6 figures, 5 tables

Journal ref: IEEE Transactions on Circuits and Systems for Video Technology, volume 23, number 12, pages 2083-2089, Dec. 2013

arXiv:1601.07225 [pdf, other]

doi 10.1088/0967-3334/25/6/002

Wavelet Analysis in a Canine Model of Gastric Electrical Uncoupling

Authors: R. J. Cintra, I. V. Tchervensky, V. S. Dimitrov, M. P. Mintchev

Abstract: Abnormal gastric motility function could be related to gastric electrical uncoupling, the lack of electrical, and respectively mechanical, synchronization in different regions of the stomach. Therefore, non-invasive detection of the onset of gastric electrical uncoupling can be important for diagnosing associated gastric motility disorders. The aim of this study is to provide a wavelet-based analy… ▽ More Abnormal gastric motility function could be related to gastric electrical uncoupling, the lack of electrical, and respectively mechanical, synchronization in different regions of the stomach. Therefore, non-invasive detection of the onset of gastric electrical uncoupling can be important for diagnosing associated gastric motility disorders. The aim of this study is to provide a wavelet-based analysis of electrogastrograms (EGG, the cutaneous recordings of gastric electric activity), to detect gastric electric uncoupling. Eight-channel EGG recordings were acquired from sixteen dogs in basal state and after each of two circular gastric myotomies. These myotomies simulated mild and severe gastric electrical uncoupling, while keeping the separated gastric sections electrophysiologically active by preserving their blood supply. After visual inspection, manually selected 10-minute EGG segments were submitted to wavelet analysis. Quantitative methodology to choose an optimal wavelet was derived. This "matching" wavelet was determined using the Pollen parameterization for 6-tap wavelet filters and error minimization criteria. After a wavelet-based compression, the distortion of the approximated EGG signals was computed. Statistical analysis on the distortion values allowed to significantly ($p<0.05$) distinguish basal state from mild and severe gastric electrical uncoupling groups in particular EGG channels. △ Less

Submitted 26 January, 2016; originally announced January 2016.

Comments: 18 pages, Fixed equation (6). arXiv admin note: substantial text overlap with arXiv:1502.00239

Journal ref: Physiological Measurement, Volume 25, Number 6, 2004

arXiv:1502.04221 [pdf, ps, other]

doi 10.1109/TCSVT.2011.2181232

A Row-parallel 8$\times$8 2-D DCT Architecture Using Algebraic Integer Based Exact Computation

Authors: A. Madanayake, R. J. Cintra, D. Onen, V. S. Dimitrov, N. T. Rajapaksha, L. T. Bruton, A. Edirisuriya

Abstract: An algebraic integer (AI) based time-multiplexed row-parallel architecture and two final-reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI-encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8$\times$8 2-D DCT which is entirely free of quantizatio… ▽ More An algebraic integer (AI) based time-multiplexed row-parallel architecture and two final-reconstruction step (FRS) algorithms are proposed for the implementation of bivariate AI-encoded 2-D discrete cosine transform (DCT). The architecture directly realizes an error-free 2-D DCT without using FRSs between row-column transforms, leading to an 8$\times$8 2-D DCT which is entirely free of quantization errors in AI basis. As a result, the user-selectable accuracy for each of the coefficients in the FRS facilitates each of the 64 coefficients to have its precision set independently of others, avoiding the leakage of quantization noise between channels as is the case for published DCT designs. The proposed FRS uses two approaches based on (i) optimized Dempster-Macleod multipliers and (ii) expansion factor scaling. This architecture enables low-noise high-dynamic range applications in digital video processing that requires full control of the finite-precision computation of the 2-D DCT. The proposed architectures and FRS techniques are experimentally verified and validated using hardware implementations that are physically realized and verified on FPGA chip. Six designs, for 4- and 8-bit input word sizes, using the two proposed FRS schemes, have been designed, simulated, physically implemented and measured. The maximum clock rate and block-rate achieved among 8-bit input designs are 307.787 MHz and 38.47 MHz, respectively, implying a pixel rate of 8$\times$307.787$\approx$2.462 GHz if eventually embedded in a real-time video-processing system. The equivalent frame rate is about 1187.35 Hz for the image size of 1920$\times$1080. All implementations are functional on a Xilinx Virtex-6 XC6VLX240T FPGA device. △ Less

Submitted 14 February, 2015; originally announced February 2015.

Comments: 28 pages, 9 figures, 7 tables, corrected typos

Journal ref: IEEE Transactions on Circuits and Systems for Video Technology, vol. 22, no. 6, pp. 915--929, 2012

arXiv:1502.01377 [pdf, ps, other]

doi 10.1109/TSP.2010.2045781

The Arithmetic Cosine Transform: Exact and Approximate Algorithms

Authors: R. J. Cintra, V. S. Dimitrov

Abstract: In this paper, we introduce a new class of transform method --- the arithmetic cosine transform (ACT). We provide the central mathematical properties of the ACT, necessary in designing efficient and accurate implementations of the new transform method. The key mathematical tools used in the paper come from analytic number theory, in particular the properties of the Riemann zeta function. Additiona… ▽ More In this paper, we introduce a new class of transform method --- the arithmetic cosine transform (ACT). We provide the central mathematical properties of the ACT, necessary in designing efficient and accurate implementations of the new transform method. The key mathematical tools used in the paper come from analytic number theory, in particular the properties of the Riemann zeta function. Additionally, we demonstrate that an exact signal interpolation is achievable for any block-length. Approximate calculations were also considered. The numerical examples provided show the potential of the ACT for various digital signal processing applications. △ Less

Submitted 4 February, 2015; originally announced February 2015.

Comments: 17 pages, 3 figures

Journal ref: IEEE Transactions on Signal Processing, vol. 58, no. 6, pp. 3076-3085, June 2010

arXiv:1502.00296 [pdf, other]

doi 10.1016/j.image.2009.04.003

Fragile Watermarking Using Finite Field Trigonometrical Transforms

Authors: R. J. Cintra, V. S. Dimitrov, H. M. de Oliveira, R. M. Campello de Souza

Abstract: Fragile digital watermarking has been applied for authentication and alteration detection in images. Utilizing the cosine and Hartley transforms over finite fields, a new transform domain fragile watermarking scheme is introduced. A watermark is embedded into a host image via a blockwise application of two-dimensional finite field cosine or Hartley transforms. Additionally, the considered finite f… ▽ More Fragile digital watermarking has been applied for authentication and alteration detection in images. Utilizing the cosine and Hartley transforms over finite fields, a new transform domain fragile watermarking scheme is introduced. A watermark is embedded into a host image via a blockwise application of two-dimensional finite field cosine or Hartley transforms. Additionally, the considered finite field transforms are adjusted to be number theoretic transforms, appropriate for error-free calculation. The employed technique can provide invisible fragile watermarking for authentication systems with tamper location capability. It is shown that the choice of the finite field characteristic is pivotal to obtain perceptually invisible watermarked images. It is also shown that the generated watermarked images can be used as publicly available signature data for authentication purposes. △ Less

Submitted 1 February, 2015; originally announced February 2015.

Comments: 9 pages, 7 figures, 2 tables

Journal ref: Image Communication, Volume 24, Issue 7, August, 2009, pp. 587-597

arXiv:1502.00239 [pdf, ps, other]

doi 10.1109/IEMBS.2004.1403159

Optimal Wavelets for Electrogastrography

Authors: R. J. Cintra, I. V. Tchervensky, V. S. Dimitrov, M. P. Mintchev

Abstract: Matching a wavelet to class of signals can be of interest in feature detection and classification based on wavelet representation. The aim of this work is to provide a quantitative approach to the problem of matching a wavelet to electrogastrographic (EGG) signals. Visually inspected EGG recordings from sixteen dogs and six volunteers were submitted to wavelet analysis. Approximated wavelet-based… ▽ More Matching a wavelet to class of signals can be of interest in feature detection and classification based on wavelet representation. The aim of this work is to provide a quantitative approach to the problem of matching a wavelet to electrogastrographic (EGG) signals. Visually inspected EGG recordings from sixteen dogs and six volunteers were submitted to wavelet analysis. Approximated wavelet-based versions of EGG signals were calculated using Pollen parameterization of 6-tap wavelet filters and wavelet compression techniques. Wavelet parameterization values that minimize the approximation error of compressed EGG signals were sought and considered optimal. The wavelets generated from the optimal parameterization values were remarkably similar to the standard Daubechies-3 wavelet. △ Less

Submitted 1 February, 2015; originally announced February 2015.

Comments: 6 pages, 4 figures, 2 tables, corrected Eq. (3)

Journal ref: 26th Annual International Conference of the IEEE Engineering in Medicine and Biology Society (IEMBS), vol. 1, pp. 329--332, 2004

arXiv:1001.4133 [pdf, ps, other]

doi 10.1090/S0002-9939-2011-10764-0

Lower bounds on the lengths of double-base representations

Authors: Vassil S. Dimitrov, Everett W. Howe

Abstract: A double-base representation of an integer n is an expression n = n_1 + ... + n_r, where the n_i are (positive or negative) integers that are divisible by no primes other than 2 or 3; the length of the representation is the number r of terms. It is known that there is a constant a > 0 such that every integer n has a double-base representation of length at most a log n / log log n. We show that the… ▽ More A double-base representation of an integer n is an expression n = n_1 + ... + n_r, where the n_i are (positive or negative) integers that are divisible by no primes other than 2 or 3; the length of the representation is the number r of terms. It is known that there is a constant a > 0 such that every integer n has a double-base representation of length at most a log n / log log n. We show that there is a constant c > 0 such that there are infinitely many integers n whose shortest double-base representations have length greater than c log n / (log log n log log log n). Our methods allow us to find the smallest positive integers with no double-base representations of several lengths. In particular, we show that 103 is the smallest positive integer with no double-base representation of length 2, that 4985 is the smallest positive integer with no double-base representation of length 3, that 641687 is the smallest positive integer with no double-base representation of length 4, and that 326552783 is the smallest positive integer with no double-base representation of length 5. △ Less

Submitted 2 February, 2011; v1 submitted 23 January, 2010; originally announced January 2010.

Comments: 8 pages, LaTeX. Added DOIs for most references; corrected a minor error in arithmetic; made small copy-editing changes. To appear in Proc. Amer. Math. Soc

MSC Class: 11A67; 11A63

Journal ref: Proc. Amer. Math. Soc. 139 (2011), 3423-3430

Showing 1–15 of 15 results for author: Dimitrov, V S