-
Bringing Auto-tuning to HIP: Analysis of Tuning Impact and Difficulty on AMD and Nvidia GPUs
Authors:
Milo Lurati,
Stijn Heldens,
Alessio Sclocco,
Ben van Werkhoven
Abstract:
Many studies have focused on developing and improving auto-tuning algorithms for Nvidia Graphics Processing Units (GPUs), but the effectiveness and efficiency of these approaches on AMD devices have hardly been studied. This paper aims to address this gap by introducing an auto-tuner for AMD's HIP. We do so by extending Kernel Tuner, an open-source Python library for auto-tuning GPU programs. We a…
▽ More
Many studies have focused on developing and improving auto-tuning algorithms for Nvidia Graphics Processing Units (GPUs), but the effectiveness and efficiency of these approaches on AMD devices have hardly been studied. This paper aims to address this gap by introducing an auto-tuner for AMD's HIP. We do so by extending Kernel Tuner, an open-source Python library for auto-tuning GPU programs. We analyze the performance impact and tuning difficulty for four highly-tunable benchmark kernels on four different GPUs: two from Nvidia and two from AMD. Our results demonstrate that auto-tuning has a significantly higher impact on performance on AMD compared to Nvidia (10x vs 2x). Additionally, we show that applications tuned for Nvidia do not perform optimally on AMD, underscoring the importance of auto-tuning specifically for AMD to achieve high performance on these GPUs.
△ Less
Submitted 16 July, 2024;
originally announced July 2024.
-
Comprehensive analysis of the Apertif Fast Radio Burst sample: similarities with young, energetic neutron stars
Authors:
Inés Pastor-Marazuela,
Joeri van Leeuwen,
Anna Bilous,
Liam Connor,
Yogesh Maan,
Leon Oostrum,
Emily Petroff,
Dany Vohl,
Kelley M. Hess,
Emanuela Orrù,
Alessio Sclocco,
Yuyang Wang
Abstract:
Understanding the origin of fast radio bursts (FRBs) has become the main science driver of recent dedicated FRB surveys. Between July 2019 and February 2022, we carried out ALERT, an FRB survey at 1370 MHz using the Apertif instrument installed at the Westerbork Synthesis Radio Telescope (WSRT). Here we report the detection of 18 new FRBs, and we study the properties of the entire 24 burst sample…
▽ More
Understanding the origin of fast radio bursts (FRBs) has become the main science driver of recent dedicated FRB surveys. Between July 2019 and February 2022, we carried out ALERT, an FRB survey at 1370 MHz using the Apertif instrument installed at the Westerbork Synthesis Radio Telescope (WSRT). Here we report the detection of 18 new FRBs, and we study the properties of the entire 24 burst sample detected during the survey. For five bursts, we identify host galaxy candidates with >50% probability association. We observe an average linear polarisation fraction of $\sim$43% and an average circular polarisation fraction consistent with 0%. A third of the FRBs display multiple components. The sample next reveals a population of highly scattered bursts, which is most likely to have been produced in the immediate circumburst environment. Furthermore, two FRBs show evidence for high rotation measures, reaching |RM|>$10^3$ rad m$^{-2}$ in the source reference frames. Together, the scattering and rotation measures ALERT finds prove that a large fraction of FRBs are embedded in complex media such as star forming regions or supernova remnants. Through the discovery of the third most dispersed FRB so far, we show that one-off FRBs can emit at frequencies in excess of 6 GHz. Finally, we determine an FRB all-sky rate of $459^{+208}_{-155}$ sky$^{-1}$ day$^{-1}$ above a fluence limit of 4.1 Jy ms, and a fluence cumulative distribution with a power law index $γ=-1.23\pm0.06\pm0.2$, which is roughly consistent with the Euclidean Universe predictions. Through the high resolution in time, frequency, polarisation and localisation that ALERT featured, we were able to determine the morphological complexity, polarisation, local scattering and magnetic environment, and high-frequency luminosity of FRBs. We find all these strongly resemble those seen in young, energetic, highly magnetised neutron stars.
△ Less
Submitted 1 June, 2024;
originally announced June 2024.
-
The Apertif Radio Transient System (ARTS): Design, Commissioning, Data Release, and Detection of the first 5 Fast Radio Bursts
Authors:
Joeri van Leeuwen,
Eric Kooistra,
Leon Oostrum,
Liam Connor,
J. E. Hargreaves,
Yogesh Maan,
Inés Pastor-Marazuela,
Emily Petroff,
Daniel van der Schuur,
Alessio Sclocco,
Samayra M. Straal,
Dany Vohl,
Stefan J. Wijnholds,
Elizabeth A. K. Adams,
Björn Adebahr,
Jisk Attema,
Cees Bassa,
Jeanette E. Bast,
Anna Bilous,
W. J. G. de Blok,
Oliver M. Boersma,
Wim A. van Cappellen,
Arthur H. W. M. Coolen,
Sieds Damstra,
Helga Dénes
, et al. (27 additional authors not shown)
Abstract:
Fast Radio Bursts must be powered by uniquely energetic emission mechanisms. This requirement has eliminated a number of possible source types, but several remain. Identifying the physical nature of Fast Radio Burst (FRB) emitters arguably requires good localisation of more detections, and broadband studies enabled by real-time alerting. We here present the Apertif Radio Transient System (ARTS), a…
▽ More
Fast Radio Bursts must be powered by uniquely energetic emission mechanisms. This requirement has eliminated a number of possible source types, but several remain. Identifying the physical nature of Fast Radio Burst (FRB) emitters arguably requires good localisation of more detections, and broadband studies enabled by real-time alerting. We here present the Apertif Radio Transient System (ARTS), a supercomputing radio-telescope instrument that performs real-time FRB detection and localisation on the Westerbork Synthesis Radio Telescope (WSRT) interferometer. It reaches coherent-addition sensitivity over the entire field of the view of the primary dish beam. After commissioning results verified the system performed as planned, we initiated the Apertif FRB survey (ALERT). Over the first 5 weeks we observed at design sensitivity in 2019, we detected 5 new FRBs, and interferometrically localised each of these to 0.4--10 sq. arcmin. All detections are broad band and very narrow, of order 1 ms duration, and unscattered. Dispersion measures are generally high. Only through the very high time and frequency resolution of ARTS are these hard-to-find FRBs detected, producing an unbiased view of the intrinsic population properties. Most localisation regions are small enough to rule out the presence of associated persistent radio sources. Three FRBs cut through the halos of M31 and M33. We demonstrate that Apertif can localise one-off FRBs with an accuracy that maps magneto-ionic material along well-defined lines of sight. The rate of 1 every ~7 days next ensures a considerable number of new sources are detected for such study. The combination of detection rate and localisation accuracy exemplified by the 5 first ARTS FRBs thus marks a new phase in which a growing number of bursts can be used to probe our Universe.
△ Less
Submitted 1 February, 2023; v1 submitted 24 May, 2022;
originally announced May 2022.
-
A fast radio burst with sub-millisecond quasi-periodic structure
Authors:
Inés Pastor-Marazuela,
Joeri van Leeuwen,
Anna Bilous,
Liam Connor,
Yogesh Maan,
Leon Oostrum,
Emily Petroff,
Samayra Straal,
Dany Vohl,
E. A. K. Adams,
B. Adebahr,
Jisk Attema,
Oliver M. Boersma,
R. van den Brink,
W. A. van Cappellen,
A. H. W. M. Coolen,
S. Damstra,
H. Dénes,
K. M. Hess,
J. M. van der Hulst,
B. Hut,
A. Kutkin,
G. Marcel Loose,
D. M. Lucero,
Á. Mika
, et al. (9 additional authors not shown)
Abstract:
Fast radio bursts (FRBs) are extragalactic radio transients of extraordinary luminosity. Studying the diverse temporal and spectral behaviour recently observed in a number of FRBs may help determine the nature of the entire class. For example, a fast spinning or highly magnetised neutron star might generate the rotation-powered acceleration required to explain the bright emission. Periodic, sub-se…
▽ More
Fast radio bursts (FRBs) are extragalactic radio transients of extraordinary luminosity. Studying the diverse temporal and spectral behaviour recently observed in a number of FRBs may help determine the nature of the entire class. For example, a fast spinning or highly magnetised neutron star might generate the rotation-powered acceleration required to explain the bright emission. Periodic, sub-second components, suggesting such rotation, were recently reported in one FRB, and potentially in two more. Here we report the discovery of FRB 20201020A with Apertif, an FRB showing five components regularly spaced by 0.415 ms. This sub-millisecond structure in FRB 20201020A carries important clues about the progenitor of this FRB specifically, and potentially about that of FRBs in general. We thus contrast its features to the predictions of the main FRB source models. We perform a timing analysis of the FRB 20201020A components to determine the significance of the periodicity. We compare these against the timing properties of the previously reported CHIME FRBs with sub-second quasi-periodic components, and against two Apertif bursts from repeating FRB 20180916B that show complex time-frequency structure. We find the periodicity of FRB 20201020A to be marginally significant at 2.5$σ$. Its repeating subcomponents cannot be explained as a pulsar rotation since the required spin rate of over 2 kHz exceeds the limits set by typical neutron star equations of state and observations. The fast periodicity is also in conflict with a compact object merger scenario. These quasi-periodic components could, however, be caused by equidistant emitting regions in the magnetosphere of a magnetar. The sub-millisecond spacing of the components in FRB 20201020A, the smallest observed so far in a one-off FRB, may rule out both neutron-star rotation and binary mergers as the direct source of quasi-periodic FRBs.
△ Less
Submitted 16 February, 2022;
originally announced February 2022.
-
Apertif, Phased Array Feeds for the Westerbork Synthesis Radio Telescope
Authors:
W. A. van Cappellen,
T. A. Oosterloo,
M. A. W. Verheijen,
E. A. K. Adams,
B. Adebahr,
R. Braun,
K. M. Hess,
H. Holties,
J. M. van der Hulst,
B. Hut,
E. Kooistra,
J. van Leeuwen,
G. M. Loose,
R. Morganti,
V. A. Moss,
E. Orrú,
M. Ruiter,
A. P. Schoenmakers,
N. J. Vermaas,
S. J. Wijnholds,
A. S. van Amesfoort,
M. J. Arts,
J. J. Attema,
L. Bakker,
C. G. Bassa
, et al. (65 additional authors not shown)
Abstract:
We describe the APERture Tile In Focus (Apertif) system, a phased array feed (PAF) upgrade of the Westerbork Synthesis Radio Telescope which has transformed this telescope into a high-sensitivity, wide field-of-view L-band imaging and transient survey instrument. Using novel PAF technology, up to 40 partially overlapping beams can be formed on the sky simultaneously, significantly increasing the s…
▽ More
We describe the APERture Tile In Focus (Apertif) system, a phased array feed (PAF) upgrade of the Westerbork Synthesis Radio Telescope which has transformed this telescope into a high-sensitivity, wide field-of-view L-band imaging and transient survey instrument. Using novel PAF technology, up to 40 partially overlapping beams can be formed on the sky simultaneously, significantly increasing the survey speed of the telescope. With this upgraded instrument, an imaging survey covering an area of 2300 deg2 is being performed which will deliver both continuum and spectral line data sets, of which the first data has been publicly released. In addition, a time domain transient and pulsar survey covering 15,000 deg2 is in progress. An overview of the Apertif science drivers, hardware and software of the upgraded telescope is presented, along with its key performance characteristics.
△ Less
Submitted 30 September, 2021; v1 submitted 29 September, 2021;
originally announced September 2021.
-
Chromatic periodic activity down to 120 MHz in a Fast Radio Burst
Authors:
Inés Pastor-Marazuela,
Liam Connor,
Joeri van Leeuwen,
Yogesh Maan,
Sander ter Veen,
Anna Bilous,
Leon Oostrum,
Emily Petroff,
Samayra Straal,
Dany Vohl,
Jisk Attema,
Oliver M. Boersma,
Eric Kooistra,
Daniel van der Schuur,
Alessio Sclocco,
Roy Smits,
Elizabeth A. K. Adams,
Björn Adebahr,
Willem J. G. de Blok,
Arthur H. W. M. Coolen,
Sieds Damstra,
Helga Dénes,
Kelley M. Hess,
Thijs van der Hulst,
Boudewijn Hut
, et al. (12 additional authors not shown)
Abstract:
Fast radio bursts (FRBs) are extragalactic astrophysical transients whose brightness requires emitters that are highly energetic, yet compact enough to produce the short, millisecond-duration bursts. FRBs have thus far been detected between 300 MHz and 8 GHz, but lower-frequency emission has remained elusive. A subset of FRBs is known to repeat, and one of those sources, FRB 20180916B, does so wit…
▽ More
Fast radio bursts (FRBs) are extragalactic astrophysical transients whose brightness requires emitters that are highly energetic, yet compact enough to produce the short, millisecond-duration bursts. FRBs have thus far been detected between 300 MHz and 8 GHz, but lower-frequency emission has remained elusive. A subset of FRBs is known to repeat, and one of those sources, FRB 20180916B, does so with a 16.3 day activity period. Using simultaneous Apertif and LOFAR data, we show that FRB 20180916B emits down to 120 MHz, and that its activity window is both narrower and earlier at higher frequencies. Binary wind interaction models predict a narrower periodic activity window at lower frequencies, which is the opposite of our observations. Our detections establish that low-frequency FRB emission can escape the local medium. For bursts of the same fluence, FRB 20180916B is more active below 200 MHz than at 1.4 GHz. Combining our results with previous upper-limits on the all-sky FRB rate at 150 MHz, we find that there are 3-450 FRBs/sky/day above 50 Jy ms at 90% confidence. We are able to rule out the scenario in which companion winds cause FRB periodicity. We also demonstrate that some FRBs live in clean environments that do not absorb or scatter low-frequency radiation.
△ Less
Submitted 15 December, 2020;
originally announced December 2020.
-
ESiWACE2 Services: RSE collaborations in Weather and Climate
Authors:
Gijs van den Oord,
Victor Azizi,
Alessio Sclocco,
Georges-Emmanuel Moulard,
David Guibert,
Jisk Attema,
Erwan Raffin,
Ben van Werkhoven
Abstract:
We present the collaborative model of ESiWACE2 Services, where Research Software Engineers (RSEs) from the Netherlands eScience Center (NLeSC) and Atos offer their expertise to climate and earth system modeling groups across Europe. Within 6-month collaborative projects, the RSEs intend to provide guidance and advice regarding the performance, portability to new architectures, and scalability of s…
▽ More
We present the collaborative model of ESiWACE2 Services, where Research Software Engineers (RSEs) from the Netherlands eScience Center (NLeSC) and Atos offer their expertise to climate and earth system modeling groups across Europe. Within 6-month collaborative projects, the RSEs intend to provide guidance and advice regarding the performance, portability to new architectures, and scalability of selected applications. We present the four awarded projects as examples of this funding structure.
△ Less
Submitted 30 September, 2020;
originally announced September 2020.
-
Lessons learned in a decade of research software engineering GPU applications
Authors:
Ben van Werkhoven,
Willem Jan Palenstijn,
Alessio Sclocco
Abstract:
After years of using Graphics Processing Units (GPUs) to accelerate scientific applications in fields as varied as tomography, computer vision, climate modeling, digital forensics, geospatial databases, particle physics, radio astronomy, and localization microscopy, we noticed a number of technical, socio-technical, and non-technical challenges that Research Software Engineers (RSEs) may run into.…
▽ More
After years of using Graphics Processing Units (GPUs) to accelerate scientific applications in fields as varied as tomography, computer vision, climate modeling, digital forensics, geospatial databases, particle physics, radio astronomy, and localization microscopy, we noticed a number of technical, socio-technical, and non-technical challenges that Research Software Engineers (RSEs) may run into. While some of these challenges, such as managing different programming languages within a project, or having to deal with different memory spaces, are common to all software projects involving GPUs, others are more typical of scientific software projects. Among these challenges we include changing resolutions or scales, maintaining an application over time and making it sustainable, and evaluating both the obtained results and the achieved performance. %In this paper, we present the challenges and lessons learned from research software engineering GPU applications.
△ Less
Submitted 27 May, 2020;
originally announced May 2020.
-
A bright, high rotation-measure FRB that skewers the M33 halo
Authors:
Liam Connor,
Joeri van Leeuwen,
L. C. Oostrum,
E. Petroff,
Yogesh Maan,
E. A. K. Adams,
J. J. Attema,
J. E. Bast,
O. M. Boersma,
H. Dénes,
D. W. Gardenier,
J. E. Hargreaves,
E. Kooistra,
I. Pastor-Marazuela,
R. Schulz,
A. Sclocco,
R. Smits,
S. M. Straal,
D. van der Schuur,
Dany Vohl,
B. Adebahr,
W. J. G. de Blok,
W. A. van Cappellen,
A. H. W. M. Coolen,
S. Damstra
, et al. (15 additional authors not shown)
Abstract:
We report the detection of a bright fast radio burst, FRB\,191108, with Apertif on the Westerbork Synthesis Radio Telescope (WSRT). The interferometer allows us to localise the FRB to a narrow $5\arcsec\times7\arcmin$ ellipse by employing both multibeam information within the Apertif phased-array feed (PAF) beam pattern, and across different tied-array beams. The resulting sight line passes close…
▽ More
We report the detection of a bright fast radio burst, FRB\,191108, with Apertif on the Westerbork Synthesis Radio Telescope (WSRT). The interferometer allows us to localise the FRB to a narrow $5\arcsec\times7\arcmin$ ellipse by employing both multibeam information within the Apertif phased-array feed (PAF) beam pattern, and across different tied-array beams. The resulting sight line passes close to Local Group galaxy M33, with an impact parameter of only 18\,kpc with respect to the core. It also traverses the much larger circumgalactic medium of M31, the Andromeda Galaxy. We find that the shared plasma of the Local Group galaxies could contribute $\sim$10\% of its dispersion measure of 588\,pc\,cm$^{-3}$. FRB\,191108 has a Faraday rotation measure of +474\,$\pm\,3$\,rad\,m$^{-2}$, which is too large to be explained by either the Milky Way or the intergalactic medium. Based on the more moderate RMs of other extragalactic sources that traverse the halo of M33, we conclude that the dense magnetised plasma resides in the host galaxy. The FRB exhibits frequency structure on two scales, one that is consistent with quenched Galactic scintillation and broader spectral structure with $Δν\approx40$\,MHz. If the latter is due to scattering in the shared M33/M31 CGM, our results constrain the Local Group plasma environment. We found no accompanying persistent radio sources in the Apertif imaging survey data.
△ Less
Submitted 22 September, 2020; v1 submitted 4 February, 2020;
originally announced February 2020.
-
Real-Time RFI Mitigation for the Apertif Radio Transient System
Authors:
Alessio Sclocco,
Dany Vohl,
Rob V. van Nieuwpoort
Abstract:
Current and upcoming radio telescopes are being designed with increasing sensitivity to detect new and mysterious radio sources of astrophysical origin. While this increased sensitivity improves the likelihood of discoveries, it also makes these instruments more susceptible to the deleterious effects of Radio Frequency Interference (RFI). The challenge posed by RFI is exacerbated by the high data-…
▽ More
Current and upcoming radio telescopes are being designed with increasing sensitivity to detect new and mysterious radio sources of astrophysical origin. While this increased sensitivity improves the likelihood of discoveries, it also makes these instruments more susceptible to the deleterious effects of Radio Frequency Interference (RFI). The challenge posed by RFI is exacerbated by the high data-rates achieved by modern radio telescopes, which require real-time processing to keep up with the data. Furthermore, the high data-rates do not allow for permanent storage of observations at high resolution. Offline RFI mitigation is therefore not possible anymore. The real-time requirement makes RFI mitigation even more challenging because, on one side, the techniques used for mitigation need to be fast and simple, and on the other side they also need to be robust enough to cope with just a partial view of the data.
The Apertif Radio Transient System (ARTS) is the real-time, time-domain, transient detection instrument of the Westerbork Synthesis Radio Telescope (WSRT), processing 73 Gb of data per second. Even with a deep learning classifier, the ARTS pipeline requires state-of-the-art real-time RFI mitigation to reduce the number of false-positive detections. Our solution to this challenge is RFIm, a high-performance, open-source, tuned, and extensible RFI mitigation library. The goal of this library is to provide users with RFI mitigation routines that are designed to run in real-time on many-core accelerators, such as Graphics Processing Units, and that can be highly-tuned to achieve code and performance portability to different hardware platforms and scientific use-cases. Results on the ARTS show that we can achieve real-time RFI mitigation, with a minimal impact on the total execution time of the search pipeline, and considerably reduce the number of false-positives.
△ Less
Submitted 16 January, 2020; v1 submitted 10 January, 2020;
originally announced January 2020.
-
Repeating fast radio bursts with WSRT/Apertif
Authors:
L. C. Oostrum,
Y. Maan,
J. van Leeuwen,
L. Connor,
E. Petroff,
J. J. Attema,
J. E. Bast,
D. W. Gardenier,
J. E. Hargreaves,
E. Kooistra,
D. van der Schuur,
A. Sclocco,
R. Smits,
S. M. Straal,
S. ter Veen,
D. Vohl,
E. A. K. Adams,
B. Adebahr,
W. J. G. de Blok,
R. H. van den Brink,
W. A. van Cappellen,
A. H. W. M. Coolen,
S. Damstra,
G. N. J. van Diepen,
B. S. Frank
, et al. (18 additional authors not shown)
Abstract:
Repeating fast radio bursts (FRBs) present excellent opportunities to identify FRB progenitors and host environments, as well as decipher the underlying emission mechanism. Detailed studies of repeating FRBs might also hold clues to the origin of FRBs as a population. We aim to detect the first two repeating FRBs: FRB 121102 (R1) and FRB 180814.J0422+73 (R2), and characterise their repeat statisti…
▽ More
Repeating fast radio bursts (FRBs) present excellent opportunities to identify FRB progenitors and host environments, as well as decipher the underlying emission mechanism. Detailed studies of repeating FRBs might also hold clues to the origin of FRBs as a population. We aim to detect the first two repeating FRBs: FRB 121102 (R1) and FRB 180814.J0422+73 (R2), and characterise their repeat statistics. We also want to significantly improve the sky localisation of R2. We use the Westerbork Synthesis Radio Telescope to conduct extensive follow-up of these two repeating FRBs. The new phased-array feed system, Apertif, allows covering the entire sky position uncertainty of R2 with fine spatial resolution in one pointing. We characterise the energy distribution and the clustering of detected R1 bursts. We detected 30 bursts from R1. Our measurements indicate a dispersion measure of 563.5(2) pc cm$^{-3}$, suggesting a significant increase in DM over the past few years. We place an upper limit of 8% on the linear polarisation fraction of the brightest burst. We did not detect any bursts from R2. A single power-law might not fit the R1 burst energy distribution across the full energy range or widely separated detections. Our observations provide improved constraints on the clustering of R1 bursts. Our stringent upper limits on the linear polarisation fraction imply a significant depolarisation, either intrinsic to the emission mechanism or caused by the intervening medium, at 1400 MHz that is not observed at higher frequencies. The non-detection of any bursts from R2 implies either a highly clustered nature of the bursts, a steep spectral index, or a combination of both. Alternatively, R2 has turned off completely, either permanently or for an extended period of time.
△ Less
Submitted 28 January, 2020; v1 submitted 27 December, 2019;
originally announced December 2019.
-
The Apertif Monitor for Bursts Encountered in Real-time (AMBER) auto-tuning optimization with genetic algorithms
Authors:
Klim Mikhailov,
Alessio Sclocco
Abstract:
Real-time searches for faint radio pulses from unknown radio transients are computationally challenging. Detections become further complicated due to continuously increasing technical capabilities of transient surveys: telescope sensitivity, searched area of the sky, number of antennas or dishes, temporal and frequency resolution. The new Apertif transient survey on the Westerbork telescope happen…
▽ More
Real-time searches for faint radio pulses from unknown radio transients are computationally challenging. Detections become further complicated due to continuously increasing technical capabilities of transient surveys: telescope sensitivity, searched area of the sky, number of antennas or dishes, temporal and frequency resolution. The new Apertif transient survey on the Westerbork telescope happens in real-time on GPUs by means of the single-pulse search pipeline AMBER (Sclocco, 2017). AMBER initially carries out auto tuning: it finds the most optimal configuration of user-controlled parameters per each of four pipeline kernels so that each kernel performs its task as fast as possible. The pipeline uses a brute-force (BF) exhaustive search which in total takes 5 - 24 hours to run depending on the processing cluster architecture. We apply more heuristic, biologically driven genetic algorithms (GAs) to limit the exploration of the total parameter space, tune all four kernels together and reduce the tuning time to few hours. Our results show that after only few hours of tuning, GAs always find similar or even better configurations for all kernels together than the combination of single kernel configurations tuned by the BF approach. At the same time, by means of their genetic operators, GAs converge into better solutions than those obtained by pure random searches. The explored multi-dimensional parameter space is very complex and has multiple local optima as the evolution of randomly generated configurations does not always guarantee global solution.
△ Less
Submitted 9 November, 2018;
originally announced November 2018.
-
Auto-Tuning Dedispersion for Many-Core Accelerators
Authors:
Alessio Sclocco,
Henri E. Bal,
Jason Hessels,
Joeri van Leeuwen,
Rob V. van Nieuwpoort
Abstract:
In this paper, we study the parallelization of the dedispersion algorithm on many-core accelerators, including GPUs from AMD and NVIDIA, and the Intel Xeon Phi. An important contribution is the computational analysis of the algorithm, from which we conclude that dedispersion is inherently memory-bound in any realistic scenario, in contrast to earlier reports. We also provide empirical proof that,…
▽ More
In this paper, we study the parallelization of the dedispersion algorithm on many-core accelerators, including GPUs from AMD and NVIDIA, and the Intel Xeon Phi. An important contribution is the computational analysis of the algorithm, from which we conclude that dedispersion is inherently memory-bound in any realistic scenario, in contrast to earlier reports. We also provide empirical proof that, even in unrealistic scenarios, hardware limitations keep the arithmetic intensity low, thus limiting performance. We exploit auto-tuning to adapt the algorithm, not only to different accelerators, but also to different observations, and even telescopes. Our experiments show how the algorithm is tuned automatically for different scenarios and how it exploits and highlights the underlying specificities of the hardware: in some observations, the tuner automatically optimizes device occupancy, while in others it optimizes memory bandwidth. We quantitatively analyze the problem space, and by comparing the results of optimal auto-tuned versions against the best performing fixed codes, we show the impact that auto-tuning has on performance, and conclude that it is statistically relevant.
△ Less
Submitted 18 January, 2016;
originally announced January 2016.
-
Real-Time Dedispersion for Fast Radio Transient Surveys, using Auto Tuning on Many-Core Accelerators
Authors:
Alessio Sclocco,
Joeri van Leeuwen,
Henri E. Bal,
Rob V. van Nieuwpoort
Abstract:
Dedispersion, the removal of deleterious smearing of impulsive signals by the interstellar matter, is one of the most intensive processing steps in any radio survey for pulsars and fast transients. We here present a study of the parallelization of this algorithm on many-core accelerators, including GPUs from AMD and NVIDIA, and the Intel Xeon Phi. We find that dedispersion is inherently memory-bou…
▽ More
Dedispersion, the removal of deleterious smearing of impulsive signals by the interstellar matter, is one of the most intensive processing steps in any radio survey for pulsars and fast transients. We here present a study of the parallelization of this algorithm on many-core accelerators, including GPUs from AMD and NVIDIA, and the Intel Xeon Phi. We find that dedispersion is inherently memory-bound. Even in a perfect scenario, hardware limitations keep the arithmetic intensity low, thus limiting performance. We next exploit auto-tuning to adapt dedispersion to different accelerators, observations, and even telescopes. We demonstrate that the optimal settings differ between observational setups, and that auto-tuning significantly improves performance. This impacts time-domain surveys from Apertif to SKA.
△ Less
Submitted 6 January, 2016;
originally announced January 2016.