subscribe to arXiv mailings

HPC with Enhanced User Separation

Authors: Andrew Prout, Albert Reuther, Michael Houle, Michael Jones, Peter Michaleas, LaToya Anderson, William Arcand, Bill Bergeron, David Bestor, Alex Bonn, Daniel Burrill, Chansup Byun, Vijay Gadepally, Matthew Hubbell, Hayden Jananthan, Piotr Luszczek, Lauren Milechin, Guillermo Morales, Julie Mullen, Antonio Rosa, Charles Yee, Jeremy Kepner

Abstract: HPC systems used for research run a wide variety of software and workflows. This software is often written or modified by users to meet the needs of their research projects, and rarely is built with security in mind. In this paper we explore several of the key techniques that MIT Lincoln Laboratory Supercomputing Center has deployed on its systems to manage the security implications of these workf… ▽ More HPC systems used for research run a wide variety of software and workflows. This software is often written or modified by users to meet the needs of their research projects, and rarely is built with security in mind. In this paper we explore several of the key techniques that MIT Lincoln Laboratory Supercomputing Center has deployed on its systems to manage the security implications of these workflows by providing enforced separation for processes, filesystem access, network traffic, and accelerators to make every user feel like they are running on a personal HPC. △ Less

Submitted 16 September, 2024; originally announced September 2024.

arXiv:2409.08115 [pdf, other]

Anonymized Network Sensing Graph Challenge

Authors: Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill, Aydin Buluc, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Piotr Luszczek, Peter Michaleas, Lauren Milechin, Chasen Milner, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther , et al. (4 additional authors not shown)

Abstract: The MIT/IEEE/Amazon GraphChallenge encourages community approaches to developing new solutions for analyzing graphs and sparse data derived from social media, sensor feeds, and scientific data to discover relationships between events as they unfold in the field. The anonymized network sensing Graph Challenge seeks to enable large, open, community-based approaches to protecting networks. Many large… ▽ More The MIT/IEEE/Amazon GraphChallenge encourages community approaches to developing new solutions for analyzing graphs and sparse data derived from social media, sensor feeds, and scientific data to discover relationships between events as they unfold in the field. The anonymized network sensing Graph Challenge seeks to enable large, open, community-based approaches to protecting networks. Many large-scale networking problems can only be solved with community access to very broad data sets with the highest regard for privacy and strong community buy-in. Such approaches often require community-based data sharing. In the broader networking community (commercial, federal, and academia) anonymized source-to-destination traffic matrices with standard data sharing agreements have emerged as a data product that can meet many of these requirements. This challenge provides an opportunity to highlight novel approaches for optimizing the construction and analysis of anonymized traffic matrices using over 100 billion network packets derived from the largest Internet telescope in the world (CAIDA). This challenge specifies the anonymization, construction, and analysis of these traffic matrices. A GraphBLAS reference implementation is provided, but the use of GraphBLAS is not required in this Graph Challenge. As with prior Graph Challenges the goal is to provide a well-defined context for demonstrating innovation. Graph Challenge participants are free to select (with accompanying explanation) the Graph Challenge elements that are appropriate for highlighting their innovations. △ Less

Submitted 12 September, 2024; originally announced September 2024.

Comments: Accepted to IEEE HPEC 2024

arXiv:2409.03111 [pdf, other]

What is Normal? A Big Data Observational Science Model of Anonymized Internet Traffic

Authors: Jeremy Kepner, Hayden Jananthan, Michael Jones, William Arcand, David Bestor, William Bergeron, Daniel Burrill, Aydin Buluc, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Piotr Luszczek, Lauren Milechin, Chasen Milner, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther , et al. (4 additional authors not shown)

Abstract: Understanding what is normal is a key aspect of protecting a domain. Other domains invest heavily in observational science to develop models of normal behavior to better detect anomalies. Recent advances in high performance graph libraries, such as the GraphBLAS, coupled with supercomputers enables processing of the trillions of observations required. We leverage this approach to synthesize low-pa… ▽ More Understanding what is normal is a key aspect of protecting a domain. Other domains invest heavily in observational science to develop models of normal behavior to better detect anomalies. Recent advances in high performance graph libraries, such as the GraphBLAS, coupled with supercomputers enables processing of the trillions of observations required. We leverage this approach to synthesize low-parameter observational models of anonymized Internet traffic with a high regard for privacy. △ Less

Submitted 4 September, 2024; originally announced September 2024.

Comments: Accepted to IEEE HPEC, 7 pages, 6 figures, 1 table, 41 references

arXiv:2407.20813 [pdf, other]

On the use of field RR Lyrae as Galactic probes VII. light curve templates in the LSST photometric system

Authors: V. F. Braga, M. Monelli, M. Dall'Ora, J. P. Mullen, R. Molinaro, M. Marconi, R. Szabó, C. Gallart

Abstract: The \textit{Vera C. Rubin} Observatory will start operations in 2025. During the first two years, too few visits per target per band will be available, meaning that mean magnitude measurements of variable stars will not be precise and thus, standard candles like RR Lyrae (RRL) will not be usable. Light curve templates (LCTs) can be adopted to estimate the mean magnitude of a variable star with few… ▽ More The \textit{Vera C. Rubin} Observatory will start operations in 2025. During the first two years, too few visits per target per band will be available, meaning that mean magnitude measurements of variable stars will not be precise and thus, standard candles like RR Lyrae (RRL) will not be usable. Light curve templates (LCTs) can be adopted to estimate the mean magnitude of a variable star with few magnitude measurements, provided that their period (plus amplitude and reference epoch, depending on how the LCT is applied) is known. LSST will provide precise RRL periods within the first six months, allowing to exploit RRLs if LCTs were available. We aim to build LCTs in the LSST bands to enhance the early science with LSST. Using them will provide a 1-2 years advantage with respect to a classical approach, concerning distance measurements. We collected $gri$-band data from the ZTF survey and $z$-band data from DECam to build the LCTs of RRLs. We also adopted synthetic $griz$-band data in the LSST system from pulsation models, plus SDSS, \gaia and OGLE photometry, inspecting the light amplitude ratios in different photometric systems to provide useful conversions to apply the LCTs. We have built LCTs of RRLs in the $griz$ bands of the LSST photometric system; for the $z$ band, we could build only fundamental-mode RRL LCTs. We quantitatively demonstrated that LCTs built with ZTF and DECam data can be adopted on the LSST photometric system. LCTs will decrease by a factor of at least two the uncertainty on distance estimates of RRLs, with respect to a simple average of the available measurements. Finally, within our tests, we have found a brand new behavior of amplitude ratios in the Large Magellanic Cloud. △ Less

Submitted 30 July, 2024; originally announced July 2024.

Comments: Accepted for publication in A&A

arXiv:2407.01481 [pdf, other]

LLload: Simplifying Real-Time Job Monitoring for HPC Users

Authors: Chansup Byun, Julia Mullen, Albert Reuther, William Arcand, William Bergeron, David Bestor, Daniel Burrill, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Peter Michaleas, Guillermo Morales, Andrew Prout, Antonio Rosa, Charles Yee, Jeremy Kepner, Lauren Milechin

Abstract: One of the more complex tasks for researchers using HPC systems is performance monitoring and tuning of their applications. Developing a practice of continuous performance improvement, both for speed-up and efficient use of resources is essential to the long term success of both the HPC practitioner and the research project. Profiling tools provide a nice view of the performance of an application… ▽ More One of the more complex tasks for researchers using HPC systems is performance monitoring and tuning of their applications. Developing a practice of continuous performance improvement, both for speed-up and efficient use of resources is essential to the long term success of both the HPC practitioner and the research project. Profiling tools provide a nice view of the performance of an application but often have a steep learning curve and rarely provide an easy to interpret view of resource utilization. Lower level tools such as top and htop provide a view of resource utilization for those familiar and comfortable with Linux but a barrier for newer HPC practitioners. To expand the existing profiling and job monitoring options, the MIT Lincoln Laboratory Supercomputing Center created LLoad, a tool that captures a snapshot of the resources being used by a job on a per user basis. LLload is a tool built from standard HPC tools that provides an easy way for a researcher to track resource usage of active jobs. We explain how the tool was designed and implemented and provide insight into how it is used to aid new researchers in developing their performance monitoring skills as well as guide researchers in their resource requests. △ Less

Submitted 1 July, 2024; originally announced July 2024.

arXiv:2405.04580 [pdf, other]

doi 10.1093/mnras/stae1149

The GALAH survey: Tracing the Milky Way's formation and evolution through RR Lyrae stars

Authors: Valentina D'Orazi, Nicholas Storm, Andrew R. Casey, Vittorio F. Braga, Alice Zocchi, Giuseppe Bono, Michele Fabrizio, Christopher Sneden, Davide Massari, Riano E. Giribaldi, Maria Bergemann, Simon W. Campbell, Luca Casagrande, Richard de Grijs, Gayandhi De Silva, Maria Lugaro, Daniel B. Zucker, Angela Bragaglia, Diane Feuillet, Giuliana Fiorentino, Brian Chaboyer, Massimo Dall'Ora, Massimo Marengo, Clara E. Martínez-Vázquez, Noriyuki Matsunaga , et al. (17 additional authors not shown)

Abstract: Stellar mergers and accretion events have been crucial in shaping the evolution of the Milky Way (MW). These events have been dynamically identified and chemically characterised using red giants and main-sequence stars. RR Lyrae (RRL) variables can play a crucial role in tracing the early formation of the MW since they are ubiquitous, old (t$\ge$10 Gyr) low-mass stars and accurate distance indicat… ▽ More Stellar mergers and accretion events have been crucial in shaping the evolution of the Milky Way (MW). These events have been dynamically identified and chemically characterised using red giants and main-sequence stars. RR Lyrae (RRL) variables can play a crucial role in tracing the early formation of the MW since they are ubiquitous, old (t$\ge$10 Gyr) low-mass stars and accurate distance indicators. We exploited Data Release 3 of the GALAH survey to identify 78 field RRLs suitable for chemical analysis. Using synthetic spectra calculations, we determined atmospheric parameters and abundances of Fe, Mg, Ca, Y, and Ba. Most of our stars exhibit halo-like chemical compositions, with an iron peak around [Fe/H]$\approx -$1.40, and enhanced Ca and Mg content. Notably, we discovered a metal-rich tail, with [Fe/H] values ranging from $-$1 to approximately solar metallicity. This sub-group includes almost ~1/4 of the sample, it is characterised by thin disc kinematics and displays sub-solar $α$-element abundances, marginally consistent with the majority of the MW stars. Surprisingly, they differ distinctly from typical MW disc stars in terms of the s-process elements Y and Ba. We took advantage of similar data available in the literature and built a total sample of 535 field RRLs for which we estimated kinematical and dynamical properties. We found that metal-rich RRLs (1/3 of the sample) likely represent an old component of the MW thin disc. We also detected RRLs with retrograde orbits and provided preliminary associations with the Gaia-Sausage-Enceladus, Helmi, Sequoia, Sagittarius, and Thamnos stellar streams. △ Less

Submitted 7 May, 2024; originally announced May 2024.

Comments: Accepted for publication in MNRAS. 29 pages, 20 figures

arXiv:2404.08827 [pdf, other]

doi 10.1109/LRA.2024.3430129

"Don't forget to put the milk back!" Dataset for Enabling Embodied Agents to Detect Anomalous Situations

Authors: James F. Mullen Jr, Prasoon Goyal, Robinson Piramuthu, Michael Johnston, Dinesh Manocha, Reza Ghanadan

Abstract: Home robots intend to make their users lives easier. Our work assists in this goal by enabling robots to inform their users of dangerous or unsanitary anomalies in their home. Some examples of these anomalies include the user leaving their milk out, forgetting to turn off the stove, or leaving poison accessible to children. To move towards enabling home robots with these abilities, we have created… ▽ More Home robots intend to make their users lives easier. Our work assists in this goal by enabling robots to inform their users of dangerous or unsanitary anomalies in their home. Some examples of these anomalies include the user leaving their milk out, forgetting to turn off the stove, or leaving poison accessible to children. To move towards enabling home robots with these abilities, we have created a new dataset, which we call SafetyDetect. The SafetyDetect dataset consists of 1000 anomalous home scenes, each of which contains unsafe or unsanitary situations for an agent to detect. Our approach utilizes large language models (LLMs) alongside both a graph representation of the scene and the relationships between the objects in the scene. Our key insight is that this connected scene graph and the object relationships it encodes enables the LLM to better reason about the scene -- especially as it relates to detecting dangerous or unsanitary situations. Our most promising approach utilizes GPT-4 and pursues a categorization technique where object relations from the scene graph are classified as normal, dangerous, unsanitary, or dangerous for children. This method is able to correctly identify over 90% of anomalous scenarios in the SafetyDetect Dataset. Additionally, we conduct real world experiments on a ClearPath TurtleBot where we generate a scene graph from visuals of the real world scene, and run our approach with no modification. This setup resulted in little performance loss. The SafetyDetect Dataset and code will be released to the public upon this papers publication. △ Less

Submitted 12 April, 2024; originally announced April 2024.

Journal ref: IEEE Robotics and Automation Letters 9.10 (2024) 9087 - 9094

arXiv:2403.13198 [pdf, other]

LAP, Using Action Feasibility for Improved Uncertainty Alignment of Large Language Model Planners

Authors: James F. Mullen Jr., Dinesh Manocha

Abstract: Large language models (LLMs) showcase many desirable traits for intelligent and helpful robots. However, they are also known to hallucinate predictions. This issue is exacerbated in robotics where LLM hallucinations may result in robots confidently executing plans that are contrary to user goals, relying more frequently on human assistance, or preventing the robot from asking for help at all. In t… ▽ More Large language models (LLMs) showcase many desirable traits for intelligent and helpful robots. However, they are also known to hallucinate predictions. This issue is exacerbated in robotics where LLM hallucinations may result in robots confidently executing plans that are contrary to user goals, relying more frequently on human assistance, or preventing the robot from asking for help at all. In this work, we present LAP, a novel approach for utilizing off-the-shelf LLMs, alongside a novel Action feasibility metric, in robotic Planners that minimize harmful hallucinations and human intervention. Our key finding is that calculating and leveraging a new metric, which we call A-Feasibility, a measure of whether a given action is possible and safe in the provided scene, helps to mitigate hallucinations in LLM predictions and better align the LLM's confidence measure with the probability of success. We specifically propose an A-Feasibility metric which both combines scene context and prompting a LLM to determine if a given action is possible and safe in the scene, using the LLM's response to compute the score. Through experiments in both simulation and the real world on tasks with a variety of ambiguities, we show that LAP significantly increases success rate and decreases the amount of human intervention required relative to prior art. For example, in our real-world testing paradigm, LAP decreases the human help rate of previous methods by over 33% at a success rate of 70%. △ Less

Submitted 15 October, 2024; v1 submitted 19 March, 2024; originally announced March 2024.

arXiv:2310.06955 [pdf]

Electronic properties of c-BN/diamond heterostructures for high-frequency high-power applications

Authors: Jeffrey T. Mullen, James A. Boulton, Minghao Pan, Ki Wook Kim

Abstract: Using first principles calculations, this work investigates the suitability of diamond/c-BN heterojunctions for high frequency, high power device applications. The key quantities of band offsets and interface charge polarization are examined for different crystallographic orientations [(110), (111), or (100)], bond terminations (C-B or C-N), and substrates (diamond or c-BN). The results indicate t… ▽ More Using first principles calculations, this work investigates the suitability of diamond/c-BN heterojunctions for high frequency, high power device applications. The key quantities of band offsets and interface charge polarization are examined for different crystallographic orientations [(110), (111), or (100)], bond terminations (C-B or C-N), and substrates (diamond or c-BN). The results indicate that both the (111) and (100) structures with polar interfaces are likely to be a type-I alignment with the diamond conduction and valence band extrema nested within the c-BN bandgap, whereas the non-polar (110) counterpart may form type II as the valence band of c-BN is shifted down substantially lower. The (111) and (100) structures also show net charge polarization in a narrow region at the interface. The electron-deficient and electron-rich nature of the C-B and C-N bonding are found to induce charge redistribution leading to an essentially 2D sheet of negative and positive polarization. With the predicted band alignments suitable for carrier confinement as well as the possibility of the modulation and polarization doping, the diamond/c-BN heterostructures are a promising candidate for high-performance electronic devices with a highly conductive 2D channel. Both p-type and n-type devices appear possible with a judicious choice of the heterojunction configuration. △ Less

Submitted 10 October, 2023; originally announced October 2023.

Comments: 18 pages, 5 figures

arXiv:2310.00522 [pdf, other]

Mapping of Internet "Coastlines" via Large Scale Anonymized Network Source Correlations

Authors: Hayden Jananthan, Jeremy Kepner, Michael Jones, William Arcand, David Bestor, William Bergeron, Chansup Byun, Timothy Davis, Vijay Gadepally, Daniel Grant, Michael Houle, Matthew Hubbell, Anna Klein, Lauren Milechin, Guillermo Morales, Andrew Morris, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Tyler Trigg , et al. (3 additional authors not shown)

Abstract: Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative ar… ▽ More Expanding the scientific tools available to protect computer networks can be aided by a deeper understanding of the underlying statistical distributions of network traffic and their potential geometric interpretations. Analyses of large scale network observations provide a unique window into studying those underlying statistics. Newly developed GraphBLAS hypersparse matrices and D4M associative array technologies enable the efficient anonymized analysis of network traffic on the scale of trillions of events. This work analyzes over 100,000,000,000 anonymized packets from the largest Internet telescope (CAIDA) and over 10,000,000 anonymized sources from the largest commercial honeyfarm (GreyNoise). Neither CAIDA nor GreyNoise actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Analysis of these observations confirms the previously observed Cauchy-like distributions describing temporal correlations between Internet sources. The Gull lighthouse problem is a well-known geometric characterization of the standard Cauchy distribution and motivates a potential geometric interpretation for Internet observations. This work generalizes the Gull lighthouse problem to accommodate larger classes of coastlines, deriving a closed-form solution for the resulting probability distributions, stating and examining the inverse problem of identifying an appropriate coastline given a continuous probability distribution, identifying a geometric heuristic for solving this problem computationally, and applying that heuristic to examine the temporal geometry of different subsets of network observations. Application of this method to the CAIDA and GreyNoise data reveals a several orders of magnitude difference between known benign and other traffic which can lead to potentially novel ways to protect networks. △ Less

Submitted 30 September, 2023; originally announced October 2023.

Comments: 9 pages, 7 figures, IEEE HPEC 2023 (accepted)

arXiv:2309.03931 [pdf]

doi 10.1109/HPEC58863.2023.10363604

pPython Performance Study

Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

Abstract: pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on a single-node (e.g., a laptop) running Window… ▽ More pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on a single-node (e.g., a laptop) running Windows, Linux, or MacOS operating systems or on any combination of heterogeneous systems that support Python, including on a cluster through a Slurm scheduler interface so that pPython can be executed in a massively parallel computing environment. It is interesting to see what performance pPython can achieve compared to the traditional socket-based MPI communication because of its unique file-based messaging implementation. In this paper, we present the point-to-point and collective communication performances of pPython and compare them with those obtained by using mpi4py with OpenMPI. For large messages, pPython demonstrates comparable performance as compared to mpi4py. △ Less

Submitted 7 September, 2023; originally announced September 2023.

Comments: arXiv admin note: substantial text overlap with arXiv:2208.14908

arXiv:2309.02464 [pdf, other]

doi 10.1109/HPEC58863.2023.10363581

Deployment of Real-Time Network Traffic Analysis using GraphBLAS Hypersparse Matrices and D4M Associative Arrays

Authors: Michael Jones, Jeremy Kepner, Andrew Prout, Timothy Davis, William Arcand, David Bestor, William Bergeron, Chansup Byun, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Lauren Milechin, Guillermo Morales, Julie Mullen, Ritesh Patel, Sandeep Pisharody, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

Abstract: Matrix/array analysis of networks can provide significant insight into their behavior and aid in their operation and protection. Prior work has demonstrated the analytic, performance, and compression capabilities of GraphBLAS (graphblas.org) hypersparse matrices and D4M (d4m.mit.edu) associative arrays (a mathematical superset of matrices). Obtaining the benefits of these capabilities requires int… ▽ More Matrix/array analysis of networks can provide significant insight into their behavior and aid in their operation and protection. Prior work has demonstrated the analytic, performance, and compression capabilities of GraphBLAS (graphblas.org) hypersparse matrices and D4M (d4m.mit.edu) associative arrays (a mathematical superset of matrices). Obtaining the benefits of these capabilities requires integrating them into operational systems, which comes with its own unique challenges. This paper describes two examples of real-time operational implementations. First, is an operational GraphBLAS implementation that constructs anonymized hypersparse matrices on a high-bandwidth network tap. Second, is an operational D4M implementation that analyzes daily cloud gateway logs. The architectures of these implementations are presented. Detailed measurements of the resources and the performance are collected and analyzed. The implementations are capable of meeting their operational requirements using modest computational resources (a couple of processing cores). GraphBLAS is well-suited for low-level analysis of high-bandwidth connections with relatively structured network data. D4M is well-suited for higher-level analysis of more unstructured data. This work demonstrates that these technologies can be implemented in operational settings. △ Less

Submitted 8 December, 2023; v1 submitted 4 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE HPEC, 8 pages, 8 figures, 1 table, 69 references. arXiv admin note: text overlap with arXiv:2203.13934. text overlap with arXiv:2309.01806

arXiv:2309.01806 [pdf, other]

doi 10.1109/HPEC58863.2023.10363471

Focusing and Calibration of Large Scale Network Sensors using GraphBLAS Anonymized Hypersparse Matrices

Authors: Jeremy Kepner, Michael Jones, Phil Dykstra, Chansup Byun, Timothy Davis, Hayden Jananthan, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Lauren Milechin, Guillermo Morales, Julie Mullen, Ritesh Patel, Alex Pentland, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Tyler Trigg, Charles Yee , et al. (1 additional authors not shown)

Abstract: Defending community-owned cyber space requires community-based efforts. Large-scale network observations that uphold the highest regard for privacy are key to protecting our shared cyberspace. Deployment of the necessary network sensors requires careful sensor placement, focusing, and calibration with significant volumes of network observations. This paper demonstrates novel focusing and calibrati… ▽ More Defending community-owned cyber space requires community-based efforts. Large-scale network observations that uphold the highest regard for privacy are key to protecting our shared cyberspace. Deployment of the necessary network sensors requires careful sensor placement, focusing, and calibration with significant volumes of network observations. This paper demonstrates novel focusing and calibration procedures on a multi-billion packet dataset using high-performance GraphBLAS anonymized hypersparse matrices. The run-time performance on a real-world data set confirms previously observed real-time processing rates for high-bandwidth links while achieving significant data compression. The output of the analysis demonstrates the effectiveness of these procedures at focusing the traffic matrix and revealing the underlying stable heavy-tail statistical distributions that are necessary for anomaly detection. A simple model of the corresponding probability of detection ($p_{\rm d}$) and probability of false alarm ($p_{\rm fa}$) for these distributions highlights the criticality of network sensor focusing and calibration. Once a sensor is properly focused and calibrated it is then in a position to carry out two of the central tenets of good cybersecurity: (1) continuous observation of the network and (2) minimizing unbrokered network connections. △ Less

Submitted 4 September, 2023; originally announced September 2023.

Comments: Accepted to IEEE HPEC, 9 pages, 12 figures, 1 table, 63 references, 2 appendices

arXiv:2303.14255 [pdf, other]

doi 10.1109/TVCG.2023.3247054

PACE: Data-Driven Virtual Agent Interaction in Dense and Cluttered Environments

Authors: James Mullen, Dinesh Manocha

Abstract: We present PACE, a novel method for modifying motion-captured virtual agents to interact with and move throughout dense, cluttered 3D scenes. Our approach changes a given motion sequence of a virtual agent as needed to adjust to the obstacles and objects in the environment. We first take the individual frames of the motion sequence most important for modeling interactions with the scene and pair t… ▽ More We present PACE, a novel method for modifying motion-captured virtual agents to interact with and move throughout dense, cluttered 3D scenes. Our approach changes a given motion sequence of a virtual agent as needed to adjust to the obstacles and objects in the environment. We first take the individual frames of the motion sequence most important for modeling interactions with the scene and pair them with the relevant scene geometry, obstacles, and semantics such that interactions in the agents motion match the affordances of the scene (e.g., standing on a floor or sitting in a chair). We then optimize the motion of the human by directly altering the high-DOF pose at each frame in the motion to better account for the unique geometric constraints of the scene. Our formulation uses novel loss functions that maintain a realistic flow and natural-looking motion. We compare our method with prior motion generating techniques and highlight the benefits of our method with a perceptual study and physical plausibility metrics. Human raters preferred our method over the prior approaches. Specifically, they preferred our method 57.1% of the time versus the state-of-the-art method using existing motions, and 81.0% of the time versus a state-of-the-art motion synthesis method. Additionally, our method performs significantly higher on established physical plausibility and interaction metrics. Specifically, we outperform competing methods by over 1.2% in terms of the non-collision metric and by over 18% in terms of the contact metric. We have integrated our interactive system with Microsoft HoloLens and demonstrate its benefits in real-world indoor scenes. Our project website is available at https://gamma.umd.edu/pace/. △ Less

Submitted 24 March, 2023; originally announced March 2023.

Journal ref: IEEE Transactions on Visualization and Computer Graphics 29.5 (2023) 2536-2546

arXiv:2303.04901 [pdf, other]

Towards Driving Policies with Personality: Modeling Behavior and Style in Risky Scenarios via Data Collection in Virtual Reality

Authors: Laura Zheng, Julio Poveda, James Mullen, Shreelekha Revankar, Ming C. Lin

Abstract: Autonomous driving research currently faces data sparsity in representation of risky scenarios. Such data is both difficult to obtain ethically in the real world, and unreliable to obtain via simulation. Recent advances in virtual reality (VR) driving simulators lower barriers to tackling this problem in simulation. We propose the first data collection framework for risky scenario driving data fro… ▽ More Autonomous driving research currently faces data sparsity in representation of risky scenarios. Such data is both difficult to obtain ethically in the real world, and unreliable to obtain via simulation. Recent advances in virtual reality (VR) driving simulators lower barriers to tackling this problem in simulation. We propose the first data collection framework for risky scenario driving data from real humans using VR, as well as accompanying numerical driving personality characterizations. We validate the resulting dataset with statistical analyses and model driving behavior with an eight-factor personality vector based on the Multi-dimensional Driving Style Inventory (MDSI). Our method, dataset, and analyses show that realistic driving personalities can be modeled without deep learning or large datasets to complement autonomous driving research. △ Less

Submitted 8 March, 2023; originally announced March 2023.

arXiv:2303.03480 [pdf, other]

doi 10.1109/LRA.2023.3346800

Can an Embodied Agent Find Your "Cat-shaped Mug"? LLM-Guided Exploration for Zero-Shot Object Navigation

Authors: Vishnu Sashank Dorbala, James F. Mullen Jr., Dinesh Manocha

Abstract: We present LGX (Language-guided Exploration), a novel algorithm for Language-Driven Zero-Shot Object Goal Navigation (L-ZSON), where an embodied agent navigates to a uniquely described target object in a previously unseen environment. Our approach makes use of Large Language Models (LLMs) for this task by leveraging the LLM's commonsense reasoning capabilities for making sequential navigational de… ▽ More We present LGX (Language-guided Exploration), a novel algorithm for Language-Driven Zero-Shot Object Goal Navigation (L-ZSON), where an embodied agent navigates to a uniquely described target object in a previously unseen environment. Our approach makes use of Large Language Models (LLMs) for this task by leveraging the LLM's commonsense reasoning capabilities for making sequential navigational decisions. Simultaneously, we perform generalized target object detection using a pre-trained Vision-Language grounding model. We achieve state-of-the-art zero-shot object navigation results on RoboTHOR with a success rate (SR) improvement of over 27% over the current baseline of the OWL-ViT CLIP on Wheels (OWL CoW). Furthermore, we study the usage of LLMs for robot navigation and present an analysis of various prompting strategies affecting the model output. Finally, we showcase the benefits of our approach via \textit{real-world} experiments that indicate the superior performance of LGX in detecting and navigating to visually unique objects. △ Less

Submitted 5 November, 2023; v1 submitted 6 March, 2023; originally announced March 2023.

Comments: 10 pages

Journal ref: IEEE Robotics and Automation Letters 9.5 (2024) 4083-4090

arXiv:2301.03777 [pdf, other]

doi 10.3847/1538-4357/acb20a

RR Lyrae mid-infrared Period-Luminosity-Metallicity and Period-Wesenheit-Metallicity relations based on Gaia DR3 parallaxes

Authors: Joseph P. Mullen, Massimo Marengo, Clara E. Martínez-Vázquez, Brian Chaboyer, Giuseppe Bono, Vittorio F. Braga, Massimo Dall'Ora, Valentina D'Orazi, Michele Fabrizio, Matteo Monelli, Frédéric Thévenin

Abstract: We present new empirical infrared Period-Luminosity-Metallicity (PLZ) and Period-Wesenheit-Metallicity (PWZ) relations for RR Lyrae based on the latest Gaia EDR3 parallaxes. The relations are provided in the WISE $W1$ and $W2$ bands, as well as in the $W(W1, V - W1)$ and $W(W2, V - W2)$ Wesenheit magnitudes. The relations are calibrated using a very large sample of Galactic halo field RR Lyrae sta… ▽ More We present new empirical infrared Period-Luminosity-Metallicity (PLZ) and Period-Wesenheit-Metallicity (PWZ) relations for RR Lyrae based on the latest Gaia EDR3 parallaxes. The relations are provided in the WISE $W1$ and $W2$ bands, as well as in the $W(W1, V - W1)$ and $W(W2, V - W2)$ Wesenheit magnitudes. The relations are calibrated using a very large sample of Galactic halo field RR Lyrae stars with homogeneous spectroscopic [Fe/H] abundances (over 1,000 stars in the $W1$ band), covering a broad range of metallicities ($-2.5 \lesssim \textrm{[Fe/H]} \lesssim 0.0$). We test the performance of our PLZ and PWZ relations by determining the distance moduli of both galactic and extragalactic stellar associations: the Sculptor dwarf spheroidal galaxy in the Local Group (finding $\barμ_{0}=19.47 \pm 0.06$), the Galactic globular clusters M4 ($\barμ_{0}=11.16 \pm 0.05$) and the Reticulum globular cluster in the Large Magellanic Cloud ($\barμ_{0}=18.23 \pm 0.06$). The distance moduli determined through all our relations are internally self-consistent (within $\lesssim$ 0.05 mag) but are systematically smaller (by $\sim$ 2-3$σ$) than previous literature measurements taken from a variety of methods/anchors. However, a comparison with similar recent RR Lyrae empirical relations anchored with EDR3 likewise shows to varying extents a systematically smaller distance modulus for PLZ/PWZ RR Lyrae relations. △ Less

Submitted 9 January, 2023; originally announced January 2023.

Comments: Accepted by ApJ, 14 pages, 5 Figures, 2 Tables

arXiv:2209.06314 [pdf, other]

doi 10.1109/WACV56688.2023.00038

Placing Human Animations into 3D Scenes by Learning Interaction- and Geometry-Driven Keyframes

Authors: James F. Mullen Jr, Divya Kothandaraman, Aniket Bera, Dinesh Manocha

Abstract: We present a novel method for placing a 3D human animation into a 3D scene while maintaining any human-scene interactions in the animation. We use the notion of computing the most important meshes in the animation for the interaction with the scene, which we call "keyframes." These keyframes allow us to better optimize the placement of the animation into the scene such that interactions in the ani… ▽ More We present a novel method for placing a 3D human animation into a 3D scene while maintaining any human-scene interactions in the animation. We use the notion of computing the most important meshes in the animation for the interaction with the scene, which we call "keyframes." These keyframes allow us to better optimize the placement of the animation into the scene such that interactions in the animations (standing, laying, sitting, etc.) match the affordances of the scene (e.g., standing on the floor or laying in a bed). We compare our method, which we call PAAK, with prior approaches, including POSA, PROX ground truth, and a motion synthesis method, and highlight the benefits of our method with a perceptual study. Human raters preferred our PAAK method over the PROX ground truth data 64.6\% of the time. Additionally, in direct comparisons, the raters preferred PAAK over competing methods including 61.5\% compared to POSA. △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: WACV 2023. Our project website is available at https://gamma.umd.edu/paak/

Journal ref: IEEE/CVF Winter Conference on the Applications of Computer Vision (2023)

arXiv:2209.05725 [pdf, other]

Hypersparse Network Flow Analysis of Packets with GraphBLAS

Authors: Tyler Trigg, Chad Meiners, Sandeep Pisharody, Hayden Jananthan, Michael Jones, Adam Michaleas, Timothy Davis, Erik Welch, William Arcand, David Bestor, William Bergeron, Chansup Byun, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Charles Yee , et al. (1 additional authors not shown)

Abstract: Internet analysis is a major challenge due to the volume and rate of network traffic. In lieu of analyzing traffic as raw packets, network analysts often rely on compressed network flows (netflows) that contain the start time, stop time, source, destination, and number of packets in each direction. However, many traffic analyses benefit from temporal aggregation of multiple simultaneous netflows,… ▽ More Internet analysis is a major challenge due to the volume and rate of network traffic. In lieu of analyzing traffic as raw packets, network analysts often rely on compressed network flows (netflows) that contain the start time, stop time, source, destination, and number of packets in each direction. However, many traffic analyses benefit from temporal aggregation of multiple simultaneous netflows, which can be computationally challenging. To alleviate this concern, a novel netflow compression and resampling method has been developed leveraging GraphBLAS hyperspace traffic matrices that preserve anonymization while enabling subrange analysis. Standard multitemporal spatial analyses are then performed on each subrange to generate detailed statistical aggregates of the source packets, source fan-out, unique links, destination fan-in, and destination packets of each subrange which can then be used for background modeling and anomaly detection. A simple file format based on GraphBLAS sparse matrices is developed for storing these statistical aggregates. This method is scale tested on the MIT SuperCloud using a 50 trillion packet netflow corpus from several hundred sites collected over several months. The resulting compression achieved is significant (<0.1 bit per packet) enabling extremely large netflow analyses to be stored and transported. The single node parallel performance is analyzed in terms of both processors and threads showing that a single node can perform hundreds of simultaneous analyses at over a million packets/sec (roughly equivalent to a 10 Gigabit link). △ Less

Submitted 13 September, 2022; originally announced September 2022.

Comments: arXiv admin note: text overlap with arXiv:2203.13934, arXiv:2108.06653, arXiv:2008.00307

arXiv:2209.00602 [pdf, other]

doi 10.1109/HPEC55821.2022.9926316

Python Implementation of the Dynamic Distributed Dimensional Data Model

Authors: Hayden Jananthan, Lauren Milechin, Michael Jones, William Arcand, William Bergeron, David Bestor, Chansup Byun, Michael Houle, Matthew Hubbell, Vijay Gadepally, Anna Klein, Peter Michaleas, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

Abstract: Python has become a standard scientific computing language with fast-growing support of machine learning and data analysis modules, as well as an increasing usage of big data. The Dynamic Distributed Dimensional Data Model (D4M) offers a highly composable, unified data model with strong performance built to handle big data fast and efficiently. In this work we present an implementation of D4M in P… ▽ More Python has become a standard scientific computing language with fast-growing support of machine learning and data analysis modules, as well as an increasing usage of big data. The Dynamic Distributed Dimensional Data Model (D4M) offers a highly composable, unified data model with strong performance built to handle big data fast and efficiently. In this work we present an implementation of D4M in Python. $D4M.py$ implements all foundational functionality of D4M and includes Accumulo and SQL database support via Graphulo. We describe the mathematical background and motivation, an explanation of the approaches made for its fundamental functions and building blocks, and performance results which compare $D4M.py$'s performance to D4M-MATLAB and D4M.jl. △ Less

Submitted 22 November, 2022; v1 submitted 1 September, 2022; originally announced September 2022.

Comments: 8 pages, 7 figures, accepted to HPEC 2022

arXiv:2208.14908 [pdf]

doi 10.1109/HPEC55821.2022.9926365

pPython for Parallel Python Programming

Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Hayden Jananthan, Michael Jones, Kurt Keville, Anna Klein, Peter Michaleas, Lauren Milechin, Guillermo Morales, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

Abstract: pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. The core data structure in pPython is a distributed numerical array whose distribution onto multiple processors is specified with a map c… ▽ More pPython seeks to provide a parallel capability that provides good speed-up without sacrificing the ease of programming in Python by implementing partitioned global array semantics (PGAS) on top of a simple file-based messaging library (PythonMPI) in pure Python. The core data structure in pPython is a distributed numerical array whose distribution onto multiple processors is specified with a map construct. Communication operations between distributed arrays are abstracted away from the user and pPython transparently supports redistribution between any block-cyclic-overlapped distributions in up to four dimensions. pPython follows a SPMD (single program multiple data) model of computation. pPython runs on any combination of heterogeneous systems that support Python, including Windows, Linux, and MacOS operating systems. In addition to running transparently on single-node (e.g., a laptop), pPython provides a scheduler interface, so that pPython can be executed in a massively parallel computing environment. The initial implementation uses the Slurm scheduler. Performance of pPython on the HPC Challenge benchmark suite demonstrates both ease of programming and scalability. △ Less

Submitted 31 August, 2022; originally announced August 2022.

Comments: arXiv admin note: substantial text overlap with arXiv:astro-ph/0606464

arXiv:2205.15143 [pdf, other]

doi 10.3847/1538-4357/ac7468

On the dwarf irregular galaxy NGC 6822. I. Young, intermediate and old stellar populations

Authors: Maria Tantalo, Massimo Dall'Ora, Giuseppe Bono, Peter B. Stetson, Michele Fabrizio, Ivan Ferraro, Mario Nonino, Vittorio F. Braga, Ronaldo da Silva, Giuliana Fiorentino, Giacinto Iannicola, Massimo Marengo, Matteo Monelli, Joseph P. Mullen, Adriano Pietrinferni, Maurizio Salaris

Abstract: We present accurate and deep multi-band ($g,r,i$) photometry of the Local Group dwarf irregular galaxy NGC 6822. The images were collected with wide field cameras at 2m/4m- (INT,CTIO,CFHT) and 8m-class telescopes (SUBARU) covering a 2 square degrees FoV across the center of the galaxy. We performed PSF photometry of $\approx$7,000 CCD images and the final catalog includes more than 1 million objec… ▽ More We present accurate and deep multi-band ($g,r,i$) photometry of the Local Group dwarf irregular galaxy NGC 6822. The images were collected with wide field cameras at 2m/4m- (INT,CTIO,CFHT) and 8m-class telescopes (SUBARU) covering a 2 square degrees FoV across the center of the galaxy. We performed PSF photometry of $\approx$7,000 CCD images and the final catalog includes more than 1 million objects. We developed a new approach to identify candidate field and galaxy stars, and performed a new estimate of the galaxy center by using old stellar tracers finding that it differs by 1.15 (RA) and 1.53 (DEC) arcmin from previous estimates. We also found that young (Main Sequence, Red Supergiants), intermediate (Red Clump, Asymptotic Giant Branch [AGB]) and old (Red Giant Branch [RGB]) stars display different radial distributions. Old stellar population is spherically distributed and extends to radial distances larger than previously estimated ($\sim$1 degree). The young population shows a well defined bar and a disk-like distribution, as suggested by radio measurements, that is off-center compared with old population. We discuss pros and cons of the different diagnostics adopted to identify AGB stars and develop new ones based on optical-NIR-MIR color-color diagrams (CCDs) to characterize Oxygen and Carbon (C) rich stars. We found a mean population ratio between Carbon and M-type (C/M) stars of 0.67$\pm$0.08 (optical/NIR/MIR) and we used the observed C/M ratio with empirical C/M-metallicity relations to estimate a mean iron abundance of [Fe/H]$\sim$-1.25 ($σ$=0.04 dex) that agrees quite well with literature estimates. △ Less

Submitted 30 May, 2022; originally announced May 2022.

Comments: Accepted for publication in ApJ, 34 pages, 22 figures, 6 tables

arXiv:2204.07627 [pdf, other]

doi 10.3847/1538-4357/ac67ee

Metallicity of Galactic RR Lyrae from Optical and Infrared Light Curves: II. Period-Fourier-Metallicity Relations for First Overtone RR Lyrae

Authors: Joseph P. Mullen, Massimo Marengo, Clara E. Martínez-Vázquez, Giuseppe Bono, Vittorio F. Braga, Brian Chaboyer, Juliana Crestani, Massimo Dall'Ora, Michele Fabrizio, Giuliana Fiorentino, Matteo Monelli, Jillian R. Neeley, Peter B. Stetson, Frédéric Thévenin

Abstract: We present new period-$φ_{31}$-[Fe/H] relations for first overtone RRL stars (RRc), calibrated over a broad range of metallicities ($-2.5 < \textrm{[Fe/H]}< 0.0$) utilizing the largest currently available set of Galactic halo field RRL with homogeneous spectroscopic metallicities. Our relations are defined in the optical (ASAS-SN $V$-band) and, inaugurally, in the infrared (WISE $W1$ and $W2$ band… ▽ More We present new period-$φ_{31}$-[Fe/H] relations for first overtone RRL stars (RRc), calibrated over a broad range of metallicities ($-2.5 < \textrm{[Fe/H]}< 0.0$) utilizing the largest currently available set of Galactic halo field RRL with homogeneous spectroscopic metallicities. Our relations are defined in the optical (ASAS-SN $V$-band) and, inaugurally, in the infrared (WISE $W1$ and $W2$ bands). Our $V$-band relation can reproduce individual RRc spectroscopic metallicities with a dispersion of 0.30 dex over the entire metallicity range of our calibrator sample (an RMS smaller than what we found for other relations in literature including non-linear terms). Our infrared relation has a similar dispersion in the low and intermediate metallicity range ($\textrm{[Fe/H]} < -0.5$) but tends to underestimate the [Fe/H] abundance around solar metallicity. We tested our relations by measuring both the metallicity of the Sculptor dSph and a sample of Galactic globular clusters, rich in both RRc and RRab stars. The average metallicity we obtain for the combined RRL sample in each cluster is within $\pm 0.08$ dex of their spectroscopic metallicities. The infrared and optical relations presented in this work will enable deriving reliable photometric RRL metallicities in conditions where spectroscopic measurements are not feasible; e.g., in distant galaxies or reddened regions (observed with upcoming Extremely Large Telescopes and the James Webb Space Telescope), or in the large sample of new RRL that will be discovered in large-area time-domain photometric surveys (such as LSST and the Roman space telescope). △ Less

Submitted 15 April, 2022; originally announced April 2022.

Comments: Accepted by ApJ, 22 pages, 9 Figures, 3 Tables

arXiv:2204.05839 [pdf, ps, other]

doi 10.1109/IPDPSW55747.2022.00122

The MIT Supercloud Workload Classification Challenge

Authors: Benny J. Tang, Qiqi Chen, Matthew L. Weiss, Nathan Frey, Joseph McDonald, David Bestor, Charles Yee, William Arcand, Chansup Byun, Daniel Edelman, Matthew Hubbell, Michael Jones, Jeremy Kepner, Anna Klein, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Andrew Bowne, Lindsey McEvoy, Baolin Li, Devesh Tiwari , et al. (2 additional authors not shown)

Abstract: High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute… ▽ More High-Performance Computing (HPC) centers and cloud providers support an increasingly diverse set of applications on heterogenous hardware. As Artificial Intelligence (AI) and Machine Learning (ML) workloads have become an increasingly larger share of the compute workloads, new approaches to optimized resource usage, allocation, and deployment of new AI frameworks are needed. By identifying compute workloads and their utilization characteristics, HPC systems may be able to better match available resources with the application demand. By leveraging datacenter instrumentation, it may be possible to develop AI-based approaches that can identify workloads and provide feedback to researchers and datacenter operators for improving operational efficiency. To enable this research, we released the MIT Supercloud Dataset, which provides detailed monitoring logs from the MIT Supercloud cluster. This dataset includes CPU and GPU usage by jobs, memory usage, and file system logs. In this paper, we present a workload classification challenge based on this dataset. We introduce a labelled dataset that can be used to develop new approaches to workload classification and present initial results based on existing approaches. The goal of this challenge is to foster algorithmic innovations in the analysis of compute workloads that can achieve higher accuracy than existing methods. Data and code will be made publicly available via the Datacenter Challenge website : https://dcc.mit.edu. △ Less

Submitted 13 April, 2022; v1 submitted 12 April, 2022; originally announced April 2022.

Comments: Accepted at IPDPS ADOPT'22

arXiv:2203.13934 [pdf, other]

doi 10.1109/HPEC55821.2022.9926332

GraphBLAS on the Edge: Anonymized High Performance Streaming of Network Traffic

Authors: Michael Jones, Jeremy Kepner, Daniel Andersen, Aydin Buluc, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Chad Meiners, Lauren Milechin, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Jon Sreekanth , et al. (3 additional authors not shown)

Abstract: Long range detection is a cornerstone of defense in many operating domains (land, sea, undersea, air, space, ..,). In the cyber domain, long range detection requires the analysis of significant network traffic from a variety of observatories and outposts. Construction of anonymized hypersparse traffic matrices on edge network devices can be a key enabler by providing significant data compression i… ▽ More Long range detection is a cornerstone of defense in many operating domains (land, sea, undersea, air, space, ..,). In the cyber domain, long range detection requires the analysis of significant network traffic from a variety of observatories and outposts. Construction of anonymized hypersparse traffic matrices on edge network devices can be a key enabler by providing significant data compression in a rapidly analyzable format that protects privacy. GraphBLAS is ideally suited for both constructing and analyzing anonymized hypersparse traffic matrices. The performance of GraphBLAS on an Accolade Technologies edge network device is demonstrated on a near worse case traffic scenario using a continuous stream of CAIDA Telescope darknet packets. The performance for varying numbers of traffic buffers, threads, and processor cores is explored. Anonymized hypersparse traffic matrices can be constructed at a rate of over 50,000,000 packets per second; exceeding a typical 400 Gigabit network link. This performance demonstrates that anonymized hypersparse traffic matrices are readily computable on edge network devices with minimal compute resources and can be a viable data product for such devices. △ Less

Submitted 5 September, 2022; v1 submitted 25 March, 2022; originally announced March 2022.

Comments: Accepted to IEEE HPEC, Outstanding Paper Award, 8 pages, 8 figures, 1 table, 70 references. arXiv admin note: text overlap with arXiv:2108.06653, arXiv:2008.00307, arXiv:2203.10230

arXiv:2203.10230 [pdf, other]

doi 10.1109/IPDPSW55747.2022.00054

Temporal Correlation of Internet Observatories and Outposts

Authors: Jeremy Kepner, Michael Jones, Daniel Andersen, Aydın Buluç, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Daniel Grant, Micheal Houle, Matthew Hubbell, Hayden Jananthan, Anna Klein, Chad Meiners, Lauren Milechin, Andrew Morris, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa , et al. (4 additional authors not shown)

Abstract: The Internet has become a critical component of modern civilization requiring scientific exploration akin to endeavors to understand the land, sea, air, and space environments. Understanding the baseline statistical distributions of traffic are essential to the scientific understanding of the Internet. Correlating data from different Internet observatories and outposts can be a useful tool for gai… ▽ More The Internet has become a critical component of modern civilization requiring scientific exploration akin to endeavors to understand the land, sea, air, and space environments. Understanding the baseline statistical distributions of traffic are essential to the scientific understanding of the Internet. Correlating data from different Internet observatories and outposts can be a useful tool for gaining insights into these distributions. This work compares observed sources from the largest Internet telescope (the CAIDA darknet telescope) with those from a commercial outpost (the GreyNoise honeyfarm). Neither of these locations actively emit Internet traffic and provide distinct observations of unsolicited Internet traffic (primarily botnets and scanners). Newly developed GraphBLAS hyperspace matrices and D4M associative array technologies enable the efficient analysis of these data on significant scales. The CAIDA sources are well approximated by a Zipf-Mandelbrot distribution. Over a 6-month period 70\% of the brightest (highest frequency) sources in the CAIDA telescope are consistently detected by coeval observations in the GreyNoise honeyfarm. This overlap drops as the sources dim (reduce frequency) and as the time difference between the observations grows. The probability of seeing a CAIDA source is proportional to the logarithm of the brightness. The temporal correlations are well described by a modified Cauchy distribution. These observations are consistent with a correlated high frequency beam of sources that drifts on a time scale of a month. △ Less

Submitted 18 March, 2022; originally announced March 2022.

Comments: 8 pages, 8 figures, 2 tables, 59 references; accepted to GrAPL 2022. arXiv admin note: substantial text overlap with arXiv:2108.06653

arXiv:2109.01747 [pdf, other]

doi 10.1109/LRA.2021.3111055

Communicating Inferred Goals with Passive Augmented Reality and Active Haptic Feedback

Authors: James F. Mullen Jr, Josh Mosier, Sounak Chakrabarti, Anqi Chen, Tyler White, Dylan P. Losey

Abstract: Robots learn as they interact with humans. Consider a human teleoperating an assistive robot arm: as the human guides and corrects the arm's motion, the robot gathers information about the human's desired task. But how does the human know what their robot has inferred? Today's approaches often focus on conveying intent: for instance, upon legible motions or gestures to indicate what the robot is p… ▽ More Robots learn as they interact with humans. Consider a human teleoperating an assistive robot arm: as the human guides and corrects the arm's motion, the robot gathers information about the human's desired task. But how does the human know what their robot has inferred? Today's approaches often focus on conveying intent: for instance, upon legible motions or gestures to indicate what the robot is planning. However, closing the loop on robot inference requires more than just revealing the robot's current policy: the robot should also display the alternatives it thinks are likely, and prompt the human teacher when additional guidance is necessary. In this paper we propose a multimodal approach for communicating robot inference that combines both passive and active feedback. Specifically, we leverage information-rich augmented reality to passively visualize what the robot has inferred, and attention-grabbing haptic wristbands to actively prompt and direct the human's teaching. We apply our system to shared autonomy tasks where the robot must infer the human's goal in real-time. Within this context, we integrate passive and active modalities into a single algorithmic framework that determines when and which type of feedback to provide. Combining both passive and active feedback experimentally outperforms single modality baselines; during an in-person user study, we demonstrate that our integrated approach increases how efficiently humans teach the robot while simultaneously decreasing the amount of time humans spend interacting with the robot. Videos here: https://youtu.be/swq_u4iIP-g △ Less

Submitted 3 September, 2021; originally announced September 2021.

Comments: 8 pages, 5 figures

Journal ref: IEEE Robotics and Automation Letters 6.4 (2021) 8522-8529

arXiv:2108.11525 [pdf, other]

doi 10.1109/HPEC49654.2021.9622808

Supercomputing Enabled Deployable Analytics for Disaster Response

Authors: Kaira Samuel, Jeremy Kepner, Michael Jones, Lauren Milechin, Vijay Gadepally, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Anna Klein, Victor Lopez, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas

Abstract: First responders and other forward deployed essential workers can benefit from advanced analytics. Limited network access and software security requirements prevent the usage of standard cloud based microservice analytic platforms that are typically used in industry. One solution is to precompute a wide range of analytics as files that can be used with standard preinstalled software that does not… ▽ More First responders and other forward deployed essential workers can benefit from advanced analytics. Limited network access and software security requirements prevent the usage of standard cloud based microservice analytic platforms that are typically used in industry. One solution is to precompute a wide range of analytics as files that can be used with standard preinstalled software that does not require network access or additional software and can run on a wide range of legacy hardware. In response to the COVID-19 pandemic, this approach was tested for providing geo-spatial census data to allow quick analysis of demographic data for better responding to emergencies. These data were processed using the MIT SuperCloud to create several thousand Google Earth and Microsoft Excel files representative of many advanced analytics. The fast mapping of census data using Google Earth and Microsoft Excel has the potential to give emergency responders a powerful tool to improve emergency preparedness. Our approach displays relevant census data (total population, population under 15, population over 65, median age) per census block, sorted by county, through a Microsoft Excel spreadsheet (xlsx file) and Google Earth map (kml file). The spreadsheet interface includes features that allow users to convert between different longitude and latitude coordinate units. For the Google Earth files, a variety of absolute and relative colors maps of population density have been explored to provide an intuitive and meaningful interface. Using several hundred cores on the MIT SuperCloud, new analytics can be generated in a few minutes. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Comments: 5 pages, 11 figures, 17 references, accepted to IEEE HPEC 2021

arXiv:2108.11359 [pdf]

doi 10.1109/HPEC49654.2021.9622870

Node-Based Job Scheduling for Large Scale Simulations of Short Running Jobs

Authors: Chansup Byun, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Jeremy Kepner

Abstract: Diverse workloads such as interactive supercomputing, big data analysis, and large-scale AI algorithm development, requires a high-performance scheduler. This paper presents a novel node-based scheduling approach for large scale simulations of short running jobs on MIT SuperCloud systems, that allows the resources to be fully utilized for both long running batch jobs while simultaneously providing… ▽ More Diverse workloads such as interactive supercomputing, big data analysis, and large-scale AI algorithm development, requires a high-performance scheduler. This paper presents a novel node-based scheduling approach for large scale simulations of short running jobs on MIT SuperCloud systems, that allows the resources to be fully utilized for both long running batch jobs while simultaneously providing fast launch and release of large-scale short running jobs. The node-based scheduling approach has demonstrated up to 100 times faster scheduler performance that other state-of-the-art systems. △ Less

Submitted 25 August, 2021; originally announced August 2021.

Comments: IEEE HPEC 2021

arXiv:2108.06653 [pdf, other]

doi 10.1109/HPEC49654.2021.9622790

Spatial Temporal Analysis of 40,000,000,000,000 Internet Darkspace Packets

Authors: Jeremy Kepner, Michael Jones, Daniel Andersen, Aydin Buluc, Chansup Byun, K Claffy, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Micheal Houle, Matthew Hubbell, Anna Klein, Chad Meiners, Lauren Milechin, Julie Mullen, Sandeep Pisharody, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Adam Tse , et al. (2 additional authors not shown)

Abstract: The Internet has never been more important to our society, and understanding the behavior of the Internet is essential. The Center for Applied Internet Data Analysis (CAIDA) Telescope observes a continuous stream of packets from an unsolicited darkspace representing 1/256 of the Internet. During 2019 and 2020 over 40,000,000,000,000 unique packets were collected representing the largest ever assem… ▽ More The Internet has never been more important to our society, and understanding the behavior of the Internet is essential. The Center for Applied Internet Data Analysis (CAIDA) Telescope observes a continuous stream of packets from an unsolicited darkspace representing 1/256 of the Internet. During 2019 and 2020 over 40,000,000,000,000 unique packets were collected representing the largest ever assembled public corpus of Internet traffic. Using the combined resources of the Supercomputing Centers at UC San Diego, Lawrence Berkeley National Laboratory, and MIT, the spatial temporal structure of anonymized source-destination pairs from the CAIDA Telescope data has been analyzed with GraphBLAS hierarchical hypersparse matrices. These analyses provide unique insight on this unsolicited Internet darkspace traffic with the discovery of many previously unseen scaling relations. The data show a significant sustained increase in unsolicited traffic corresponding to the start of the COVID19 pandemic, but relatively little change in the underlying scaling relations associated with unique sources, source fan-outs, unique links, destination fan-ins, and unique destinations. This work provides a demonstration of the practical feasibility and benefit of the safe collection and analysis of significant quantities of anonymized Internet traffic. △ Less

Submitted 14 August, 2021; originally announced August 2021.

Comments: 8 pages, 9 figures, 2 tables, 43 references, accepted to IEEE HPEC 2021. arXiv admin note: substantial text overlap with arXiv:2008.00307

arXiv:2108.06650 [pdf, other]

doi 10.1109/HPEC49654.2021.9622802

Vertical, Temporal, and Horizontal Scaling of Hierarchical Hypersparse GraphBLAS Matrices

Authors: Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Charles Yee, Peter Michaleas

Abstract: Hypersparse matrices are a powerful enabler for a variety of network, health, finance, and social applications. Hierarchical hypersparse GraphBLAS matrices enable rapid streaming updates while preserving algebraic analytic power and convenience. In many contexts, the rate of these updates sets the bounds on performance. This paper explores hierarchical hypersparse update performance on a variety o… ▽ More Hypersparse matrices are a powerful enabler for a variety of network, health, finance, and social applications. Hierarchical hypersparse GraphBLAS matrices enable rapid streaming updates while preserving algebraic analytic power and convenience. In many contexts, the rate of these updates sets the bounds on performance. This paper explores hierarchical hypersparse update performance on a variety of hardware with identical software configurations. The high-level language bindings of the GraphBLAS readily enable performance experiments on simultaneous diverse hardware. The best single process performance measured was 4,000,000 updates per second. The best single node performance measured was 170,000,000 updates per second. The hardware used spans nearly a decade and allows a direct comparison of hardware improvements for this computation over this time range; showing a 2x increase in single-core performance, a 3x increase in single process performance, and a 5x increase in single node performance. Running on nearly 2,000 MIT SuperCloud nodes simultaneously achieved a sustained update rate of over 200,000,000,000 updates per second. Hierarchical hypersparse GraphBLAS allows the MIT SuperCloud to analyze extremely large streaming network data sets. △ Less

Submitted 14 August, 2021; originally announced August 2021.

Comments: 6 pages, 5 figures, 32 references, accepted to IEEE HPEC 2021. arXiv admin note: text overlap with arXiv:2001.06935

arXiv:2108.02037 [pdf]

The MIT Supercloud Dataset

Authors: Siddharth Samsi, Matthew L Weiss, David Bestor, Baolin Li, Michael Jones, Albert Reuther, Daniel Edelman, William Arcand, Chansup Byun, John Holodnack, Matthew Hubbell, Jeremy Kepner, Anna Klein, Joseph McDonald, Adam Michaleas, Peter Michaleas, Lauren Milechin, Julia Mullen, Charles Yee, Benjamin Price, Andrew Prout, Antonio Rosa, Allan Vanterpool, Lindsey McEvoy, Anson Cheng , et al. (2 additional authors not shown)

Abstract: Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new… ▽ More Artificial intelligence (AI) and Machine learning (ML) workloads are an increasingly larger share of the compute workloads in traditional High-Performance Computing (HPC) centers and commercial cloud systems. This has led to changes in deployment approaches of HPC clusters and the commercial cloud, as well as a new focus on approaches to optimized resource usage, allocations and deployment of new AI frame- works, and capabilities such as Jupyter notebooks to enable rapid prototyping and deployment. With these changes, there is a need to better understand cluster/datacenter operations with the goal of developing improved scheduling policies, identifying inefficiencies in resource utilization, energy/power consumption, failure prediction, and identifying policy violations. In this paper we introduce the MIT Supercloud Dataset which aims to foster innovative AI/ML approaches to the analysis of large scale HPC and datacenter/cloud operations. We provide detailed monitoring logs from the MIT Supercloud system, which include CPU and GPU usage by jobs, memory usage, file system logs, and physical monitoring data. This paper discusses the details of the dataset, collection methodology, data availability, and discusses potential challenge problems being developed using this data. Datasets and future challenge announcements will be available via https://dcc.mit.edu. △ Less

Submitted 4 August, 2021; originally announced August 2021.

arXiv:2107.00923 [pdf, other]

doi 10.3847/1538-4357/ac1074

On the Use of Field RR Lyrae as Galactic Probes. V. Optical and radial velocity curve templates

Authors: V. F. Braga, J. Crestani, M. Fabrizio, G. Bono, G. W. Preston, C. Sneden, J. Storm, S. Kamann, M. Latour, H. Lala, B. Lemasle, Z. Prudil, G. Altavilla, B. Chaboyer, M. Dall'Ora, I. Ferraro, C. K. Gilligan, G. Fiorentino, G. Iannicola, L. Inno, S. Kwak, M. Marengo, S. Marinoni, P. M. Marrese, C. E. Martínez-Vázquez , et al. (7 additional authors not shown)

Abstract: We collected the largest spectroscopic catalog of RR Lyrae (RRLs) including $\approx$20,000 high-, medium- and low-resolution spectra for $\approx$10,000 RRLs. We provide the analytical forms of radial velocity curve (RVC) templates. These were built using 36 RRLs (31 fundamental -- split into three period bins -- and 5 first overtone pulsators) with well-sampled RVCs based on three groups of meta… ▽ More We collected the largest spectroscopic catalog of RR Lyrae (RRLs) including $\approx$20,000 high-, medium- and low-resolution spectra for $\approx$10,000 RRLs. We provide the analytical forms of radial velocity curve (RVC) templates. These were built using 36 RRLs (31 fundamental -- split into three period bins -- and 5 first overtone pulsators) with well-sampled RVCs based on three groups of metallic lines (Fe, Mg, Na) and four Balmer lines (H$_α$, H$_β$, H$_γ$, H$_δ$). We tackled the long-standing problem of the reference epoch to anchor light curve and RVC templates. For the $V$-band, we found that the residuals of the templates anchored to the phase of the mean magnitude along the rising branch are $\sim$35\% to $\sim$45\% smaller than those anchored to the phase of maximum light. For the RVC, we used two independent reference epochs for metallic and Balmer lines and we verified that the residuals of the RVC templates anchored to the phase of mean RV are from 30\% (metallic lines) up to 45\% (Balmer lines) smaller than those anchored to the phase of minimum RV. We validated our RVC templates by using both the single- and the three-phase points approach. We found that barycentric velocities based on our RVC templates are two-three times more accurate than those available in the literature. We applied the current RVC templates to Balmer lines RVs of RRLs in the globular NGC~3201 collected with MUSE at VLT. We found the cluster barycentric RV of $V_γ$=496.89$\pm$8.37(error)$\pm$3.43 (standard deviation) km/s, which agrees well with literature estimates. △ Less

Submitted 2 July, 2021; originally announced July 2021.

arXiv:2107.00919 [pdf, other]

doi 10.3847/1538-4357/ac1115

On the use of field RR Lyrae as Galactic probes: IV. New insights into and around the Oosterhoff dichotomy

Authors: M. Fabrizio, V. F. Braga, J. Crestani, G. Bono, I. Ferraro, G. Fiorentino, G. Iannicola, G. W. Preston, C. Sneden, F. Thévenin, G. Altavilla, B. Chaboyer, M. Dall'Ora, R. da Silva, E. K. Grebel, C. K. Gilligan, H. Lala, B. Lemasle, D. Magurno, M. Marengo, S. Marinoni, P. M. Marrese, C. E. Martìnez-Vàzquez, N. Matsunaga, M. Monelli , et al. (8 additional authors not shown)

Abstract: We discuss the largest and most homogeneous spectroscopic dataset of field RR Lyrae variables (RRLs) available to date. We estimated abundances using both high-resolution and low-resolution ({ΔS} method) spectra for fundamental (RRab) and first overtone (RRc) RRLs. The iron abundances for 7,941 RRLs were supplemented with similar literature estimates available, ending up with 9,015 RRLs (6,150 RRa… ▽ More We discuss the largest and most homogeneous spectroscopic dataset of field RR Lyrae variables (RRLs) available to date. We estimated abundances using both high-resolution and low-resolution ({ΔS} method) spectra for fundamental (RRab) and first overtone (RRc) RRLs. The iron abundances for 7,941 RRLs were supplemented with similar literature estimates available, ending up with 9,015 RRLs (6,150 RRab, 2,865 RRc). The metallicity distribution shows a mean value of <[Fe/H]> = -1.51\pm0.01, and σ(standard deviation)= 0.41 dex with a long metal-poor tail approaching [Fe/H] = -3 and a sharp metal-rich tail approaching solar iron abundance. The RRab variables are more metal-rich (<[Fe/H]>ab = -1.48\pm0.01, σ = 0.41 dex) than RRc variables (<[Fe/H]>c = -1.58\pm0.01, σ = 0.40 dex). The relative fraction of RRab variables in the Bailey diagram (visual amplitude vs period) located along the short-period (more metal-rich) and the long-period (more metal-poor) sequences are 80% and 20\%, while RRc variables display an opposite trend, namely 30\% and 70\%. We found that the pulsation period of both RRab and RRc variables steadily decreases when moving from the metal-poor to the metal-rich regime. The visual amplitude shows the same trend, but RRc amplitudes are almost two times more sensitive than RRab amplitudes to metallicity. We also investigated the dependence of the population ratio (Nc/Ntot) of field RRLs on the metallicity and we found that the distribution is more complex than in globular clusters. The population ratio steadily increases from ~0.25 to ~0.36 in the metal-poor regime, it decreases from ~0.36 to ~0.18 for -1.8 < [Fe/H] < -0.9 and it increases to a value of ~0.3 approaching solar iron abundance. △ Less

Submitted 2 July, 2021; originally announced July 2021.

Comments: 22 pages, 13 figures, 3 tables. Accepted for publication in ApJ

arXiv:2104.08113 [pdf, other]

doi 10.3847/1538-4357/abfa23

On the Use of Field RR Lyrae as Galactic Probes. III. The $α$-element abundances

Authors: J. Crestani, V. F. Braga, M. Fabrizio, G. Bono, C. Sneden, G. W. Preston, I. Ferraro, G. Iannicola, M. Nonino, G. Fiorentino, F. Thévenin, B. Lemasle, Z. Prudil, A. Alves-Brito, G. Altavilla, B. Chaboyer, M. Dall'Ora, V. D'Orazi, C. K. Gilligan, E. Grebel, A. J. Koch-Hansen, H. Lala, M. Marengo, S. Marinoni, P. M. Marrese , et al. (11 additional authors not shown)

Abstract: We provide the largest and most homogeneous sample of $α$-element (Mg, Ca, Ti) and iron abundances for field RR Lyrae (RRLs, 162 variables) by using high-resolution spectra. The current measurements were complemented with similar abundances available in the literature for 46 field RRLs brought to our metallicity scale. We ended up with a sample of old (t$\ge$ 10 Gyr), low-mass stellar tracers (208… ▽ More We provide the largest and most homogeneous sample of $α$-element (Mg, Ca, Ti) and iron abundances for field RR Lyrae (RRLs, 162 variables) by using high-resolution spectra. The current measurements were complemented with similar abundances available in the literature for 46 field RRLs brought to our metallicity scale. We ended up with a sample of old (t$\ge$ 10 Gyr), low-mass stellar tracers (208 RRLs: 169 fundamental, 38 first overtone, 1 mixed mode) covering three dex in iron abundance (-3.00$\le$[Fe/H]$\le$0.24). We found that field RRLs are $\sim$0.3 dex more $α$-poor than typical Halo tracers in the metal-rich regime, ([Fe/H]$\ge$-1.2) while in the metal-poor regime ([Fe/H]$\le$-2.2) they seem to be on average $\sim$0.1 dex more $α$-enhanced. This is the first time that the depletion in $α$-elements for solar iron abundances is detected on the basis of a large, homogeneous and coeval sample of old stellar tracers. Interestingly, we also detected a close similarity in the [$α$/Fe] trend between $α$-poor, metal-rich RRLs and red giants (RGs) in the Sagittarius dwarf galaxy as well as between $α$-enhanced, metal-poor RRLs and RGs in ultra faint dwarf galaxies. These results are supported by similar elemental abundances for 46 field Horizontal Branch (HB) stars. These stars share with RRLs the same evolutionary phase and the same progenitors. This evidence further supports the key role that old stellar tracers play in constraining the early chemical enrichment of the Halo and, in particular, in investigating the impact that dwarf galaxies have had in the mass assembly of the Galaxy. △ Less

Submitted 16 April, 2021; originally announced April 2021.

arXiv:2103.11012 [pdf, other]

doi 10.1093/mnras/stab857

Metallicities from high resolution spectra of 49 RR Lyrae Variables

Authors: Christina K. Gilligan, Brian Chaboyer, Massimo Marengo, Joseph P. Mullen, Giuseppe Bono, Vittorio F. Braga, Juliana Crestani, Massimo Dall'Ora, Giuliana Fiorentino, Matteo Monelli, Jill R. Neeley, Michele Fabrizio, Clara E. Martínez-Vázquez, Frédéric Thévenin, Christopher Sneden

Abstract: Accurate metallicities of RR Lyrae are extremely important in constraining period-luminosity-metallicity relationships (PLZ), particularly in the near-infrared. We analyse 69 high-resolution spectra of Galactic RR Lyrae stars from the Southern African Large Telescope (SALT). We measure metallicities of 58 of these RR Lyrae stars with typical uncertainties of 0.13 dex. All but one RR Lyrae in this… ▽ More Accurate metallicities of RR Lyrae are extremely important in constraining period-luminosity-metallicity relationships (PLZ), particularly in the near-infrared. We analyse 69 high-resolution spectra of Galactic RR Lyrae stars from the Southern African Large Telescope (SALT). We measure metallicities of 58 of these RR Lyrae stars with typical uncertainties of 0.13 dex. All but one RR Lyrae in this sample has accurate (σ_parallax ~ 10%) parallax from Gaia. Combining these new high resolution spectroscopic abundances with similar determinations from the literature for 93 stars, we present new PLZ relationships in WISE W1 and W2 magnitudes, and the Wesenheit magnitudes W(W1,V-W1) and W(W2,V-W2). △ Less

Submitted 19 March, 2021; originally announced March 2021.

Comments: 16 pages, 13 figures, accepted in MNRAS

arXiv:2103.09372 [pdf, other]

doi 10.3847/1538-4357/abefd4

Metallicity of Galactic RR Lyrae from Optical and Infrared Light Curves: I. Period-Fourier-Metallicity Relations for Fundamental Mode RR Lyrae

Authors: Joseph P. Mullen, Massimo Marengo, Clara E. Martínez-Vázquez, Jillian R. Neeley, Giuseppe Bono, Massimo Dall'Ora, Brian Chaboyer, Frédéric Thévenin, Vittorio F. Braga, Juliana Crestani, Michele Fabrizio, Giuliana Fiorentino, Christina K. Gilligan, Matteo Monelli, Peter B. Stetson

Abstract: We present newly-calibrated period-$φ_{31}$-[Fe/H] relations for fundamental mode RR Lyrae stars in the optical and, for the first time, mid-infrared. This work's calibration dataset provides the largest and most comprehensive span of parameter space to date with homogeneous metallicities from $-3<\textrm{[Fe/H]}<0.4$ and accurate Fourier parameters derived from 1980 ASAS-SN ($V$-band) and 1083 WI… ▽ More We present newly-calibrated period-$φ_{31}$-[Fe/H] relations for fundamental mode RR Lyrae stars in the optical and, for the first time, mid-infrared. This work's calibration dataset provides the largest and most comprehensive span of parameter space to date with homogeneous metallicities from $-3<\textrm{[Fe/H]}<0.4$ and accurate Fourier parameters derived from 1980 ASAS-SN ($V$-band) and 1083 WISE (NEOWISE extension, $W1$ and $W2$ bands) RR Lyrae stars with well-sampled light curves. We compare our optical period-$φ_{31}$-[Fe/H] with those available in the literature and demonstrate that our relation minimizes systematic trends in the lower and higher metallicity range. Moreover, a direct comparison shows that our optical photometric metallicities are consistent with both those from high-resolution spectroscopy and globular clusters, supporting the good performance of our relation. We found an intrinsic scatter in the photometric metallicities (0.41 dex in the $V$-band and 0.50 dex in the infrared) by utilizing large calibration datasets covering a broad metallicity range. This scatter becomes smaller when optical and infrared bands are used together (0.37 dex). Overall, the relations derived in this work have many potential applications, including large-area photometric surveys with JWST in the infrared and LSST in the optical. △ Less

Submitted 16 March, 2021; originally announced March 2021.

Comments: Accepted by ApJ, 29 pages, 14 Figures, 4 Tables

arXiv:2012.02284 [pdf, other]

doi 10.3847/1538-4357/abd183

On the Use of Field RR Lyrae as Galactic Probes. II. A new $Δ$S calibration to estimate their metallicity

Authors: J. Crestani, M. Fabrizio, V. F. Braga, C. Sneden, G. W. Preston, I. Ferraro, G. Iannicola, G. Bono, A. Alves-Brito, M. Nonino, V. D'Orazi, L. Inno, M. Monelli, J. Storm, G. Altavilla, B. Chaboyer, M. Dall'Ora, G. Fiorentino, C. K. Gilligan, E. Grebel, H. Lala, B. Lemasle, M. Marengo, S. Marinoni, P. M. Marrese , et al. (11 additional authors not shown)

Abstract: We performed the largest and most homogeneous spectroscopic survey of field RR Lyraes (RRLs). We secured $\approx$6,300 high resolution (HR, R$\sim$35,000) spectra for 143 RRLs (111 fundamental, RRab; 32 first overtone, RRc). The atmospheric parameters were estimated by using the traditional approach and the iron abundances were measured by using an LTE line analysis. The resulting iron distributi… ▽ More We performed the largest and most homogeneous spectroscopic survey of field RR Lyraes (RRLs). We secured $\approx$6,300 high resolution (HR, R$\sim$35,000) spectra for 143 RRLs (111 fundamental, RRab; 32 first overtone, RRc). The atmospheric parameters were estimated by using the traditional approach and the iron abundances were measured by using an LTE line analysis. The resulting iron distribution shows a well defined metal-rich tail approaching solar iron abundance. This suggests that field RRLs experienced a complex chemical enrichment in the early halo formation. We used these data to develop a new calibration of the $Δ$S method. This diagnostic, based on the equivalent widths of CaII K and three Balmer (H$_{δ,γ,β}$) lines, traces the metallicity of RRLs. For the first time the new empirical calibration: i) includes spectra collected over the entire pulsation cycle; ii) includes RRc variables; iii) relies on spectroscopic calibrators covering more than three dex in iron abundance; iv) provides independent calibrations based on one/two/three Balmer lines. The new calibrations were applied to both SEGUE-SDSS and degraded HR spectra totalling 6,451 low resolution (LR, R$\sim$2,000) spectra for 5,001 RRLs (3,439 RRab, 1,562 RRc). This resulted in an iron distribution with a median of -1.55$\pm$0.01 and $σ$=0.51 dex, in good agreement with literature values. We also found that RRc are 0.10 dex more metal-poor than RRab variables, and have a distribution with a smoother metal-poor tail. This finding supports theoretical prescriptions suggesting a steady decrease in the RRc number when moving from metal-poor to metal-rich stellar environments. △ Less

Submitted 3 December, 2020; originally announced December 2020.

Comments: Accepted by ApJ

arXiv:2008.09037 [pdf]

doi 10.1109/HPEC43674.2020.9286249

Accuracy and Performance Comparison of Video Action Recognition Approaches

Authors: Matthew Hutchinson, Siddharth Samsi, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Micheal Houle, Matthew Hubbell, Micheal Jones, Jeremy Kepner, Andrew Kirby, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Albert Reuther, Charles Yee, Vijay Gadepally

Abstract: Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-t… ▽ More Over the past few years, there has been significant interest in video action recognition systems and models. However, direct comparison of accuracy and computational performance results remain clouded by differing training environments, hardware specifications, hyperparameters, pipelines, and inference methods. This article provides a direct comparison between fourteen off-the-shelf and state-of-the-art models by ensuring consistency in these training characteristics in order to provide readers with a meaningful comparison across different types of video action recognition algorithms. Accuracy of the models is evaluated using standard Top-1 and Top-5 accuracy metrics in addition to a proposed new accuracy metric. Additionally, we compare computational performance of distributed training from two to sixty-four GPUs on a state-of-the-art HPC system. △ Less

Submitted 20 August, 2020; originally announced August 2020.

Comments: Accepted for publication at IEEE HPEC 2020

arXiv:2008.08057 [pdf]

doi 10.1109/HPEC43674.2020.9286232

Benchmarking network fabrics for data distributed training of deep neural networks

Authors: Siddharth Samsi, Andrew Prout, Michael Jones, Andrew Kirby, Bill Arcand, Bill Bergeron, David Bestor, Chansup Byun, Vijay Gadepally, Michael Houle, Matthew Hubbell, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Antonio Rosa, Charles Yee, Albert Reuther, Jeremy Kepner

Abstract: Artificial Intelligence/Machine Learning applications require the training of complex models on large amounts of labelled data. The large computational requirements for training deep models have necessitated the development of new methods for faster training. One such approach is the data parallel approach, where the training data is distributed across multiple compute nodes. This approach is simp… ▽ More Artificial Intelligence/Machine Learning applications require the training of complex models on large amounts of labelled data. The large computational requirements for training deep models have necessitated the development of new methods for faster training. One such approach is the data parallel approach, where the training data is distributed across multiple compute nodes. This approach is simple to implement and supported by most of the commonly used machine learning frameworks. The data parallel approach leverages MPI for communicating gradients across all nodes. In this paper, we examine the effects of using different physical hardware interconnects and network-related software primitives for enabling data distributed deep learning. We compare the effect of using GPUDirect and NCCL on Ethernet and OmniPath fabrics. Our results show that using Ethernet-based networking in shared HPC systems does not have a significant effect on the training times for commonly used deep neural network architectures or traditional HPC applications such as Computational Fluid Dynamics. △ Less

Submitted 18 August, 2020; originally announced August 2020.

Comments: Accepted for publication at IEEE HPEC 2020

arXiv:2008.02223 [pdf]

doi 10.1109/HPEC43674.2020.9286142

Best of Both Worlds: High Performance Interactive and Batch Launching

Authors: Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Andrew Kirby, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

Abstract: Rapid launch of thousands of jobs is essential for effective interactive supercomputing, big data analysis, and AI algorithm development. Achieving thousands of launches per second has required hardware to be available to receive these jobs. This paper presents a novel preemptive approach to implement spot jobs on MIT SuperCloud systems allowing the resources to be fully utilized for both long run… ▽ More Rapid launch of thousands of jobs is essential for effective interactive supercomputing, big data analysis, and AI algorithm development. Achieving thousands of launches per second has required hardware to be available to receive these jobs. This paper presents a novel preemptive approach to implement spot jobs on MIT SuperCloud systems allowing the resources to be fully utilized for both long running batch jobs while still providing fast launch for interactive jobs. The new approach separates the job preemption and scheduling operations and can achieve 100 times faster performance in the scheduling of a job with preemption when compared to using the standard scheduler-provided automatic preemption-based capability. The results demonstrate that the new approach can schedule interactive jobs preemptively at a performance comparable to when the required computing resources are idle and available. The spot job capability can be deployed without disrupting the interactive user experience while increasing the overall system utilization. △ Less

Submitted 5 August, 2020; originally announced August 2020.

arXiv:2008.00307 [pdf, other]

doi 10.1109/HPEC43674.2020.9286235

Multi-Temporal Analysis and Scaling Relations of 100,000,000,000 Network Packets

Authors: Jeremy Kepner, Chad Meiners, Chansup Byun, Sarah McGuire, Timothy Davis, William Arcand, Jonathan Bernays, David Bestor, William Bergeron, Vijay Gadepally, Raul Harnasch, Matthew Hubbell, Micheal Houle, Micheal Jones, Andrew Kirby, Anna Klein, Lauren Milechin, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Siddharth Samsi, Doug Stetson, Adam Tse, Charles Yee , et al. (1 additional authors not shown)

Abstract: Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient me… ▽ More Our society has never been more dependent on computer networks. Effective utilization of networks requires a detailed understanding of the normal background behaviors of network traffic. Large-scale measurements of networks are computationally challenging. Building on prior work in interactive supercomputing and GraphBLAS hypersparse hierarchical traffic matrices, we have developed an efficient method for computing a wide variety of streaming network quantities on diverse time scales. Applying these methods to 100,000,000,000 anonymized source-destination pairs collected at a network gateway reveals many previously unobserved scaling relationships. These observations provide new insights into normal network background traffic that could be used for anomaly detection, AI feature engineering, and testing theoretical models of streaming networks. △ Less

Submitted 1 August, 2020; originally announced August 2020.

Comments: 6 pages, 6 figures,3 tables, 49 references, accepted to IEEE HPEC 2020

arXiv:2006.09625 [pdf, other]

Metallicity Distribution of Galactic Halo Field RR Lyrae, and the Effect of Metallicity on their Pulsation Properties

Authors: M. Marengo, J. P. Mullen, J. R. Neeley, M. Fabrizio, P. M. Marrese, G. Bono, V. F. Braga, D. Magurno, J. Crestani, G. Fiorentino, M. Monelli, B. Chaboyer, C. K. Gilligan, M. Dall'Ora, C. E. Martinez-Vazquez, F. Thevenin, N. Matsunaga

Abstract: We present our analysis of a large sample (over 150k) of candidate Galactic RR Lyrae (RRL) stars for which we derived high quality photometry at UV, optical and infrared wavelengths, using data from publicly available surveys. For a sub-sample of these stars (~2,400 fundamental mode field RRLs) we have measured their individual metallicity using the Delta S method, resulting in the largest and mos… ▽ More We present our analysis of a large sample (over 150k) of candidate Galactic RR Lyrae (RRL) stars for which we derived high quality photometry at UV, optical and infrared wavelengths, using data from publicly available surveys. For a sub-sample of these stars (~2,400 fundamental mode field RRLs) we have measured their individual metallicity using the Delta S method, resulting in the largest and most homogeneous spectroscopic data set collected for RRLs. We use this sample to study the metallicity distribution in the Galactic Halo, including the long-standing problem of the Oosterhoff dichotomy among Galactic globular clusters. We also analyze the dependence of their pulsation properties, and in particular the shape of their infrared light curves, from their [Fe/H] abundance. △ Less

Submitted 16 June, 2020; originally announced June 2020.

Comments: Proc. of "The RR Lyrae and Cepheid Conference 2019: Frontiers of Classical Pulsators - Theory and Observations", held in Cloudcroft, NM, 13-18 October 2019

arXiv:2005.11566 [pdf, other]

doi 10.3847/2041-8213/ab9538

On the Metamorphosis of the Bailey diagram for RR Lyrae stars

Authors: G. Bono, V. F. Braga, J. Crestani, M. Fabrizio, C. Sneden, M. Marconi, G. W. Preston, J. P. Mullen, C. K. Gilligan, G. Fiorentino, A. Pietrinferni, G. Altavilla, R. Buonanno, B. Chaboyer, R. da Silva, M. Dall'Ora, S. Degl'Innocenti, E. Di Carlo, I. Ferraro, E. Grebel, G. Iannicola, L. Inno, V. Kovtyukh, A. Kunder, B. Lemasle , et al. (15 additional authors not shown)

Abstract: We collected over 6000 high-resolution spectra of four dozen field RR Lyrae (RRL) variables pulsating either in the fundamental (39 RRab) or in the first overtone (9 RRc) mode. We measured radial velocities (RVs) of four strong metallic and four Balmer lines along the entire pulsational cycle and derived RV amplitudes with accuracies better than 1$-$2~\kmsec. The new amplitudes were combined with… ▽ More We collected over 6000 high-resolution spectra of four dozen field RR Lyrae (RRL) variables pulsating either in the fundamental (39 RRab) or in the first overtone (9 RRc) mode. We measured radial velocities (RVs) of four strong metallic and four Balmer lines along the entire pulsational cycle and derived RV amplitudes with accuracies better than 1$-$2~\kmsec. The new amplitudes were combined with literature data for 23~RRab and 3~RRc stars (total sample 74 RRLs) which allowed us to investigate the variation of the Bailey diagram (photometric amplitude versus period) when moving from optical to mid-infrared bands and to re-cast the Bailey diagram in terms of RV amplitudes. We found that RV amplitudes for RRab are minimally affected by nonlinear phenomena (shocks) and multi-periodicity (Blazhko effect). The RV slope ($\log P$--A(V$_r$)) when compared with the visual slope ($\log P$--A($V$)) is shallower and the dispersion, at fixed period, decreases by a factor of two. We constructed homogeneous sets of Horizontal Branch evolutionary models and nonlinear, convective pulsation models of RRLs to constrain the impact of evolutionary effects on their pulsation properties. Evolution causes, on the Bailey diagram based on RV amplitudes, a modest variation in pulsation period and a large dispersion in amplitude. The broad dispersion in period of the Bailey diagram is mainly caused by variation in RRL intrinsic parameters (stellar mass, chemical composition). Empirical evidence indicates that RV amplitudes are an optimal diagnostic for tracing the mean effective temperature across the RRab instability strip. △ Less

Submitted 23 May, 2020; originally announced May 2020.

Comments: 14 pages, 4 figures, 1 table, Accepted on ApJ Letter

arXiv:2005.03156 [pdf, other]

doi 10.1109/HPEC43674.2020.9286157

Fast Mapping onto Census Blocks

Authors: Jeremy Kepner, Andreas Kipf, Darren Engwirda, Navin Vembar, Michael Jones, Lauren Milechin, Vijay Gadepally, Chris Hill, Tim Kraska, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Andrew Kirby, Anna Klein, Julie Mullen, Andrew Prout, Albert Reuther, Antonio Rosa, Sid Samsi, Charles Yee, Peter Michaleas

Abstract: Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly irregular demographic census block polygons is computationally challenging in both research and deployment contexts. This paper describes two approaches labeled… ▽ More Pandemic measures such as social distancing and contact tracing can be enhanced by rapidly integrating dynamic location data and demographic data. Projecting billions of longitude and latitude locations onto hundreds of thousands of highly irregular demographic census block polygons is computationally challenging in both research and deployment contexts. This paper describes two approaches labeled "simple" and "fast". The simple approach can be implemented in any scripting language (Matlab/Octave, Python, Julia, R) and is easily integrated and customized to a variety of research goals. This simple approach uses a novel combination of hierarchy, sparse bounding boxes, polygon crossing-number, vectorization, and parallel processing to achieve 100,000,000+ projections per second on 100 servers. The simple approach is compact, does not increase data storage requirements, and is applicable to any country or region. The fast approach exploits the thread, vector, and memory optimizations that are possible using a low-level language (C++) and achieves similar performance on a single server. This paper details these approaches with the goal of enabling the broader community to quickly integrate location and demographic data. △ Less

Submitted 1 August, 2020; v1 submitted 6 May, 2020; originally announced May 2020.

Comments: 8 pages, 7 figures, 55 references; accepted to IEEE HPEC 2020

arXiv:2001.06935 [pdf, other]

doi 10.1109/IPDPSW50202.2020.00046

75,000,000,000 Streaming Inserts/Second Using Hierarchical Hypersparse GraphBLAS Matrices

Authors: Jeremy Kepner, Tim Davis, Chansup Byun, William Arcand, David Bestor, William Bergeron, Vijay Gadepally, Matthew Hubbell, Michael Houle, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

Abstract: The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of h… ▽ More The SuiteSparse GraphBLAS C-library implements high performance hypersparse matrices with bindings to a variety of languages (Python, Julia, and Matlab/Octave). GraphBLAS provides a lightweight in-memory database implementation of hypersparse matrices that are ideal for analyzing many types of network data, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of hypersparse matrices put enormous pressure on the memory hierarchy. This work benchmarks an implementation of hierarchical hypersparse matrices that reduces memory pressure and dramatically increases the update rate into a hypersparse matrices. The parameters of hierarchical hypersparse matrices rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical hypersparse matrices achieve over 1,000,000 updates per second in a single instance. Scaling to 31,000 instances of hierarchical hypersparse matrices arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 75,000,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets. △ Less

Submitted 16 March, 2020; v1 submitted 19 January, 2020; originally announced January 2020.

Comments: 4 pages, 4 figures, 28 references, accepted to IPDPS GrAPL 2020. arXiv admin note: substantial text overlap with arXiv:1907.04217

arXiv:1909.01241 [pdf]

doi 10.1109/HPEC.2019.8916221

Large Scale Parallelization Using File-Based Communications

Authors: Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, Bill Bergeron, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

Abstract: In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and r… ▽ More In this paper, we present a novel and new file-based communication architecture using the local filesystem for large scale parallelization. This new approach eliminates the issues with filesystem overload and resource contention when using the central filesystem for large parallel jobs. The new approach incurs additional overhead due to inter-node message file transfers when both the sending and receiving processes are not on the same node. However, even with this additional overhead cost, its benefits are far greater for the overall cluster operation in addition to the performance enhancement in message communications for large scale parallel jobs. For example, when running a 2048-process parallel job, it achieved about 34 times better performance with MPI_Bcast() when using the local filesystem. Furthermore, since the security for transferring message files is handled entirely by using the secure copy protocol (scp) and the file system permissions, no additional security measures or ports are required other than those that are typically required on an HPC system. △ Less

Submitted 3 September, 2019; originally announced September 2019.

arXiv:1908.07573 [pdf]

doi 10.1109/HPEC.2019.8916255

Securing HPC using Federated Authentication

Authors: Andrew Prout, William Arcand, David Bestor, Bill Bergeron, Chansup Byun, Vijay Gadepally, Michael Houle, Matthew Hubbell, Michael Jones, Anna Klein, Peter Michaleas, Lauren Milechin, Julie Mullen, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther, Jeremy Kepner

Abstract: Federated authentication can drastically reduce the overhead of basic account maintenance while simultaneously improving overall system security. Integrating with the user's more frequently used account at their primary organization both provides a better experience to the end user and makes account compromise or changes in affiliation more likely to be noticed and acted upon. Additionally, with m… ▽ More Federated authentication can drastically reduce the overhead of basic account maintenance while simultaneously improving overall system security. Integrating with the user's more frequently used account at their primary organization both provides a better experience to the end user and makes account compromise or changes in affiliation more likely to be noticed and acted upon. Additionally, with many organizations transitioning to multi-factor authentication for all account access, the ability to leverage external federated identity management systems provides the benefit of their efforts without the additional overhead of separately implementing a distinct multi-factor authentication process. This paper describes our experiences and the lessons we learned by enabling federated authentication with the U.S. Government PKI and InCommon Federation, scaling it up to the user base of a production HPC system, and the motivations behind those choices. We have received only positive feedback from our users. △ Less

Submitted 20 August, 2019; originally announced August 2019.

arXiv:1907.04217 [pdf, other]

doi 10.1109/HPEC.2019.8916508

Streaming 1.9 Billion Hypersparse Network Updates per Second with D4M

Authors: Jeremy Kepner, Vijay Gadepally, Lauren Milechin, Siddharth Samsi, William Arcand, David Bestor, William Bergeron, Chansup Byun, Matthew Hubbell, Michael Houle, Michael Jones, Anne Klein, Peter Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Charles Yee, Albert Reuther

Abstract: The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets, databases, matrices, graphs, and… ▽ More The Dynamic Distributed Dimensional Data Model (D4M) library implements associative arrays in a variety of languages (Python, Julia, and Matlab/Octave) and provides a lightweight in-memory database implementation of hypersparse arrays that are ideal for analyzing many types of network data. D4M relies on associative arrays which combine properties of spreadsheets, databases, matrices, graphs, and networks, while providing rigorous mathematical guarantees, such as linearity. Streaming updates of D4M associative arrays put enormous pressure on the memory hierarchy. This work describes the design and performance optimization of an implementation of hierarchical associative arrays that reduces memory pressure and dramatically increases the update rate into an associative array. The parameters of hierarchical associative arrays rely on controlling the number of entries in each level in the hierarchy before an update is cascaded. The parameters are easily tunable to achieve optimal performance for a variety of applications. Hierarchical arrays achieve over 40,000 updates per second in a single instance. Scaling to 34,000 instances of hierarchical D4M associative arrays on 1,100 server nodes on the MIT SuperCloud achieved a sustained update rate of 1,900,000,000 updates per second. This capability allows the MIT SuperCloud to analyze extremely large streaming network data sets. △ Less

Submitted 6 July, 2019; originally announced July 2019.

Comments: 6 pages; 6 figures; accepted to IEEE High Performance Extreme Computing (HPEC) Conference 2019. arXiv admin note: text overlap with arXiv:1807.05308, arXiv:1902.00846

arXiv:1907.03195 [pdf, other]

doi 10.1109/HPEC.2019.8916300

Optimizing Xeon Phi for Interactive Data Analysis

Authors: Chansup Byun, Jeremy Kepner, William Arcand, David Bestor, William Bergeron, Matthew Hubbell, Vijay Gadepally, Michael Houle, Michael Jones, Anne Klein, Lauren Milechin, Peter Michaleas, Julie Mullen, Andrew Prout, Antonio Rosa, Siddharth Samsi, Charles Yee, Albert Reuther

Abstract: The Intel Xeon Phi manycore processor is designed to provide high performance matrix computations of the type often performed in data analysis. Common data analysis environments include Matlab, GNU Octave, Julia, Python, and R. Achieving optimal performance of matrix operations within data analysis environments requires tuning the Xeon Phi OpenMP settings, process pinning, and memory modes. This p… ▽ More The Intel Xeon Phi manycore processor is designed to provide high performance matrix computations of the type often performed in data analysis. Common data analysis environments include Matlab, GNU Octave, Julia, Python, and R. Achieving optimal performance of matrix operations within data analysis environments requires tuning the Xeon Phi OpenMP settings, process pinning, and memory modes. This paper describes matrix multiplication performance results for Matlab and GNU Octave over a variety of combinations of process counts and OpenMP threads and Xeon Phi memory modes. These results indicate that using KMP_AFFINITY=granlarity=fine, taskset pinning, and all2all cache memory mode allows both Matlab and GNU Octave to achieve 66% of the practical peak performance for process counts ranging from 1 to 64 and OpenMP threads ranging from 1 to 64. These settings have resulted in generally improved performance across a range of applications and has enabled our Xeon Phi system to deliver significant results in a number of real-world applications. △ Less

Submitted 6 July, 2019; originally announced July 2019.

Comments: 6 pages, 5 figures, accepted in IEEE High Performance Extreme Computing (HPEC) conference 2019

Showing 1–50 of 77 results for author: Mullen, J