subscribe to arXiv mailings

Broken Windows: Exploring the Applicability of a Controversial Theory on Code Quality

Authors: Diomidis Spinellis, Panos Louridas, Maria Kechagia, Tushar Sharma

Abstract: Is the quality of existing code correlated with the quality of subsequent changes? According to the (controversial) broken windows theory, which inspired this study, disorder sets descriptive norms and signals behavior that further increases it. From a large code corpus, we examine whether code history does indeed affect the evolution of code quality. We examine C code quality metrics and Java cod… ▽ More Is the quality of existing code correlated with the quality of subsequent changes? According to the (controversial) broken windows theory, which inspired this study, disorder sets descriptive norms and signals behavior that further increases it. From a large code corpus, we examine whether code history does indeed affect the evolution of code quality. We examine C code quality metrics and Java code smells in specific files, and see whether subsequent commits by developers continue on that path. We check whether developers tailor the quality of their commits based on the quality of the file they commit to. Our results show that history matters, that developers behave differently depending on some aspects of the code quality they encounter, and that programming style inconsistency is not necessarily related to structural qualities. These findings have implications for both software practice and research. Software practitioners can emphasize current quality practices as these influence the code that will be developed in the future. Researchers in the field may replicate and extend the study to improve our understanding of the theory and its practical implications on artifacts, processes, and people. △ Less

Submitted 17 October, 2024; originally announced October 2024.

Comments: 15 pages, 5 figures, to be published in the proceedings of ICSME '24: 40th IEEE International Conference on Software Maintenance and Evolution

arXiv:2410.13095 [pdf, other]

Future of Algorithmic Organization: Large-Scale Analysis of Decentralized Autonomous Organizations (DAOs)

Authors: Tanusree Sharma, Yujin Potter, Kornrapat Pongmala, Henry Wang, Andrew Miller, Dawn Song, Yang Wang

Abstract: Decentralized Autonomous Organizations (DAOs) resemble early online communities, particularly those centered around open-source projects, and present a potential empirical framework for complex social-computing systems by encoding governance rules within "smart contracts" on the blockchain. A key function of a DAO is collective decision-making, typically carried out through a series of proposals w… ▽ More Decentralized Autonomous Organizations (DAOs) resemble early online communities, particularly those centered around open-source projects, and present a potential empirical framework for complex social-computing systems by encoding governance rules within "smart contracts" on the blockchain. A key function of a DAO is collective decision-making, typically carried out through a series of proposals where members vote on organizational events using governance tokens, signifying relative influence within the DAO. In just a few years, the deployment of DAOs surged with a total treasury of $24.5 billion and 11.1M governance token holders collectively managing decisions across over 13,000 DAOs as of 2024. In this study, we examine the operational dynamics of 100 DAOs, like pleasrdao, lexdao, lootdao, optimism collective, uniswap, etc. With large-scale empirical analysis of a diverse set of DAO categories and smart contracts and by leveraging on-chain (e.g., voting results) and off-chain data, we examine factors such as voting power, participation, and DAO characteristics dictating the level of decentralization, thus, the efficiency of management structures. As such, our study highlights that increased grassroots participation correlates with higher decentralization in a DAO, and lower variance in voting power within a DAO correlates with a higher level of decentralization, as consistently measured by Gini metrics. These insights closely align with key topics in political science, such as the allocation of power in decision-making and the effects of various governance models. We conclude by discussing the implications for researchers, and practitioners, emphasizing how these factors can inform the design of democratic governance systems in emerging applications that require active engagement from stakeholders in decision-making. △ Less

Submitted 16 October, 2024; originally announced October 2024.

arXiv:2410.01817 [pdf, other]

From Experts to the Public: Governing Multimodal Language Models in Politically Sensitive Video Analysis

Authors: Tanusree Sharma, Yujin Potter, Zachary Kilhoffer, Yun Huang, Dawn Song, Yang Wang

Abstract: This paper examines the governance of multimodal large language models (MM-LLMs) through individual and collective deliberation, focusing on analyses of politically sensitive videos. We conducted a two-step study: first, interviews with 10 journalists established a baseline understanding of expert video interpretation; second, 114 individuals from the general public engaged in deliberation using I… ▽ More This paper examines the governance of multimodal large language models (MM-LLMs) through individual and collective deliberation, focusing on analyses of politically sensitive videos. We conducted a two-step study: first, interviews with 10 journalists established a baseline understanding of expert video interpretation; second, 114 individuals from the general public engaged in deliberation using Inclusive.AI, a platform that facilitates democratic decision-making through decentralized autonomous organization (DAO) mechanisms. Our findings show that while experts emphasized emotion and narrative, the general public prioritized factual clarity, objectivity of the situation, and emotional neutrality. Additionally, we explored the impact of different governance mechanisms: quadratic vs. weighted ranking voting and equal vs. 20-80 power distributions on users decision-making on how AI should behave. Specifically, quadratic voting enhanced perceptions of liberal democracy and political equality, and participants who were more optimistic about AI perceived the voting process to have a higher level of participatory democracy. Our results suggest the potential of applying DAO mechanisms to help democratize AI governance. △ Less

Submitted 14 September, 2024; originally announced October 2024.

arXiv:2408.10424 [pdf, other]

Exact NNLO corrections vs K-factors in PDF fits

Authors: Tanishq Sharma

Abstract: Parton distribution functions (PDFs) often include datasets corresponding to processes whereby the theoretical predictions at next-to-next-to-leading order (NNLO) in peturbative QCD have to be approximated, and this approximation may be performed using K-factors, which in turn depend on the PDF set used to compute them. In this study, we investigate the impact of K-factors produced with various PD… ▽ More Parton distribution functions (PDFs) often include datasets corresponding to processes whereby the theoretical predictions at next-to-next-to-leading order (NNLO) in peturbative QCD have to be approximated, and this approximation may be performed using K-factors, which in turn depend on the PDF set used to compute them. In this study, we investigate the impact of K-factors produced with various PDF sets, namely CT18, MSHT20 and NNPDF4.0 on (differential) cross sections of top pair production at the Large Hadron Collider (LHC). Furthermore, we perform a new fit (otherwise analogous to NNPDF4.0 with MHOUs) where the exact NNLO corrections are used in the fitting procedure and compare the K-factors obtained from this fit with those obtained from the above mentioned PDF sets. We find good agreement amongst K-factors obtained from these different PDF sets. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 5 pages, submitted to the Proceedings of the 31st International Workshop on Deep Inelastic Scattering (DIS2024)

arXiv:2408.09870 [pdf, other]

1d Conformal Field Theory and Dispersion Relations

Authors: Dean Carmi, Sudip Ghosh, Trakshu Sharma

Abstract: We study conformal field theory in $d=1$ space-time dimensions. We derive a dispersion relation for the 4-point correlation function of identical bosons and fermions, in terms of the double discontinuity. This extends the conformal dispersion relation of arXiv:1910.12123, which holds for CFTs in dimensions $d\geq 2$, to the case of $d=1$. The dispersion relation is obtained by combining the Lorent… ▽ More We study conformal field theory in $d=1$ space-time dimensions. We derive a dispersion relation for the 4-point correlation function of identical bosons and fermions, in terms of the double discontinuity. This extends the conformal dispersion relation of arXiv:1910.12123, which holds for CFTs in dimensions $d\geq 2$, to the case of $d=1$. The dispersion relation is obtained by combining the Lorentzian inversion formula with the operator product expansion of the 4-point correlator. We perform checks of the dispersion relation using correlators of generalised free fields and derive an integral relation between the kernel of the dispersion relation and that of the Lorentzian inversion formula. Finally, for $1$-$d$ holographic conformal theories, we analytically compute scalar Witten diagrams in $AdS_2$ at tree-level and $1$-loop. △ Less

Submitted 19 August, 2024; originally announced August 2024.

Comments: 36 pages, 2 figures

arXiv:2407.20343 [pdf, other]

Observations of Kappa Distributions in Solar Energetic Protons and Derived Thermodynamic Properties

Authors: M. E. Cuesta, A. T. Cummings, G. Livadiotis, D. J. McComas, C. M. S. Cohen, L. Y. Khoo, T. Sharma, M. M. Shen, R. Bandyopadhyay, J. S. Rankin, J. R. Szalay, H. A. Farooki, Z. Xu, G. D. Muro, M. L. Stevens, S. D. Bale

Abstract: In this paper we model the high-energy tail of observed solar energetic proton energy distributions with a kappa distribution function. We employ a technique for deriving the thermodynamic parameters of solar energetic proton populations measured by the Parker Solar Probe (PSP) Integrated Science Investigation of the Sun (IS$\odot$IS) EPI-Hi high energy telescope (HET), over energies from 10 - 60… ▽ More In this paper we model the high-energy tail of observed solar energetic proton energy distributions with a kappa distribution function. We employ a technique for deriving the thermodynamic parameters of solar energetic proton populations measured by the Parker Solar Probe (PSP) Integrated Science Investigation of the Sun (IS$\odot$IS) EPI-Hi high energy telescope (HET), over energies from 10 - 60 MeV. With this technique we explore, for the first time, the characteristic thermodynamic properties of the solar energetic protons associated with an interplanetary coronal mass ejection (ICME) and its driven shock. We find that (1) the spectral index, or equivalently, the thermodynamic parameter kappa of solar energetic protons ($κ_{\rm EP}$) gradually increases starting from the pre-ICME region (upstream of the CME-driven shock), reaching a maximum in the CME ejecta ($κ_{\rm EP} \approx 3.5$), followed by a gradual decrease throughout the trailing portion of the CME; (2) solar energetic proton temperature and density ($T_{\rm EP}$ and $n_{\rm EP}$) appear anti-correlated, a behavior consistent to sub-isothermal polytropic processes; and (3) values of $T_{\rm EP}$ and $κ_{\rm EP}$ appear are positively correlated, indicating an increasing entropy with time. Therefore, these proton populations are characterized by a complex and evolving thermodynamic behavior, consisting of multiple sub-isothermal polytropic processes, and a large-scale trend of increasing temperature, kappa, and entropy. This study and its companion study by Livadiotis et al. (2024) open a new set of procedures for investigating the thermodynamic behavior of energetic particles and their shared thermal properties. △ Less

Submitted 29 July, 2024; originally announced July 2024.

Comments: 16 pages, 6 figures, 1 table

arXiv:2407.18243 [pdf, other]

BIV-Priv-Seg: Locating Private Content in Images Taken by People With Visual Impairments

Authors: Yu-Yun Tseng, Tanusree Sharma, Lotus Zhang, Abigale Stangl, Leah Findlater, Yang Wang, Danna Gurari

Abstract: Individuals who are blind or have low vision (BLV) are at a heightened risk of sharing private information if they share photographs they have taken. To facilitate developing technologies that can help preserve privacy, we introduce BIV-Priv-Seg, the first localization dataset originating from people with visual impairments that shows private content. It contains 1,028 images with segmentation ann… ▽ More Individuals who are blind or have low vision (BLV) are at a heightened risk of sharing private information if they share photographs they have taken. To facilitate developing technologies that can help preserve privacy, we introduce BIV-Priv-Seg, the first localization dataset originating from people with visual impairments that shows private content. It contains 1,028 images with segmentation annotations for 16 private object categories. We first characterize BIV-Priv-Seg and then evaluate modern models' performance for locating private content in the dataset. We find modern models struggle most with locating private objects that are not salient, small, and lack text as well as recognizing when private content is absent from an image. We facilitate future extensions by sharing our new dataset with the evaluation server at https://vizwiz.org/tasks-and-datasets/object-localization. △ Less

Submitted 21 August, 2024; v1 submitted 25 July, 2024; originally announced July 2024.

arXiv:2407.04188 [pdf]

Kappa-tail technique: Modeling and application to Solar Energetic Particles observed by Parker Solar Probe

Authors: G. Livadiotis, A. T. Cummings, M. E. Cuesta, R. Bandyopadhyay, H. A. Farooki, L. Y. Khoo, D. J. McComas, J. S. Rankin, T. Sharma, M. M. Shen, C. M. S. Cohen, G. D. Muro, Z. Xu

Abstract: We develop the kappa-tail fitting technique, which analyzes observations of power-law tails of distributions and energy-flux spectra and connects them to theoretical modeling of kappa distributions, to determine the thermodynamics of the examined space plasma. In particular, we (i) construct the associated mathematical formulation, (ii) prove its decisive lead for determining whether the observed… ▽ More We develop the kappa-tail fitting technique, which analyzes observations of power-law tails of distributions and energy-flux spectra and connects them to theoretical modeling of kappa distributions, to determine the thermodynamics of the examined space plasma. In particular, we (i) construct the associated mathematical formulation, (ii) prove its decisive lead for determining whether the observed power-law is associated with kappa distributions; and (iii) provide a validation of the technique using pseudo-observations of typical input plasma parameters. Then, we apply this technique to a case-study by determining the thermodynamics of solar energetic particle (SEP) protons, for a SEP event observed on April 17, 2021, by the PSP/ISOIS instrument suite onboard PSP. The results show SEP temperatures and densities of the order of $\sim 1$ MeV and $ \sim 5 \cdot 10^{-7} $ cm$^{-3}$, respectively. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2407.04147 [pdf, other]

ALPINE: An adaptive language-agnostic pruning method for language models for code

Authors: Mootez Saad, José Antonio Hernández López, Boqi Chen, Dániel Varró, Tushar Sharma

Abstract: Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce th… ▽ More Language models of code have demonstrated state-of-the-art performance across various software engineering and source code analysis tasks. However, their demanding computational resource requirements and consequential environmental footprint remain as significant challenges. This work introduces ALPINE, an adaptive programming language-agnostic pruning technique designed to substantially reduce these models' computational overhead. The proposed method offers a pluggable layer that can be integrated with all Transformer-based models. With ALPINE, input sequences undergo adaptive compression throughout the pipeline, reaching a size up to $\times 3$ less their initial size, resulting in significantly reduced computational load. Our experiments on two software engineering tasks, defect prediction and code clone detection across three language models CodeBERT, GraphCodeBERT and UniXCoder show that ALPINE achieves up to a 50% reduction in FLOPs, a 58.1% decrease in memory footprint, and a 28.1% improvement in throughput on average. This led to a reduction in CO2 by up to $44.85$%. Importantly, it achieves the reduction in computation resources while maintaining up to 98.1% of the original predictive performance. These findings highlight the potential of ALPINE in making language models of code more resource-efficient and accessible while preserving their performance, contributing to the overall sustainability of adopting language models in software development. Also, it sheds light on redundant and noisy information in source code analysis corpora, as shown by the substantial sequence compression achieved by ALPINE. △ Less

Submitted 4 July, 2024; originally announced July 2024.

arXiv:2406.12308 [pdf, other]

Status of Astronomy Education in India: A Baseline Survey

Authors: Moupiya Maji, Surhud More, Aniket Sule, Vishaak Balasubramanya, Ankit Bhandari, Hum Chand, Kshitij Chavan, Avik Dasgupta, Anindya De, Jayant Gangopadhyay, Mamta Gulati, Priya Hasan, Syed Ishtiyaq, Meraj Madani, Kuntal Misra, Amoghavarsha N, Divya Oberoi, Subhendu Pattnaik, Mayuri Patwardhan, Niruj Mohan Ramanujam, Pritesh Ranadive, Disha Sawant, Paryag Sharma, Twinkle Sharma, Sai Shetye , et al. (6 additional authors not shown)

Abstract: We present the results of a nation-wide baseline survey, conducted by us, for the status of Astronomy education among secondary school students in India. The survey was administered in 10 different languages to over 2000 students from diverse backgrounds, and it explored multiple facets of their perspectives on astronomy. The topics included students' views on the incorporation of astronomy in cur… ▽ More We present the results of a nation-wide baseline survey, conducted by us, for the status of Astronomy education among secondary school students in India. The survey was administered in 10 different languages to over 2000 students from diverse backgrounds, and it explored multiple facets of their perspectives on astronomy. The topics included students' views on the incorporation of astronomy in curricula, their grasp of fundamental astronomical concepts, access to educational resources, cultural connections to astronomy, and their levels of interest and aspirations in the subject. We find notable deficiencies in students' knowledge of basic astronomical principles, with only a minority demonstrating proficiency in key areas such as celestial sizes, distances, and lunar phases. Furthermore, access to resources such as telescopes and planetariums remain limited across the country. Despite these challenges, a significant majority of students expressed a keen interest in astronomy. We further analyze the data along socioeconomic and gender lines. Particularly striking were the socioeconomic disparities, with students from resource-poor backgrounds often having lower levels of access and proficiency. Some differences were observed between genders, although not very pronounced. The insights gleaned from this study hold valuable implications for the development of a more robust astronomy curriculum and the design of effective teacher training programs in the future. △ Less

Submitted 18 June, 2024; originally announced June 2024.

Comments: 15 pages, 19 figures

arXiv:2406.10461 [pdf, ps, other]

Exploring Parent-Child Perceptions on Safety in Generative AI: Concerns, Mitigation Strategies, and Design Implications

Authors: Yaman Yu, Tanusree Sharma, Melinda Hu, Justin Wang, Yang Wang

Abstract: The widespread use of Generative Artificial Intelligence (GAI) among teenagers has led to significant misuse and safety concerns. To identify risks and understand parental controls challenges, we conducted a content analysis on Reddit and interviewed 20 participants (seven teenagers and 13 parents). Our study reveals a significant gap in parental awareness of the extensive ways children use GAI, s… ▽ More The widespread use of Generative Artificial Intelligence (GAI) among teenagers has led to significant misuse and safety concerns. To identify risks and understand parental controls challenges, we conducted a content analysis on Reddit and interviewed 20 participants (seven teenagers and 13 parents). Our study reveals a significant gap in parental awareness of the extensive ways children use GAI, such as interacting with character-based chatbots for emotional support or engaging in virtual relationships. Parents and children report differing perceptions of risks associated with GAI. Parents primarily express concerns about data collection, misinformation, and exposure to inappropriate content. In contrast, teenagers are more concerned about becoming addicted to virtual relationships with GAI, the potential misuse of GAI to spread harmful content in social groups, and the invasion of privacy due to unauthorized use of their personal data in GAI applications. The absence of parental control features on GAI platforms forces parents to rely on system-built controls, manually check histories, share accounts, and engage in active mediation. Despite these efforts, parents struggle to grasp the full spectrum of GAI-related risks and to perform effective real-time monitoring, mediation, and education. We provide design recommendations to improve parent-child communication and enhance the safety of GAI use. △ Less

Submitted 14 June, 2024; originally announced June 2024.

Comments: 13 pages

arXiv:2405.11138 [pdf, other]

From Point Data to Geographic Boundaries: Regionalizing Crowdsourced Latency Measurements

Authors: Taveesh Sharma, Paul Schmitt, Francesco Bronzino, Nick Feamster, Nicole Marwell

Abstract: Despite significant investments in access network infrastructure, universal access to high-quality Internet connectivity remains a challenge. Policymakers often rely on large-scale, crowdsourced measurement datasets to assess the distribution of access network performance across geographic areas. These decisions typically rest on the assumption that Internet performance is uniformly distributed wi… ▽ More Despite significant investments in access network infrastructure, universal access to high-quality Internet connectivity remains a challenge. Policymakers often rely on large-scale, crowdsourced measurement datasets to assess the distribution of access network performance across geographic areas. These decisions typically rest on the assumption that Internet performance is uniformly distributed within predefined social boundaries. However, this assumption may not be valid for two reasons: crowdsourced measurements often exhibit non-uniform sampling densities within geographic areas; and predefined social boundaries may not align with the actual boundaries of Internet infrastructure. In this paper, we present a spatial analysis on crowdsourced datasets for constructing stable boundaries for sampling Internet performance. We hypothesize that greater stability in sampling boundaries will reflect the true nature of Internet performance disparities than misleading patterns observed as a result of data sampling variations. We apply and evaluate a series of statistical techniques to: aggregate Internet performance over geographic regions; overlay interpolated maps with various sampling unit choices; and spatially cluster boundary units to identify contiguous areas with similar performance characteristics. We assess the effectiveness of the techniques we apply by comparing the similarity of the resulting boundaries for monthly samples drawn from the dataset. Our evaluation shows that the combination of techniques we apply achieves higher similarity compared to directly calculating central measures of network metrics over census tracts or neighborhood boundaries. These findings underscore the important role of spatial modeling in accurately assessing and optimizing the distribution of Internet performance, to inform policy, network operations, and long-term planning decisions. △ Less

Submitted 12 August, 2024; v1 submitted 17 May, 2024; originally announced May 2024.

Comments: 23 pages

arXiv:2405.02790 [pdf, other]

Confidential and Protected Disease Classifier using Fully Homomorphic Encryption

Authors: Aditya Malik, Nalini Ratha, Bharat Yalavarthi, Tilak Sharma, Arjun Kaushik, Charanjit Jutla

Abstract: With the rapid surge in the prevalence of Large Language Models (LLMs), individuals are increasingly turning to conversational AI for initial insights across various domains, including health-related inquiries such as disease diagnosis. Many users seek potential causes on platforms like ChatGPT or Bard before consulting a medical professional for their ailment. These platforms offer valuable benef… ▽ More With the rapid surge in the prevalence of Large Language Models (LLMs), individuals are increasingly turning to conversational AI for initial insights across various domains, including health-related inquiries such as disease diagnosis. Many users seek potential causes on platforms like ChatGPT or Bard before consulting a medical professional for their ailment. These platforms offer valuable benefits by streamlining the diagnosis process, alleviating the significant workload of healthcare practitioners, and saving users both time and money by avoiding unnecessary doctor visits. However, Despite the convenience of such platforms, sharing personal medical data online poses risks, including the presence of malicious platforms or potential eavesdropping by attackers. To address privacy concerns, we propose a novel framework combining FHE and Deep Learning for a secure and private diagnosis system. Operating on a question-and-answer-based model akin to an interaction with a medical practitioner, this end-to-end secure system employs Fully Homomorphic Encryption (FHE) to handle encrypted input data. Given FHE's computational constraints, we adapt deep neural networks and activation functions to the encryted domain. Further, we also propose a faster algorithm to compute summation of ciphertext elements. Through rigorous experiments, we demonstrate the efficacy of our approach. The proposed framework achieves strict security and privacy with minimal loss in performance. △ Less

Submitted 4 May, 2024; originally announced May 2024.

arXiv:2402.19049 [pdf, other]

Enhancing key rates of QKD protocol by Coincidence Detection

Authors: Tanya Sharma, Rutvij Bhavsar, Jayanth Ramakrishnan, Pooja Chandravanshi, Shashi Prabhakar, Ayan Biswas, R. P. Singh

Abstract: In theory, quantum key distribution (QKD) provides unconditional security; however, its practical implementations are susceptible to exploitable vulnerabilities. This investigation tackles the constraints in practical QKD implementations using weak coherent pulses. We improve on the conventional approach of using decoy pulses by integrating it with the coincidence detection (CD) protocol. Addition… ▽ More In theory, quantum key distribution (QKD) provides unconditional security; however, its practical implementations are susceptible to exploitable vulnerabilities. This investigation tackles the constraints in practical QKD implementations using weak coherent pulses. We improve on the conventional approach of using decoy pulses by integrating it with the coincidence detection (CD) protocol. Additionally, we introduce an easy-to-implement algorithm to compute asymptotic key rates for the protocol. Furthermore, we have carried out an experimental implementation of the protocol, where we demonstrate that monitoring coincidences in the decoy state protocol leads to enhanced key rates under realistic experimental conditions. △ Less

Submitted 10 May, 2024; v1 submitted 29 February, 2024; originally announced February 2024.

Comments: 19 pages, 3 figures. Comments are welcome

arXiv:2402.01841 [pdf, other]

COMET: Generating Commit Messages using Delta Graph Context Representation

Authors: Abhinav Reddy Mandli, Saurabhsingh Rajput, Tushar Sharma

Abstract: Commit messages explain code changes in a commit and facilitate collaboration among developers. Several commit message generation approaches have been proposed; however, they exhibit limited success in capturing the context of code changes. We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and lever… ▽ More Commit messages explain code changes in a commit and facilitate collaboration among developers. Several commit message generation approaches have been proposed; however, they exhibit limited success in capturing the context of code changes. We propose Comet (Context-Aware Commit Message Generation), a novel approach that captures context of code changes using a graph-based representation and leverages a transformer-based model to generate high-quality commit messages. Our proposed method utilizes delta graph that we developed to effectively represent code differences. We also introduce a customizable quality assurance module to identify optimal messages, mitigating subjectivity in commit messages. Experiments show that Comet outperforms state-of-the-art techniques in terms of bleu-norm and meteor metrics while being comparable in terms of rogue-l. Additionally, we compare the proposed approach with the popular gpt-3.5-turbo model, along with gpt-4-turbo; the most capable GPT model, over zero-shot, one-shot, and multi-shot settings. We found Comet outperforming the GPT models, on five and four metrics respectively and provide competitive results with the two other metrics. The study has implications for researchers, tool developers, and software developers. Software developers may utilize Comet to generate context-aware commit messages. Researchers and tool developers can apply the proposed delta graph technique in similar contexts, like code review summarization. △ Less

Submitted 2 February, 2024; originally announced February 2024.

Comments: 22 Pages, 7 Figures

arXiv:2402.00210 [pdf, other]

Correlation of Coronal Mass Ejection Shock Temperature with Solar Energetic Particle Intensity

Authors: Manuel Enrique Cuesta, D. J. McComas, L. Y. Khoo, R. Bandyopadhyay, T. Sharma, M. M. Shen, J. S. Rankin, A. T. Cummings, J. R. Szalay, C. M. S. Cohen, N. A. Schwadron, R. Chhiber, F. Pecora, W. H. Matthaeus, R. A. Leske, M. L. Stevens

Abstract: Solar energetic particle (SEP) events have been observed by the Parker Solar Probe (PSP) spacecraft since its launch in 2018. These events include sources from solar flares and coronal mass ejections (CMEs). Onboard PSP is the IS$\odot$IS instrument suite measuring ions over energies from ~ 20 keV/nucleon to 200 MeV/nucleon and electrons from ~ 20 keV to 6 MeV. Previous studies sought to group C… ▽ More Solar energetic particle (SEP) events have been observed by the Parker Solar Probe (PSP) spacecraft since its launch in 2018. These events include sources from solar flares and coronal mass ejections (CMEs). Onboard PSP is the IS$\odot$IS instrument suite measuring ions over energies from ~ 20 keV/nucleon to 200 MeV/nucleon and electrons from ~ 20 keV to 6 MeV. Previous studies sought to group CME characteristics based on their plasma conditions and arrived at general descriptions with large statistical errors, leaving open questions on how to properly group CMEs based solely on their plasma conditions. To help resolve these open questions, plasma properties of CMEs have been examined in relation to SEPs. Here we reexamine one plasma property, the solar wind proton temperature, and compare it to the proton SEP intensity in a region immediately downstream of a CME-driven shock for seven CMEs observed at radial distances within 1 au. We find a statistically strong correlation between proton SEP intensity and bulk proton temperature, indicating a clear relationship between SEPs and the conditions in the solar wind. Furthermore, we propose that an indirect coupling of SEP intensity to the level of turbulence and the amount of energy dissipation that results is mainly responsible for the observed correlation between SEP intensity and proton temperature. These results are key to understanding the interaction of SEPs with the bulk solar wind in CME-driven shocks and will improve our ability to model the interplay of shock evolution and particle acceleration. △ Less

Submitted 31 January, 2024; originally announced February 2024.

Comments: 12 pages, 4 figures, and 2 tables

arXiv:2401.17967 [pdf, other]

CONCORD: Towards a DSL for Configurable Graph Code Representation

Authors: Mootez Saad, Tushar Sharma

Abstract: Deep learning is widely used to uncover hidden patterns in large code corpora. To achieve this, constructing a format that captures the relevant characteristics and features of source code is essential. Graph-based representations have gained attention for their ability to model structural and semantic information. However, existing tools lack flexibility in constructing graphs across different pr… ▽ More Deep learning is widely used to uncover hidden patterns in large code corpora. To achieve this, constructing a format that captures the relevant characteristics and features of source code is essential. Graph-based representations have gained attention for their ability to model structural and semantic information. However, existing tools lack flexibility in constructing graphs across different programming languages, limiting their use. Additionally, the output of these tools often lacks interoperability and results in excessively large graphs, making graph-based neural networks training slower and less scalable. We introduce CONCORD, a domain-specific language to build customizable graph representations. It implements reduction heuristics to reduce graphs' size complexity. We demonstrate its effectiveness in code smell detection as an illustrative use case and show that: first, CONCORD can produce code representations automatically per the specified configuration, and second, our heuristics can achieve comparable performance with significantly reduced size. CONCORD will help researchers a) create and experiment with customizable graph-based code representations for different software engineering tasks involving DL, b) reduce the engineering work to generate graph representations, c) address the issue of scalability in GNN models, and d) enhance the reproducibility of experiments in research through a standardized approach to code representation and analysis. △ Less

Submitted 31 January, 2024; originally announced January 2024.

arXiv:2401.07930 [pdf, other]

On Inter-dataset Code Duplication and Data Leakage in Large Language Models

Authors: José Antonio Hernández López, Boqi Chen, Mootez Saaz, Tushar Sharma, Dániel Varró

Abstract: Motivation. Large language models (LLMs) have exhibited remarkable proficiency in diverse software engineering (SE) tasks. Handling such tasks typically involves acquiring foundational coding knowledge on large, general-purpose datasets during a pre-training phase, and subsequently refining on smaller, task-specific datasets as part of a fine-tuning phase. Problem statement. While intra-dataset… ▽ More Motivation. Large language models (LLMs) have exhibited remarkable proficiency in diverse software engineering (SE) tasks. Handling such tasks typically involves acquiring foundational coding knowledge on large, general-purpose datasets during a pre-training phase, and subsequently refining on smaller, task-specific datasets as part of a fine-tuning phase. Problem statement. While intra-dataset code duplication examines the intersection between the training and test splits within a given dataset and has been addressed in prior research, inter-dataset code duplication, which gauges the overlap between different datasets, remains largely unexplored. If this phenomenon exists, it could compromise the integrity of LLM evaluations because of the inclusion of fine-tuning test samples that were already encountered during pre-training, resulting in inflated performance metrics. Contribution. This paper explores the phenomenon of inter-dataset code duplication and its impact on evaluating LLMs across diverse SE tasks. Study design. We conduct an empirical study using the CodeSearchNet dataset (CSN), a widely adopted pre-training dataset, and five fine-tuning datasets used for various se tasks. We first identify the intersection between the pre-training and fine-tuning datasets using a deduplication process. Next, we pre-train two versions of LLMs using a subset of CSN: one leaky LLM and one non-leaky LLM. Finally, we fine-tune both models and compare their performances using leaky fine-tuning test samples. Results. Our findings reveal a potential threat to the evaluation of LLMs across multiple SE tasks, stemming from the inter-dataset code duplication phenomenon. We also demonstrate that this threat is accentuated by the chosen fine-tuning technique. Furthermore, we provide evidence that open-source models could be affected by inter-dataset duplication. △ Less

Submitted 1 August, 2024; v1 submitted 15 January, 2024; originally announced January 2024.

arXiv:2312.15896 [pdf, other]

WWW: What, When, Where to Compute-in-Memory

Authors: Tanvi Sharma, Mustafa Ali, Indranil Chakraborty, Kaushik Roy

Abstract: Compute-in-memory (CiM) has emerged as a highly energy efficient solution for performing matrix multiplication during Machine Learning (ML) inference. However, integrating compute in memory poses key questions, such as 1) What type of CiM to use: Given a multitude of CiM design characteristics, determining their suitability from architecture perspective is needed. 2) When to use CiM: ML inference… ▽ More Compute-in-memory (CiM) has emerged as a highly energy efficient solution for performing matrix multiplication during Machine Learning (ML) inference. However, integrating compute in memory poses key questions, such as 1) What type of CiM to use: Given a multitude of CiM design characteristics, determining their suitability from architecture perspective is needed. 2) When to use CiM: ML inference includes workloads with a variety of memory and compute requirements, making it difficult to identify when CiM is more beneficial. 3) Where to integrate CiM: Each memory level has different bandwidth and capacity, creating different data reuse opportunities for CiM integration. To answer such questions regarding on-chip CiM integration for accelerating ML workloads, we use an analytical architecture evaluation methodology where we tailor the dataflow mapping. The mapping algorithm aims to achieve highest weight reuse and reduced data movements for a given CiM prototype and workload. Our experiments show that CiM integrated memory improves energy efficiency by up to 3.4x and throughput by up to 15.6x compared to tensor-core-like baseline architecture, with INT-8 precision under iso-area constraints. We believe the proposed work provides insights into what type of CiM to use, and when and where to optimally integrate it in the cache hierarchy for efficient matrix multiplication. △ Less

Submitted 20 June, 2024; v1 submitted 26 December, 2023; originally announced December 2023.

Comments: updated methodology

arXiv:2311.13508 [pdf, other]

Naturalness of Attention: Revisiting Attention in Code Language Models

Authors: Mootez Saad, Tushar Sharma

Abstract: Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties. Recent attention analysis studies provide initial interpretability insights by focusing solely on attention weights rather than considering the wider context modeling of Transformers. This study aims to shed some ligh… ▽ More Language models for code such as CodeBERT offer the capability to learn advanced source code representation, but their opacity poses barriers to understanding of captured properties. Recent attention analysis studies provide initial interpretability insights by focusing solely on attention weights rather than considering the wider context modeling of Transformers. This study aims to shed some light on the previously ignored factors of the attention mechanism beyond the attention weights. We conduct an initial empirical study analyzing both attention distributions and transformed representations in CodeBERT. Across two programming languages, Java and Python, we find that the scaled transformation norms of the input better capture syntactic structure compared to attention weights alone. Our analysis reveals characterization of how CodeBERT embeds syntactic code properties. The findings demonstrate the importance of incorporating factors beyond just attention weights for rigorously understanding neural code models. This lays the groundwork for developing more interpretable models and effective uses of attention mechanisms in program analysis. △ Less

Submitted 22 November, 2023; originally announced November 2023.

Comments: Accepted at ICSE-NIER (2024) track

arXiv:2310.02859 [pdf, other]

Tight Sampling in Unbounded Networks

Authors: Kshitijaa Jaglan, Meher Chaitanya, Triansh Sharma, Abhijeeth Singam, Nidhi Goyal, Ponnurangam Kumaraguru, Ulrik Brandes

Abstract: The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioriti… ▽ More The default approach to deal with the enormous size and limited accessibility of many Web and social media networks is to sample one or more subnetworks from a conceptually unbounded unknown network. Clearly, the extracted subnetworks will crucially depend on the sampling scheme. Motivated by studies of homophily and opinion formation, we propose a variant of snowball sampling designed to prioritize inclusion of entire cohesive communities rather than any kind of representativeness, breadth, or depth of coverage. The method is illustrated on a concrete example, and experiments on synthetic networks suggest that it behaves as desired. △ Less

Submitted 5 October, 2023; v1 submitted 4 October, 2023; originally announced October 2023.

Comments: The first two authors contributed equally

arXiv:2309.12583 [pdf]

doi 10.1145/3571884.3603755

Using ChatGPT in HCI Research -- A Trioethnography

Authors: Smit Desai, Tanusree Sharma, Pratyasha Saha

Abstract: This paper explores the lived experience of using ChatGPT in HCI research through a month-long trioethnography. Our approach combines the expertise of three HCI researchers with diverse research interests to reflect on our daily experience of living and working with ChatGPT. Our findings are presented as three provocations grounded in our collective experiences and HCI theories. Specifically, we e… ▽ More This paper explores the lived experience of using ChatGPT in HCI research through a month-long trioethnography. Our approach combines the expertise of three HCI researchers with diverse research interests to reflect on our daily experience of living and working with ChatGPT. Our findings are presented as three provocations grounded in our collective experiences and HCI theories. Specifically, we examine (1) the emotional impact of using ChatGPT, with a focus on frustration and embarrassment, (2) the absence of accountability and consideration of future implications in design, and raise (3) questions around bias from a Global South perspective. Our work aims to inspire critical discussions about utilizing ChatGPT in HCI research and advance equitable and inclusive technological development. △ Less

Submitted 21 September, 2023; originally announced September 2023.

arXiv:2308.16247 [pdf, other]

doi 10.1007/JHEP02(2024)025

Monotonicity conjecture for multi-party entanglement I

Authors: Abhijit Gadde, Shraiyance Jain, Vineeth Krishna, Harshal Kulkarni, Trakshu Sharma

Abstract: In this paper, we conjecture a monotonicity property that we call monotonicity under coarse-graining for a class of multi-partite entanglement measures. We check these properties by computing the measures for various types of states using different methods. In this paper, we conjecture a monotonicity property that we call monotonicity under coarse-graining for a class of multi-partite entanglement measures. We check these properties by computing the measures for various types of states using different methods. △ Less

Submitted 30 August, 2023; originally announced August 2023.

Comments: 40 Pages, 10 Figures

arXiv:2308.14402 [pdf, other]

doi 10.1109/JLT.2024.3361079

Mitigating the source-side channel vulnerability by characterization of photon statistics

Authors: Tanya Sharma, Ayan Biswas, Jayanth Ramakrishnan, Pooja Chandravanshi, Ravindra P. Singh

Abstract: Quantum key distribution (QKD) theoretically offers unconditional security. Unfortunately, the gap between theory and practice threatens side-channel attacks on practical QKD systems. Many well-known QKD protocols use weak coherent laser pulses to encode the quantum information. These sources differ from ideal single photon sources and follow Poisson statistics. Many protocols, such as decoy state… ▽ More Quantum key distribution (QKD) theoretically offers unconditional security. Unfortunately, the gap between theory and practice threatens side-channel attacks on practical QKD systems. Many well-known QKD protocols use weak coherent laser pulses to encode the quantum information. These sources differ from ideal single photon sources and follow Poisson statistics. Many protocols, such as decoy state and coincidence detection protocols, rely on monitoring the photon statistics to detect any information leakage. The accurate measurement and characterization of photon statistics enable the detection of adversarial attacks and the estimation of secure key rates, strengthening the overall security of the QKD system. We have rigorously characterized our source to estimate the mean photon number employing multiple detectors for comparison against measurements made with a single detector. Furthermore, we have also studied intensity fluctuations to help identify and mitigate any potential information leakage due to state preparation flaws. We aim to bridge the gap between theory and practice to achieve information-theoretic security. △ Less

Submitted 28 August, 2023; originally announced August 2023.

Comments: Comments and suggestions are welcomed

arXiv:2308.12264 [pdf, other]

Enhancing Energy-Awareness in Deep Learning through Fine-Grained Energy Measurement

Authors: Saurabhsingh Rajput, Tim Widmayer, Ziyuan Shang, Maria Kechagia, Federica Sarro, Tushar Sharma

Abstract: With the increasing usage, scale, and complexity of Deep Learning (DL) models, their rapidly growing energy consumption has become a critical concern. Promoting green development and energy awareness at different granularities is the need of the hour to limit carbon emissions of DL systems. However, the lack of standard and repeatable tools to accurately measure and optimize energy consumption at… ▽ More With the increasing usage, scale, and complexity of Deep Learning (DL) models, their rapidly growing energy consumption has become a critical concern. Promoting green development and energy awareness at different granularities is the need of the hour to limit carbon emissions of DL systems. However, the lack of standard and repeatable tools to accurately measure and optimize energy consumption at a fine granularity (e.g., at method level) hinders progress in this area. This paper introduces FECoM (Fine-grained Energy Consumption Meter), a framework for fine-grained DL energy consumption measurement. FECoM enables researchers and developers to profile DL APIs from energy perspective. FECoM addresses the challenges of measuring energy consumption at fine-grained level by using static instrumentation and considering various factors, including computational load and temperature stability. We assess FECoM's capability to measure fine-grained energy consumption for one of the most popular open-source DL frameworks, namely TensorFlow. Using FECoM, we also investigate the impact of parameter size and execution time on energy consumption, enriching our understanding of TensorFlow APIs' energy profiles. Furthermore, we elaborate on the considerations, issues, and challenges that one needs to consider while designing and implementing a fine-grained energy consumption measurement tool. This work will facilitate further advances in DL energy measurement and the development of energy-aware practices for DL systems. △ Less

Submitted 1 February, 2024; v1 submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.12199 [pdf, other]

Towards Real-Time Analysis of Broadcast Badminton Videos

Authors: Nitin Nilesh, Tushar Sharma, Anurag Ghosh, C. V. Jawahar

Abstract: Analysis of player movements is a crucial subset of sports analysis. Existing player movement analysis methods use recorded videos after the match is over. In this work, we propose an end-to-end framework for player movement analysis for badminton matches on live broadcast match videos. We only use the visual inputs from the match and, unlike other approaches which use multi-modal sensor data, our… ▽ More Analysis of player movements is a crucial subset of sports analysis. Existing player movement analysis methods use recorded videos after the match is over. In this work, we propose an end-to-end framework for player movement analysis for badminton matches on live broadcast match videos. We only use the visual inputs from the match and, unlike other approaches which use multi-modal sensor data, our approach uses only visual cues. We propose a method to calculate the on-court distance covered by both the players from the video feed of a live broadcast badminton match. To perform this analysis, we focus on the gameplay by removing replays and other redundant parts of the broadcast match. We then perform player tracking to identify and track the movements of both players in each frame. Finally, we calculate the distance covered by each player and the average speed with which they move on the court. We further show a heatmap of the areas covered by the player on the court which is useful for analyzing the gameplay of the player. Our proposed framework was successfully used to analyze live broadcast matches in real-time during the Premier Badminton League 2019 (PBL 2019), with commentators and broadcasters appreciating the utility. △ Less

Submitted 23 August, 2023; originally announced August 2023.

arXiv:2308.06882 [pdf, other]

Quantifying Outlierness of Funds from their Categories using Supervised Similarity

Authors: Dhruv Desai, Ashmita Dhiman, Tushar Sharma, Deepika Sharma, Dhagash Mehta, Stefano Pasquali

Abstract: Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. H… ▽ More Mutual fund categorization has become a standard tool for the investment management industry and is extensively used by allocators for portfolio construction and manager selection, as well as by fund managers for peer analysis and competitive positioning. As a result, a (unintended) miscategorization or lack of precision can significantly impact allocation decisions and investment fund managers. Here, we aim to quantify the effect of miscategorization of funds utilizing a machine learning based approach. We formulate the problem of miscategorization of funds as a distance-based outlier detection problem, where the outliers are the data-points that are far from the rest of the data-points in the given feature space. We implement and employ a Random Forest (RF) based method of distance metric learning, and compute the so-called class-wise outlier measures for each data-point to identify outliers in the data. We test our implementation on various publicly available data sets, and then apply it to mutual fund data. We show that there is a strong relationship between the outlier measures of the funds and their future returns and discuss the implications of our findings. △ Less

Submitted 13 August, 2023; originally announced August 2023.

Comments: 8 pages, 5 tables, 8 figures

arXiv:2307.08652 [pdf, other]

Search Me Knot, Render Me Knot: Embedding Search and Differentiable Rendering of Knots in 3D

Authors: Aalok Gangopadhyay, Paras Gupta, Tarun Sharma, Prajwal Singh, Shanmuganathan Raman

Abstract: We introduce the problem of knot-based inverse perceptual art. Given multiple target images and their corresponding viewing configurations, the objective is to find a 3D knot-based tubular structure whose appearance resembles the target images when viewed from the specified viewing configurations. To solve this problem, we first design a differentiable rendering algorithm for rendering tubular kno… ▽ More We introduce the problem of knot-based inverse perceptual art. Given multiple target images and their corresponding viewing configurations, the objective is to find a 3D knot-based tubular structure whose appearance resembles the target images when viewed from the specified viewing configurations. To solve this problem, we first design a differentiable rendering algorithm for rendering tubular knots embedded in 3D for arbitrary perspective camera configurations. Utilizing this differentiable rendering algorithm, we search over the space of knot configurations to find the ideal knot embedding. We represent the knot embeddings via homeomorphisms of the desired template knot, where the homeomorphisms are parametrized by the weights of an invertible neural network. Our approach is fully differentiable, making it possible to find the ideal 3D tubular structure for the desired perceptual art using gradient-based optimization. We propose several loss functions that impose additional physical constraints, enforcing that the tube is free of self-intersection, lies within a predefined region in space, satisfies the physical bending limits of the tube material and the material cost is within a specified budget. We demonstrate through results that our knot representation is highly expressive and gives impressive results even for challenging target images in both single view as well as multiple view constraints. Through extensive ablation study we show that each of the proposed loss function is effective in ensuring physical realizability. We construct a real world 3D-printed object to demonstrate the practical utility of our approach. To the best of our knowledge, we are the first to propose a fully differentiable optimization framework for knot-based inverse perceptual art. △ Less

Submitted 19 August, 2023; v1 submitted 17 July, 2023; originally announced July 2023.

arXiv:2306.06261 [pdf, other]

Iterative Design of An Accessible Crypto Wallet for Blind Users

Authors: Zhixuan Zhou, Tanusree Sharma, Luke Emano, Sauvik Das, Yang Wang

Abstract: Crypto wallets are a key touch-point for cryptocurrency use. People use crypto wallets to make transactions, manage crypto assets, and interact with decentralized apps (dApps). However, as is often the case with emergent technologies, little attention has been paid to understanding and improving accessibility barriers in crypto wallet software. We present a series of user studies that explored how… ▽ More Crypto wallets are a key touch-point for cryptocurrency use. People use crypto wallets to make transactions, manage crypto assets, and interact with decentralized apps (dApps). However, as is often the case with emergent technologies, little attention has been paid to understanding and improving accessibility barriers in crypto wallet software. We present a series of user studies that explored how both blind and sighted individuals use MetaMask, one of the most popular non-custodial crypto wallets. We uncovered inter-related accessibility, learnability, and security issues with MetaMask. We also report on an iterative redesign of MetaMask to make it more accessible for blind users. This process involved multiple evaluations with 44 novice crypto wallet users, including 20 sighted users, 23 blind users, and one user with low vision. Our study results show notable improvements for accessibility after two rounds of design iterations. Based on the results, we discuss design implications for creating more accessible and secure crypto wallets for blind users. △ Less

Submitted 9 June, 2023; originally announced June 2023.

Comments: 19th Symposium on Usable Privacy and Security

arXiv:2306.01194 [pdf, other]

doi 10.1145/3618257.3624828

Estimating WebRTC Video QoE Metrics Without Using Application Headers

Authors: Taveesh Sharma, Tarun Mangla, Arpit Gupta, Junchen Jiang, Nick Feamster

Abstract: The increased use of video conferencing applications (VCAs) has made it critical to understand and support end-user quality of experience (QoE) by all stakeholders in the VCA ecosystem, especially network operators, who typically do not have direct access to client software. Existing VCA QoE estimation methods use passive measurements of application-level Real-time Transport Protocol (RTP) headers… ▽ More The increased use of video conferencing applications (VCAs) has made it critical to understand and support end-user quality of experience (QoE) by all stakeholders in the VCA ecosystem, especially network operators, who typically do not have direct access to client software. Existing VCA QoE estimation methods use passive measurements of application-level Real-time Transport Protocol (RTP) headers. However, a network operator does not always have access to RTP headers in all cases, particularly when VCAs use custom RTP protocols (e.g., Zoom) or due to system constraints (e.g., legacy measurement systems). Given this challenge, this paper considers the use of more standard features in the network traffic, namely, IP and UDP headers, to provide per-second estimates of key VCA QoE metrics such as frames rate and video resolution. We develop a method that uses machine learning with a combination of flow statistics (e.g., throughput) and features derived based on the mechanisms used by the VCAs to fragment video frames into packets. We evaluate our method for three prevalent VCAs running over WebRTC: Google Meet, Microsoft Teams, and Cisco Webex. Our evaluation consists of 54,696 seconds of VCA data collected from both (1), controlled in-lab network conditions, and (2) real-world networks from 15 households. We show that the ML-based approach yields similar accuracy compared to the RTP-based methods, despite using only IP/UDP data. For instance, we can estimate FPS within 2 FPS for up to 83.05% of one-second intervals in the real-world data, which is only 1.76% lower than using the application-level RTP headers. △ Less

Submitted 9 November, 2023; v1 submitted 1 June, 2023; originally announced June 2023.

Comments: 16 pages

arXiv:2305.13126 [pdf, other]

Free Space Continuous Variable Quantum Key Distribution with Discrete Phases

Authors: Anju Rani, Pooja Chandravanshi, Jayanth Ramakrishnan, Pravin Vaity, P. Madhusudhan, Tanya Sharma, Pranav Bhardwaj, Ayan Biswas, R. P. Singh

Abstract: Quantum Key Distribution (QKD) offers unconditional security in principle. Many QKD protocols have been proposed and demonstrated to ensure secure communication between two authenticated users. Continuous variable (CV) QKD offers many advantages over discrete variable (DV) QKD since it is cost-effective, compatible with current classical communication technologies, efficient even in daylight, and… ▽ More Quantum Key Distribution (QKD) offers unconditional security in principle. Many QKD protocols have been proposed and demonstrated to ensure secure communication between two authenticated users. Continuous variable (CV) QKD offers many advantages over discrete variable (DV) QKD since it is cost-effective, compatible with current classical communication technologies, efficient even in daylight, and gives a higher secure key rate. Keeping this in view, we demonstrate a discrete modulated CVQKD protocol in the free space which is robust against polarization drift. We also present the simulation results with a noise model to account for the channel noise and the effects of various parameter changes on the secure key rate. These simulation results help us to verify the experimental values obtained for the implemented CVQKD. △ Less

Submitted 22 May, 2023; originally announced May 2023.

Comments: 9 pages, 7 figures. Comments are welcome

arXiv:2304.09822 [pdf, other]

Unpacking How Decentralized Autonomous Organizations (DAOs) Work in Practice

Authors: Tanusree Sharma, Yujin Kwon, Kornrapat Pongmala, Henry Wang, Andrew Miller, Dawn Song, Yang Wang

Abstract: Decentralized Autonomous Organizations (DAOs) have emerged as a novel way to coordinate a group of (pseudonymous) entities towards a shared vision (e.g., promoting sustainability), utilizing self-executing smart contracts on blockchains to support decentralized governance and decision-making. In just a few years, over 4,000 DAOs have been launched in various domains, such as investment, education,… ▽ More Decentralized Autonomous Organizations (DAOs) have emerged as a novel way to coordinate a group of (pseudonymous) entities towards a shared vision (e.g., promoting sustainability), utilizing self-executing smart contracts on blockchains to support decentralized governance and decision-making. In just a few years, over 4,000 DAOs have been launched in various domains, such as investment, education, health, and research. Despite such rapid growth and diversity, it is unclear how these DAOs actually work in practice and to what extent they are effective in achieving their goals. Given this, we aim to unpack how (well) DAOs work in practice. We conducted an in-depth analysis of a diverse set of 10 DAOs of various categories and smart contracts, leveraging on-chain (e.g., voting results) and off-chain data (e.g., community discussions) as well as our interviews with DAO organizers/members. Specifically, we defined metrics to characterize key aspects of DAOs, such as the degrees of decentralization and autonomy. We observed CompoundDAO, AssangeDAO, Bankless, and Krausehouse having poor decentralization in voting, while decentralization has improved over time for one-person-one-vote DAOs (e.g., Proof of Humanity). Moreover, the degree of autonomy varies among DAOs, with some (e.g., Compound and Krausehouse) relying more on third parties than others. Lastly, we offer a set of design implications for future DAO systems based on our findings. △ Less

Submitted 16 April, 2023; originally announced April 2023.

arXiv:2304.07598 [pdf, other]

Understanding Rug Pulls: An In-Depth Behavioral Analysis of Fraudulent NFT Creators

Authors: Trishie Sharma, Rachit Agarwal, Sandeep Kumar Shukla

Abstract: The explosive growth of non-fungible tokens (NFTs) on Web3 has created a new frontier for digital art and collectibles, but also an emerging space for fraudulent activities. This study provides an in-depth analysis of NFT rug pulls, which are fraudulent schemes aimed at stealing investors' funds. Using data from 758 rug pulls across 10 NFT marketplaces, we examine the structural and behavioral pro… ▽ More The explosive growth of non-fungible tokens (NFTs) on Web3 has created a new frontier for digital art and collectibles, but also an emerging space for fraudulent activities. This study provides an in-depth analysis of NFT rug pulls, which are fraudulent schemes aimed at stealing investors' funds. Using data from 758 rug pulls across 10 NFT marketplaces, we examine the structural and behavioral properties of these schemes, identify the characteristics and motivations of rug-pullers, and classify NFT projects into groups based on creators' association with their accounts. Our findings reveal that repeated rug pulls account for a significant proportion of the rise in NFT-related cryptocurrency crimes, with one NFT collection attempting 37 rug pulls within three months. Additionally, we identify the largest group of creators influencing the majority of rug pulls, and demonstrate the connection between rug-pullers of different NFT projects through the use of the same wallets to store and move money. Our study contributes to the understanding of NFT market risks and provides insights for designing preventative strategies to mitigate future losses. △ Less

Submitted 15 April, 2023; originally announced April 2023.

arXiv:2304.06082 [pdf, other]

doi 10.1007/JHEP08(2023)202

Towards classification of holographic multi-partite entanglement measures

Authors: Abhijit Gadde, Vineeth Krishna, Trakshu Sharma

Abstract: In this paper, we systematically study the measures of multi-partite entanglement with the aim of constructing those measures that can be computed in probe approximation in the holographic dual. We classify and count general measures as invariants of local unitary transformations. After formulating these measures in terms of permutation group elements, we derive conditions that a probe measure sho… ▽ More In this paper, we systematically study the measures of multi-partite entanglement with the aim of constructing those measures that can be computed in probe approximation in the holographic dual. We classify and count general measures as invariants of local unitary transformations. After formulating these measures in terms of permutation group elements, we derive conditions that a probe measure should satisfy and find a large class of solutions. These solutions are generalizations of the multi-entropy introduced in arXiv:2206.09723 . We derive their holographic dual with the assumption that the replica symmetry is unbroken in the bulk and check our prescription with explicit computations in $2d$ CFTs. Analogous to the multi-entropy, the holographic dual of these measures is given by the weighted area of the minimal brane-web but with branes having differing tensions. We discuss the replica symmetry assumption and also how the already known entanglement measures, such as entanglement negativity and reflected entropy fit in our framework. △ Less

Submitted 28 August, 2023; v1 submitted 12 April, 2023; originally announced April 2023.

Comments: 55 pages, 24 figures ; modified the discussion in sections 1, 3, 4 and 7; added references and two figures

arXiv:2303.17424 [pdf, other]

doi 10.3847/1538-4357/acd115

A novel survey for young substellar objects with the W-band filter VI: Spectroscopic census of sub-stellar members and the IMF of $σ$ Orionis cluster

Authors: Belinda Damian, Jessy Jose, Beth Biller, Gregory J. Herczeg, Loic Albert, Katelyn Allers, Zhoujian Zhang, Michael C. Liu, Sophie Dubber, KT Paul, Wen-Ping Chen, Bhavana Lalchand, Tanvi Sharma, Yumiko Oasa

Abstract: Low-mass stars and sub-stellar objects are essential in tracing the initial mass function (IMF). We study the nearby young $σ$ Orionis cluster (d$\sim$408 pc; age$\sim$1.8 Myr) using deep NIR photometric data in J, W and H-bands from WIRCam on the Canada-France-Hawaii Telescope. We use the water absorption feature to photometrically select the brown dwarfs and confirm their nature spectroscopicall… ▽ More Low-mass stars and sub-stellar objects are essential in tracing the initial mass function (IMF). We study the nearby young $σ$ Orionis cluster (d$\sim$408 pc; age$\sim$1.8 Myr) using deep NIR photometric data in J, W and H-bands from WIRCam on the Canada-France-Hawaii Telescope. We use the water absorption feature to photometrically select the brown dwarfs and confirm their nature spectroscopically with the IRTF-SpeX. Additionally we select candidate low-mass stars for spectroscopy and analyze their membership and that of literature sources using astrometry from Gaia DR3. We obtain the near-IR spectra for 28 very low-mass stars and brown dwarfs and estimate their spectral type between M3-M8.5 (mass ranging between 0.3-0.01 M$_{\odot}$). Apart from these, we also identify 5 new planetary mass candidates which require further spectroscopic confirmation of youth. We compile the comprehensive catalog of 170 spectroscopically confirmed members in the central region of the cluster, for a wide mass range of $\sim$19-0.004 M$_{\odot}$. We estimate the star/BD ratio to be $\sim$4, within the range reported for other nearby star forming regions. With the updated catalog of members we trace the IMF down to 4 M$_\mathrm{Jup}$ and we find that a two-segment power-law fits the sub-stellar IMF better than the log-normal distribution. △ Less

Submitted 30 March, 2023; originally announced March 2023.

Comments: Accepted for publication in The Astrophysical Journal (ApJ). 27 pages, 9 figures, 2 tables

arXiv:2303.08729 [pdf, other]

DACOS-A Manually Annotated Dataset of Code Smells

Authors: Himesh Nandani, Mootez Saad, Tushar Sharma

Abstract: Researchers apply machine-learning techniques for code smell detection to counter the subjectivity of many code smells. Such approaches need a large, manually annotated dataset for training and benchmarking. Existing literature offers a few datasets; however, they are small in size and, more importantly, do not focus on the subjective code snippets. In this paper, we present DACOS, a manually anno… ▽ More Researchers apply machine-learning techniques for code smell detection to counter the subjectivity of many code smells. Such approaches need a large, manually annotated dataset for training and benchmarking. Existing literature offers a few datasets; however, they are small in size and, more importantly, do not focus on the subjective code snippets. In this paper, we present DACOS, a manually annotated dataset containing 10,267 annotations for 5,192 code snippets. The dataset targets three kinds of code smells at different granularity: multifaceted abstraction, complex method, and long parameter list. The dataset is created in two phases. The first phase helps us identify the code snippets that are potentially subjective by determining the thresholds of metrics used to detect a smell. The second phase collects annotations for potentially subjective snippets. We also offer an extended dataset DACOSX that includes definitely benign and definitely smelly snippets by using the thresholds identified in the first phase. We have developed TagMan, a web application to help annotators view and mark the snippets one-by-one and record the provided annotations. We make the datasets and the web application accessible publicly. This dataset will help researchers working on smell detection techniques to build relevant and context-aware machine-learning models. △ Less

Submitted 15 March, 2023; originally announced March 2023.

Comments: 4 pages

arXiv:2301.04980 [pdf, other]

doi 10.1007/JHEP05(2023)146

Bound on the central charge of CFTs in large dimension

Authors: Abhijit Gadde, Mrunmay Jagadale, Shraiyance Jain, Trakshu Sharma

Abstract: In this paper, we use crossing symmetry and unitarity constraints to put a lower bound on the central charge of conformal field theories in large space-time dimensions $D$. Specifically, we work with the four-point function of identical scalars $φ$ with scaling dimension $Δ_φ$, and use a certain class of analytic functionals to show that the OPE coefficient squared $c^2_{φφT^{μν}}$ must be exponen… ▽ More In this paper, we use crossing symmetry and unitarity constraints to put a lower bound on the central charge of conformal field theories in large space-time dimensions $D$. Specifically, we work with the four-point function of identical scalars $φ$ with scaling dimension $Δ_φ$, and use a certain class of analytic functionals to show that the OPE coefficient squared $c^2_{φφT^{μν}}$ must be exponentially small in $D$. For this to hold, we need to make a mild assumption about the nature of the spectrum below $2Δ_φ$. Our argument is robust and can be applied to any OPE coefficient squared $c^2_{φφO}$ with $Δ_O< 2Δ_φ$. This suggests that conformal field theories in large dimensions (if they exist) must be exponentially close to generalized free field theories. △ Less

Submitted 12 January, 2023; originally announced January 2023.

Comments: 23 pages, 8 figures

arXiv:2301.02211 [pdf, other]

Teaching Computer Vision for Ecology

Authors: Elijah Cole, Suzanne Stathatos, Björn Lütjens, Tarun Sharma, Justin Kay, Jason Parham, Benjamin Kellenberger, Sara Beery

Abstract: Computer vision can accelerate ecology research by automating the analysis of raw imagery from sensors like camera traps, drones, and satellites. However, computer vision is an emerging discipline that is rarely taught to ecologists. This work discusses our experience teaching a diverse group of ecologists to prototype and evaluate computer vision systems in the context of an intensive hands-on su… ▽ More Computer vision can accelerate ecology research by automating the analysis of raw imagery from sensors like camera traps, drones, and satellites. However, computer vision is an emerging discipline that is rarely taught to ecologists. This work discusses our experience teaching a diverse group of ecologists to prototype and evaluate computer vision systems in the context of an intensive hands-on summer workshop. We explain the workshop structure, discuss common challenges, and propose best practices. This document is intended for computer scientists who teach computer vision across disciplines, but it may also be useful to ecologists or other domain experts who are learning to use computer vision themselves. △ Less

Submitted 5 January, 2023; originally announced January 2023.

arXiv:2211.06254 [pdf, other]

Re-visiting Reservoir Computing architectures optimized by Evolutionary Algorithms

Authors: Sebastián Basterrech, Tarun Kumar Sharma

Abstract: For many years, Evolutionary Algorithms (EAs) have been applied to improve Neural Networks (NNs) architectures. They have been used for solving different problems, such as training the networks (adjusting the weights), designing network topology, optimizing global parameters, and selecting features. Here, we provide a systematic brief survey about applications of the EAs on the specific domain of… ▽ More For many years, Evolutionary Algorithms (EAs) have been applied to improve Neural Networks (NNs) architectures. They have been used for solving different problems, such as training the networks (adjusting the weights), designing network topology, optimizing global parameters, and selecting features. Here, we provide a systematic brief survey about applications of the EAs on the specific domain of the recurrent NNs named Reservoir Computing (RC). At the beginning of the 2000s, the RC paradigm appeared as a good option for employing recurrent NNs without dealing with the inconveniences of the training algorithms. RC models use a nonlinear dynamic system, with fixed recurrent neural network named the \textit{reservoir}, and learning process is restricted to adjusting a linear parametric function. %so the performance of learning is fast and precise. However, an RC model has several hyper-parameters, therefore EAs are helpful tools to figure out optimal RC architectures. We provide an overview of the results on the area, discuss novel advances, and we present our vision regarding the new trends and still open questions. △ Less

Submitted 11 November, 2022; originally announced November 2022.

Comments: Accepted manuscript to the 14th World Congress on Nature and Biologically Inspired Computing (NaBIC), Seattle, WA, United States, December 14-16, 2022. A revised manuscript will be published in the conference proceedings by Springer in the Lecture Notes in Networks and Systems

arXiv:2209.13128 [pdf, other]

Report of the Topical Group on Physics Beyond the Standard Model at Energy Frontier for Snowmass 2021

Authors: Tulika Bose, Antonio Boveia, Caterina Doglioni, Simone Pagan Griso, James Hirschauer, Elliot Lipeles, Zhen Liu, Nausheen R. Shah, Lian-Tao Wang, Kaustubh Agashe, Juliette Alimena, Sebastian Baum, Mohamed Berkat, Kevin Black, Gwen Gardner, Tony Gherghetta, Josh Greaves, Maxx Haehn, Phil C. Harris, Robert Harris, Julie Hogan, Suneth Jayawardana, Abraham Kahn, Jan Kalinowski, Simon Knapen , et al. (297 additional authors not shown)

Abstract: This is the Snowmass2021 Energy Frontier (EF) Beyond the Standard Model (BSM) report. It combines the EF topical group reports of EF08 (Model-specific explorations), EF09 (More general explorations), and EF10 (Dark Matter at Colliders). The report includes a general introduction to BSM motivations and the comparative prospects for proposed future experiments for a broad range of potential BSM mode… ▽ More This is the Snowmass2021 Energy Frontier (EF) Beyond the Standard Model (BSM) report. It combines the EF topical group reports of EF08 (Model-specific explorations), EF09 (More general explorations), and EF10 (Dark Matter at Colliders). The report includes a general introduction to BSM motivations and the comparative prospects for proposed future experiments for a broad range of potential BSM models and signatures, including compositeness, SUSY, leptoquarks, more general new bosons and fermions, long-lived particles, dark matter, charged-lepton flavor violation, and anomaly detection. △ Less

Submitted 18 October, 2022; v1 submitted 26 September, 2022; originally announced September 2022.

Comments: 108 pages + 38 pages references and appendix, 37 figures, Report of the Topical Group on Beyond the Standard Model Physics at Energy Frontier for Snowmass 2021. The first nine authors are the Conveners, with Contributions from the other authors

arXiv:2209.02438 [pdf]

Threat Detection In Self-Driving Vehicles Using Computer Vision

Authors: Umang Goenka, Aaryan Jagetia, Param Patil, Akshay Singh, Taresh Sharma, Poonam Saini

Abstract: On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted o… ▽ More On-road obstacle detection is an important field of research that falls in the scope of intelligent transportation infrastructure systems. The use of vision-based approaches results in an accurate and cost-effective solution to such systems. In this research paper, we propose a threat detection mechanism for autonomous self-driving cars using dashcam videos to ensure the presence of any unwanted obstacle on the road that falls within its visual range. This information can assist the vehicle's program to en route safely. There are four major components, namely, YOLO to identify the objects, advanced lane detection algorithm, multi regression model to measure the distance of the object from the camera, the two-second rule for measuring the safety, and limiting speed. In addition, we have used the Car Crash Dataset(CCD) for calculating the accuracy of the model. The YOLO algorithm gives an accuracy of around 93%. The final accuracy of our proposed Threat Detection Model (TDM) is 82.65%. △ Less

Submitted 6 September, 2022; originally announced September 2022.

Comments: Presented in 3rd International Conference on Machine Learning, Image Processing, Network Security and Data Sciences MIND-2021

arXiv:2208.08637 [pdf, other]

doi 10.3847/1538-3881/ac8547

A Novel Survey for Young Substellar Objects with the W band Filter.V. IC 348 and Barnard 5 in the Perseus Cloud

Authors: Bhavana Lalchand, Wen-Ping Chen, Beth A. Biller, Loic Albert, Katelyn Allers, Sophie Dubber, Zhoujian Zhang, Michael C. Liu, Jessy Jose, Belinda Damian, Tanvi Sharma, Mickael Bonnefoy, Yumiko Oasa

Abstract: We report the discovery of substellar objects in the young star cluster IC 348 and the neighboring Barnard 5 dark cloud, both at the eastern end of the Perseus star-forming complex. The substellar candidates are selected using narrowband imaging, i.e., on and off photometric technique with a filter centered around the water absorption feature at 1.45 microns, a technique proven to be efficient in… ▽ More We report the discovery of substellar objects in the young star cluster IC 348 and the neighboring Barnard 5 dark cloud, both at the eastern end of the Perseus star-forming complex. The substellar candidates are selected using narrowband imaging, i.e., on and off photometric technique with a filter centered around the water absorption feature at 1.45 microns, a technique proven to be efficient in detecting water-bearing substellar objects. Our spectroscopic observations confirm three brown dwarfs in IC 348. In addition, the source WBIS 03492858+3258064, reported in this work, is the first confirmed brown dwarf discovered toward Barnard 5. Together with the young stellar population selected via near- and mid-infrared colors using the Two Micron All Sky Survey and the Wide-field Infrared Survey Explorer, we diagnose the relation between stellar versus substellar objects with the associated molecular clouds. Analyzed by Gaia EDR3 parallaxes and kinematics of the cloud members across the Perseus region, we propose the star formation scenario of the complex under influence of the nearby OB association. △ Less

Submitted 18 August, 2022; originally announced August 2022.

Comments: Accepted for publication in the Astrophysical Journal. 24 pages, 14 figures, 5 tables

arXiv:2206.13910 [pdf, other]

Epidemic Control Modeling using Parsimonious Models and Markov Decision Processes

Authors: Edilson F. Arruda, Tarun Sharma, Rodrigo e A. Alexandre, Sinnu Susan Thomas

Abstract: Many countries have experienced at least two waves of the COVID-19 pandemic. The second wave is far more dangerous as distinct strains appear more harmful to human health, but it stems from the complacency about the first wave. This paper introduces a parsimonious yet representative stochastic epidemic model that simulates the uncertain spread of the disease regardless of the latency and recovery… ▽ More Many countries have experienced at least two waves of the COVID-19 pandemic. The second wave is far more dangerous as distinct strains appear more harmful to human health, but it stems from the complacency about the first wave. This paper introduces a parsimonious yet representative stochastic epidemic model that simulates the uncertain spread of the disease regardless of the latency and recovery time distributions. We also propose a Markov decision process to seek an optimal trade-off between the usage of the healthcare system and the economic costs of an epidemic. We apply the model to COVID-19 data from New Delhi, India and simulate the epidemic spread with different policy review times. The results show that the optimal policy acts swiftly to curb the epidemic in the first wave, thus avoiding the collapse of the healthcare system and the future costs of posterior outbreaks. An analysis of the recent collapse of the healthcare system of India during the second COVID-19 wave suggests that many lives could have been preserved if swift mitigation was promoted after the first wave. △ Less

Submitted 23 June, 2022; originally announced June 2022.

arXiv:2206.13535 [pdf, other]

doi 10.3847/1538-3881/ac7bea

India's first robotic eye for time domain astrophysics: the GROWTH-India telescope

Authors: Harsh Kumar, Varun Bhalerao, G. C. Anupama, Sudhanshu Barway, Judhajeet Basu, Kunal Deshmukh, Kishalay De, Anirban Dutta, Christoffer Fremling, Hrishikesh Iyer, Adeem Jassani, Simran Joharle, Viraj Karambelkar, Maitreya Khandagale, K Adithya Krishna, Sumeet Kulkarni, Sujay Mate, Atharva Patil, DVS Phanindra, Subham Samantaray, Kritti Sharma, Yashvi Sharma, Vedant Shenoy, Avinash Singh, Shubham Srivastava , et al. (13 additional authors not shown)

Abstract: We present the design and performance of the GROWTH-India telescope, a 0.7 m robotic telescope dedicated to time-domain astronomy. The telescope is equipped with a 4k back-illuminated camera giving a 0.82-degree field of view and sensitivity of m_g ~20.5 in 5-min exposures. Custom software handles observatory operations: attaining high on-sky observing efficiencies (>~ 80%) and allowing rapid resp… ▽ More We present the design and performance of the GROWTH-India telescope, a 0.7 m robotic telescope dedicated to time-domain astronomy. The telescope is equipped with a 4k back-illuminated camera giving a 0.82-degree field of view and sensitivity of m_g ~20.5 in 5-min exposures. Custom software handles observatory operations: attaining high on-sky observing efficiencies (>~ 80%) and allowing rapid response to targets of opportunity. The data processing pipelines are capable of performing PSF photometry as well as image subtraction for transient searches. We also present an overview of the GROWTH-India telescope's contributions to the studies of Gamma-ray Bursts, the electromagnetic counterparts to gravitational wave sources, supernovae, novae and solar system objects. △ Less

Submitted 27 June, 2022; originally announced June 2022.

Comments: 17 pages, 8 figures, Accepted for publication in The Astronomical Journal

arXiv:2206.09723 [pdf, other]

doi 10.1103/PhysRevD.106.126001

A new multi-partite entanglement measure and its holographic dual

Authors: Abhijit Gadde, Vineeth Krishna, Trakshu Sharma

Abstract: In this letter we define a natural generalization of the von Neumann entropy to multiple parties that is symmetric with respect to all the parties. We call this measure multi-entropy. We show that for conformal field theories with holographic duals, the multi-entropy is computed by the area of an appropriate "soap-film" anchored on the boundary. We conjecture the quantum version of this prescripti… ▽ More In this letter we define a natural generalization of the von Neumann entropy to multiple parties that is symmetric with respect to all the parties. We call this measure multi-entropy. We show that for conformal field theories with holographic duals, the multi-entropy is computed by the area of an appropriate "soap-film" anchored on the boundary. We conjecture the quantum version of this prescription that takes into account the sub-leading corrections in G_N. △ Less

Submitted 11 January, 2023; v1 submitted 20 June, 2022; originally announced June 2022.

Comments: 7 pages, 3 figures, added comments about the junction contributions and covariant generalization, added a discussion on analytic continuation

Report number: TIFR/TH/22-34

arXiv:2204.11193 [pdf, other]

Exploring Security Practices of Smart Contract Developers

Authors: Tanusree Sharma, Zhixuan Zhou, Andrew Miller, Yang Wang

Abstract: Smart contracts are self-executing programs that run on blockchains (e.g., Ethereum). 680 million US dollars worth of digital assets controlled by smart contracts have been hacked or stolen due to various security vulnerabilities in 2021. Although security is a fundamental concern for smart contracts, it is unclear how smart contract developers approach security. To help fill this research gap, we… ▽ More Smart contracts are self-executing programs that run on blockchains (e.g., Ethereum). 680 million US dollars worth of digital assets controlled by smart contracts have been hacked or stolen due to various security vulnerabilities in 2021. Although security is a fundamental concern for smart contracts, it is unclear how smart contract developers approach security. To help fill this research gap, we conducted an exploratory qualitative study consisting of a semi-structured interview and a code review task with 29 smart contract developers with diverse backgrounds, including 10 early stage (less than one year of experience) and 19 experienced (2-5 years of experience) smart contract developers. Our findings show a wide range of smart contract security perceptions and practices including various tools and resources they used. Our early-stage developer participants had a much lower success rate (15%) of identifying security vulnerabilities in the code review task than their experienced counterparts (55%). Our hierarchical task analysis of their code reviews implies that just by accessing standard documentation, reference implementations and security tools is not sufficient. Many developers checked those materials or used a security tool but still failed to identify the security issues. In addition, several participants pointed out shortcomings of current smart contract security tooling such as its usability. We discuss how future education and tools could better support developers in ensuring smart contract security. △ Less

Submitted 24 April, 2022; originally announced April 2022.

arXiv:2204.06462 [pdf, other]

doi 10.1007/JHEP09(2022)157

A Scattering Amplitude for Massive Particles in AdS

Authors: Abhijit Gadde, Trakshu Sharma

Abstract: In this paper, we propose a conformally covariant momentum space representation of CFT correlation functions. We call it the AdS S-matrix. This representation has the property that it reduces to the S-matrix in the flat space limit. The flat space limit in question is taken by keeping all the particle masses fixed as the operator conformal dimensions go to infinity along with the AdS radius… ▽ More In this paper, we propose a conformally covariant momentum space representation of CFT correlation functions. We call it the AdS S-matrix. This representation has the property that it reduces to the S-matrix in the flat space limit. The flat space limit in question is taken by keeping all the particle masses fixed as the operator conformal dimensions go to infinity along with the AdS radius $\mathtt{R}$. We give Feynman-like rules to compute the AdS S-matrix in $1/ \mathtt{R}$ perturbation theory. Moreover, we relate it to the Mellin space representation of the conformal correlators in $1/ \mathtt{R}$ perturbation theory. △ Less

Submitted 6 May, 2022; v1 submitted 13 April, 2022; originally announced April 2022.

Comments: 42 pages, 5 figures, version 2, minor changes have been made

arXiv:2204.02438 [pdf]

Revealing the Charge Transfer Dynamics Between Singlet Fission Molecule and Hybrid Perovskite Nanocrystals

Authors: Tejasvini Sharma, Saurav Saini, Naveen Kumar Tailor, Mahesh Kumar, Soumitra Satapathi

Abstract: Singlet fission process has gained considerable attention because of its potential to enhance photovoltaic efficiency and break the Shockley Queisser limit. In photovoltaic devices perovskite materials have shown tremendous progress in the last decade. Therefore combining the singlet fission materials in perovskite devices can lead to a drastic enhancement in their performance. To reveal the appli… ▽ More Singlet fission process has gained considerable attention because of its potential to enhance photovoltaic efficiency and break the Shockley Queisser limit. In photovoltaic devices perovskite materials have shown tremendous progress in the last decade. Therefore combining the singlet fission materials in perovskite devices can lead to a drastic enhancement in their performance. To reveal the applicability of singlet fission processes in perovskite materials we have investigated the charge transfer dynamics from an SF active material 910 bis phenylethynyl anthracene to CH3NH3PbBr3 perovskite nanocrystals using the transient absorption spectroscopy. We observed a significant charge transfer from the coupled triplet state of BPEA to conduction band of CH3NH3PbBr3 in picosecond timescale. The observation of shortened lifetime in a mixture of BPEA and CH3NH3PbBr3 nanocrytals confirms the significant charge transfer between these systems. Our study reveals the charge transfer mechanism in singlet fission perovskite composite which will help to develop an advanced photovoltaic system. △ Less

Submitted 5 April, 2022; originally announced April 2022.

arXiv:2203.15950 [pdf, other]

doi 10.1145/3524842.3528032

Empirical Standards for Repository Mining

Authors: Preetha Chatterjee, Tushar Sharma, Paul Ralph

Abstract: The purpose of scholarly peer review is to evaluate the quality of scientific manuscripts. However, study after study demonstrates that peer review neither effectively nor reliably assesses research quality. Empirical standards attempt to address this problem by modelling a scientific community's expectations for each kind of empirical study conducted in that community. This should enhance not onl… ▽ More The purpose of scholarly peer review is to evaluate the quality of scientific manuscripts. However, study after study demonstrates that peer review neither effectively nor reliably assesses research quality. Empirical standards attempt to address this problem by modelling a scientific community's expectations for each kind of empirical study conducted in that community. This should enhance not only the quality of research but also the reliability and predictability of peer review, as scientists adopt the standards in both their researcher and reviewer roles. However, these improvements depend on the quality and adoption of the standards. This tutorial will therefore present the empirical standard for mining software repositories, both to communicate its contents and to get feedback from the attendees. The tutorial will be organized into three parts: (1) brief overview of the empirical standards project; (2) detailed presentation of the repository mining standard; (3) discussion and suggestions for improvement. △ Less

Submitted 29 March, 2022; originally announced March 2022.

arXiv:2203.10243 [pdf, other]

doi 10.3847/1538-4357/ac510b

Diagnosing Triggered Star Formation in the Galactic H II region Sh 2-142

Authors: Tanvi Sharma, Wen Ping Chen, Neelam Panwar, Yan Sun, Yu Gao

Abstract: Stars are formed by gravitational collapse, spontaneously or, in some cases under the constructive influence of nearby massive stars, out of molecular cloud cores. Here we present an observational diagnosis of such triggered formation processes in the prominent \ion{H}{2} region Sh\,2-142, which is associated with the young star cluster NGC\,7380, and with some bright-rimmed clouds as the signpost… ▽ More Stars are formed by gravitational collapse, spontaneously or, in some cases under the constructive influence of nearby massive stars, out of molecular cloud cores. Here we present an observational diagnosis of such triggered formation processes in the prominent \ion{H}{2} region Sh\,2-142, which is associated with the young star cluster NGC\,7380, and with some bright-rimmed clouds as the signpost of photoionization of molecular cloud surfaces. Using near- (2MASS) and mid-infrared (WISE) colors, we identified candidate young stars at different evolutionary stages, including embedded infrared sources having spectral energy distributions indicative of active accretion. We have also used data from our optical observations to be used in SEDs, and from Gaia EDR3 to study the kinematics of young objects. With this young stellar sample, together with the latest CO line emission data (spectral resolution $\sim 0.16$~km~s$^{-1}$, sensitivity $\sim 0.5$~K), a positional and ageing sequence relative to the neighboring cloud complex, and to the bright-rimmed clouds, is inferred. The propagating stellar birth may be responsible, at least partially, for the formation of the cluster a few million years ago, and for the ongoing activity now witnessed in the cloud complex. △ Less

Submitted 19 March, 2022; originally announced March 2022.

Comments: 15 figures, 2 tables, Accepted for publication in ApJ

Showing 1–50 of 104 results for author: Sharma, T