-
Agile Effort Estimation: Have We Solved the Problem Yet? Insights From A Second Replication Study (GPT2SP Replication Report)
Authors:
Vali Tawosi,
Rebecca Moussa,
Federica Sarro
Abstract:
Fu and Tantithamthavorn have recently proposed GPT2SP, a Transformer-based deep learning model for SP estimation of user stories. They empirically evaluated the performance of GPT2SP on a dataset shared by Choetkiertikul et al including 16 projects with a total of 23,313 issues. They benchmarked GPT2SP against two baselines (namely the naive Mean and Median estimators) and the method previously pr…
▽ More
Fu and Tantithamthavorn have recently proposed GPT2SP, a Transformer-based deep learning model for SP estimation of user stories. They empirically evaluated the performance of GPT2SP on a dataset shared by Choetkiertikul et al including 16 projects with a total of 23,313 issues. They benchmarked GPT2SP against two baselines (namely the naive Mean and Median estimators) and the method previously proposed by Choetkiertikul et al. (which we will refer to as DL2SP from now on) for both within- and cross-project estimation scenarios, and evaluated the extent to which each components of GPT2SP contribute towards the accuracy of the SP estimates. Their results show that GPT2SP outperforms DL2SP with a 6%-47% improvement over MAE for the within-project scenario and a 3%-46% improvement for the cross-project scenarios. However, when we attempted to use the GPT2SP source code made available by Fu and Tantithamthavorn to reproduce their experiments, we found a bug in the computation of the Mean Absolute Error (MAE), which may have inflated the GPT2SP's accuracy reported in their work. Therefore, we had issued a pull request to fix such a bug, which has been accepted and merged into their repository at https://github.com/awsm-research/gpt2sp/pull/2.
In this report, we describe the results we achieved by using the fixed version of GPT2SP to replicate the experiments conducted in the original paper for RQ1 and RQ2. Following the original study, we analyse the results considering the Medan Absolute Error (MAE) of the estimation methods over all issues in each project, but we also report the Median Absolute Error (MdAE) and the Standard accuracy (SA) for completeness.
△ Less
Submitted 1 September, 2022;
originally announced September 2022.
-
Do Not Take It for Granted: Comparing Open-Source Libraries for Software Development Effort Estimation
Authors:
Rebecca Moussa,
Federica Sarro
Abstract:
In the past two decades, several Machine Learning (ML) libraries have become freely available. Many studies have used such libraries to carry out empirical investigations on predictive Software Engineering (SE) tasks. However, the differences stemming from using one library over another have been overlooked, implicitly assuming that using any of these libraries would provide the user with the same…
▽ More
In the past two decades, several Machine Learning (ML) libraries have become freely available. Many studies have used such libraries to carry out empirical investigations on predictive Software Engineering (SE) tasks. However, the differences stemming from using one library over another have been overlooked, implicitly assuming that using any of these libraries would provide the user with the same or very similar results. This paper aims at raising awareness of the differences incurred when using different ML libraries for software development effort estimation (SEE), one of most widely studied SE prediction tasks. To this end, we investigate 4 deterministic machine learners as provided by 3 of the most popular ML open-source libraries written in different languages (namely, Scikit-Learn, Caret and Weka). We carry out a thorough empirical study comparing the performance of the machine learners on 5 SEE datasets in the two most common SEE scenarios (i.e., out-of-the-box-ml and tuned-ml) as well as an in-depth analysis of the documentation and code of their APIs. The results of our study reveal that the predictions provided by the 3 libraries differ in 95% of the cases on average across a total of 105 cases studied. These differences are significantly large in most cases and yield misestimations of up to approx. 3,000 hours per project. Moreover, our API analysis reveals that these libraries provide the user with different levels of control on the parameters one can manipulate, and a lack of clarity and consistency, overall, which might mislead users. Our findings highlight that the ML library is an important design choice for SEE studies, which can lead to a difference in performance. However, such a difference is under-documented. We conclude by highlighting open-challenges with suggestions for the developers of libraries as well as for the researchers and practitioners using them.
△ Less
Submitted 4 July, 2022;
originally announced July 2022.
-
On The Effectiveness of One-Class Support Vector Machine in Different Defect Prediction Scenarios
Authors:
Rebecca Moussa,
Danielle Azar,
Federica Sarro
Abstract:
Defect prediction aims at identifying software components that are likely to cause faults before a software is made available to the end-user. To date, this task has been modeled as a two-class classification problem, however its nature also allows it to be formulated as a one-class classification task. Previous studies show that One-Class Support Vector Machine (OCSVM) can outperform two-class cl…
▽ More
Defect prediction aims at identifying software components that are likely to cause faults before a software is made available to the end-user. To date, this task has been modeled as a two-class classification problem, however its nature also allows it to be formulated as a one-class classification task. Previous studies show that One-Class Support Vector Machine (OCSVM) can outperform two-class classifiers for within-project defect prediction, however it is not effective when employed at a finer granularity (i.e., commit-level defect prediction). In this paper, we further investigate whether learning from one class only is sufficient to produce effective defect prediction model in two other different scenarios (i.e., granularity), namely cross-version and cross-project defect prediction models, as well as replicate the previous work at within-project granularity for completeness. Our empirical results confirm that OCSVM performance remain low at different granularity levels, that is, it is outperformed by the two-class Random Forest (RF) classifier for both cross-version and cross-project defect prediction. While, we cannot conclude that OCSVM is the best classifier, our results still show interesting findings. While OCSVM does not outperform RF, it still achieves performance superior to its two-class counterpart (i.e., SVM) as well as other two-class classifiers studied herein. We also observe that OCSVM is more suitable for both cross-version and cross-project defect prediction, rather than for within-project defect prediction, thus suggesting it performs better with heterogeneous data. We encourage further research on one-class classifiers for defect prediction as these techniques may serve as an alternative when data about defective modules is scarce or not available.
△ Less
Submitted 23 March, 2024; v1 submitted 24 February, 2022;
originally announced February 2022.
-
A Versatile Dataset of Agile Open Source Software Projects
Authors:
Vali Tawosi,
Afnan Al-Subaihin,
Rebecca Moussa,
Federica Sarro
Abstract:
Agile software development is nowadays a widely adopted practise in both open-source and industrial software projects. Agile teams typically heavily rely on issue management tools to document new issues and keep track of outstanding ones, in addition to storing their technical details, effort estimates, assignment to developers, and more. Previous work utilised the historical information stored in…
▽ More
Agile software development is nowadays a widely adopted practise in both open-source and industrial software projects. Agile teams typically heavily rely on issue management tools to document new issues and keep track of outstanding ones, in addition to storing their technical details, effort estimates, assignment to developers, and more. Previous work utilised the historical information stored in issue management systems for various purposes; however, when researchers make their empirical data public, it is usually relevant solely to the study's objective. In this paper, we present a more holistic and versatile dataset containing a wealth of information on more than 500,000 issues from 44 open-source Agile software, making it well-suited to several research avenues, and cross-analyses therein, including effort estimation, issue prioritization, issue assignment and many more. We make this data publicly available on GitHub to facilitate ease of use, maintenance, and extensibility.
△ Less
Submitted 2 February, 2022;
originally announced February 2022.
-
Agile Effort Estimation: Have We Solved the Problem Yet? Insights From A Replication Study
Authors:
Vali Tawosi,
Rebecca Moussa,
Federica Sarro
Abstract:
In the last decade, several studies have explored automated techniques to estimate the effort of agile software development. We perform a close replication and extension of a seminal work proposing the use of Deep Learning for Agile Effort Estimation (namely Deep-SE), which has set the state-of-the-art since. Specifically, we replicate three of the original research questions aiming at investigati…
▽ More
In the last decade, several studies have explored automated techniques to estimate the effort of agile software development. We perform a close replication and extension of a seminal work proposing the use of Deep Learning for Agile Effort Estimation (namely Deep-SE), which has set the state-of-the-art since. Specifically, we replicate three of the original research questions aiming at investigating the effectiveness of Deep-SE for both within-project and cross-project effort estimation. We benchmark Deep-SE against three baselines (i.e., Random, Mean and Median effort estimators) and a previously proposed method to estimate agile software project development effort (dubbed TF/IDF-SVM), as done in the original study. To this end, we use the data from the original study and an additional dataset of 31,960 issues mined from TAWOS, as using more data allows us to strengthen the confidence in the results, and to further mitigate external validity threats. The results of our replication show that Deep-SE outperforms the Median baseline estimator and TF/IDF-SVM in only very few cases with statistical significance (8/42 and 9/32 cases, respectively), thus confounding previous findings on the efficacy of Deep-SE. The two additional RQs revealed that neither augmenting the training set nor pre-training Deep-SE play lead to an improvement of its accuracy and convergence speed. These results suggest that using semantic similarity is not enough to differentiate user stories with respect to their story points; thus, future work has yet to explore and find new techniques and features that obtain accurate agile software development estimates.
△ Less
Submitted 17 December, 2022; v1 submitted 14 January, 2022;
originally announced January 2022.
-
A Unified Approach to Computing the Zeros of Classical Orthogonal Polynomials
Authors:
Ridha Moussa,
James Tipton
Abstract:
The authors present a unified method for calculating the zeros of the classical orthogonal polynomials based upon the electrostatic interpretation and its connection to the energy minimization problem. Examples are given with error estimates for three cases of the Jacobi polynomials, three cases of the Laguerre polynomials, and the Hermite polynomials. In the case of the Chebyshev polynomials, exa…
▽ More
The authors present a unified method for calculating the zeros of the classical orthogonal polynomials based upon the electrostatic interpretation and its connection to the energy minimization problem. Examples are given with error estimates for three cases of the Jacobi polynomials, three cases of the Laguerre polynomials, and the Hermite polynomials. In the case of the Chebyshev polynomials, exact errors are given.
△ Less
Submitted 19 September, 2021;
originally announced September 2021.
-
FinQA: A Dataset of Numerical Reasoning over Financial Data
Authors:
Zhiyu Chen,
Wenhu Chen,
Charese Smiley,
Sameena Shah,
Iana Borova,
Dylan Langdon,
Reema Moussa,
Matt Beane,
Ting-Hao Huang,
Bryan Routledge,
William Yang Wang
Abstract:
The sheer volume of financial statements makes it difficult for humans to access and analyze a business's financials. Robust numerical reasoning likewise faces unique challenges in this domain. In this work, we focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. In contrast to existing tasks on general domain, the finance…
▽ More
The sheer volume of financial statements makes it difficult for humans to access and analyze a business's financials. Robust numerical reasoning likewise faces unique challenges in this domain. In this work, we focus on answering deep questions over financial data, aiming to automate the analysis of a large corpus of financial documents. In contrast to existing tasks on general domain, the finance domain includes complex numerical reasoning and understanding of heterogeneous representations. To facilitate analytical progress, we propose a new large-scale dataset, FinQA, with Question-Answering pairs over Financial reports, written by financial experts. We also annotate the gold reasoning programs to ensure full explainability. We further introduce baselines and conduct comprehensive experiments in our dataset. The results demonstrate that popular, large, pre-trained models fall far short of expert humans in acquiring finance knowledge and in complex multi-step numerical reasoning on that knowledge. Our dataset -- the first of its kind -- should therefore enable significant, new community research into complex application domains. The dataset and code are publicly available\url{https://github.com/czyssrs/FinQA}.
△ Less
Submitted 7 May, 2022; v1 submitted 31 August, 2021;
originally announced September 2021.
-
Decomposition formulae for Dirichlet forms and their corollaries
Authors:
Ali BenAmor,
Rafed Moussa
Abstract:
We provide decompositions of Dirichlet forms into recurrent and transient parts as well as into conservative and dissipative parts, in the framework of Hausdorff state spaces. Combining both formulae we write every Dirichlet form as the sum of a recurrent, dissipative and transient conservative Dirichlet forms. Besides, we prove that Mosco convergence preserves invariant sets and that a Dirichlet…
▽ More
We provide decompositions of Dirichlet forms into recurrent and transient parts as well as into conservative and dissipative parts, in the framework of Hausdorff state spaces. Combining both formulae we write every Dirichlet form as the sum of a recurrent, dissipative and transient conservative Dirichlet forms. Besides, we prove that Mosco convergence preserves invariant sets and that a Dirichlet form shares the same invariants sets with its approximating Dirichlet forms E(t) and E(?). Finally we show the equivalence between conservativeness (resp. dissipativity) of a Dirichlet form and the conservativeness (reps. dissipativity) of E(t) and E(?). The elaborated results are enlightened by some examples.
△ Less
Submitted 1 July, 2019;
originally announced July 2019.
-
Computations and global properties for traces of Bessel's Dirichlet form
Authors:
Ali BenAmor,
Rafed Moussa
Abstract:
We compute explicitly traces of the Dirichlet form related to the Bessel process with respect to discrete measures as well as measures of mixed type. Then some global properties of the obtained Dirichlet forms, such as conservativeness, irreducibility and compact embedding for their domains are discussed.
We compute explicitly traces of the Dirichlet form related to the Bessel process with respect to discrete measures as well as measures of mixed type. Then some global properties of the obtained Dirichlet forms, such as conservativeness, irreducibility and compact embedding for their domains are discussed.
△ Less
Submitted 22 January, 2019;
originally announced January 2019.
-
Limits on the amplification of evanescent waves of left-handed materials
Authors:
Th. Koschny,
R. Moussa,
C. M. Soukoulis
Abstract:
We investigate the transfer function of the discretized perfect lens in finite-difference time-domain (FDTD) and transfer matrix (TMM) simulations; the latter allow to eliminate the problems associated with the explicit time dependence in FDTD simulations. We argue that the peak observed in the FDTD transfer function near the maximum parallel momentum $k_{\|,\mathrm{max}}$ is due to finite time…
▽ More
We investigate the transfer function of the discretized perfect lens in finite-difference time-domain (FDTD) and transfer matrix (TMM) simulations; the latter allow to eliminate the problems associated with the explicit time dependence in FDTD simulations. We argue that the peak observed in the FDTD transfer function near the maximum parallel momentum $k_{\|,\mathrm{max}}$ is due to finite time artifacts. We also find the finite discretization mesh acts like imaginary deviations from $μ=ε=-1$ and leads to a cross-over in the transfer function from constance to exponential decay around $k_{\|,\mathrm{max}}$ limiting the attainable super-resolution. We propose a simple qualitative model to describe the impact of the discretization. $k_{\|,\mathrm{max}}$ is found to depend logarithmically on the mesh constant in qualitative agreement with the TMM simulations.
△ Less
Submitted 13 April, 2005;
originally announced April 2005.
-
Negative refraction and superlensing in a 2D photonic crystal structure
Authors:
R. Moussa,
S. Foteinopoulou,
Lei Zhang,
G. Tuttle,
K. Guven,
E. Ozbay,
C. M. Soukoulis
Abstract:
We experimentally and theoretically studied a new left-handed (LH) structure based on a photonic crystal (PC) with a negative refractive index. The structure consists of triangular array of rectangular dielectric bars with dielectric constant 9.61. Experimental and theoretical results demonstrate the negative refraction and the superlensing phenomena in the microwave regime. The results show hig…
▽ More
We experimentally and theoretically studied a new left-handed (LH) structure based on a photonic crystal (PC) with a negative refractive index. The structure consists of triangular array of rectangular dielectric bars with dielectric constant 9.61. Experimental and theoretical results demonstrate the negative refraction and the superlensing phenomena in the microwave regime. The results show high transmission for our structure for a wide range of incident angles. Furthermore, surface termination within a specific cut of the structure excite surface waves at the interface between air and PC and allow the reconstruction of evanescent waves for a better focus and better transmission. The normalized average field intensity calculated in both the source and image planes shows almost the same full width at half maximum for the source and the focused beam.
△ Less
Submitted 27 September, 2004;
originally announced September 2004.