research-article

Preliminary Causal Discovery Results with Software Effort Estimation Data

Authors:

Robert Stoddard,

Michael KonradAuthors Info & Claims

ISEC '18: Proceedings of the 11th Innovations in Software Engineering Conference

Article No.: 6, Pages 1 - 11

https://doi.org/10.1145/3172871.3172876

Published: 09 February 2018 Publication History

Abstract

Correlation does not imply causation. Though this is a well-known fact, most analyses depend on correlation as proof of relationships that are often treated as causal. Causal discovery, also referred to as causal model search, involves the application of statistical methods to identify causal relationships from conditional independences (and/or other statistical relationships) in the data. Though software cost estimation models use both domain knowledge and statistics, to date, there has yet to be a published report describing the evaluation of a software dataset using causal discovery. Two of the authors have previously used regression analysis to evaluate the effectiveness of the International Function Points User Group (IFPUG)'s and the Common Software Measurement International Consortium (COSMIC)'s functional size measurement methods for analyzing the Unified Code Count (UCC)1's dataset of maintenance tasks. Using the same dataset, the authors will report in this paper on what types of information causal discovery provides, and how they differ from correlation tests. This paper will introduce causal discovery to software engineering research, and its use in the future may impact how software effort models are built.

References

[1]

Alain Abran, Serge Oligny, and Charles Symons. 2000. COSMIC FFP and the world-wide field trials strategy. New Approaches in Software Measurement (October 2000), 125--134.

Digital Library

[2]

Allan J. Albrecht and John E Gaffney. 1983. Software function, source lines of code, and development effort prediction: a software science validation. IEEE transactions on software engineering 6 (1983), 639--648.

Digital Library

[3]

Constantin F Aliferis, Ioannis Tsamardinos, Alexander R Statnikov, and Laura E Brown. 2003. Causal Explorer: A Causal Probabilistic Network Learning Toolkit for Biomedical Discovery. In METMBS, Vol. 3. 371--376.

[4]

Barry W Boehm et al. 1981. Software engineering economics. Vol. 197. Prentice-hall Englewood Cliffs (NJ).

Digital Library

[5]

Barry W Boehm, Ray Madachy, Bert Steece, et al. 2000. Software cost estimation with Cocomo II with Cdrom. Prentice Hall PTR.

Digital Library

[6]

Eugenio Brentari, Maurizio Carpita, and Silvia Golia. {n. d.}. INSPECTING THE QUALITY OF ITALIAN WINE THROUGH CAUSAL REASONING. In BOOK OF ABSTRACTS. 521.

[7]

Cesar Couto, Pedro Pires, Marco Tulio Valente, Roberto S Bigonha, and Nicolas Anquetil. 2014. Predicting software defects with causality tests. Journal of Systems and Software 93 (2014), 24--41.

[8]

Marek J Druzdze and Clark Glymour. 1994. Application of the TETRAD II Program to the Study of Student Retention in US Colleges. In KDD Workshop. 419--430.

Digital Library

[9]

Imme Ebert-Uphoff and Yi Deng. 2012. Causal discovery for climate research using graphical models. Journal of Climate 25, 17 (2012), 5648--5665.

[10]

Felix Elwert. 2013. Graphical causal models. In Handbook of causal analysis for social research. Springer, 245--273.

[11]

Abdolreza Eshghi, Dominique Haughton, and Heikki Topi. 2007. Determinants of customer loyalty in the wireless telecommunications industry. Telecommunications policy 31, 2 (2007), 93--106.

Digital Library

[12]

Egil Ferkingstad, Anders Løland, and Mathilde Wilhelmsen. 2011. Causal modeling and inference for electricity markets. Energy Economics 33, 3 (2011), 404--412.

[13]

Ronald Aylmer Fisher. 1925. Statistical methods for research workers. Genesis Publishing Pvt Ltd.

[14]

M Maria Glymour. 2006. Using causal diagrams to understand common problems in social epidemiology. Methods in social epidemiology (2006), 393--428.

[15]

TE Hastings and ASM Sajeev. 2001. A vector-based approach to software size measurement and effort estimation. IEEE Transactions on Software Engineering 27, 4 (2001), 337--350.

Digital Library

[16]

Anandi Hira and Barry Boehm. 2016. Function Point Analysis for Software Maintenance. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, 48.

Digital Library

[17]

Anandi Hira and Barry Boehm. 2016. Using Software Non-Functional Assessment Process to Complement Function Points for Software Maintenance. In Proceedings of the 10th ACM/IEEE International Symposium on Empirical Software Engineering and Measurement. ACM, 50.

Digital Library

[18]

Anandi Hira and Barry Boehm. 2018. COSMIC Function Points Evaluation for Software Maintenance. In Proceedings of the 11th Innovations in Software Engineering Conference, Submitted. ACM.

Digital Library

[19]

Anandi Hira, Shreya Sharma, and Barry Boehm. 2016. Calibrating COCOMO® II for projects with high personnel turnover. In Proceedings of the International Workshop on Software and Systems Process. ACM, 51--55.

Digital Library

[20]

Paul W Holland, Clark Glymour, and Clive Granger. 1985. Statistics and causal inference. ETS Research Report Series 1985, 2 (1985).

[21]

Yong Hu, Xiangzhou Zhang, EWT Ngai, Ruichu Cai, and Mei Liu. 2013. Software project risk analysis using Bayesian networks with causality constraints. Decision Support Systems 56 (2013), 439--449.

[22]

Ghiyoung Im and Jijie Wang. 2007. A TETRAD-based approach for theory development in information systems research. Communications of the Association for Information Systems 20, 1 (2007), 22.

[23]

Yothin Jinjarak and Steven M Sheffrin. 2011. Causality, real estate prices, and the current account. Journal of Macroeconomics 33, 2 (2011), 233--246.

[24]

Chris F Kemerer. 1987. An empirical validation of software cost estimation models. Commun. ACM 30, 5 (1987), 416--429.

Digital Library

[25]

Barbara Kitchenham. 1997. Counterpoint: the problem with function points. IEEE software 14, 2 (1997), 29.

Digital Library

[26]

Barbara A Kitchenham, Tore Dyba, and Magne Jorgensen. 2004. Evidence-based software engineering. In Proceedings of the 26th international conference on software engineering. IEEE Computer Society, 273--281.

Digital Library

[27]

Marcus Klasson, Kun Zhang, Bo C Bertilson, Cheng Zhang, and Hedvig Kjellström. 2017. Causality Refined Diagnostic Prediction. arXiv preprint arXiv:1711.10915 (2017).

[28]

JA Landsheer. 2010. The specification of causal models with Tetrad IV: A review. Structural Equation Modeling 17, 4 (2010), 703--711.

[29]

Liping Liu. 2009. Technology acceptance model: A replicated test using TETRAD. International Journal of Intelligent Systems 24, 12 (2009), 1230--1242.

Digital Library

[30]

Daniel Malinsky and David Danks. 2017. Causal discovery algorithms: A practical guide. Philosophy Compass (2017).

[31]

Thomas J McCabe. 1976. A complexity measure. IEEE Transactions on software Engineering 4 (1976), 308--320.

Digital Library

[32]

Vu Nguyen. 2010. Improved size and effort estimation models for software maintenance (Software Engineering). Ph.D. Dissertation. Ph. D. Dissertation. University of Southern California, Los Angeles, CA. UTI Order.

Digital Library

[33]

Robert E Park. 1992. Software size measurement: A framework for counting source statements. Technical Report. DTIC Document.

[34]

Judea Pearl. 2001. Causal inference in the health sciences: a conceptual introduction. Health services and outcomes research methodology 2, 3 (2001), 189--220.

[35]

Judea Pearl, Madelyn Glymour, and Nicholas P Jewell. 2016. Causal inference in statistics: a primer. John Wiley & Sons.

[36]

Joseph D Ramsey, Stephen José Hanson, Catherine Hanson, Yaroslav O Halchenko, Russell A Poldrack, and Clark Glymour. 2010. Six problems for causal inference from fMRI. neuroimage 49, 2 (2010), 1545--1558.

[37]

Andrew J Rettenmaier and Zijun Wang. 2013. What determines health: a causal analysis using county level data. The European Journal of Health Economics 14, 5 (2013), 821--834.

[38]

Ruben Sanchez-Romero, Joseph D Ramsey, Jackson C Liang, and Clark Glymour. 2017. Identification of Mechanisms of Functional Signaling Between Human Hippocampus Regions. bioRxiv (2017), 099820.

[39]

Andrew J Sedgewick, Joseph D Ramsey, Peter Spirtes, Clark Glymour, and Panayiotis V Benos. 2017. Mixed Graphical Models for Causal Analysis of Multi-modal Variables. arXiv preprint arXiv:1704.02621 (2017).

[40]

William R. Shadish, Thomas D Cook, and Donald Thomas Campbell. 2002. Experimental and quasi-experimental designs for generalized causal inference. Wadsworth Cengage learning.

[41]

Peter Spirtes. 2010. Introduction to causal inference. Journal of Machine Learning Research 11, May (2010), 1643--1662.

Digital Library

[42]

Božidar Tepeš, Gordana Lešin, Ana Hrkač, and Krunoslav Tepeš. 2016. Causal Bayes Model of Mathematical Competence in Kindergarten. Journal of systemics, cybernetics and informatics 14, 3 (2016), 14--17.

[43]

Charley Tichenor. 2013. A new software metric to complement function points: the Software Non-functional Assessment Process (SNAP). Technical Report. DEFENSE SECURITY COOPERATION AGENCY WASHINGTON DC.

Cited By

Siebert J(2023)Applications of statistical causal inference in software engineeringInformation and Software Technology10.1016/j.infsof.2023.107198159:COnline publication date: 10-May-2023
https://dl.acm.org/doi/10.1016/j.infsof.2023.107198
Hu YLuo WHu Z(2023)A practical approach to explaining defect proneness of code commits by causal discoveryEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106187123(106187)Online publication date: Aug-2023
https://doi.org/10.1016/j.engappai.2023.106187
Rao KRao G(2020)RETRACTED ARTICLE: Ensemble learning with recursive feature elimination integrated software effort estimation: a novel approachEvolutionary Intelligence10.1007/s12065-020-00360-514:1(151-162)Online publication date: 17-Feb-2020
https://doi.org/10.1007/s12065-020-00360-5

Index Terms

Preliminary Causal Discovery Results with Software Effort Estimation Data

Recommendations

COSMIC Function Points Evaluation for Software Maintenance
ISEC '18: Proceedings of the 11th Innovations in Software Engineering Conference

The Common Software Measurement International Consortium (COSMIC) group reviewed the existing functional size methods, such as the International Function Points User Group (IFPUG)'s Function Points (FPs), to develop a functional size metric based on "...
Disentangling causality: assumptions in causal discovery and inference
Abstract
Causality has been a burgeoning field of research leading to the point where the literature abounds with different components addressing distinct parts of causality. For researchers, it has been increasingly difficult to discern the assumptions ...
A Survey on Causal Discovery: Theory and Practice
Abstract
Understanding the laws that govern a phenomenon is the core of scientific progress. This is especially true when the goal is to model the interplay between different aspects in a causal fashion. Indeed, causal inference itself is ...

Comments

Information & Contributors

Information

Published In

cover image ACM Other conferences

ISEC '18: Proceedings of the 11th Innovations in Software Engineering Conference

February 2018

154 pages

ISBN:9781450363983

DOI:10.1145/3172871

General Chairs:
Y. Raghu Reddy
IIIT Hyderabad
,
Vasudeva Varma
IIIT Hyderabad
,
Program Chairs:
Jane Huang Cleland
University of Notradame
,
Umesh Bellur
IIT Bombay
,
Shubashsis Sengupta
Accenture, India
,
Naveen Sharma
RIT, New York/Quantiply Corp.
,
Ramesh Loganathan
IIIT Hyderabad
,
Publications Chairs:
Richa Sharma
BML Munjal University, India
,
Santonu Sarkar
BITS Pilani, Goa

Copyright © 2018 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected]

In-Cooperation

iSOFT: iSOFT

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 09 February 2018

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Author Tags

Qualifiers

Research-article
Research
Refereed limited

Conference

ISEC '18

ISEC '18: Innovations in Software Engineering Conference

February 9 - 11, 2018

Hyderabad, India

Acceptance Rates

Overall Acceptance Rate 76 of 315 submissions, 24%

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

3
Total Citations
View Citations
182
Total Downloads

Downloads (Last 12 months)12
Downloads (Last 6 weeks)0

Reflects downloads up to 22 Oct 2024

Other Metrics

View Author Metrics

Citations

Cited By

Siebert J(2023)Applications of statistical causal inference in software engineeringInformation and Software Technology10.1016/j.infsof.2023.107198159:COnline publication date: 10-May-2023
https://dl.acm.org/doi/10.1016/j.infsof.2023.107198
Hu YLuo WHu Z(2023)A practical approach to explaining defect proneness of code commits by causal discoveryEngineering Applications of Artificial Intelligence10.1016/j.engappai.2023.106187123(106187)Online publication date: Aug-2023
https://doi.org/10.1016/j.engappai.2023.106187
Rao KRao G(2020)RETRACTED ARTICLE: Ensemble learning with recursive feature elimination integrated software effort estimation: a novel approachEvolutionary Intelligence10.1007/s12065-020-00360-514:1(151-162)Online publication date: 17-Feb-2020
https://doi.org/10.1007/s12065-020-00360-5

View Options

Get Access

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Publication

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Media

Figures

Other

Tables

View Table of Contents