Abstract
Stack Overflow provides a means for developers to exchange knowledge. While much previous research on Stack Overflow has focused on questions and answers (Q&A), recent work has shown that discussions in comments also contain rich information. On Stack Overflow, discussions through comments and chat rooms can be tied to questions or answers. In this paper, we conduct an empirical study that focuses on the nature of question discussions. We observe that: (1) Question discussions occur at all phases of the Q&A process, with most beginning before the first answer is received. (2) Both askers and answerers actively participate in question discussions; the likelihood of their participation increases as the number of comments increases. (3) There is a strong correlation between the number of question comments and the question answering time (i.e., more discussed questions receive answers more slowly). Our findings suggest that question discussions contain a rich trove of data that is integral to the Q&A processes on Stack Overflow. We further suggest how future research can leverage the information in question discussions, along with the commonly studied Q&A information.
Similar content being viewed by others
Notes
In Stack Overflow, comments are “hidden” (i.e., elided from view) by default when there are six or more attached to the same question.
We’ve made our dataset open access on Zenodo: https://zenodo.org/record/5516190
General chat rooms are standard chat rooms on Stack Overflow that are not associated with a question or an answer.
Comments may be deleted by their author, but they may not be edited in place.
References
Alkadhi R, Laţa T, Guzman E, Bruegge B (2017) Rationale in development chat messages: an exploratory study. In: Proceedings of the 14th international conference on mining software repositories, MSR ’17. IEEE Press, pp 436–446
Allamanis M, Sutton C (2013) Why, when, and what: analyzing stack overflow questions by topic, type, and code. In: 2013 10th Working conference on mining software repositories, MSR ’13. IEEE, pp 53–56
Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2012) Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12, pp 850–858
Asaduzzaman M, Mashiyat A S, Roy C K, Schneider K A (2013) Answering questions about unanswered questions of stack overflow. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, pp 97–100
Baltes S, Dumani L, Treude C, Diehl S (2018) Sotorrent: reconstructing and analyzing the evolution of stack overflow posts. In: Proceedings of the 15th international conference on mining software repositories, MSR ’18, pp 319–330
Beyer S, Pinzger M (2015) Synonym suggestion for tags on stack overflow. In: Proceedings of the 23rd international conference on program comprehension, ICPC ’15. IEEE, pp 94–103
Cai L, Wang H, Xu B, Huang Q, Xia X, Lo D, Xing Z (2019) Answerbot: an answer summary generation tool based on stack overflow. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, FSE ’19, pp 1134–1138
Calefato F, Lanubile F, Novielli N (2018) How to ask for technical help? Evidence-based guidelines for writing questions on stack overflow. Inf Softw Technol 94:186–207
Chatterjee P, Damevski K, Pollock L, Augustine V, Kraft N A (2019) Exploratory study of slack q&a chats as a mining source for software engineering tools. In: Proceedings of the 16th international conference on mining software repositories, MSR ’19. IEEE Press, pp 490–501
Chen C, Chen X, Sun J, Xing Z, Li G (2018) Data-driven proactive policy assurance of post quality in community q&a sites. In: Proceedings of the 2018 ACM human-computer interaction, vol 2
Chen M, Fischer F, Meng N, Wang X, Grossklags J (2019) How reliable is the crowdsourced knowledge of security implementation?. In: 2019 IEEE/ACM 41st international conference on software engineering, ICSE ’19. IEEE, pp 536–547
Choetkiertikul M, Avery D, Dam H K, Tran T, Ghose A (2015) Who will answer my question on stack overflow?. In: Proceedings of the 24th Australasian software engineering conference, pp 155–164
Chowdhury S A, Hindle A (2015) Mining stackoverflow to filter out off-topic irc discussion. In: Proceedings of the 12th working conference on mining software repositories, MSR ’15, pp 422–425
Dittrich Y, Giuffrida R (2011) Exploring the role of instant messaging in a global software development project. In: Proceedings of the IEEE sixth international conference on global software engineering, pp 103–112
Ford D, Lustig K, Banks J, Parnin C (2018) “We don’t do that here”: how collaborative editing with mentors improves engagement in social Q&A communities. In: Proceedings of the 2018 CHI conference on human factors in computing systems, CHI ’18. Association for Computing Machinery, pp 1–12
Jin X, Servant F (2019) What edits are done on the highly answered questions in stack overflow? An empirical study. In: 2019 IEEE/ACM 16th international conference on mining software repositories, MSR ’19, pp 225–229
Lin B, Zagalsky A, Storey M, Serebrenik A (2016) Why developers are slacking off: understanding how software teams use slack. In: Proceedings of the 19th ACM conference on computer supported cooperative work and social computing companion, CSCW ’16 companion. ACM, pp 333–336
Linares-Vásquez M, Bavota G, Di Penta M, Oliveto R, Poshyvanyk D (2014) How do api changes trigger stack overflow discussions? A study on the android sdk. In: Proceedings of the 22nd international conference on program comprehension, pp 83–94
Nasehi S M, Sillito J, Maurer F, Burns C (2012) What makes a good code example?: a study of programming q a in stackoverflow. In: Proceedings of the 28th international conference on software maintenance, ICSM ’12, pp 25–34
Ponzanelli L, Mocci A, Bacchelli A, Lanza M, Fullerton D (2014) Improving low quality stack overflow post detection. In: Proceedings of the 30th international conference on software maintenance and evolution, ICSME ’14, pp 541–544
Ragkhitwetsagul C, Krinke J, Paixao M, Bianco G, Oliveto R (2019) Toxic code snippets on stack overflow. IEEE Trans Softw Eng
Raymond E (2019) How to ask questions the smart way. http://www.catb.org/esr/faqs/smart-questions.html. Accessed 21 Oct 2019
Rigby P C, Hassan A E (2007) What can oss mailing lists tell us? A preliminary psychometric text analysis of the apache developer mailing list. In: Fourth international workshop on mining software repositories, MSR ’07, pp 23–23
Saha A K, Saha R K, Schneider K A (2013) A discriminative model approach for suggesting tags automatically for stack overflow questions. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, pp 73–76
Sengupta S, Haythornthwaite C (2020) Learning with comments: an analysis of comments and community on stack overflow. In: Proceedings of the 53rd Hawaii International conference on system sciences
Shcherban S, Liang P, Tahir A, Li X (2020) Automatic identification of code smell discussions on stack overflow: a preliminary investigation. In: Proceedings of the 14th ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 1–6
Shihab E, Jiang Zhen Ming, Hassan A E (2009a) On the use of internet relay chat (irc) meetings by developers of the gnome gtk+ project. In: Proceedings of the 6th international working conference on mining software repositories, MSR ’09, pp 107–110
Shihab E, Jiang Z M, Hassan A E (2009b) Studying the use of developer irc meetings in open source projects. In: Proceedings of the 25th IEEE international conference on software maintenance, ICSM ’09. IEEE, pp 147–156
Soni A, Nadi S (2019) Analyzing comment-induced updates on stack overflow. In: 2019 IEEE/ACM 16th international conference on mining software repositories, MSR ’19. IEEE, pp 220–224
Sowe S, Stamelos I, Angelis L (2006) Identifying knowledge brokers that yield software engineering knowledge in oss projects. Inf Softw Technol 48 (11):1025–1033
Spearman C (1961) The proof and measurement of association between two things. Am J Psychol (AJP)
Stack Exchange Data Dump (2021) https://archive.org/details/stackexchange. Accessed 29 Dec 2021
Storey M, Zagalsky A, Filho F F, Singer L, German D M (2017) How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans Softw Eng 43:185–204
Srba I, Bielikova M (2016) Why is stack overflow failing? Preserving sustainability in community question answering. IEEE Softw 33:80–89
Tian Q, Zhang P, Li B (2013) Towards predicting the best answers in community-based question-answering services. In: Proceedings of the seventh international AAAI conference on weblogs and social media
Treude C, Barzilay O, Storey M -A (2011) How do programmers ask and answer questions on the web? (NIER Track). In: Proceedings of the 33rd international conference on software engineering, ICSE ’11. ACM, pp 804–807
Uddin G, Khomh F, Roy C K (2020) Mining api usage scenarios from stack overflow. Inf Softw Technol 122:106277
Vasilescu B, Serebrenik A, Devanbu P, Filkov V (2014) How social q&a sites are changing knowledge sharing in open source software communities. In: Proceedings of the 17th ACM conference on computer supported cooperative work & social computing, CSCW ’14. Association for Computing Machinery, pp 342–354
Wang S, Lo D, Vasilescu B, Serebrenik A (2014) Entagrec: an enhanced tag recommendation system for software information sites. In: Proceedings of the 30th IEEE international conference on software maintenance and evolution, ICSME ’14. IEEE Computer Society, pp 291–300
Wang S, Chen T -H, Hassan A E (2018a) Understanding the factors for fast answers in technical q&a websites. Empir Softw Eng 23(3):1552–1593
Wang S, Lo D, Vasilescu B, Serebrenik A (2018b) Entagrec ++: an enhanced tag recommendation system for software information sites. Empir Softw Eng 23:800–832
Wang S, Chen T P, Hassan A E (2020) How do users revise answers on technical q&a websites? A case study on stack overflow. IEEE Trans Softw Eng 46(9):1024–1038
Xia X, Lo D, Wang X, Zhou B (2013) Tag recommendation in software information sites. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, pp 287–296
Yang D, Hussain A, Lopes C V (2016) From query to usable code: an analysis of stack overflow code snippets. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16. ACM, pp 391–402
Ye D, Xing Z, Kapre N (2017) The structure and dynamics of knowledge network in domain-specific q&a sites: a case study of stack overflow. Empir Softw Eng 22(1):375–406
Zagalsky A, Barzilay O, Yehudai A (2012) Example overflow: using social media for code recommendation. In: 2012 Third international workshop on recommendation systems for software engineering. IEEE, pp 38–42
Zhang H, Wang S, Chen T, Hassan A E (2019) Reading answers on stack overflow: not enough! IEEE Trans Softw Eng 1–1
Zhang H, Wang S, Chen T P, Zou Y, Hassan A E (2021a) An empirical study of obsolete answers on stack overflow. IEEE Trans Softw Eng 47 (4):850–862
Zhang H, Wang S, Chen T -H, Hassan A E (2021b) Are comments on stack overflow well organized for easy retrieval by developers? ACM Trans Softw Eng Methodol (TOSEM) 30(2):1–31
Zhen Wei Y G, Zhang J (2019) Automating question-and-answer session capture using neural networks. In: 2019 KDD workshop on deep learning for education, DL4ed
Zhou J, Wang S, Bezemer C -P, Hassan A E (2019) Bounties on technical q&a sites: a case study of stack overflow bounties. Empir Softw Eng 06
Acknowledgments
We would like to thank the anonymous reviewers for their insightful comments. The findings and opinions in this paper belong solely to the authors, and are not necessarily those of Huawei. Moreover, our results do not in any way reflect the quality of Huawei software products.
Author information
Authors and Affiliations
Corresponding author
Ethics declarations
Conflict of Interests
The authors declare that they have no conflict of interest.
Additional information
Communicated by: Nicole Novielli
Publisher’s note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
Zhu, W., Zhang, H., Hassan, A.E. et al. An empirical study of question discussions on Stack Overflow. Empir Software Eng 27, 148 (2022). https://doi.org/10.1007/s10664-022-10180-z
Accepted:
Published:
DOI: https://doi.org/10.1007/s10664-022-10180-z