Skip to main content
Log in

An empirical study of question discussions on Stack Overflow

  • Published:
Empirical Software Engineering Aims and scope Submit manuscript

Abstract

Stack Overflow provides a means for developers to exchange knowledge. While much previous research on Stack Overflow has focused on questions and answers (Q&A), recent work has shown that discussions in comments also contain rich information. On Stack Overflow, discussions through comments and chat rooms can be tied to questions or answers. In this paper, we conduct an empirical study that focuses on the nature of question discussions. We observe that: (1) Question discussions occur at all phases of the Q&A process, with most beginning before the first answer is received. (2) Both askers and answerers actively participate in question discussions; the likelihood of their participation increases as the number of comments increases. (3) There is a strong correlation between the number of question comments and the question answering time (i.e., more discussed questions receive answers more slowly). Our findings suggest that question discussions contain a rich trove of data that is integral to the Q&A processes on Stack Overflow. We further suggest how future research can leverage the information in question discussions, along with the commonly studied Q&A information.

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3
Fig. 4
Fig. 5
Fig. 6
Fig. 7
Fig. 8
Fig. 9

Similar content being viewed by others

Notes

  1. https://stackoverflow.com/questions/10801537/

  2. In Stack Overflow, comments are “hidden” (i.e., elided from view) by default when there are six or more attached to the same question.

  3. https://stackoverflow.com/help/privileges/comment

  4. https://chat.stackoverflow.com/faq

  5. We’ve made our dataset open access on Zenodo: https://zenodo.org/record/5516190

  6. General chat rooms are standard chat rooms on Stack Overflow that are not associated with a question or an answer.

  7. https://stackoverflow.com/questions/48956597/

  8. https://stackoverflow.com/questions/25869533/

  9. https://stackoverflow.com/questions/17690956/

  10. https://stackoverflow.blog/2019/11/13/were-rewarding-the-question-askers/

  11. Comments may be deleted by their author, but they may not be edited in place.

  12. https://meta.stackoverflow.com/questions/416059/

  13. https://meta.stackexchange.com/questions/17364/

  14. https://www.codeproject.com/

  15. https://coderanch.com/

  16. https://meta.stackoverflow.com/questions/326494/

  17. https://mail.gnome.org/mailman/listinfo

  18. https://www.atlassian.com/software/jira

References

  • Alkadhi R, Laţa T, Guzman E, Bruegge B (2017) Rationale in development chat messages: an exploratory study. In: Proceedings of the 14th international conference on mining software repositories, MSR ’17. IEEE Press, pp 436–446

  • Allamanis M, Sutton C (2013) Why, when, and what: analyzing stack overflow questions by topic, type, and code. In: 2013 10th Working conference on mining software repositories, MSR ’13. IEEE, pp 53–56

  • Anderson A, Huttenlocher D, Kleinberg J, Leskovec J (2012) Discovering value from community activity on focused question answering sites: a case study of stack overflow. In: Proceedings of the 18th ACM SIGKDD international conference on knowledge discovery and data mining, KDD ’12, pp 850–858

  • Asaduzzaman M, Mashiyat A S, Roy C K, Schneider K A (2013) Answering questions about unanswered questions of stack overflow. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, pp 97–100

  • Baltes S, Dumani L, Treude C, Diehl S (2018) Sotorrent: reconstructing and analyzing the evolution of stack overflow posts. In: Proceedings of the 15th international conference on mining software repositories, MSR ’18, pp 319–330

  • Beyer S, Pinzger M (2015) Synonym suggestion for tags on stack overflow. In: Proceedings of the 23rd international conference on program comprehension, ICPC ’15. IEEE, pp 94–103

  • Cai L, Wang H, Xu B, Huang Q, Xia X, Lo D, Xing Z (2019) Answerbot: an answer summary generation tool based on stack overflow. In: Proceedings of the 2019 27th ACM joint meeting on european software engineering conference and symposium on the foundations of software engineering, FSE ’19, pp 1134–1138

  • Calefato F, Lanubile F, Novielli N (2018) How to ask for technical help? Evidence-based guidelines for writing questions on stack overflow. Inf Softw Technol 94:186–207

    Article  Google Scholar 

  • Chatterjee P, Damevski K, Pollock L, Augustine V, Kraft N A (2019) Exploratory study of slack q&a chats as a mining source for software engineering tools. In: Proceedings of the 16th international conference on mining software repositories, MSR ’19. IEEE Press, pp 490–501

  • Chen C, Chen X, Sun J, Xing Z, Li G (2018) Data-driven proactive policy assurance of post quality in community q&a sites. In: Proceedings of the 2018 ACM human-computer interaction, vol 2

  • Chen M, Fischer F, Meng N, Wang X, Grossklags J (2019) How reliable is the crowdsourced knowledge of security implementation?. In: 2019 IEEE/ACM 41st international conference on software engineering, ICSE ’19. IEEE, pp 536–547

  • Choetkiertikul M, Avery D, Dam H K, Tran T, Ghose A (2015) Who will answer my question on stack overflow?. In: Proceedings of the 24th Australasian software engineering conference, pp 155–164

  • Chowdhury S A, Hindle A (2015) Mining stackoverflow to filter out off-topic irc discussion. In: Proceedings of the 12th working conference on mining software repositories, MSR ’15, pp 422–425

  • Dittrich Y, Giuffrida R (2011) Exploring the role of instant messaging in a global software development project. In: Proceedings of the IEEE sixth international conference on global software engineering, pp 103–112

  • Ford D, Lustig K, Banks J, Parnin C (2018) “We don’t do that here”: how collaborative editing with mentors improves engagement in social Q&A communities. In: Proceedings of the 2018 CHI conference on human factors in computing systems, CHI ’18. Association for Computing Machinery, pp 1–12

  • Jin X, Servant F (2019) What edits are done on the highly answered questions in stack overflow? An empirical study. In: 2019 IEEE/ACM 16th international conference on mining software repositories, MSR ’19, pp 225–229

  • Lin B, Zagalsky A, Storey M, Serebrenik A (2016) Why developers are slacking off: understanding how software teams use slack. In: Proceedings of the 19th ACM conference on computer supported cooperative work and social computing companion, CSCW ’16 companion. ACM, pp 333–336

  • Linares-Vásquez M, Bavota G, Di Penta M, Oliveto R, Poshyvanyk D (2014) How do api changes trigger stack overflow discussions? A study on the android sdk. In: Proceedings of the 22nd international conference on program comprehension, pp 83–94

  • Nasehi S M, Sillito J, Maurer F, Burns C (2012) What makes a good code example?: a study of programming q a in stackoverflow. In: Proceedings of the 28th international conference on software maintenance, ICSM ’12, pp 25–34

  • Ponzanelli L, Mocci A, Bacchelli A, Lanza M, Fullerton D (2014) Improving low quality stack overflow post detection. In: Proceedings of the 30th international conference on software maintenance and evolution, ICSME ’14, pp 541–544

  • Ragkhitwetsagul C, Krinke J, Paixao M, Bianco G, Oliveto R (2019) Toxic code snippets on stack overflow. IEEE Trans Softw Eng

  • Raymond E (2019) How to ask questions the smart way. http://www.catb.org/esr/faqs/smart-questions.html. Accessed 21 Oct 2019

  • Rigby P C, Hassan A E (2007) What can oss mailing lists tell us? A preliminary psychometric text analysis of the apache developer mailing list. In: Fourth international workshop on mining software repositories, MSR ’07, pp 23–23

  • Saha A K, Saha R K, Schneider K A (2013) A discriminative model approach for suggesting tags automatically for stack overflow questions. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13, pp 73–76

  • Sengupta S, Haythornthwaite C (2020) Learning with comments: an analysis of comments and community on stack overflow. In: Proceedings of the 53rd Hawaii International conference on system sciences

  • Shcherban S, Liang P, Tahir A, Li X (2020) Automatic identification of code smell discussions on stack overflow: a preliminary investigation. In: Proceedings of the 14th ACM/IEEE international symposium on empirical software engineering and measurement (ESEM), pp 1–6

  • Shihab E, Jiang Zhen Ming, Hassan A E (2009a) On the use of internet relay chat (irc) meetings by developers of the gnome gtk+ project. In: Proceedings of the 6th international working conference on mining software repositories, MSR ’09, pp 107–110

  • Shihab E, Jiang Z M, Hassan A E (2009b) Studying the use of developer irc meetings in open source projects. In: Proceedings of the 25th IEEE international conference on software maintenance, ICSM ’09. IEEE, pp 147–156

  • Soni A, Nadi S (2019) Analyzing comment-induced updates on stack overflow. In: 2019 IEEE/ACM 16th international conference on mining software repositories, MSR ’19. IEEE, pp 220–224

  • Sowe S, Stamelos I, Angelis L (2006) Identifying knowledge brokers that yield software engineering knowledge in oss projects. Inf Softw Technol 48 (11):1025–1033

    Article  Google Scholar 

  • Spearman C (1961) The proof and measurement of association between two things. Am J Psychol (AJP)

  • Stack Exchange Data Dump (2021) https://archive.org/details/stackexchange. Accessed 29 Dec 2021

  • Storey M, Zagalsky A, Filho F F, Singer L, German D M (2017) How social and communication channels shape and challenge a participatory culture in software development. IEEE Trans Softw Eng 43:185–204

    Article  Google Scholar 

  • Srba I, Bielikova M (2016) Why is stack overflow failing? Preserving sustainability in community question answering. IEEE Softw 33:80–89

    Article  Google Scholar 

  • Tian Q, Zhang P, Li B (2013) Towards predicting the best answers in community-based question-answering services. In: Proceedings of the seventh international AAAI conference on weblogs and social media

  • Treude C, Barzilay O, Storey M -A (2011) How do programmers ask and answer questions on the web? (NIER Track). In: Proceedings of the 33rd international conference on software engineering, ICSE ’11. ACM, pp 804–807

  • Uddin G, Khomh F, Roy C K (2020) Mining api usage scenarios from stack overflow. Inf Softw Technol 122:106277

    Article  Google Scholar 

  • Vasilescu B, Serebrenik A, Devanbu P, Filkov V (2014) How social q&a sites are changing knowledge sharing in open source software communities. In: Proceedings of the 17th ACM conference on computer supported cooperative work & social computing, CSCW ’14. Association for Computing Machinery, pp 342–354

  • Wang S, Lo D, Vasilescu B, Serebrenik A (2014) Entagrec: an enhanced tag recommendation system for software information sites. In: Proceedings of the 30th IEEE international conference on software maintenance and evolution, ICSME ’14. IEEE Computer Society, pp 291–300

  • Wang S, Chen T -H, Hassan A E (2018a) Understanding the factors for fast answers in technical q&a websites. Empir Softw Eng 23(3):1552–1593

    Article  Google Scholar 

  • Wang S, Lo D, Vasilescu B, Serebrenik A (2018b) Entagrec ++: an enhanced tag recommendation system for software information sites. Empir Softw Eng 23:800–832

    Article  Google Scholar 

  • Wang S, Chen T P, Hassan A E (2020) How do users revise answers on technical q&a websites? A case study on stack overflow. IEEE Trans Softw Eng 46(9):1024–1038

    Article  Google Scholar 

  • Xia X, Lo D, Wang X, Zhou B (2013) Tag recommendation in software information sites. In: Proceedings of the 10th working conference on mining software repositories, MSR ’13. IEEE Press, pp 287–296

  • Yang D, Hussain A, Lopes C V (2016) From query to usable code: an analysis of stack overflow code snippets. In: Proceedings of the 13th international conference on mining software repositories, MSR ’16. ACM, pp 391–402

  • Ye D, Xing Z, Kapre N (2017) The structure and dynamics of knowledge network in domain-specific q&a sites: a case study of stack overflow. Empir Softw Eng 22(1):375–406

    Article  Google Scholar 

  • Zagalsky A, Barzilay O, Yehudai A (2012) Example overflow: using social media for code recommendation. In: 2012 Third international workshop on recommendation systems for software engineering. IEEE, pp 38–42

  • Zhang H, Wang S, Chen T, Hassan A E (2019) Reading answers on stack overflow: not enough! IEEE Trans Softw Eng 1–1

  • Zhang H, Wang S, Chen T P, Zou Y, Hassan A E (2021a) An empirical study of obsolete answers on stack overflow. IEEE Trans Softw Eng 47 (4):850–862

    Article  Google Scholar 

  • Zhang H, Wang S, Chen T -H, Hassan A E (2021b) Are comments on stack overflow well organized for easy retrieval by developers? ACM Trans Softw Eng Methodol (TOSEM) 30(2):1–31

    Article  Google Scholar 

  • Zhen Wei Y G, Zhang J (2019) Automating question-and-answer session capture using neural networks. In: 2019 KDD workshop on deep learning for education, DL4ed

  • Zhou J, Wang S, Bezemer C -P, Hassan A E (2019) Bounties on technical q&a sites: a case study of stack overflow bounties. Empir Softw Eng 06

Download references

Acknowledgments

We would like to thank the anonymous reviewers for their insightful comments. The findings and opinions in this paper belong solely to the authors, and are not necessarily those of Huawei. Moreover, our results do not in any way reflect the quality of Huawei software products.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Haoxiang Zhang.

Ethics declarations

Conflict of Interests

The authors declare that they have no conflict of interest.

Additional information

Communicated by: Nicole Novielli

Publisher’s note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Zhu, W., Zhang, H., Hassan, A.E. et al. An empirical study of question discussions on Stack Overflow. Empir Software Eng 27, 148 (2022). https://doi.org/10.1007/s10664-022-10180-z

Download citation

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1007/s10664-022-10180-z

Keywords

Navigation