×

A quantitative linguistic analysis of a cancer online health community with a smooth latent space model. (English) Zbl 07832641

Summary: Online health communities (OHCs) provide free, open, and well-resourced platforms for patients, family members, and others to discuss illnesses, express feelings, and connect with others. Linguistic analysis of OHC posts can assist in better understanding disease conditions as well as monitoring the emotional and mental status of patients and those who are closely related. Many existing OHC linguistic analyses are limited by focusing on individual words. There are a handful of cooccurrence network analyses, which have multiple methodological limitations. In this article we analyze posts that are publicly available at the LUNGevity Foundation’s Lung Cancer Support Community (LCSC). The analyzed data contains 21,028 posts published between April 2018 and February 2022. For word cooccurrence network analysis, we develop a two-part latent space model, which advances from the existing ones by accommodating network weights. Further, we consider the scenario where there are change points in time, networks remain the same between two change points but differ on the two sides of a change point, and the number and locations of change points are unknown. A penalized fusion approach is developed to data-dependently determine change points and estimate networks. In data analysis multiple change points are identified, which reflect significant changes in lung cancer patients’ and their close affiliates’ emotional/mental status and mostly align with the changes in COVID-19. The obtained network structures and other findings are also sensible.

MSC:

62Pxx Applications of statistics
Full Text: DOI

References:

[1] AUSTENFELD, J. L. and STANTON, A. L. (2004). Coping through emotional approach: A new look at emotion, coping, and health-related outcomes. J. Pers. 72 1335-1364.
[2] BÄUERLE, A., MUSCHE, V., SCHMIDT, K., SCHWEDA, A., FINK, M., WEISMÜLLER, B., KOHLER, H., HERRMANN, K., TEWES, M. et al. (2021). Mental health burden of German cancer patients before and after the outbreak of Covid-19: Predictors of mental health impairment. Int. J. Environ. Res. Public Health 18 2318.
[3] BLEAKLEY, K. and VERT, J.-P. (2011). The group fused lasso for multiple change-point detection. ArXiv preprint. Available at arXiv:1106.4199.
[4] BOLLEN, K. A. and CURRAN, P. J. (2004). Autoregressive latent trajectory (ALT) models: A synthesis of two traditions. Sociol. Methods Res. 32 336-383. Digital Object Identifier: 10.1177/0049124103260222 Google Scholar: Lookup Link MathSciNet: MR2037798 · doi:10.1177/0049124103260222
[5] BOLLEN, K. A. and CURRAN, P. J. (2006). Latent Curve Models: A Structural Equation Perspective. Wiley Series in Probability and Statistics. Wiley, Hoboken, NJ. MathSciNet: MR2184502 · Zbl 1093.62110
[6] BOYD, S., PARIKH, N., CHU, E., PELEATO, B. and ECKSTEIN, J. (2010). Distributed optimization and statistical learning via the alternating direction method of multipliers. Found. Trends Mach. Learn. 3 1-122. · Zbl 1229.90122
[7] CHAVES, A. L. F., CASTRO, A. F., MARTA, G. N., JUNIOR, G. C., FERRIS, R. L., GIGLIO, R. E., GOLUSINSKI, W., GORPHE, P., HOSAL, S. et al. (2020). Emergency changes in international guidelines on treatment for head and neck cancer patients during the Covid-19 pandemic. Oral Oncol. 107 104734.
[8] DEHDARIRAD, T. and FREER, J. (2021). Is there alignment amongst scientific literature, news media and patient forums regarding topics?: A study of breast and lung cancer. Online Inf. Rev. 45 983-999.
[9] Goldenberg, A., Zheng, A. X., Fienberg, S. E. and Airoldi, E. M. (2010). A survey of statistical network models. Found. Trends Mach. Learn. 2 129-233. · Zbl 1184.68030
[10] GOLDER, S. A. and MACY, M. W. (2011). Diurnal and seasonal mood vary with work, sleep, and daylength across diverse cultures. Science 333 1878-1881. Digital Object Identifier: 10.1126/science.1202775 Google Scholar: Lookup Link · doi:10.1126/science.1202775
[11] HARTLEY, C. A. and PHELPS, E. A. (2012). Anxiety and decision-making. Biol. Psychiatry 72 113-118. Digital Object Identifier: 10.1016/j.biopsych.2011.12.027 Google Scholar: Lookup Link · doi:10.1016/j.biopsych.2011.12.027
[12] Hoff, P. D., Raftery, A. E. and Handcock, M. S. (2002). Latent space approaches to social network analysis. J. Amer. Statist. Assoc. 97 1090-1098. Digital Object Identifier: 10.1198/016214502388618906 Google Scholar: Lookup Link MathSciNet: MR1951262 · Zbl 1041.62098 · doi:10.1198/016214502388618906
[13] HOFMANN, S. G., MOORE, P. M., GUTNER, C. and WEEKS, J. W. (2012). Linguistic correlates of social anxiety disorder. Cogn. Emot. 26 720-726. Digital Object Identifier: 10.1080/02699931.2011.602048 Google Scholar: Lookup Link · doi:10.1080/02699931.2011.602048
[14] KE, Y., LI, J. and ZHANG, W. (2016). Structure identification in panel data analysis. Ann. Statist. 44 1193-1233. Digital Object Identifier: 10.1214/15-AOS1403 Google Scholar: Lookup Link MathSciNet: MR3485958 · Zbl 1341.62214 · doi:10.1214/15-AOS1403
[15] KE, Z. T., FAN, J. and WU, Y. (2015). Homogeneity pursuit. J. Amer. Statist. Assoc. 110 175-194. Digital Object Identifier: 10.1080/01621459.2014.892882 Google Scholar: Lookup Link MathSciNet: MR3338495 · Zbl 1373.62345 · doi:10.1080/01621459.2014.892882
[16] KUDERER, N. M., CHOUEIRI, T. K., SHAH, D. P., SHYR, Y., RUBINSTEIN, S. M., RIVERA, D. R., SHETE, S., HSU, C.-Y., DESAI, A. et al. (2020). Clinical impact of Covid-19 on patients with cancer (CCC19): A cohort study. Lancet 395 1907-1918.
[17] LIN, K., SHARPNACK, J., RINALDO, A. and TIBSHIRANI, R. J. (2017). A sharp error analysis for the fused lasso, with application to approximate changepoint screening. In Proceedings of the 31st International Conference on Neural Information Processing Systems. NIPS’17 6887-6896. Curran Associates, Red Hook, NY.
[18] LIU, M., FAN, X. and MA, S. (2024). Supplement to “A quantitative linguistic analysis of a cancer online health community with a smooth latent space model.” https://doi.org/10.1214/23-AOAS1783SUPP
[19] MA, Z., MA, Z. and YUAN, H. (2020). Universal latent space model fitting for large networks with edge covariates. J. Mach. Learn. Res. 21 Paper No. 4. Digital Object Identifier: 10.1109/tnnls.2020.3010690 Google Scholar: Lookup Link MathSciNet: MR4071187 · Zbl 1497.68432 · doi:10.1109/tnnls.2020.3010690
[20] MEHLER, A., LÜCKING, A., BANISCH, S., BLANCHARD, P. and JOB, B. (2016). Towards a Theoretical Framework for Analyzing Complex Linguistic Networks. Springer, Berlin. · Zbl 1356.94004
[21] NAUSHEEN, B., GIDRON, Y., PEVELER, R. and MOSS-MORRIS, R. (2009). Social support and cancer progression: A systematic review. J. Psychosom. Res. 67 403-415. Digital Object Identifier: 10.1016/j.jpsychores.2008.12.012 Google Scholar: Lookup Link · doi:10.1016/j.jpsychores.2008.12.012
[22] PALOWITCH, J., BHAMIDI, S. and NOBEL, A. B. (2017). Significance-based community detection in weighted networks. J. Mach. Learn. Res. 18 Paper No. 188. MathSciNet: MR3827076 · Zbl 1472.62123
[23] PEH, C. X., KUA, E. H. and MAHENDRAN, R. (2016). Hope, emotion regulation, and psychosocial well-being in patients newly diagnosed with cancer. Support Care Cancer 24 1955-1962. Digital Object Identifier: 10.1007/s00520-015-2989-x Google Scholar: Lookup Link · doi:10.1007/s00520-015-2989-x
[24] Reshef, D. N., Reshef, Y. A., Finucane, H. K., Grossman, S. R., McVean, G., Turnbaugh, P. J., Lander, E. S., Mitzenmacher, M. and Sabeti, P. C. (2011). Detecting novel associations in large data sets. Science 334 1518-1524. · Zbl 1359.62216
[25] ROGERS, J. P., CHESNEY, E., OLIVER, D., POLLAK, T. A., MCGUIRE, P., FUSAR-POLI, P., ZANDI, M. S., LEWIS, G. and DAVID, A. S. (2020). Psychiatric and neuropsychiatric presentations associated with severe coronavirus infections: A systematic review and meta-analysis with comparison to the Covid-19 pandemic. Lancet Psychiatry 7 611-627.
[26] ROY, U. B., MANTEL, S., JACOBSON, M. and FERRIS, A. (2017). P2. 08-005 treating Cachexia-Anorexia in lung cancer patients: Understanding the patient perspective on novel treatment approaches: Topic: Patient’s voice, patient’s information. J. Thorac. Oncol. 12 S1112.
[27] SKRONDAL, A. and RABE-HESKETH, S. (2004). Generalized Latent Variable Modeling: Multilevel, Longitudinal, and Structural Equation Models. Interdisciplinary Statistics. CRC Press/CRC, Boca Raton, FL. Digital Object Identifier: 10.1201/9780203489437 Google Scholar: Lookup Link MathSciNet: MR2059021 · Zbl 1097.62001 · doi:10.1201/9780203489437
[28] SKRONDAL, A. and RABE-HESKETH, S. (2007). Latent variable modelling: A survey. Scand. J. Stat. 34 712-745. Digital Object Identifier: 10.1111/j.1467-9469.2007.00573.x Google Scholar: Lookup Link MathSciNet: MR2396936 · Zbl 1164.62028 · doi:10.1111/j.1467-9469.2007.00573.x
[29] SOSA, J. and BETANCOURT, B. (2022). A latent space model for multilayer network data. Comput. Statist. Data Anal. 169 Paper No. 107432. Digital Object Identifier: 10.1016/j.csda.2022.107432 Google Scholar: Lookup Link MathSciNet: MR4369145 · Zbl 1543.62665 · doi:10.1016/j.csda.2022.107432
[30] SPIEGEL, D. and GIESE-DAVIS, J. (2003). Depression and cancer: Mechanisms and disease progression. Biol. Psychiatry 54 269-282. Digital Object Identifier: 10.1016/s0006-3223(03)00566-3 Google Scholar: Lookup Link · doi:10.1016/s0006-3223(03)00566-3
[31] SWANSON, T., TEIXEIRA, A. S., RICHSON, B. N., YING, L., HILLS, T., FORBUSH, K. T., WATSON, D. and STELLA, M. (2021). Cognitive networks identify dimensions of distress in suicide notes: Anxiety, emotional profiles, and “words not said”.
[32] VAN DER EIJK, M., FABER, M. J., AARTS, J. W., KREMER, J. A., MUNNEKE, M., BLOEM, B. R. et al. (2013). Using online health communities to deliver patient-centered care to people with chronic conditions. J. Med. Internet Res. 15 e2476.
[33] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B. Stat. Methodol. 68 49-67. Digital Object Identifier: 10.1111/j.1467-9868.2005.00532.x Google Scholar: Lookup Link MathSciNet: MR2212574 · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[34] ZHANG, C. and ZHANG, J. (2010). InForCE: Forum data crawling with information extraction. In 2010 4th International Universal Communication Symposium 367-373. IEEE Press, New York.
[35] ZHANG, S., O’CARROLL BANTUM, E., OWEN, J., BAKKEN, S. and ELHADAD, N. (2017). Online cancer communities as informatics intervention for social support: Conceptualization, characterization, and impact. J. Amer. Med. Inform. Assoc. 24 451-459.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.