Background: Synovial sarcoma is a rare soft tissue tumor that constitutes 5-10% of all soft tissue sarcomas. Early diagnosis and risk stratification are essential for effective management. This study leverages machine learning models to improve the risk stratification of mortality in synovial sar-coma patients.
Methods: This retrospective cohort study utilized data from the Surveillance, Epidemiology, and End Results (SEER) database to analyze patients diagnosed with synovial sarcoma between 2004 and 2015, as well as a validation cohort diagnosed from 2018 onward. The dataset encompassed demographic data, clinical characteristics, staging, treatment modalities, and outcomes. Four ma-chine learning models—support vector classifier (SVC), k-nearest neighbors (KNN), Gaussian naive Bayes, and gradient boosting—were trained and evaluated. Model performance was assessed using sensitivity, specificity, AUC-ROC, and Brier score. SHAP analysis was performed to determine the most influential features impacting model predictions. The top-performing model was validated using the 2018+ cohort.
Results: The study included a total of 762 patients, with an average age of 39.4 years. The support vector classifier (SVC) outperformed the other models, achieving an AUC-ROC of 0.8153 (95% CI: 0.7578 to 0.8677) and a Brier score of 0.1715. Upon external validation with the 2018+ cohort, the SVC model yielded an AUC-ROC of 0.8179 (95% CI: 0.7632 to 0.8666). Key prognostic factors identified through SHAP analysis included tumor size, patient age, presence of metastasis, tumor differentiation, and cancer stage.
Conclusion: Machine learning models can be used to stratify the risk of death in synovial sarcoma patients effectively, with the support vector classifier showing the most promise.