Treffer: The Role of Customer Segmentation in Churn Prediction
info:eu-repo/semantics/openAccess
English
1525885519
From OAIster®, provided by the OCLC Cooperative.
Weitere Informationen
Introduction This thesis explores how clustering-based customer segmentation can enhance both the predictive performance and interpretability of churn prediction models in the online gaming industry. As acquiring new players is costlier than retaining existing ones, understanding and anticipating customer churn is vital for long-term profitability in this highly competitive, non-contractual environment. Research Question The central research question is: How can different customer segmentation methods, combined with daily behavioral and financial data, be used to develop and evaluate churn prediction models in the online gaming industry, and how do these segmentation approaches affect both model performance and interoperability? Method Using a design science research approach, I built a modular churn prediction pipeline that incorporated five segmentation methods: threshold-based segmentation, k-means, agglomerative clustering, Gaussian Mixture Models and HDBSCAN. Each was compared to a baseline with no segmentation, either as a feature in a single XGBoost model or through separate models per segment. The pipeline included behavioral, financial and trend-based features, SMOTE for class imbalance and SHAP for segment-level interpretability. All modeling was performed in Python using libraries such as scikit-learn, XGBoost and SHAP. Results Segmentation led to only minor improvements in predictive performance, with AUC and F1-score differences typically under 0.01 and no statistically significant gains in the single-model setup. However, segment-specific models revealed meaningful differences in churn behavior. HDBSCAN reached the highest accuracy of 0.8856 and AUC of 0.9366, although its interpretability was reduced by a large noise cluster. K-means and agglomerative clustering provided a more balanced trade-off between interpretability and performance, while threshold-based segmentation delivered intuitive business-aligned clusters. Feature importance analysis confi