Treffer: Interpretability of automated machine learning methods in psychological research: A tutorial with AutoGluon in Python.
Original Publication: Austin, Tex. : Psychonomic Society, c2005-
Anvari, F., Alsalti, T., Oehler, L. A., Hussey, I., Elson, M., & Arslan, R. C. (2025). Defragmenting psychology. Nature Human Behaviour, 9(5), 836–839. https://doi.org/10.1038/s41562-025-02138-0. (PMID: 10.1038/s41562-025-02138-040102675)
Armstrong, J. S., & Collopy, F. (1992). Error measures for generalizing about forecasting methods: Empirical comparisons. International Journal of Forecasting, 8(1), 69–80. https://doi.org/10.1016/0169-2070(92)90008-W. (PMID: 10.1016/0169-2070(92)90008-W)
Arora, A., Bojko, L., Kumar, S., Lillington, J., Panesar, S., & Petrungaro, B. (2023). Assessment of machine learning algorithms in national data to classify the risk of self-harm among young adults in hospital: A retrospective study. International Journal of Medical Informatics, 177, Article 105164. https://doi.org/10.1016/j.ijmedinf.2023.105164. (PMID: 10.1016/j.ijmedinf.2023.10516437516036)
Babaeianjelodar, M., Poorna Prudhvi, G., Lorenz, S., Chen, K., Mondal, S., Dey, S., & Kumar, N. (2022). Explainable and high-performance hate and offensive speech detection. arXiv preprint arXiv:2206.12983. https://doi.org/10.48550/arXiv.2206.12983.
Binson, V. A., Thomas, S., Subramoniam, M., Arun, J., Naveen, S., & Madhu, S. (2024). A review of machine learning algorithms for biomedical applications. Annals of Biomedical Engineering, 52(5), 1159–1183. https://doi.org/10.1007/s10439-024-03459-3. (PMID: 10.1007/s10439-024-03459-338383870)
Bishop, C. M. (2006). Pattern Recognition and Machine Learning (Information Science and Statistics). Springer-Verlag.
Bishop, C. M., & Nasrabadi, N. M. (2006). Pattern recognition and machine learning (vol. 4). Springer.
Blötner, C. (2025). Extending Lawson and Robins’ (2021) guideline for the evaluation of jingle and jangle fallacies. Behavior Research Methods, 57(6), 177. https://doi.org/10.3758/s13428-025-02691-6.
Bondi, E., Maggioni, E., Brambilla, P., & Delvecchio, G. (2023). A systematic review on the potential use of machine learning to classify major depressive disorder from healthy controls using resting state fMRI measures. Neuroscience and Biobehavioral Reviews, 144, Article 104972. https://doi.org/10.1016/j.neubiorev.2022.104972. (PMID: 10.1016/j.neubiorev.2022.10497236436736)
Cervantes, J., Garcia-Lamont, F., Rodríguez-Mazahua, L., & Lopez, A. (2020). A comprehensive survey on support vector machine classification: Applications, challenges and trends. Neurocomputing, 408, 189–215. https://doi.org/10.1016/j.neucom.2019.10.118. (PMID: 10.1016/j.neucom.2019.10.118)
Chen, J., Hu, Y., Lai, Q., Wang, W., Chen, J., Liu, H., Srivastava, G., Bashir, A. K., & Hu, X. (2024). IIFDD: Intra and inter-modal fusion for depression detection with multi-modal information from Internet of Medical Things. Information Fusion, 102, 102017. https://doi.org/10.1016/j.inffus.2023.102017. (PMID: 10.1016/j.inffus.2023.102017)
Ciharova, M., Amarti, K., van Breda, W., Peng, X., Lorente-Català, R., Funk, B., Hoogendoorn, M., Koutsouleris, N., Fusar-Poli, P., Karyotaki, E., Cuijpers, P., & Riper, H. (2024). Use of machine learning algorithms based on text, audio, and video data in the prediction of anxiety and posttraumatic stress in general and clinical populations: A systematic review. Biological Psychiatry, 96(7), 519–531. https://doi.org/10.1016/j.biopsych.2024.06.002. (PMID: 10.1016/j.biopsych.2024.06.00238866173)
Connor, K. M., & Davidson, J. R. T. (2003). Development of a new resilience scale: The Connor-Davidson Resilience Scale (CD-RISC). Depression and Anxiety, 18(2), 76–82. https://doi.org/10.1002/da.10113. (PMID: 10.1002/da.1011312964174)
D’Mello, S. K., Tay, L., & Southwell, R. (2022). Psychological measurement in the information age: Machine-learned computational models. Current Directions in Psychological Science, 31(1), 76–87. https://doi.org/10.1177/09637214211056906. (PMID: 10.1177/09637214211056906)
Davis, J., & Goadrich, M. (2006). The relationship between Precision-Recall and ROC curves. In Proceedings of the 23rd International Conference on Machine learning (pp. 233–240). https://doi.org/10.1145/1143844.1143874.
Elshawi, R., & Sakr, S. (2020). Automated machine learning: Techniques and frameworks. Big Data Management and Analytics: 9th European Summer School. eBISS. 2019, June 30–July 5, 2019, Revised Selected Papers 9.
Erickson, N., Mueller, J., Shirkov, A., Zhang, H., Larroy, P., Li, M., & Smola, A. (2020). Autogluon-tabular: Robust and accurate automl for structured data. arXiv preprint arXiv:2003.06505. https://doi.org/10.48550/arXiv.2003.06505.
Fakoor, R., Mueller, J., Erickson, N., Chaudhari, P., & Smola, A. (2020). Fast, accurate, and simple models for tabular data via augmented distillation. arXiv preprint arXiv:2006.14284. https://doi.org/10.48550/arXiv.2006.14284.
Feurer, M., Klein, A., Eggensperger, K., Springenberg, J., Blum, M., & Hutter, F. (2015). Efficient and robust automated machine learning. Advances in neural information processing systems, 28. https://proceedings.neurips.cc/paper_files/paper/2015/file/11d0e6287202fced83f79975ec59a3a6-Paper.pdf.
Forster, M. R. (2000). Key concepts in model selection: Performance and generalizability. Journal Of Mathematical Psychology, 44(1), 205–231. https://doi.org/10.1006/jmps.1999.1284. (PMID: 10.1006/jmps.1999.128410733865)
Fried, E. I. (2017). The 52 symptoms of major depression: Lack of content overlap among seven common depression scales. Journal Of Affective Disorders, 208, 191–197. https://doi.org/10.1016/j.jad.2016.10.019. (PMID: 10.1016/j.jad.2016.10.01927792962)
Fu, H., Zhang, M., Yang, S., Kang, C., Liu, L., & Zhao, X. (2025). Decoding the adolescent non-suicidal self-injury: Understanding with interpretable machine learning insights. Bmc Public Health, 25(1), 2994. https://doi.org/10.1186/s12889-025-24354-z. (PMID: 10.1186/s12889-025-24354-z4089065712400739)
Gao, X., Lin, J., Qu, C., Wang, C., Wu, A., Zhu, J., & Xu, C. (2024). Computer-aided diagnostic system with automated deep learning method based on the AutoGluon framework improved the diagnostic accuracy of early esophageal cancer. Journal of Gastrointestinal Oncology, 15(2), 535–543. https://doi.org/10.21037/jgo-24-158. (PMID: 10.21037/jgo-24-1583875663311094492)
Geng, H., Chen, J., Chuan-Peng, H., Jin, J., Chan, R. C. K., Li, Y., Hu, X., Zhang, R.-Y., & Zhang, L. (2022). Promoting computational psychiatry in China. Nature Human Behaviour, 6(5), 615–617. https://doi.org/10.1038/s41562-022-01328-4. (PMID: 10.1038/s41562-022-01328-435347241)
Haghish, E. F. (2025). Differentiating adolescent suicidal and nonsuicidal self-harm with artificial intelligence: Beyond suicidal intent and capability for suicide. Journal of Affective Disorders, 378, 381–391. https://doi.org/10.1016/j.jad.2025.02.015. (PMID: 10.1016/j.jad.2025.02.01539955075)
Haghish, E. F., Nes, R. B., Obaidi, M., Qin, P., Stänicke, L. I., Bekkhus, M., Laeng, B., & Czajkowski, N. (2024). Unveiling adolescent suicidality: Holistic analysis of protective and risk factors using multiple machine learning algorithms. Journal of Youth and Adolescence, 53(3), 507–525. https://doi.org/10.1007/s10964-023-01892-6. (PMID: 10.1007/s10964-023-01892-637982927)
Haig, B. D. (2005). Exploratory factor analysis, theory generation, and scientific method. Multivariate Behavioral Research, 40(3), 303–329. https://doi.org/10.1207/s15327906mbr4003_2. (PMID: 10.1207/s15327906mbr4003_226794686)
Hair, J. F., Hult, G. T. M., Ringle, C. M., Sarstedt, M., Danks, N. P., & Ray, S. (2021). An Introduction to Structural Equation Modeling. Partial Least Squares Structural Equation Modeling (PLS-SEM) Using R: A Workbook (pp. 1–29). Springer International Publishing. https://doi.org/10.1007/978-3-030-80519-7_1. (PMID: 10.1007/978-3-030-80519-7_1)
Hastie, T., Tibshirani, R., & Friedman, J. (2009). Model Assessment and Selection. The Elements of Statistical Learning: Data Mining, Inference, and Prediction (pp. 219–259). Springer New York. https://doi.org/10.1007/978-0-387-84858-7_7. (PMID: 10.1007/978-0-387-84858-7_7)
Hawkins, D. M. (2004). The problem of overfitting. Journal Of Chemical Information And Computer Sciences, 44(1), 1–12. https://doi.org/10.1021/ci0342472. (PMID: 10.1021/ci034247214741005)
Henninger, M., Debelak, R., Rothacher, Y., & Strobl, C. (2025). Interpretable machine learning for psychological research: Opportunities and pitfalls. Psychological Methods, 30(2), 271–305. https://doi.org/10.1037/met0000560.
Hill, E. D., Kashyap, P., Raffanello, E., Wang, Y., Moffitt, T. E., Caspi, A., Engelhard, M., & Posner, J. (2025). Prediction of mental health risk in adolescents. Nature Medicine. https://doi.org/10.1038/s41591-025-03560-7.
Horn, J. L. (1965). A rationale and test for the number of factors in factor analysis. Psychometrika, 30(2), 179–185. https://doi.org/10.1007/BF02289447. (PMID: 10.1007/BF0228944714306381)
Hutter, F., Kotthoff, L., & Vanschoren, J. (2019). Automated machine learning: Methods, systems, challenges. Springer Nature. (PMID: 10.1007/978-3-030-05318-5)
Hyndman, R. J., & Koehler, A. B. (2006). Another look at measures of forecast accuracy. International Journal of Forecasting, 22(4), 679–688. https://doi.org/10.1016/j.ijforecast.2006.03.001. (PMID: 10.1016/j.ijforecast.2006.03.001)
Islam, M. M., Hassan, S., Akter, S., Jibon, F. A., & Sahidullah, M. (2024). A comprehensive review of predictive analytics models for mental illness using machine learning algorithms. Healthcare Analytics, 6, Article 100350. https://doi.org/10.1016/j.health.2024.100350. (PMID: 10.1016/j.health.2024.100350)
Jacob, S. G., Sulaiman, M., & Bennet, B. (2023). Feature signature discovery for autism detection: An automated machine learning based feature ranking framework. Computational Intelligence and Neuroscience, 2023, Article 6330002. https://doi.org/10.1155/2023/6330002. (PMID: 10.1155/2023/6330002366438889833925)
Jiang, C., Zhu, Y., Luo, Y., Tan, C.-S., Mastrotheodoros, S., Costa, P., Chen, L., Guo, L., Ma, H., & Meng, R. (2023). Validation of the Chinese version of the Rosenberg Self-Esteem Scale: Evidence from a three-wave longitudinal study. BMC Psychology, 11(1), 345. https://doi.org/10.1186/s40359-023-01293-1. (PMID: 10.1186/s40359-023-01293-13785349910585735)
Jo, D., Pyo, S., Hwang, Y., Seung, Y., & Yang, E. (2024). What makes us strong: Conceptual and functional comparisons of psychological flexibility and resilience. Journal Of Contextual Behavioral Science, 33, Article 100798. https://doi.org/10.1016/j.jcbs.2024.100798. (PMID: 10.1016/j.jcbs.2024.100798)
Knipe, D., Padmanathan, P., Newton-Howes, G., Chan, L. F., & Kapur, N. (2022). Suicide and self-harm. The Lancet, 399(10338), 1903–1916. https://doi.org/10.1016/S0140-6736(22)00173-8. (PMID: 10.1016/S0140-6736(22)00173-8)
Krishnan, N. M. A., Kodamana, H., & Bhattoo, R. (2024). Interpretable Machine Learning. Machine Learning for Materials Discovery: Numerical Recipes and Practical Applications (pp. 159–171). Springer International Publishing. https://doi.org/10.1007/978-3-031-44622-1_9. (PMID: 10.1007/978-3-031-44622-1_9)
LeDell, E., & Poirier, S. (2020). H2O AutoML: Scalable automatic machine learning. In Proceedings of the AutoML Workshop at ICML (Vol. 2020). ICML. https://www.automl.org/wp-content/uploads/2020/07/AutoML_2020_paper_61.pdf.
Lei, C., Qu, D., Liu, K., & Chen, R. (2023). Ecological momentary assessment and machine learning for predicting suicidal ideation among sexual and gender minority individuals. JAMA Network Open, 6(9), e2333164–e2333164. https://doi.org/10.1001/jamanetworkopen.2023.33164. (PMID: 10.1001/jamanetworkopen.2023.331643769558010495869)
Lü, W., Wang, Z., Liu, Y., & Zhang, H. (2014). Resilience as a mediator between extraversion, neuroticism and happiness, PA and NA. Personality and Individual Differences, 63, 128–133. https://doi.org/10.1016/j.paid.2014.01.015. (PMID: 10.1016/j.paid.2014.01.015)
Lundberg, S. M., & Lee, S.-I. (2017). A unified approach to interpreting model predictions. Advances in Neural Information Processing Systems, 30, 42226.
Lundberg, S. M., Nair, B., Vavilala, M. S., Horibe, M., Eisses, M. J., Adams, T., Liston, D. E., Low, D.K.-W., Newman, S.-F., Kim, J., & Lee, S.-I. (2018). Explainable machine-learning predictions for the prevention of hypoxaemia during surgery. Nature Biomedical Engineering, 2(10), 749–760. https://doi.org/10.1038/s41551-018-0304-0. (PMID: 10.1038/s41551-018-0304-0310014556467492)
Lundberg, S. M., Erion, G., Chen, H., DeGrave, A., Prutkin, J. M., Nair, B., Katz, R., Himmelfarb, J., Bansal, N., & Lee, S.-I. (2020). From local explanations to global understanding with explainable AI for trees. Nature Machine Intelligence, 2(1), 56–67. https://doi.org/10.1038/s42256-019-0138-9. (PMID: 10.1038/s42256-019-0138-9326074727326367)
Madakkatel, I., Zhou, A., McDonnell, M. D., & Hyppönen, E. (2021). Combining machine learning and conventional statistical approaches for risk factor discovery in a large cohort study. Scientific Reports, 11(1), Article 22997. https://doi.org/10.1038/s41598-021-02476-9. (PMID: 10.1038/s41598-021-02476-9348370008626442)
Madolimov, F., Medatov, A., Nazirova, E., Zaynidinov, H., Azimov, U., Turakhonova, S., & Azimjonov, J. (2025). Optimizing artificial intelligence-aided breast cancer models: An empirical analysis of binary classifiers and regression-based feature selectors. Engineering Applications of Artificial Intelligence, 147, 110318. https://doi.org/10.1016/j.engappai.2025.110318. (PMID: 10.1016/j.engappai.2025.110318)
Marsh, H. W., Scalas, L. F., & Nagengast, B. (2010). Longitudinal tests of competing factor structures for the Rosenberg Self-Esteem Scale: Traits, ephemeral artifacts, and stable response styles. Psychological Assessment, 22(2), 366–381. https://doi.org/10.1037/a0019225. (PMID: 10.1037/a001922520528064)
Martínez-Martí, M. L., & Ruch, W. (2017). Character strengths predict resilience over and above positive affect, self-efficacy, optimism, social support, self-esteem, and life satisfaction. The Journal of Positive Psychology, 12(2), 110–119. https://doi.org/10.1080/17439760.2016.1163403. (PMID: 10.1080/17439760.2016.1163403)
Mengcheng, W., Bingguang, C., Yan, W., & Xiaoyang, D. (2010). The factor structure of Chinese Rosenberg’ Self-esteem Scale Affected by Item Statement Method. Psychological Exploration, 30(3), 63–68.
Merrick, L., & Taly, A. (2020). The explanation game: explaining machine learning models using shapley values. In International cross-domain conference for machine learning and knowledge extraction (pp. 17–38). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-57321-8_2.
Muthukrishna, M., & Henrich, J. (2019). A problem in theory. Nature Human Behaviour, 3(3), 221–229. https://doi.org/10.1038/s41562-018-0522-1. (PMID: 10.1038/s41562-018-0522-130953018)
Ng, A. Y. (2004). Feature selection, L1 vs. L2 regularization, and rotational invariance. In Proceedings of the twenty-first international conference on machine learning (p. 78). https://doi.org/10.1145/1015330.1015435.
Ning, Z., Hu, H., Yi, L., Qie, Z., Tolba, A., & Wang, X. (2024). A depression detection auxiliary decision system based on multi-modal feature-level fusion of EEG and speech. IEEE Transactions on Consumer Electronics, 70(1), 3392–3402. https://doi.org/10.1109/TCE.2024.3370310. (PMID: 10.1109/TCE.2024.3370310)
Nohara, Y., Matsumoto, K., Soejima, H., & Nakashima, N. (2022). Explanation of machine learning models using shapley additive explanation and application for real data in hospital. Computer Methods and Programs in Biomedicine, 214, Article 106584. https://doi.org/10.1016/j.cmpb.2021.106584. (PMID: 10.1016/j.cmpb.2021.10658434942412)
Oberauer, K., & Lewandowsky, S. (2019). Addressing the theory crisis in psychology. Psychonomic Bulletin & Review, 26(5), 1596–1618. https://doi.org/10.3758/s13423-019-01645-2. (PMID: 10.3758/s13423-019-01645-2)
Pargent, F., Schoedel, R., & Stachl, C. (2023). Best practices in supervised machine learning: A tutorial for psychologists. Advances in Methods and Practices in Psychological Science, 6(3), 25152459231162560. https://doi.org/10.1177/25152459231162559. (PMID: 10.1177/25152459231162559)
Pavot, W., & Diener, E. (1993). Review of the satisfaction with life scale. Psychological Assessment, 5(2), 164–172. https://doi.org/10.1037/1040-3590.5.2.164. (PMID: 10.1037/1040-3590.5.2.164)
Pawlicki, M., Pawlicka, A., Uccello, F., Szelest, S., D’Antonio, S., Kozik, R., & Choraś, M. (2024). Evaluating the necessity of the multiple metrics for assessing explainable AI: A critical examination. Neurocomputing, 128282. https://doi.org/10.1016/j.neucom.2024.128282.
Pearl, J. (2018). Theoretical impediments to machine learning with seven sparks from the causal revolution. arXiv preprint arXiv:1801.04016. https://doi.org/10.48550/arXiv.1801.04016.
Powers, D. M. W. (2020). Evaluation: from precision, recall and F-measure to ROC, informedness, markedness and correlation. arXiv preprint arXiv:2010.16061. https://doi.org/10.48550/arXiv.2010.16061.
Ren, L., Wang, T., Sekhari Seklouli, A., Zhang, H., & Bouras, A. (2023). A review on missing values for main challenges and methods. Information Systems, 119, Article 102268. https://doi.org/10.1016/j.is.2023.102268. (PMID: 10.1016/j.is.2023.102268)
Revelle, W. R. (2017). psych: Procedures for personality and psychological research. Software.
Rocca, R., & Yarkoni, T. (2021). Putting psychology to the test: Rethinking model evaluation through benchmarking and prediction. Advances in Methods and Practices in Psychological Science, 4(3), Article 25152459211026864. https://doi.org/10.1177/25152459211026864. (PMID: 10.1177/25152459211026864)
Rosenberg, M. (1965). Society and the adolescent self-image. Princeton University Press.
Rosenbusch, H., Soldner, F., Evans, A. M., & Zeelenberg, M. (2021). Supervised machine learning methods in psychology: A practical introduction with annotated R code. Social and Personality Psychology Compass, 15(2), Article e12579. https://doi.org/10.1111/spc3.12579. (PMID: 10.1111/spc3.12579)
Shchur, O., Turkmen, A. C., Erickson, N., Shen, H., Shirkov, A., Hu, T., & Wang, Y. (2023). Autogluon–timeseries: AutoML for probabilistic time series forecasting. arXiv preprint arXiv:2308.05566. https://doi.org/10.48550/arXiv.2308.05566.
Sheetal, A., Feng, Z., & Savani, K. (2020). Using machine learning to generate novel hypotheses: Increasing optimism about COVID-19 makes people less willing to justify unethical behaviors. Psychological Science, 31(10), 1222–1235. https://doi.org/10.1177/0956797620959594. (PMID: 10.1177/095679762095959432926807)
Sokolova, M., & Lapalme, G. (2009). A systematic analysis of performance measures for classification tasks. Information Processing & Management, 45(4), 427–437. https://doi.org/10.1016/j.ipm.2009.03.002. (PMID: 10.1016/j.ipm.2009.03.002)
Stehman, S. V. (1997). Selecting and interpreting measures of thematic classification accuracy. Remote Sensing of Environment, 62(1), 77–89. https://doi.org/10.1016/S0034-4257(97)00083-7. (PMID: 10.1016/S0034-4257(97)00083-7)
Sullivan, J. H., Warkentin, M., & Wallace, L. (2021). So many ways for assessing outliers: What really works and does it matter? Journal of Business Research, 132, 530–543. https://doi.org/10.1016/j.jbusres.2021.03.066. (PMID: 10.1016/j.jbusres.2021.03.066)
Sun, Y., Zheng, J., Zhao, H., Zhou, H., Li, J., Li, F., Xiong, Z., Liu, J., & Li, Y. (2024). Modifying the one-hot encoding technique can enhance the adversarial robustness of the visual model for symbol recognition. Expert Systems With Applications, 250, Article 123751. https://doi.org/10.1016/j.eswa.2024.123751. (PMID: 10.1016/j.eswa.2024.123751)
Sylvester, S., Sagehorn, M., Gruber, T., Atzmueller, M., & Schöne, B. (2024). SHAP value-based ERP analysis (SHERPA): Increasing the sensitivity of EEG signals with explainable AI methods. Behavior Research Methods, 56(6), 6067–6081. https://doi.org/10.3758/s13428-023-02335-7. (PMID: 10.3758/s13428-023-02335-73845382811335964)
Tadai, M. E., Straughan, P. T., Cheong, G., Yi, R. N. W., & Er, T. Y. (2023). The effects of SES, social support, and resilience on older adults’ well-being during COVID-19: Evidence from Singapore. Urban Governance, 3(1), 14–21. https://doi.org/10.1016/j.ugj.2023.02.002. (PMID: 10.1016/j.ugj.2023.02.0029957974)
Thirunavukarasu, A. J., & O’Logbon, J. (2024). The potential and perils of generative artificial intelligence in psychiatry and psychology. Nature Mental Health, 2(7), 745–746. https://doi.org/10.1038/s44220-024-00257-7. (PMID: 10.1038/s44220-024-00257-7)
Thirunavukarasu, A. J., Elangovan, K., Gutierrez, L., Hassan, R., Li, Y., Tan, T. F., Cheng, H., Teo, Z. L., Lim, G., & Ting, D. S. W. (2024). Clinical performance of automated machine learning: A systematic review. Annals Of The Academy Of Medicine, Singapore, 53(3), 187–207. https://doi.org/10.47102/annals-acadmedsg.2023113. (PMID: 10.47102/annals-acadmedsg.202311338920245)
Trivedi, U. B., Bhatt, M., & Srivastava, P. (2021). Prevent overfitting problem in machine learning: a case focus on linear regression and logistics regression. In Innovations in Information and Communication Technologies (IICT-2020) Proceedings of International Conference on ICRIHE-2020: IICT-2020 (pp. 345–349). Cham: Springer International Publishing. https://doi.org/10.1007/978-3-030-66218-9_40.
Tuan, T. A., Dai An, T., Nghia, N. H., & Loan, D. T. T. (2024). An analysis of machine learning for detecting depression, anxiety, and stress of recovered COVID-19 patients. Journal of Human, Earth, and Future, 5(1), 1–18. (PMID: 10.28991/HEF-2024-05-01-01)
Uludag, K. (2024). Hyperparameters and Tuning Methods for Random Forest Using Python Sklearn Package Relevant to Psychology Studies. In H. Liu, R. K. Tripathy, & P. Bhattacharya (Eds.), Clinical Practice and Unmet Challenges in AI-Enhanced Healthcare Systems (pp. 207–222). IGI Global. https://doi.org/10.4018/979-8-3693-2703-6.ch011. (PMID: 10.4018/979-8-3693-2703-6.ch011)
Ursenbach, J., O’Connell, M. E., Neiser, J., Tierney, M. C., Morgan, D., Kosteniuk, J., & Spiteri, R. J. (2019). Scoring algorithms for a computer-based cognitive screening tool: An illustrative example of overfitting machine learning approaches and the impact on estimates of classification accuracy. Psychological Assessment, 31(11), 1377–1382. https://doi.org/10.1037/pas0000764. (PMID: 10.1037/pas000076431414853)
Varoquaux, G., & Cheplygina, V. (2022). Machine learning for medical imaging: Methodological failures and recommendations for the future. Npj Digital Medicine, 5(1), Article 48. https://doi.org/10.1038/s41746-022-00592-y. (PMID: 10.1038/s41746-022-00592-y354139889005663)
Wang, K., Tian, J., Zheng, C., Yang, H., Ren, J., Liu, Y., Han, Q., & Zhang, Y. (2021). Interpretable prediction of 3-year all-cause mortality in patients with heart failure caused by coronary heart disease based on machine learning and SHAP. Computers in Biology and Medicine, 137, Article 104813. https://doi.org/10.1016/j.compbiomed.2021.104813. (PMID: 10.1016/j.compbiomed.2021.10481334481185)
Wang, H., Liang, Q., Hancock, J. T., & Khoshgoftaar, T. M. (2024). Feature selection strategies: A comparative analysis of SHAP-value and importance-based methods. Journal Of Big Data, 11(1), 44. https://doi.org/10.1186/s40537-024-00905-w. (PMID: 10.1186/s40537-024-00905-w)
Wang, S. B., Dempsey, W., Hunt, R. A., & Nock, M. K. (2024). Machine Learning for Detection, Prediction, and Treatment of Nonsuicidal Self-Injury: Challenges and Future Directions. In: The Oxford Handbook of Nonsuicidal Self-Injury (pp. 0). Oxford University Press. https://doi.org/10.1093/oxfordhb/9780197611272.013.34.
Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief measures of positive and negative affect: The PANAS scales. Journal of Personality and Social Psychology, 54(6), 1063–1070. https://doi.org/10.1037/0022-3514.54.6.1063. (PMID: 10.1037/0022-3514.54.6.10633397865)
Williams, B., Onsman, A., & Brown, T. (2010). Exploratory factor analysis: A five-step guide for novices. Australasian Journal of Paramedicine, 8, 1–13. https://doi.org/10.33151/ajp.8.3.93. (PMID: 10.33151/ajp.8.3.93)
Xia, Y., & Yang, Y. (2019). RMSEA, CFI, and TLI in structural equation modeling with ordered categorical data: The story they tell depends on the estimation methods. Behavior Research Methods, 51(1), 409–428. https://doi.org/10.3758/s13428-018-1055-2. (PMID: 10.3758/s13428-018-1055-229869222)
Yan, W., Yuan, Y., Yang, M., Zhang, P., & Peng, K. (2023). Detecting the risk of bullying victimization among adolescents: A large-scale machine learning approach. Computers in Human Behavior, 147, Article 107817. https://doi.org/10.1016/j.chb.2023.107817. (PMID: 10.1016/j.chb.2023.107817)
Yang, L., & Shami, A. (2020). On hyperparameter optimization of machine learning algorithms: Theory and practice. Neurocomputing, 415, 295–316. https://doi.org/10.1016/j.neucom.2020.07.061. (PMID: 10.1016/j.neucom.2020.07.061)
Yao, L., Xu, Z., Zhao, X., Chen, Y., Liu, L., Fu, X., & Chen, F. (2022). Therapists and psychotherapy side effects in China: A machine learning-based study. Heliyon, 8(11), Article e11821. https://doi.org/10.1016/j.heliyon.2022.e11821. (PMID: 10.1016/j.heliyon.2022.e11821364583109706699)
Yarkoni, T., & Westfall, J. (2017). Choosing prediction over explanation in psychology: Lessons from machine learning. Perspectives on Psychological Science, 12(6), 1100–1122. https://doi.org/10.1177/1745691617693393. (PMID: 10.1177/1745691617693393288410866603289)
Zhang, Y., Wang, F., & Sui, J. (2023). Decoding individual differences in self-prioritization from the resting-state functional connectome. Neuroimage, 276, Article 120205. https://doi.org/10.1016/j.neuroimage.2023.120205. (PMID: 10.1016/j.neuroimage.2023.12020537253415)
Zhang, B., You, J., Rolls, E. T., Wang, X., Kang, J., Li, Y., Zhang, R., Zhang, W., Wang, H., Xiang, S., Shen, C., Jiang, Y., Xie, C., Yu, J., Cheng, W., & Feng, J. (2024). Identifying behaviour-related and physiological risk factors for suicide attempts in the UK Biobank. Nature Human Behaviour. https://doi.org/10.1038/s41562-024-01903-x. (PMID: 10.1038/s41562-024-01903-x3971587511576506)
Zhong, Y., He, J., Luo, J., Zhao, J., Cen, Y., Song, Y., Wu, Y., Lin, C., Pan, L., & Luo, J. (2024). A machine learning algorithm-based model for predicting the risk of non-suicidal self-injury among adolescents in western China: A multicentre cross-sectional study. Journal Of Affective Disorders, 345, 369–377. https://doi.org/10.1016/j.jad.2023.10.110. (PMID: 10.1016/j.jad.2023.10.11037898476)
Zhou, X., Liang, Z., & Zhang, G. (2025). Using explainable machine learning to investigate the relationship between childhood maltreatment, positive psychological traits, and CPTSD symptoms. European Journal of Psychotraumatology, 16(1), 2455800. https://doi.org/10.1080/20008066.2025.2455800. (PMID: 10.1080/20008066.2025.24558004000742011866650)
Zhu, T., Liu, X., Wang, J., Kou, R., Hu, Y., Yuan, M., Yuan, C., Luo, L., & Zhang, W. (2023). Explainable machine-learning algorithms to differentiate bipolar disorder from major depressive disorder using self-reported symptoms, vital signs, and blood-based markers. Computer Methods and Programs in Biomedicine, 240, Article 107723. https://doi.org/10.1016/j.cmpb.2023.107723. (PMID: 10.1016/j.cmpb.2023.10772337480646)
Weitere Informationen
Integrating artificial intelligence into psychological research represents a significant direction in contemporary psychology. Utilizing supervised and unsupervised machine learning techniques can further aid in understanding the nonlinear relationships of psychological concepts. In machine learning, variables, referred to as features, can encompass data from psychological scales, text, audio, and images. Current psychological research predominantly relies on frequentist approaches, where relationships between variables are typically based on regression, which often falls short in handling the nonlinear relationships of psychological characteristics. Therefore, we outline an innovative semi-automated workflow that empowers psychology researchers to leverage machine learning algorithms for intelligent model selection, facilitating the construction of more precise and insightful theoretical frameworks. This approach aims to achieve three primary research objectives: (1) automated hyperparameter tuning to attain optimal models; (2) identification of important features through interpretability techniques, facilitating feature selection based on calculated importance; (3) data-driven insights for theory building based on important features by integrating exploratory factor analysis with machine learning interpretability. In this paper, we provide an introduction to the basics of machine learning, describe the benefits of combining automated machine learning for researchers, and, using psychological resilience research as an example, offer a detailed annotated code workflow along with raw data. This low-code approach, designed with psychological research methodologies in mind, makes it highly accessible for psychological researchers.
(© 2025. The Psychonomic Society, Inc.)
Declarations. Conflicts of interest: The authors have no relevant financial or non-financial interests to disclose. Ethics approval: The study protocol and informed consent form were approved by the Institutional Review Board of the Institute of Shanghai Pudong New Area Mental Health Center [PDJWLL2023013]. Informed consent was obtained from all participants included in the study. Consent to participate: Informed consent was obtained from all individual participants included in the study. Consent for publication: All participants provided informed consent regarding publishing their data. Declaration of Generative AI and AI-assisted Technologies in the Writing Process: During the preparation of this work, the authors used OpenAI’s ChatGPT to assist with language translation and code checking. After using this tool, the authors reviewed and edited the content as needed and take full responsibility for the content of the publication.