Treffer: Replication Package for: Smart Tags, Smarter Learning: Improving and Combining AI Tools for a Software Engineering Course
Weitere Informationen
This replication package provides the necessary artifacts to support the findings of our study on the use QT transformers based-text classification model in Software Engineering course to detect Quality related tags or labels for any issues from Github or other issue tracking system. The study was conducted over two years (2024 and 2025). The primary goal was to explore the educational benefits of the tool and assess the impact on its perceived usefulness. This package contains the anonymized raw data collected from student teams and the Python script used to perform the statistical analyses, including metric calculations. Contents The package is organized as follows:Data_analysis.py: The Python script used for all data processing, analysis, and generation of figures and summary tables presented in the paper. issue_details_2024.csv: Contains anonymized data including user-provided quality tags and model-generated tags for each student team in the 2024 cohort. issue_details_2025.csv: Contains the same data structure as the 2024 file but for the 2025 cohort. Data Anonymization To ensure the anonymity of the student participants and adhere to the requirements of the blind review process, the issue_text column, which contained the original descriptions of the software issues written by the students, has been removed from the datasets (issue_details_2024.csv and issue_details_2025.csv). All other data fields, including team identifiers, user-provided tags and model-generated tags have been retained. This allows for the complete replication of the quantitative analyses, such as the calculation of performance metrics (e.g., Jaccard score, Hamming loss) as presented in the paper. Full Data Availability Upon acceptance of the associated research paper, a complete, non-anonymized version of the dataset will be permanently archived and made publicly available on Zenodo. This full version will include the original issue_text and other metadata, providing a richer resource for the research community and ensuring full ...