Treffer: GenerativeCI: Automating Github Actions Workflow Generation using Large Language Models
Weitere Informationen
Continuous Integration (CI) pipelines are critical for modern software development, yet defining and maintaining GitHub Actions workflows remains a highly manual and error-prone process. This thesis presents GenerativeCI, a framework that leverages large language models (LLMs), runtime trace analysis, and retrieval-augmented generation (RAG) to automatically generate robust GitHub Actions workflows. GenerativeCI integrates static artifacts (with dynamic execution traces to provide the LLM with semantically rich context. It further employs job-level scoping, vector similarity retrieval, and a self-healing validation step to iteratively refine the generated workflows. To evaluate GenerativeCI, we conducted an evaluation study across six incrementally enhanced system variants. The baseline version, which relied solely on static shell scripts, achieved an average F1 score of 0.40 on the test set. By progressively integrating static dependencies, heuristic priors, runtime trace retrieval, and job-level scoping, the final GenerativeCI variant improved the F1 score to 0.59—a relative increase of 20.4% over the baseline. This version also achieved the highest BLEU score (0.51) and lowest normalized edit distance (62.65) on the test set, demonstrating substantial improvements in both syntactic and semantic accuracy. This work attempts to set a foundation for grounding LLM-based generation with runtime-aware context, which significantly enhances the quality of automatically generated CI workflows. GenerativeCI reduces the manual effort required for CI authoring and provides a foundation for integrating advanced validation and adaptation strategies in future work.