Large Language Models for Build System Maintenance: An Empirical Study of CodeGen’s Next-Line Prediction

Akin-Taylor, Akinbowale

Large Language Models for Build System Maintenance: An Empirical Study of CodeGen’s Next-Line Prediction

dc.contributor.advisor	McIntosh, Shane
dc.contributor.advisor	Nagappan, Meiyappan
dc.contributor.author	Akin-Taylor, Akinbowale
dc.date.accessioned	2025-01-31T20:10:21Z
dc.date.available	2025-01-31T20:10:21Z
dc.date.issued	2025-01-31
dc.date.submitted	2025-01-23
dc.description.abstract	Build systems play a crucial role in software development and are responsible for compiling source code into executable programs. Despite their importance, build systems often receive limited attention because their impact is not directly visible to end users. This oversight can lead to inadequate maintenance, frequent build failures, and disruptions that require additional resources. Recognising and addressing the maintenance needs of build systems is essential to preventing costly disruptions and ensuring efficient software production. In this thesis, I explore whether applying a Large Language Model (LLM) can reduce the burden of maintaining build systems. I aim to determine whether the prior content in build specifications provides sufficient context for an LLM to generate subsequent lines accurately. I conduct an empirical study on CodeGen, a state-of-the-art Large Language Model (LLM), using a dataset of 13,343 Maven build files. The dataset consists of the Expert dataset from the Apache Software Foundation (ASF) for fine-tuning (9,426 build files) and the Generalised dataset from GitHub for testing (3,917 build files). I observe that (i) fine-tuning on a small portion of data (i.e., 11% of fine-tuning datasets) provides the largest improvement in performance by 13.93% (ii) When applied to the Generalised dataset, the fine-tuned model retains 83.86% of its performance, indicating that it is not overfitted. Upon further investigation, I classify build-code content into functional and metadata subgroups based on enclosing tags. The fine-tuned model performs substantially better in suggesting functional than metadata build-code. The findings highlight the potential of leveraging LLMs like CodeGen to relieve the maintenance challenges associated with build systems, particularly in functional content. My thesis highlights the limitations of large language models in suggesting the metadata components of build code. Future research should focus on developing approaches to enhance the accuracy and effectiveness of metadata generation.
dc.identifier.uri	https://hdl.handle.net/10012/21450
dc.language.iso	en
dc.pending	false
dc.publisher	University of Waterloo	en
dc.relation.uri	https://zenodo.org/records/14052990
dc.subject	Build systems
dc.subject	Build code generation
dc.subject	CodeGen
dc.subject	Large Language Model
dc.title	Large Language Models for Build System Maintenance: An Empirical Study of CodeGen’s Next-Line Prediction
dc.type	Master Thesis
uws-etd.degree	Master of Mathematics
uws-etd.degree.department	David R. Cheriton School of Computer Science
uws-etd.degree.discipline	Computer Science
uws-etd.degree.grantor	University of Waterloo	en
uws-etd.embargo.terms	4 months
uws.contributor.advisor	McIntosh, Shane
uws.contributor.advisor	Nagappan, Meiyappan
uws.contributor.affiliation1	Faculty of Mathematics
uws.peerReviewStatus	Unreviewed	en
uws.published.city	Waterloo	en
uws.published.country	Canada	en
uws.published.province	Ontario	en
uws.scholarLevel	Graduate	en
uws.typeOfResource	Text	en

Files

Original bundle

Now showing 1 - 1 of 1

Name:: Akin-Taylor_Akinbowale.pdf
Size:: 1.29 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 6.4 KB
Format:: Item-specific license agreed upon to submission
Description:

Download

Collections

Theses
Computer Science