Spatial-Temporal Computer Vision Methods for Automated Vision-Based Visual Inspection

dc.contributor.authorMidwinter, Max Xuhao Xue
dc.date.accessioned2026-06-08T20:18:04Z
dc.date.available2026-06-08T20:18:04Z
dc.date.issued2026-06-08
dc.date.submitted2026-05-25
dc.description.abstractThe objective of this thesis is to investigate how spatial and temporal context can be leveraged to enhance automated vision-based visual inspection (AVVI). The prevailing paradigm in AVVI is the single-shot supervised deep semantic inference model, where an image is processed independently and the resulting semantic prediction is compared against labeled data to generate a supervision signal. While these methods have demonstrated strong performance for defect detection tasks, they often neglect the spatial and temporal context in which inspection data are collected. In practice, engineers rarely make decisions based on a single observation in isolation; instead, they rely on contextual information such as multiple viewpoints of a region of interest, geometric cues for estimating defect scale, and comparisons with previous inspection records. This thesis therefore explores how contextual information inherent in inspection workflows can be incorporated directly into the inference process. Three research challenges are investigated in my thesis: leveraging multi-view imagery to improve defect segmentation, developing and evaluating spatial inference models for defect quantification in civil infrastructure, and enabling visual change detection between unordered sets of inspection data. In Chapter 3, multi-view spatial relationships between inspection images are used to refine segmentations from an unsupervised feature-clustering semantic segmentation model through a novel iterative stochastic consensus algorithm. In Chapter 4, a civil infrastructure RGB-D dataset is created using a custom handheld Light Detection and Ranging scanner, consisting of five short- to medium-span overpass bridges used to benchmark monocular metric depth estimation methods for defect measurement. In Chapter 5, synchronized pairs of novel view synthesis models are constructed to generate pixel-aligned renders of the same structure across inspection events, enabling visual change detection. Finally, Chapter 6 discusses the implications of this research for industrial inspection workflows and possible directions for future work.
dc.identifier.urihttps://hdl.handle.net/10012/23572
dc.language.isoen
dc.pendingfalse
dc.publisherUniversity of Waterlooen
dc.subjectvisual inspection
dc.subjectAI
dc.subjectdeep learning
dc.subjectcomputer vision
dc.titleSpatial-Temporal Computer Vision Methods for Automated Vision-Based Visual Inspection
dc.typeDoctoral Thesis
uws-etd.degreeDoctor of Philosophy
uws-etd.degree.departmentCivil and Environmental Engineering
uws-etd.degree.disciplineCivil Engineering
uws-etd.degree.grantorUniversity of Waterlooen
uws-etd.embargo.terms0
uws.contributor.advisorYeum, Chul Min
uws.contributor.affiliation1Faculty of Engineering
uws.peerReviewStatusUnrevieweden
uws.published.cityWaterlooen
uws.published.countryCanadaen
uws.published.provinceOntarioen
uws.scholarLevelGraduateen
uws.typeOfResourceTexten

Files

Original bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
Midwinter_MaxXuHaoXue.pdf
Size:
110.58 MB
Format:
Adobe Portable Document Format

License bundle

Now showing 1 - 1 of 1
Loading...
Thumbnail Image
Name:
license.txt
Size:
6.4 KB
Format:
Item-specific license agreed upon to submission
Description: