Teach a robot to assemble a bolt to a nut with a handful of demonstrations

Loading...
Thumbnail Image

Advisor

Tripp, Bryan

Journal Title

Journal ISSN

Volume Title

Publisher

University of Waterloo

Abstract

This thesis investigates data-efficient methods for learning and executing complex, multistep robotic manipulation tasks in unstructured environments. A two-level hierarchical framework is first proposed, in which high-level symbolic action planning is performed using Vector Symbolic Architectures (VSA), and low-level 6D gripper trajectories are modeled using Task-parameterized Probabilistic Movement Primitives (TP-ProMPs). This approach enables both interpretable planning and motion generalization from limited human demonstrations. Building on this foundation, the thesis introduces the Task parameterized Transformer (TP-TF), a unified model that jointly predicts gripper pose trajectories, gripper states, and subtask labels conditioned on object-centric task parameters. Inspired by the parameterization strategy of Task-parameterized Gaussian Mixture Models (TP-GMMs), the TP-TF retains the data efficiency of classical Programming by demonstration (PbD) methods while leveraging the expressiveness and flexibility of transformer-based architectures. The model is evaluated on a real-world bolt–nut assembly task and achieves a 70% success rate with only 20 demonstrations when combined with visual servoing for precision-critical phases. The results highlight the potential of combining structured representations with deep sequence modeling to bridge symbolic reasoning and continuous control. This work contributes a step toward scalable, more interpretable, and data-efficient learning frameworks for autonomous robotic manipulation.

Description

Keywords

LC Subject Headings

Citation