Verifying Explanations for Neural Networks
Loading...
Date
2024-09-05
Authors
Advisor
Gurfinkel, Arie
Journal Title
Journal ISSN
Volume Title
Publisher
University of Waterloo
Abstract
Deep neural networks (DNNs) have been applied in solving many complex tasks across multiple domains, many of which have direct effects on our daily lives: generative models are replacing traditional search engines for answering questions, cars are being driven by neural networks, doctors and radiologists are using neural nets to diagnose patients more efficiently, financial systems are run by automated trading bots, etc. Coupled with the ever-increasing of DNNs' complexity, the need for explaining their predictions and verifying their safety is clear.
Generally speaking, verifying a DNN involves checking if it behaves as expected for unseen inputs in a particular region, and explaining a DNN involves interpreting the network's prediction on a given input. Both approaches have their own pros and cons: the output of any input in a verified region is proven to be correct (with respect to a spec), but such regions are minuscule compared to the whole input space, not just because of the performance of the tools, but because of the inherent limits in e-robustness -- the commonly used verification specifications; and while explanation methods can be applied to explain the output given any input, they are post-hoc and hard to judge: does an explanation make sense because the DNN is working close to how a human being process the same input, or because the explanation visualizes the input itself without taking the model in consideration?
Our main insight: we can combine both verification and explanation, resulting in novel verification problems towards a robust explanation for neural networks. However, any verification problem (or specification) can not exist in isolation, but in a symbiosis relationship with the tools solving it. When we propose a new spec, it is expected that existing tools cannot solve it effectively, or may not work at all. Interesting problems push developers to improve the tools, and better tools widen the design space for researchers to come up with even more interesting specs. Thus, in this proposal, we are introducing not just novel specifications, but how to solve them by building better tools.
This thesis presents a series of results and research ideas based on that insight.
First, we show that by extending e-robustness with an explanation function (the activation pattern of the DNN), we can verify a bigger region of the input space using existing verification tools.
Second, by verifying the explanation functions, we provide a robust way to compare different explanation methods. Finally, even when the combination of existing DNNs' verification specs and explanation functions is friendlier to existing verification tools, we still run into scalability issues as we increase the size of the networks. Thus, in this thesis we also present our results on building a distributed SMT solver, which lies at the heart of many neural network verification tools.
Description
Keywords
Machine Learning, Software verification, Large Language Models, Explainable AI, Formal Methods, Model checking