Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Sixt, Leon

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Haupttitel:

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Autor*in:

Sixt, Leon

Erscheinungsjahr:

2023

Datum der Freigabe:

2024-03-28T13:39:08Z

Abstract:

The field of Explainable AI (XAI) aims to explain the decisions made by machine learning models. Recently, there have been calls for more rigorous and theoretically grounded approaches to explainability. In my thesis, I respond to this call by investigating the properties of explainability methods theoretically and empirically. As an introduction, I provide a brief overview of the history of XAI. The first contribution is a novel theoretically motivated attribution method that estimates the importance of each input feature in bits per pixel, an absolute frame of reference. The method is evaluated against 11 baselines in several benchmarks. In my next publication, the limitations of modified backward propagation methods are examined. It is found that many methods fail the weight-randomization sanity check, and the reasons for these failures are analyzed in detail. In a follow-up publication, the limitations of one particular method, Deep Taylor Decomposition (DTD), are further analyzed. DTD has been cited as the theoretical basis for many other attribution methods. However, it is found to be either under-constrained or reduced to the simpler gradient x input method. In the next contribution, a user study design is presented to evaluate the helpfulness of XAI to users in practice. In the user study, only a partial improvement is observed when users are working with explanation methods compared to a baseline method. As a final contribution, a simple and explainable model of the k-nearest neighbors regression is proposed. We integrate higher-order information into this classical model and show that it outperforms the classical k-nearest neighbors, while still maintaining much of the simplicity and explainability of the original model. In conclusion, this thesis contributes to the field of XAI by exploring explainability methods, identifying their limitations, and suggesting novel, theoretically motivated approaches. My work seeks to improve the performance and interpretability of explainable models toward more transparent, reliable, and comprehensible machine learning systems.

Identifier:

https://refubium.fu-berlin.de/handle/fub188/42771
http://dx.doi.org/10.17169/refubium-42487
urn:nbn:de:kobv:188-refubium-42771-9

Sprache:

Englisch

Freie Schlagwörter:

Interpretability
Artificial Intelligence
Explainability
Deep Neural Networks

DDC-Klassifikation:

000 Computer science, information, and general works

Publikationstyp:

Dissertation

Fachbereich/Einrichtung:

Mathematik und Informatik