Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Sixt, Leon

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Metadata

dc.contributor.author

Sixt, Leon

dc.date.accessioned

2024-03-28T13:39:08Z

dc.date.available

2024-03-28T13:39:08Z

dc.date.issued

2023

dc.identifier.uri

https://refubium.fu-berlin.de/handle/fub188/42771

dc.identifier.uri

http://dx.doi.org/10.17169/refubium-42487

dc.description.abstract

The field of Explainable AI (XAI) aims to explain the decisions made by machine learning models. Recently, there have been calls for more rigorous and theoretically grounded approaches to explainability. In my thesis, I respond to this call by investigating the properties of explainability methods theoretically and empirically. As an introduction, I provide a brief overview of the history of XAI. The first contribution is a novel theoretically motivated attribution method that estimates the importance of each input feature in bits per pixel, an absolute frame of reference. The method is evaluated against 11 baselines in several benchmarks. In my next publication, the limitations of modified backward propagation methods are examined. It is found that many methods fail the weight-randomization sanity check, and the reasons for these failures are analyzed in detail. In a follow-up publication, the limitations of one particular method, Deep Taylor Decomposition (DTD), are further analyzed. DTD has been cited as the theoretical basis for many other attribution methods. However, it is found to be either under-constrained or reduced to the simpler gradient x input method. In the next contribution, a user study design is presented to evaluate the helpfulness of XAI to users in practice. In the user study, only a partial improvement is observed when users are working with explanation methods compared to a baseline method. As a final contribution, a simple and explainable model of the k-nearest neighbors regression is proposed. We integrate higher-order information into this classical model and show that it outperforms the classical k-nearest neighbors, while still maintaining much of the simplicity and explainability of the original model. In conclusion, this thesis contributes to the field of XAI by exploring explainability methods, identifying their limitations, and suggesting novel, theoretically motivated approaches. My work seeks to improve the performance and interpretability of explainable models toward more transparent, reliable, and comprehensible machine learning systems.

dc.format.extent

xi, 141 Seiten

dc.language

eng

dc.rights.uri

http://www.fu-berlin.de/sites/refubium/rechtliches/Nutzungsbedingungen

dc.subject

Interpretability

dc.subject

Artificial Intelligence

dc.subject

Explainability

dc.subject

Deep Neural Networks

dc.subject.ddc

000 Computer science, information, and general works::000 Computer Science, knowledge, systems::000 Computer science, information, and general works

dc.title

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

dc.type

Dissertation

dcterms.format

Text

dc.contributor.gender

male

dc.contributor.firstReferee

Landgraf, Tim

dc.contributor.furtherReferee

Mac Aodha, Oisin

dc.date.accepted

2024-02-16

dc.identifier.urn

urn:nbn:de:kobv:188-refubium-42771-9

refubium.affiliation

Mathematik und Informatik

dcterms.accessRights.dnb

free

dcterms.accessRights.openaire

open access

dcterms.accessRights.proquest

Show Simple Item Record

This Item appears in the following Collection(s)

Dissertationen FU

Files in This Item

thesis.pdf

Size: 14.52MB

Format: PDF

Checksum (MD5): 58c6ccf49749648cc9cb5ae150f556aa

View/Open

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Refubium - Freie Universität Berlin Repository

Enhancing And Evaluating Interpretability In Machine Learning Through Theory And Practice

Metadata

This Item appears in the following Collection(s)

Files in This Item

Export metadata