Computer Science > Computation and Language

arXiv:2410.18541 (cs)

[Submitted on 24 Oct 2024]

Title:On Explaining with Attention Matrices

Abstract:This paper explores the much discussed, possible explanatory link between attention weights (AW) in transformer models and predicted output. Contrary to intuition and early research on attention, more recent prior research has provided formal arguments and empirical evidence that AW are not explanatorily relevant. We show that the formal arguments are incorrect. We introduce and effectively compute efficient attention, which isolates the effective components of attention matrices in tasks and models in which AW play an explanatory role. We show that efficient attention has a causal role (provides minimally necessary and sufficient conditions) for predicting model output in NLP tasks requiring contextual information, and we show, contrary to [7], that efficient attention matrices are probability distributions and are effectively calculable. Thus, they should play an important part in the explanation of attention based model behavior. We offer empirical experiments in support of our method illustrating various properties of efficient attention with various metrics on four datasets.

Subjects:	Computation and Language (cs.CL); Artificial Intelligence (cs.AI)
MSC classes:	46-04
ACM classes:	I.2.7; I.7.0
Cite as:	arXiv:2410.18541 [cs.CL]
	(or arXiv:2410.18541v1 [cs.CL] for this version)
	https://doi.org/10.48550/arXiv.2410.18541
Journal reference:	Proceedings of ECAI 2024, Frontiers in Artificial Intelligence and Applications, pp. 1035-1042
Related DOI:	https://doi.org/10.3233/FAIA240594

Submission history

From: Nicholas Asher [view email]
[v1] Thu, 24 Oct 2024 08:43:33 UTC (64 KB)

Computer Science > Computation and Language

Title:On Explaining with Attention Matrices

Submission history

Access Paper:

References & Citations

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators

Computer Science > Computation and Language

Title:On Explaining with Attention Matrices

Submission history

Access Paper:

References & Citations

BibTeX formatted citation

Bookmark

Bibliographic and Citation Tools

Code, Data and Media Associated with this Article

Demos

Recommenders and Search Tools

arXivLabs: experimental projects with community collaborators