High-level visual representations in the human brain are aligned with large language models

Doerig, Adrien; Kietzmann, Tim C.; Allen, Emily; Wu, Yihan; Naselaris, Thomas; Kay, Kendrick; Charest, Ian

doi:10.1038/s42256-025-01072-0

High-level visual representations in the human brain are aligned with large language models

Metadaten

dc.contributor.author

Doerig, Adrien

dc.contributor.author

Kietzmann, Tim C.

dc.contributor.author

Allen, Emily

dc.contributor.author

Wu, Yihan

dc.contributor.author

Naselaris, Thomas

dc.contributor.author

Kay, Kendrick

dc.contributor.author

Charest, Ian

dc.date.accessioned

2025-09-25T12:05:03Z

dc.date.available

2025-09-25T12:05:03Z

dc.date.issued

2025

dc.identifier.uri

https://refubium.fu-berlin.de/handle/fub188/49579

dc.identifier.uri

http://dx.doi.org/10.17169/refubium-49301

dc.description.abstract

The human brain extracts complex information from visual inputs, including objects, their spatial and semantic interrelations, and their interactions with the environment. However, a quantitative approach for studying this information remains elusive. Here we test whether the contextual information encoded in large language models (LLMs) is beneficial for modelling the complex visual information extracted by the brain from natural scenes. We show that LLM embeddings of scene captions successfully characterize brain activity evoked by viewing the natural scenes. This mapping captures selectivities of different brain areas and is sufficiently robust that accurate scene captions can be reconstructed from brain activity. Using carefully controlled model comparisons, we then proceed to show that the accuracy with which LLM representations match brain representations derives from the ability of LLMs to integrate complex information contained in scene captions beyond that conveyed by individual words. Finally, we train deep neural network models to transform image inputs into LLM representations. Remarkably, these networks learn representations that are better aligned with brain representations than a large number of state-of-the-art alternative models, despite being trained on orders-of-magnitude less data. Overall, our results suggest that LLM embeddings of scene captions provide a representational format that accounts for complex information extracted by the brain from visual inputs.

dc.format.extent

18 Seiten

dc.language

eng

dc.rights.uri

https://creativecommons.org/licenses/by/4.0/

dc.subject

Cognitive neuroscience

dc.subject

Neural encoding

dc.subject

human brain

dc.subject.ddc

100 Philosophie und Psychologie::150 Psychologie::150 Psychologie

dc.title

High-level visual representations in the human brain are aligned with large language models

dc.type

Wissenschaftlicher Artikel

dcterms.bibliographicCitation.doi

10.1038/s42256-025-01072-0

dcterms.bibliographicCitation.journaltitle

Nature Machine Intelligence

dcterms.bibliographicCitation.number

dcterms.bibliographicCitation.pagestart

1220

dcterms.bibliographicCitation.pageend

1234

dcterms.bibliographicCitation.volume

dcterms.bibliographicCitation.url

https://doi.org/10.1038/s42256-025-01072-0

refubium.affiliation

Erziehungswissenschaft und Psychologie

refubium.affiliation.other

Arbeitsbereich Allgemeine und Neurokognitive Psychologie

Dieser Normdateneintrag wurde von einer Benutzerin oder einem Benutzer als gültig bestätigt.

refubium.resourceType.isindependentpub

dcterms.accessRights.openaire

open access

dcterms.isPartOf.eissn

2522-5839

refubium.resourceType.provider

WoS-Alert

Zur Kurzanzeige

Das Dokument erscheint in:

Dokumente FU

Dateien zu dieser Ressource

s42256-025-01072-0.pdf

Größe: 3.940MB

Format: PDF

Prüfsumme (MD5): aad2c8cb667716978909bff1a14ba2b2

Öffnen

High-level visual representations in the human brain are aligned with large language models

Refubium - Repositorium der Freien Universität Berlin

High-level visual representations in the human brain are aligned with large language models

Metadaten

Das Dokument erscheint in:

Dateien zu dieser Ressource

Metadaten exportieren