dc.description.abstract
Humans are remarkably fast at processing scenes and making decisions based on
the information they contain. Within a few hundred milliseconds of viewing a
scene, our brain can extract the most important information through a hierarchical
cascade starting with perceptual attributes (color, edges, etc.) and ending
with abstract properties (category, relationship between objects, etc.), eventually
supporting decision-making. Despite the central role of scene processing, many
aspects of how it unfolds in the brain remain poorly understood. In particular,
the intermediate stages linking perceptual and abstract scene understanding, i.e.,
mid-level feature processing, are largely unresolved. Moreover, the link between
neural activity and behavior, i.e., when, where and what kind of scene information
arising in the brain influences decision-making, remains unclear. This thesis
addresses these gaps through three studies implementing empirical and computational
methods. In Study 1, we used a novel stimulus set to reveal that various
mid-level features of scenes are processed in humans between ∼100 ms and
∼250 ms after stimulus onset, bridging low- and high-level feature representations,
and with a temporal hierarchy that is mirrored by convolutional neural networks
(CNNs). In Study 2, we showed that neural representations of scenes are suitably
formatted for behavioral readout of scene naturalness between ∼100 ms and
∼200 ms, i.e., in the intermediate processing stages, and that intermediate CNN
layers best correlated with the neural representations in this time-window, suggesting
that mid-level features underlie behaviorally-relevant representations. In
Study 3, we showed that neural representations of scenes are suitably formatted
for behavioral readout of scene naturalness in the early visual cortex and in the
object-selective high-level cortex, and that intermediate CNN layers best explain
this brain-behavior relationship, indicating that behaviorally-relevant representations
in these areas are driven by mid-level features. Taken together, the studies
included in this thesis revealed the timing, spatial localization, and behavioral
relevance of mid-level feature representations in scene processing, contributing to
a better understanding of how the human brain extracts information from the
surrounding world.
en