The integration of information across multiple senses is a flexible process shaped by stimulus-driven and contextual influences. Understanding how these influences interact to shape crossmodal processing and the perceptual or behavioral outcome is a major goal of multisensory research. Over the last decades, neuroscientific work promoted a hierarchical framework of multisensory processing, including a dynamic interplay between primary sensory and association cortical areas unfolding at multiple stages. In parallel, behavioral research characterized the extent to which various multisensory phenomena are affected by contextual influences. Despite the accumulating evidence informing these frameworks, there are still important knowledge gaps. To address critical aspects of multisensory perception, that are, to date, poorly understood I used two multisensory phenomena: (i) an established multisensory illusion paradigm, the sound-induced flash illusion (SIFI) in which the integration of a flash together with two rapid beeps can induce the illusory perception of two flashes, and (ii) the crossmodal response speed facilitation, as manifested in the speeding of simple visual responses by concurrent task-irrelevant auditory information. In the first study, we show that susceptibility to the SIFI is altered when cognitive resources are depleted through a secondary working memory task. This finding suggests that multisensory integration producing SIFI, previously considered a stimulus-driven process, is subject to cognitive resource limitations. The second study, using EEG and a similar design, replicated this finding and extended it by demonstrating a pronounced effect of working memory load on the oscillatory power related to the SIFI. Specifically, SIFI under high load was associated with low frequency oscillations in the theta and beta range unfolding at multiple stages of crossmodal processing. This finding suggests that SIFI, previously linked to gamma oscillations, is an adaptive process that depends on the availability of cognitive resources. Critically, the observed pattern of oscillatory responses is remarkably similar with the literature on an audiovisual speech illusion (McGurk effect), suggesting that low frequency oscillations might reflect general integrative mechanisms. The last study used EEG and ECoG recordings to explore the oscillatory signatures of crossmodal response speed facilitation, for which there is little and inconclusive evidence. We found that crossmodal response speed facilitation is associated with reduced beta power in association areas occurring at early processing stages. Taken together, we provide strong evidence supporting the adaptive nature of multisensory integration in the SIFI and the functional relevance of low frequency oscillations at multiple stages of crossmodal processing.