In the present work, we present a publicly available, expert-segmented representative dataset of 158 3.0 Tesla biparametric MRIs [1]. There is an increasing number of studies investigating prostate and prostate carcinoma segmentation using deep learning (DL) with 3D architectures [2], [3], [4], [5], [6], [7]. The development of robust and data-driven DL models for prostate segmentation and assessment is currently limited by the availability of openly available expert-annotated datasets [8], [9], [10].
The dataset contains 3.0 Tesla MRI images of the prostate of patients with suspected prostate cancer. Patients over 50 years of age who had a 3.0 Tesla MRI scan of the prostate that met PI-RADS version 2.1 technical standards were included. All patients received a subsequent biopsy or surgery so that the MRI diagnosis could be verified/matched with the histopathologic diagnosis. For patients who had undergone multiple MRIs, the last MRI, which was less than six months before biopsy/surgery, was included. All patients were examined at a German university hospital (Charité Universitätsmedizin Berlin) between 02/2016 and 01/2020. All MRI were acquired with two 3.0 Tesla MRI scanners (Siemens VIDA and Skyra, Siemens Healthineers, Erlangen, Germany). Axial T2W sequences and axial diffusion-weighted sequences (DWI) with apparent diffusion coefficient maps (ADC) were included in the data set.
T2W sequences and ADC maps were annotated by two board-certified radiologists with 6 and 8 years of experience, respectively. For T2W sequences, the central gland (central zone and transitional zone) and peripheral zone were segmented. If areas of suspected prostate cancer (PIRADS score of ≥ 4) were identified on examination, they were segmented in both the T2W sequences and ADC maps.
Because restricted diffusion is best seen in DWI images with high b-values, only these images were selected and all images with low b-values were discarded. Data were then anonymized and converted to NIfTI (Neuroimaging Informatics Technology Initiative) format.