dc.contributor.author
Iarkaeva, Anastasiia
dc.contributor.author
Nachev, Vladislav
dc.contributor.author
Bobrov, Evgeny
dc.date.accessioned
2025-07-29T15:50:39Z
dc.date.available
2025-07-29T15:50:39Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/48490
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-48212
dc.description.abstract
To monitor the sharing of research data through repositories is increasingly of interest to institutions and funders, as well as from a meta-research perspective. Automated screening tools exist, but they are based on either narrow or vague definitions of open data. Where manual validation has been performed, it was based on a small article sample. At our biomedical research institution, we developed detailed criteria for such a screening, as well as a workflow which combines an automated and a manual step, and considers both fully open and restricted-access data. We use the results for an internal incentivization scheme, as well as for a monitoring in a dashboard. Here, we describe in detail our screening procedure and its validation, based on automated screening of 11035 biomedical research articles, of which 1381 articles with potential data sharing were subsequently screened manually. The screening results were highly reliable, as witnessed by inter-rater reliability values of >= 0.8 (Krippendorff's alpha) in two different validation samples. We also report the results of the screening, both for our institution and an independent sample from a meta-research study. In the largest of the three samples, the 2021 institutional sample, underlying data had been openly shared for 7.8% of research articles. For an additional 1.0% of articles, restricted-access data had been shared, resulting in 8.3% of articles overall having open and/or restricted-access data. The extraction workflow is then discussed with regard to its applicability in different contexts, limitations, possible variations, and future developments. In summary, we present a comprehensive, validated, semi-automated workflow for the detection of shared research data underlying biomedical article publications.
en
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
access to information
en
dc.subject
biomedical research
en
dc.subject
information dissemination
en
dc.subject
reproducibility of results
en
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit::610 Medizin und Gesundheit
dc.title
Workflow for detecting biomedical articles with underlying open and restricted-access datasets
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
e0302787
dcterms.bibliographicCitation.doi
10.1371/journal.pone.0302787
dcterms.bibliographicCitation.journaltitle
PLOS ONE
dcterms.bibliographicCitation.number
5
dcterms.bibliographicCitation.originalpublishername
Public Library of Science (PLoS)
dcterms.bibliographicCitation.volume
19
refubium.affiliation
Charité - Universitätsmedizin Berlin
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.bibliographicCitation.pmid
38718077
dcterms.isPartOf.eissn
1932-6203