dc.contributor.author
Kruppa, Jochen
dc.contributor.author
Sieg, Miriam
dc.contributor.author
Richter, Gesa
dc.contributor.author
Pohrt, Anne
dc.date.accessioned
2023-03-14T11:50:20Z
dc.date.available
2023-03-14T11:50:20Z
dc.identifier.uri
https://refubium.fu-berlin.de/handle/fub188/38358
dc.identifier.uri
http://dx.doi.org/10.17169/refubium-38077
dc.description.abstract
Background: In DNA methylation analyses like epigenome-wide association studies, effects in differentially methylated CpG sites are assessed. Two kinds of outcomes can be used for statistical analysis: Beta-values and M-values. M-values follow a normal distribution and help to detect differentially methylated CpG sites. As biological effect measures, differences of M-values are more or less meaningless. Beta-values are of more interest since they can be interpreted directly as differences in percentage of DNA methylation at a given CpG site, but they have poor statistical properties. Different frameworks are proposed for reporting estimands in DNA methylation analysis, relying on Beta-values, M-values, or both.
Results: We present and discuss four possible approaches of achieving estimands in DNA methylation analysis. In addition, we present the usage of M-values or Beta-values in the context of bioinformatical pipelines, which often demand a predefined outcome. We show the dependencies between the differences in M-values to differences in Beta-values in two data simulations: a analysis with and without confounder effect. Without present confounder effects, M-values can be used for the statistical analysis and Beta-values statistics for the reporting. If confounder effects exist, we demonstrate the deviations and correct the effects by the intercept method. Finally, we demonstrate the theoretical problem on two large human genome-wide DNA methylation datasets to verify the results.
Conclusions: The usage of M-values in the analysis of DNA methylation data will produce effect estimates, which cannot be biologically interpreted. The parallel usage of Beta-value statistics ignores possible confounder effects and can therefore not be recommended. Hence, if the differences in Beta-values are the focus of the study, the intercept method is recommendable. Hyper- or hypomethylated CpG sites must then be carefully evaluated. If an exploratory analysis of possible CpG sites is the aim of the study, M-values can be used for inference.
en
dc.rights.uri
https://creativecommons.org/licenses/by/4.0/
dc.subject
DNA methylation
en
dc.subject
Epigenome-wide association study (EWAS)
en
dc.subject
Multiple testing
en
dc.subject
Reproducible research
en
dc.subject.ddc
600 Technik, Medizin, angewandte Wissenschaften::610 Medizin und Gesundheit::610 Medizin und Gesundheit
dc.title
Estimands in epigenome-wide association studies
dc.type
Wissenschaftlicher Artikel
dcterms.bibliographicCitation.articlenumber
98
dcterms.bibliographicCitation.doi
10.1186/s13148-021-01083-9
dcterms.bibliographicCitation.journaltitle
Clinical Epigenetics
dcterms.bibliographicCitation.originalpublishername
Springer Nature
dcterms.bibliographicCitation.volume
13
refubium.affiliation
Charité - Universitätsmedizin Berlin
refubium.funding
Springer Nature DEAL
refubium.resourceType.isindependentpub
no
dcterms.accessRights.openaire
open access
dcterms.bibliographicCitation.pmid
33926513
dcterms.isPartOf.issn
1868-7075
dcterms.isPartOf.eissn
1868-7083