Research on metacognition-thinking about thinking-has grown rapidly and fostered our understanding of human cognition in healthy individuals and clinical populations. Of central importance is the concept of metacognitive performance, which characterizes the capacity of an individual to estimate and report the accuracy of primary (type 1) cognitive processes or actions ensuing from these processes. Arguably one of the biggest challenges for measures of metacognitive performance is their dependency on objective type 1 performance, although more recent methods aim to address this issue. The present work scrutinizes the most popular metacognitive performance measures in terms of two critical characteristics: independence of type 1 performance and test-retest reliability. Analyses of data from the Confidence Database (total N = 6912) indicate that no current metacognitive performance measure is independent of type 1 performance. The shape of this dependency is largely reproduced by extending current models of metacognition with a source of metacognitive noise. Moreover, the reliability of metacognitive performance measures is highly sensitive to the combination of type 1 performance and trial number. Importantly, trial numbers frequently employed in metacognition research are too low to achieve an acceptable level of test-retest reliability. Among common task characteristics, simultaneous choice and confidence reports most strongly improved reliability. Finally, general recommendations about design choices and analytical remedies for studies investigating metacognitive performance are provided.