Abstract. Estimates of terrestrial water storage (TWS) variations from the Gravity Recovery and Climate Experiment (GRACE) satellite mission are used to assess the accuracy of four global numerical model realizations that simulate the continental branch of the global water cycle. Based on four different validation metrics, we demonstrate that for the 31 largest discharge basins worldwide all model runs agree with the observations to a very limited degree only, together with large spreads among the models themselves. Since we apply a common atmospheric forcing data set to all hydrological models considered, we conclude that those discrepancies are not entirely related to uncertainties in meteorologic input, but instead to the model structure and parametrization, and in particular to the representation of individual storage components with different spatial characteristics in each of the models. TWS as monitored by the GRACE mission is therefore a valuable validation data set for global numerical simulations of the terrestrial water storage since it is sensitive to very different model physics in individual basins, which offers helpful insight to modellers for the future improvement of large-scale numerical models of the global terrestrial water cycle.