Dynamical Stability Indicator based on Autoregressive Moving-Average Models: Critical Transitions and the Atlantic Meridional Overturning Circulation

A statistical indicator for dynamic stability known as the $\Upsilon$ indicator is used to gauge the stability and hence detect approaching tipping points of simulation data from a reduced 5-box model of the North-Atlantic Meridional Overturning Circulation (AMOC) exposed to a time dependent hosing function. The hosing function simulates the influx of fresh water due to the melting of the Greenland ice sheet and increased precipitation in the North Atlantic. The $\Upsilon$ indicator is designed to detect changes in the memory properties of the dynamics, and is based on fitting ARMA (auto-regressive moving-average) models in a sliding window approach to time series data. An increase in memory properties is interpreted as a sign of dynamical instability. The performance of the indicator is tested on time series subject to different types of tipping, namely bifurcation-induced, noise-induced and rate-induced tipping. The numerical analysis show that the indicator indeed responds to the different types of induced instabilities. Finally, the indicator is applied to two AMOC time series from a full complexity Earth systems model (CESM2). Compared with the doubling CO$_2$ scenario, the quadrupling CO$_2$ scenario results in stronger dynamical instability of the AMOC during its weakening phase.


I. INTRODUCTION
Tipping points, or critical transitions, are sudden, drastic changes in a system resulting from initial small perturbations.The study of tipping points is of particular interest to climate scientists and ecologists, as several theoretical studies highlight such tipping for an assortment of climatic and ecological systems, and observations also indicate that abrupt changes are, indeed, common in nature 1 .
Ashwin et al. 2 classified tipping points according to the causing mechanism, yielding three classes of tipping points.Bifurcation-induced tipping, or B-tipping, occurs when a steady change in a parameter past a threshold induces a sudden qualitative change in the system's behaviour.Noiseinduced tipping, or N-tipping, occurs when short-timescale internal variability causes the system to transition between different co-existing attracting states.Finally, rate-induced tipping, or Rtipping, occurs when the system fails to track a continuously changing attractor and hence abruptly leaves the attractor.
Of these three, rate-induced tipping is certainly the least studied, however as demonstrated by Scheffer et al. 3 , Wieczorek et al. 4 and more recently O'Keeffe and Wieczorek 5 , it is an important tipping mechanism that cannot be explained through classical bifurcation theory.Indeed, when the system is unable to track a continuously available quasi-stable state due to the system parameters changing too quickly, it might shift to another available equilibrium state without crossing a bifurcation boundary.There are a few methods available for estimating what exactly "too quickly" means, see Wieczorek and Perrymann 6 , Ashwin, Perrymann, and Wieczorek 7 , Vanselow, Wieczorek, and Feudel 8 and O'Keeffe and Wieczorek 5 , but they depend strongly on the time-dependent parameter function; in particular its asymptotic properties.Finding generalizable methods for determining the rate of the parameter drift that induces tipping, will be of great The ϒ indicator for Early Warning simplified Atlantic Meridional Overturning Circulation (AMOC), which transports warm surface water from the tropics to North America and Europe, resulting in a milder climate in these regions than what would otherwise be expected.Since the current is density driven, a large influx of freshwater due to the melting of land ice or increased precipitation in the North Atlantic, would be expected to result in a reduction in the AMOC flow strength.The question of whether the AMOC could undergo a sudden transition from a high flow strength state (the "on" state) to a state with weak or no overturning (the "off" state), is still debated.The latest assessment report of the International Panel for Climate Change (IPCC AR6) concludes that the AMOC strength will very likely decline in the future, but states with medium confidence that an abrupt collapse will not occur in the next century 17 .Simple box models, like the one presented in this paper, show bistability, while more realistic models like the global atmosphere-ocean general circulation models (AOGCMs) are largely mono-stable, implying that they do not exhibit the abrupt transition to an "off"-state so characteristic of the simpler models.However, there is limited evidence that the more complex models may be too stable (Weijer et al. 18 , Hofmann and Rahmsdorf 19 and Liu et al. 20 ), in particular that they mis-represent the direction of AMOC-induced freshwater transport across the southern boundary of the Atlantic (Liu et al. 20 , Huisman et al. 21, Liu, Liu, and Brady 22 , Hawkins et al. 23 ).Liu et al. 20 demonstrated that by introducing a flux-correction term into the National Center for Atmospheric Research (NCAR) Community Climate System Model version 3 (CCSM3), they could make the formerly mono-stable system bi-stable.
In addition, it has been suggested that paleoclimate data is consistent with abrupt changes in the surface temperature in the North Atlantic region in the past, as might be expected with a collapse of the AMOC.Boers 24 applied a statistical early warning indicator on Earth System Model (ESM) outputs, and found significant early-warning signals in eight independent AMOC indices.This was interpreted as a sign that the AMOC is not only a bistable system, but one approaching a critical transition.
Previously, the potential collapse of the AMOC has largely been attributed to the crossing of a bifurcation boundary in the bi-stable system.However, more recent analysis, see in particular Lohman and Ditlevsen 25 , demonstrate the possibility of tipping before the bifurcation boundary is reached through the mechanism of rate-induced tipping.In addition, Lohman and Ditlevsen 25 demonstrate that due to the chaotic nature of complex systems a well-defined critical rate, i.e., the rate of parameter change at which the system tips, cannot be obtained, which in turn severely The ϒ indicator for Early Warning limits our ability to predict the long-term behavior of the system.They conclude that due to this added level of uncertainty, it is possible that the safe operating space with regard to future emissions of CO 2 might be smaller than previously thought.This suggests that proper evaluation of the probability of rate-induced tipping in the different tipping elements of the Earth System is of utmost importance in assessing the likelihood of dramatic future changes.
Regardless of whether the AMOC in actuality is bi-stable or mono-stable, the reduced 5-box model of Alkhayuon et al. 15 is the perfect test case for the ϒ indicator as it exhibits both bifurcationinduced and rate-induced tipping, provided a time dependent hosing function is applied.The hosing function represents the influx of fresh water into the ocean due to increased precipitation and melting of land and sea ice in the North Atlantic region.Alkhayuon et al. 15 provide an extensive analysis of the tipping mechanisms present in the model.Armed with such a well studied theoretical model, we will be able to systematically study the indicator's ability to not only detect bifurcation-induced and noise-induced, but also rate-induced tipping.We will additionally assess the indicator's ability to deal with colored noise, something that is known to cause issues for other early warning indicators, like the increase in variance and auto-correlation 24 .
In reality, the ocean system has many more degrees of freedom than those included in the box models, and ultimately a mixture of different processes is likely to trigger tipping, if occurring.The Coupled Model Intercomparison Project (CMIP6), with the Community Earth System Model (CESM2) 26 , provides an alternative AMOC model with many more degrees of freedom.
Two scenarios where the atmospheric CO 2 concentration is abruptly increased will be considered, providing monthly outputs of geographical density differences on which the ϒ indicator will be applied.In these model scenarios, the abrupt change in CO 2 is followed by a response of the Earth system, and after 2-3 decades, freshwater eventually circulates in the sub-polar gyre 27 .This response hence offers similarities with the hosing experiments done in the box models.While the two scenarios are insufficient to assess the potential bistability of the AMOC, the ϒ indicator will be used to assess the dynamical stability of the AMOC during its weakening phase.

II. THE ϒ-INDICATOR FOR EARLY-WARNING SIGNALS
In what follows, we will briefly outline the method used to determine the stability of the time series data.Further details can be found in Faranda et al. 10 , Faranda and Defrance 14 , Nevo The ϒ indicator for Early Warning et al. 12 and Kaiser et al. 13 The method relies on an accurate representation of a complex dynamical system close to a metastable state by a random walk-like behavior with a tendency to be attracted to the metastable state.Changes in the system's stability are then characterized as statistically significant deviations from that local behavior, indicating that the system currently does not reside close to a metastable state.Indeed, the local dynamics of a continuous-time random dynamical system (i.e., a stochastic differential equation) near a metastable state come close to the dynamics of a stochastic spring (i.e., an Ornstein-Uhlenbeck process), whose discrete-time observations are well approximated by an ARMA (1,0) process.Here, ARMA denotes the space of autoregressive moving-average models, with the numbers in parentheses denoting the order of the model.A time series x(t), t ∈ Z, is an ARMA(p,q) process if it is stationary and can be written as with constant ν, coefficients φ i , θ j and {w t } being white noise with positive variance σ 2 (see Brockwell and Davis 28 for an introductory text).In addition, constraints are imposed on the coefficients φ i and θ j to ensure that the process in ( 1) is stationary and satisfies the invertibility condition.Intuitively, the variables p and q say something about the memory lag of the process, while the prefactors φ i and θ j relate to the persistence of said memory lag.One expects that the higher the values for q and p, the longer the system, once perturbed from its equilibrium state, would need to return to equilibrium.It is this intuitive notion that the statistical indicator denoted ϒ takes advantage of.Indeed, when approaching a critical transition the response of the system to perturbations can become increasingly long (referred to as a critical slow down), and this translates into diverging memory properties of the statistical signal.Hence, an ARMA(p,q) model will require higher orders to incorporate the memory effects.By fitting the model (1) repeatedly to a time series data set for varying values of p and q, one can, through application of an appropriate information criterion, obtain the values of p and q that best represent the time series data.For this purpose, we choose the Bayesian information criterion, BIC: where β denotes the maximum likelihood estimator of β = (ν, φ 1 , . . ., φ p , θ 1 , . . ., θ q ), which is obtained by maximising the likelihood function L associated with the ARMA(p,q) model ( 1) for a given time series; see Brockwell and Davis 28 for details.The best fitting ARMA(p,q) model The ϒ indicator for Early Warning is then determined as the one that minimizes the BIC.The second term in equation ( 2) punishes complex models with high p and q values, and is the reason why we prefer to use the BIC over other criteria, such as the perhaps more familiar Akaike Information Criterion.Here, τ denotes the number of discrete points in the time series to which the ARMA model is fitted.We refer to τ as the window length.
Finally, the stability indicator is defined as where p and q indicate the order of what we refer to as the theorized base model.This is the ARMA(p,q) model, characterized by a specific value of q = q and p = p, to which the chosen best fit is compared.The ϒ-indicator takes on values between 0 and 1, where lower values imply a higher degree of stability.The intuition behind using the difference in BIC values between the chosen "best" model and a base model is that this quantity assesses just how much better the model with the lower BIC value approximates the fitted data compared to the other.The significance threshold for deviations in the BIC values between an ARMA(p,q) and the base model, simply denoted as |∆BIC|, is |∆BIC| > 2. The differences in BIC values can be directly related to the Bayes Factor, see Preacher and Merkle 29 , which is another way of quantifying the likelihood of one model over another.
For the data sets analysed by Faranda et al. 10 , it was determined that the appropriate base model is the ARMA(1,0) model, i.e., p = 1 and q = 0, which can be viewed as a time discretized Langevin process.In later work by Nevo et al. 12 and Kaiser et al. 13 the authors continued to rely on ARMA(1,0) as the base model.While Faranda et al. 10 used a statistical argument to justify the choice of the base model, Nevo et al. 12 and Kaiser et al. 13 argued, as already noted above, that the dynamics near a stable state can be approximated as that of a stochastic spring, further strengthening the case for ARMA(1,0) as the general choice of base model.However, due to the additional well-posedness constraints on the autoregressive and moving-average coefficients φ i and θ j in (1), depending on the treatment of constraints by the fitting routine one can have cases where the BIC value of the ARMA(1,0) process is smaller than the corresponding value for the chosen ARMA(p,q) model.In these cases the ARMA(1,0) process is rejected as the best fit, despite having the lowest BIC value, due to violating the stationarity or invertibility conditions required for a numerically well behaved fit.Thus, in this scenario it becomes unclear how to determine the 'distance' between the states.To overcome this issue we have chosen to modify the ϒ indicator The ϒ indicator for Early Warning to allow for a second base state, namely the ARMA(0,0) model.This model is just white noise, possibly with a drift, and is guaranteed to satisfy all the auxiliary conditions for the obvious reasons that there are no coefficients available to violate them.We consider ARMA(0,0) as a special case of ARMA(1,0) in which φ 1 = 0.The use of the ARMA(1,0) process as a base model was partly justified by the image of a particle trapped in a potential well, where a restoring force keeps the particle oscillating around the equilibrium.The justification for including ARMA(0,0) as a potential base model follows a similar argument, except that in this case the noise amplitude is too low compared to the width of the potential well to feel the restoring force.To use both base models, we first introduce and With this, the modified ϒ-Indicator for the extended base model class can be written as In addition, it must be specified that in the cases where the constrained fitting failed for the ARMA(1,0) model so that ∆BIC 1 (p, q) may be negative, ∆BIC 0 (p, q) is automatically chosen in practise.For obvious reasons, there cannot be a case where ∆BIC 0 (p, q) is itself negative.
Furthermore, following Faranda, Dubrulle, and Pons 11 , we define the order, O, and persistence, R, of an ARMA(p, q) process as where φ i and θ j denote the autoregressive and moving-average coefficients, respectively.While the order relates to the memory lag of the process, the persistence relates to the persistence of said memory lag, hence the name.When approaching a tipping point, one would expect one out of two things to happen: either both the persistence and the order increase significantly, due to the increased memory of the process, or the order remains constant, and the persistence approaches the value of the order O, indicating a loss of stationarity.According to Faranda, Dubrulle, and The ϒ indicator for Early Warning Pons 11 , the latter alternative corresponds to a case in which the potential landscape of the system does not change considerably when approaching the transition.
This observation strengthens the case for the modified ϒ indicator in contrast to excluding windows of the time series where ∆BIC 1 (p, q) is negative, as these periods are indicative of an instability resulting from the loss of stationarity of the ARMA(1,0) process.
To apply the method to a time series data set, one first has to ensure stationarity of the data.This can be done in two ways, depending on the nature of the time series.In some cases, it is sufficient to split the time series into small enough intervals, so that within each interval the time series is approximately stationary.To check for stationarity one runs a Kwiatkowski-Phillips-Schmidt-Shin (KPSS) tests on the intervals.This way, one also obtains an upper bound on the length of the intervals; see Kaiser et al. 13 .The other option is to not assume stationarity from the outset, and instead allow for application of a differencing routine to the separate intervals, achieving stationarity that way.In that case, a KPSS test is run on each interval, and if the interval is found to not be stationary, differencing is applied.This process is then repeated until stationarity is achieved.
The KPSS test is to be preferred over the unit root test due to the danger of over-differencing (Hyndman and Khandakar 30 ).As we wish to study rate induced tipping phenomena, which yields highly non-stationary time series even for very small interval lengths, the latter method is to be preferred.By this choice we go from an ARMA to an ARIMA model, in which the I stands for "integrated" in reference to the differencing routine used to ensure the stationarity of the time series.
Provided one can select sufficiently long time series intervals where the process is approximately stationary, one can fit ARMA(p,q) models to available observations during these intervals, and through the ϒ indicator obtain an estimate for how close any given interval is to an equilibrium state.To determine the best fit, we use the auto.arimafunction found in the FORECAST R package, setting BIC as the information criterion used for model selection.Since we will not assume stationarity of the time series, auto.arimafirst determines the correct differencing order before continuing with the fitting procedure; the details of said procedure can be found in Hyndman and Khandakar 30 .
It is clear that the method is strongly dependent upon the size of the intervals, which we will refer to as the window length, τ.This is not only due to the inclusion of the 1/τ factor in the exponential, but also due to the inherent τ-dependence of BIC(p, q) and BIC(1, 0).In fact, the The ϒ indicator for Early Warning rationale for including the 1/τ factor in the definition of ϒ is to attempt to remove or reduce this dependence.From equation ( 2) one might conclude that the correct scaling would be 1/ ln(τ), as opposed to 1/τ.However, we do not only want to remove the dependence on τ, but also include the significance threshold for ∆BIC, such that the ϒ value of any point where ∆BIC is below 2 is suppressed relative to other points.

III. APPLICATION TO THE GLOBAL OCEANIC 3-BOX MODEL
To determine the validity of the ϒ-indicator as a measure of stability, as well as its ability to detect different types of tipping points, we start by applying the method to the global oceanic 3-box model discussed by Alkhayuon et al. 15 .The 3-box model of Alkhayuon et al. 15 is a simplification We denote salinity by S i , the volume by V i and the fluxes by F i , where i ∈ {N, T, S, IP, B} denotes the respective boxes.
Let Γ denote the AMOC flow defined by The model approximates a buoyancy-driven flow, with a transport proportional to the density difference between the boxes, assuming a linearized equation of state.The evolution equations for the salinities S N and S T are for Γ ≥ 0, and for Γ < 0, where S B and S S are regarded as fixed parameters and Y = 3.15 × 10 7 , which converts the time unit from seconds to years.S 0 is a reference salinity, and K i are coefficients associated with the gyre strengths.We note that all the salinity values are given as perturbations from a background state, see Appendix A of Alkhayuon et al. 15 for details on the transformation.Since the total salinity is assumed to be conserved, the salinity of the Indo-Pacific (IP) box, S IP , can be computed from S N and S T .
The values of the assorted parameters can be found in Table 1 and Table 2.
The fluxes, F N and F T , are linear functions of the hosing function H(t) which simulates the influx of fresh water.In the case of 2×CO 2 the fluxes are (see Wood et al. 16 ) F T = −0.997× 10 6 + H(t) 0.6961 × 10 6 (15)   where all fluxes are given in units of Sverdrup (Sv).
The values for the case of 1×CO 2 can be found in Table 5 of Alkhayuon et al. 15 .
Figure 3 shows the bifurcation diagram for S N ; for S T we refer to Alkhayuon et al. 15 The bifurcation diagram for the flow strength Γ is qualitatively similar, since all other parameters in Eq. 9 are kept constant.The diagram clearly shows that this is a bi-stable system with two stable equilibrium branches connected by an unstable branch.The upper equilibrium branch looses stability, not at the saddle-node bifurcation, but rather due to a Hopf-bifurcation, indicated by a red diamond in the diagram.Thus, part of the upper equilibrium branch, denoted in black, is in fact unstable.
To simulate the influx of fresh water we apply a time dependent, piece-wise linear hosing function, 2), to equations ( 10)-( 13).Here where α(t) and β (t) are linear functions ensuring continuity of H(t).If we define the rise and fall rates, as then As demonstrated by Alkhayuon et al. 15 , whether the system undergoes a transition from one stable state to the other, is dependent not only on the value of H pert , but on the rise and fall rates, r rise and r f all , as well as the perturbation time T pert .In particular, they demonstrate that even when H pert is above the bifurcation value that destabilizes the upper equilibrium branch, the system may still return to this equilibrium, provided T f all is short enough; a process which they termed avoided The ϒ indicator for Early Warning B-tipping.In addition, they showed that if T pert is too short, the system will not tip, but return to the initial equilibrium branch.
In what follows, we will apply the ϒ indicator as described in the previous section to time series data generated by the 3-box model.We will separately study time series undergoing rate-, noiseand bifurcation-induced tipping, while attempting to assess the indicator's ability to gauge the stability of the time series as it approaches the tipping point.such a transition has occurred, i.e., when the unstable equilibrium branch has been crossed and the system is approaching a different equilibrium.The objective should then be to develop an indicator that is able to identify this induced instability as soon as possible after the transition.
Finally, we note that, while it is possible to extend ARMA fitting to multivalued time series data, we have chosen to not go down that route, and instead only apply the indicator to a single time series for the salinity values from the North Atlantic basin, S N .The reason for choosing S N over S T is that within the 3-box model, the equilibrium branches of S N are that much further apart, making the transitions easier to see.Such a simplification might at first glance seem rather contrived, however we argue that, as the goal of any indicator is to be used on real-world time series data in which the connection to other time series is largely unknown, it is reasonable to only concentrate on one time series, despite the underlying system being multidimensional.

A. Bifurcation-induced Tipping
To induce B-tipping in the 3-box model, we gradually change H(t) according to equation (16), with H 0 = 0, H pert = 0.5, T rise = 1000.This corresponds to an increase in the freshwater fluxes F T and F N , corresponding to the flux into the tropical and North Atlantic boxes, by approximately 34% and 13%, respectively.This, in turn, corresponds to roughly a 0.1-0.2Sv increase, in line with freshwater "hosing" experiments of the North Atlantic 32 .We let T pert go to infinity, such that H(t) never returns to its initial value.As H(t) changes, S N follows the upper equilibrium branch as sketched in Figure 3, until it reaches the hopf-bifurcation (around H = 0.4), at which point the upper equilibrium branch becomes unstable, and S N starts approaching the lower equilibrium branch.We choose a window length of 350 points corresponding to about 70 years.
Figure 4 shows the time series of S N color coded according to the value of ϒ, with brighter colors corresponding to higher values of ϒ and hence a greater degree of instability.Figure 5 shows ϒ as a function of time, with clear peaks corresponding to brightly colored points in Figure 4.
It should be noted that low amplitude white noise is also applied to facilitate ARIMA model The ϒ indicator for Early Warning The ϒ indicator for Early Warning fitting.The noise intensity is kept small enough to avoid noise-induced tipping.
Figures 4 and 5 clearly indicate that there are several points on the time series as it approaches the transition, which are deemed to have a high degree of instability.We further note that, although the result is not shown here, the high ϒ values in Figures 4 and 5 correspond to intervals for which ∆BIC 1 (p, q) is negative, indicating that, as discussed previously, the ARMA(1,0) model would, when only considering BIC values, be the better fit, but it violates the auxiliary conditions, indicating a loss of stationarity.Hence, at these points ARMA(1,0) is excluded as a possible model, implying that ARMA(0,0) is the chosen base model.
In addition, we look at the order of the best-fit ARMA model, namely the q and p values, as well as the persistence, to gain further insight into the stability properties of the time series.Figure 6 shows the time series of S N color coded according to the values of q and p.When comparing with Figure 4, this seems to indicate that the high values of ϒ appearing before the transition are primarily associated with an increase in the q-values.This is not unexpected, as it is primarily the change in the properties of the noise which is expected to give an indication of an approaching transition.Figure 7 shows the persistence plotted as a function of time t.We see a clear increase in the persistence directly preceding the tipping point around t = 1000.
We make a final comment regarding Figure 6 and its relation to our choice of ARMA(1,0) and ARMA(0,0) as base models.In Faranda et al. 10 this choice was guided by the fact that for the time series under consideration the order, i.e. p + q, of the intervals was clustered around 1, and as the authors explicitly excluded pure moving-average processes, they concluded that ARMA(1,0) was the appropriate base model.However, from Figure 6 we see that for the time series currently under consideration, the order is clustered around 0. This observation further strengthens the case for using ARMA(0,0) as an additional base model.We hypothesize that the dominance of ARMA(0,0) is related to the low degree of noise in the system, which makes the restoring force that returns the system to equilibrium less prominent, hence obscuring tendency of the randomwalk to be attracted to a metastable state.

B. Noise-induced Tipping
To induce N-tipping, we fix the hosing parameter H and apply additive white noise to all the equations equally.The noise term is added equally to (10)-( 13), with the same noise amplitude in all cases.We look at transitions from the upper branch to the lower branch and vice versa.In The ϒ indicator for Early Warning either case, it is convenient to choose a value for H that is close to the bifurcation point, as the probability of transitioning is much higher in these regions, and hence one does not need high amplitude noise to induce transitions between the branches.
Figures 8 and 9 show two time series undergoing noise induced tipping, one going from the lower to the upper branch, while the other going the other way around.In the first case H = −0.25,while in the second H = 0.24.The amplitude of the additive white noise is the same in both cases.
For the window length τ, we have chosen a length of 350 and 200 points, corresponding to about 70 and 41 years, respectively.The window length is chosen so that it is at most half as long as the transition time, which is taken to be the time for the system to arrive at the other equilibrium once it has crossed the unstable branch.Of course, when dealing with simulation data such as The ϒ indicator for Early Warning this, we have the advantage of knowing where the stable and unstable branches are, which is an advantage that anyone dealing with real-world data does not have.In principle one could use the clustering methods proposed by Kaiser et al. 13 to approximate the window length, although this method also requires that one knows how many clusters, i.e., equilibrium states, one should look for.The clustering method works particularly well for noise induced transitions, as one can repeatedly induce transitions back and forth, to gain an ensemble of transitions, yielding a higher degree of accuracy.
In previous works, the choice of τ has largely been guided by a desire to ensure the stationarity of the time series intervals.However, as we are not requiring the individual time series segments to be stationary a priori, we are permitted to use much longer time series intervals.In the world of The ϒ indicator for Early Warning ARIMA fitting a time series of length above 200 points would generally be considered a very long series, however, we should keep in mind that the sampling frequency of our simulated data is quite high; in fact, there are 5 points per time unit (i.e., year), yielding a total of 10000 points for the 2000 years of simulations.An interval consisting of 200 points corresponds to around 40 years, which is not an unreasonably long time interval for the dynamics of the AMOC.When fitting an ARIMA model to a time series, one wishes to avoid too long time series to avoid including events from the past that no longer have any relevance for the future.This, and not the inherent inaccuracy of the fit itself, is the primary reason for limiting the length of a time series.
Returning to Figures 8 and 9, we note that there are a few brightly colored points indicating a high degree of instability.There are for example, in both cases, several points in the middle of the gap between the two stable branches, indicated by solid gray lines in the figure.This is consistent The ϒ indicator for Early Warning with the results of Kaiser et al. 13 .In addition, for the transition from the lower to the upper branch, Figure 8, there are several brightly colored points just after the system has reached the upper equilibrium branch.Although it is not so clear in the figure due to the presence of noise, any time S N returns to the upper equilibrium branch it initially overshoots and then oscillates around the equilibrium value with continuously decreasing amplitude (see Figure 13 for a clearer example of this behavior).This is probably due to the presence of an unstable limit cycle, and the aforementioned sub-critical hopf bifurcation.Hence, we see it as an encouraging sign that the indicator seems to be able to identify these points as well.We further note that, although the result is not shown, the high ϒ value points in figure 8 and 9 correspond to points where ∆ 1 BIC(p, q) is negative, as was the case for the B-tipping example in the previous section.
Looking at the p and q values in Figures 10 and 11, it is clear that high values of ϒ correspond to high values of q, while the connection between p and ϒ remains uncertain.However, we note that shows the same time series, but color coded according to the value of q.
FIG. 13. S N as a function of time, color coded according to the value of ϒ for T f all = 280.With these parameter values, the system does not tip, but returns to the upper equilibrium branch after some time.Note that the system initially overshoots the stable branch upon return.This is probably due to the presence of the unstable limit cycle.The equilibrium branches are plotted in gray, with the dashed line denoting the unstable branch.
the high ϒ values appearing around the transition correspond to high values of both p and q, and consequently also of persistence (result not shown).
The ϒ indicator for Early Warning

C. Rate-induced Tipping
To induce R-tipping we fix H pert below the bifurcation value, ensuring that both equilibria still exist and are stable, and vary T f all .We set T rise = 100 and T pert = 400, while H pert = 0.37.This corresponds to an increase in the freshwater fluxes F T and F N , corresponding to the flux into the tropical and North Atlantic boxes, by approximately 25% and 10%, respectively.Next, we observe that for T f all = 280 the system returns to the upper equilibrium branch, while for T f all = 320, the system transitions to the lower branch.The transition happens even though the bifurcation boundary has not been crossed.Again, we note that some additive white noise has been applied to allow for ARIMA fitting.see several brightly colored points, indicating a high degree of instability, before the system transitions.These points occur initially as the system approaches the unstable branch (between approximately t = 350 and t = 500).These points do not appear for the time series that does not tip, Figure 13, despite the fact that within this time interval, the two time series are virtually identical, and could therefore be an indication of an approaching tipping point.However, again looking at Figure 13 we see some brightly colored points, corresponding to large ϒ, in the interval t = 600 to t = 750, and it is unclear what approaching instability these points would be indicative of, and thus might be regarded as false signals.
Looking at Figure 14, it becomes clear that the high values of ϒ found in Figure 12 correspond to high values of q, while a comparison with Figure 16, gives the same indication for the persistence.
The ϒ indicator for Early Warning FIG.16.Persistence of a time series undergoing rate-induced tipping, plotted as a function of time.The underlying series is the time series shown in Figure 12.We see several high persistence values, corresponding with a high value for the order, q + p (compare with Figure 14), appearing before the potential tipping point around t = 500.
In other words, high values of ϒ primarily correspond to high values of persistence and q.
From Figure 13, we can also see how the indicator correctly identifies the unstable limit cycle, which we have argued causes the overshoot when returning to the upper equilibrium branch.Fig- ure 15 shows the same time series as in Figure 13, color coded according to the values of q and p.While high values of q seem to be associated with increased instability, the high values of p primarily occur as the system returns to the equilibrium.We would therefore suggest that high values of the autoregressive order, p, should be interpreted as an indication that the system is following a moving equilibrium branch.Comparing Figures 16 and ?? it becomes clear that the points with high q value around t = 1000, correspond to particularly high values of persistence, even when compared to other points of similar order.We also note that, as in the previous two tipping scenarios, the high ϒ values, or equivalently high p values, We end this section with a brief comment on the rate-induced tipping example presented in this section.In this example the system is, as it undergoes rate-induced tipping, approaching a bifurcation boundary.It would be instructive to study a case in which this is not the case to ensure that the detected instability is not merely due to the approaching bifurcation boundary.However, as one would need to look at different model examples than those presented here, this is outside the scope of the current work.
The ϒ indicator for Early Warning

IV. COMPARISON WITH OTHER EARLY WARNING INDICATORS
As briefly alluded to in the introduction, it is well established that bifurcation-induced tipping is generally preceded by an increase in lag 1 autocorrelation and variance (Lenton et al. 33 , Dakos et al. 34 , Boers 24 ).The intuition behind this is that as the system approaches a bifurcation point, the potential well flattens out, reducing the speed at which the system recovers from a perturbation, so called "critical slowing down", which should manifest as an increase in the variance and autocorrelation of the time series.However, the variance and autocorrelation might also increase for other reasons, in particular if the properties of the noise changes.What happens to the autocorrelation and variance when the system approaches a rate-induced tipping point is thus far unclear, although it is conceivable that the "critical slowing down" hypothesis still holds for this type of tipping, see Ritchie and Sieber 9 .Obviously, it does not hold true for time series undergoing purely noise induced tipping, as there is no change in the potential well.However, the autocorrelation and variance of the time series will dramatically change as the system crosses the unstable equilibrium branch and enters a different potential well.
In what follows, we will compare these classical indicators to the ϒ indicator for rate-induced and bifurcation-induced tipping in the AMOC 3-box model.It is instructive to just look at the part of the time series prior to the transition, as in general one wishes to be able to detect early signs of the transition before it happens.For the time series undergoing bifurcation-induced tipping (Figure 4) we chose a segment consisting of the points between approximately t = 200 and t = 1100.For the time series undergoing rate-induced tipping (Figure 13), we choose a segment consisting of the points between t = 200 and t = 700.This segment is in all probability too long, meaning that it also contains the transition itself, as opposed to only points prior to the transition.However, this is the inherent difficulty with rate induced tipping; there is currently no way to analytically determine when the transition happens, and one largely has to guess.Based on Figures 12 and 13, one could potentially conclude that the tipping point is found somewhere between t = 400 and t = 600, but this is pure guess work.For this reason we have included points up until t = 700.
Given a set of measurements Y 1 ,Y 2 , • • • ,Y N the sample variance is defined as The ϒ indicator for Early Warning while the lag k autocorrelation is given by where Y denotes the sample mean of the series Jenkins, and Reinsel 35 ).Although time does not enter explicitly in the formulas, it is assumed that the measurements are taken at regular intervals.
When computing the variance and autocorrelation it is essential that the signal is properly detrended; otherwise any trend will immediately obscure the relevant dynamics.As for the ϒ indicator, one generally employs a rolling window approach, with an appropriately chosen window length τ.Lenton et al. 33 demonstrated that detrending can be done within each time window, as opposed to on the whole time series at once, without significantly changing the result.We have chosen this same approach, using linear detrending, as opposed to quadratic or higher order detrending methods, to remove the trend.The window length τ was set to 350 points, corresponding to 70 years.The autocorrelation and variance of a time series can increase for reasons that have nothing to do with an approaching tipping point.Hence, we wish to see how the ϒ indicator responds to colored noise, whose variance and autocorrelation increases with time t.To this end, we construct an artificial time series of the form where ξ (t) is autocorrelated colored noise.ξ (t) is in effect modelled as an ARMA(1,0) process whose AR1 coefficient increases linearly in time.In addition, the variance of this process also increases linearly in time.This is equivalent to the example presented in Boers 24 .Applying the ϒ indicator to this time series yields the result shown in Figure 19. Figure 20 shows a comparison between the autocorrelation, variance and value of ϒ for the same time series.All three indicators show a dramatic increase, despite there being no approaching tipping point.However, looking at the plot of the time series when color coded according to the values of p and q, Figure 21, a curious pattern emerges: the increase in ϒ is largely associated with increased p value.Looking at Figure 22 the trend becomes even clearer: here we have computed the rolling average of the p and q values with a window length of 50 points corresponding to 25 non-dimensional time units.
We see that while the average value of q goes towards zero for large t, the average value of p settles around one.The general trend is independent of the choice of window length, provided the window length is between 30 and 300 points.
This behavior is unlike what was observed for the 3-box model.The high values of ϒ were associated with a high value of q.We thus argue that high values of q were associated with increased The ϒ indicator for Early Warning The ϒ indicator for Early Warning instability, while high values of p were more indicative of the system following a moving equilibrium.
Thus, one would, through the distinction between q and p values, potentially have a way of distinguishing the effect of colored noise from real early warning signals.However, it is conceivable that the result for the artificial colored noise time series is a consequence of how we have constructed the colored noise, so further studies on this are warranted.
Finally, we note that the constructed colored noise time series is a very artificial example of colored noise, as the noise amplitude increases by a probably unrealistic amount, and when applied to any reasonable time series it would obscure the dynamics altogether.This is to say that although we can likely assume that the noise in real-world data is autocorrelated, it will be much more subtle, and not result in equally high values of ϒ.

V. APPLICATION TO SIMULATION DATA FROM CESM2
So far, we have only applied the dynamic stability indicator to data from a very simplified model.The actual ocean has many more degrees of freedom and the response could be quite different.Nevertheless, it is of interest to see how the indicator responds when applied to such a system.To this end, we employ data from the earth systems model CESM2, under two climate scenarios: one in which the atmospheric CO 2 concentration is abruptly doubled and another in which it is abruptly quadrupled.Both simulations were initialized using a pre-industrial control run (piControl) and then run for 500 years.The CO 2 was then increased, at t = 6000 months.
The data was saved at monthly intervals and the seasonal cycle was removed prior to the analysis.
Such an abrupt change in CO 2 represents an extreme forcing, and contrasts with the ramped-up hosing employed with the idealized model.However, the oceanic response is not instantaneous, but requires 2-3 decades for freshwater to circulate in the model's sub-polar gyre 27 .We consider this more hereafter.

A. Abrupt 4 × CO 2
The time series of a monthly-mean density difference, δ ρ, and AMOC strength, ψ AMOC , are shown in Figure 23  linked to the AMOC strength (Madan et al. 27 ), is calculated from the difference in surface densities averaged in boxes to the north and south of the North Atlantic Current.The surface density is calculated using the thermodynamic equation of state of seawater as per UNESCO 1983 Report 36 .
The AMOC strength is calculated as the monthly maxima of meridional overturning stream function between 20 o N-60 o N and below 450 m depth.
Shortly after the quadrupling of CO 2 , there is an abrupt transition followed by a dramatic increase in the variance.We will apply the indicator to the density difference time series, although one could of course apply the same analysis to the AMOC strength.
We choose a window length of 250 data points, corresponding to exactly 20 years of monthly data.
Figure 24 shows the density difference, δ ρ, color coded according to the values of ϒ.We only display the part of the time series close to the transition, as this is of primary interest.The point at which the CO2 concentration is abruptly increased, at t = 6000 months, is indicated by a dashed line.
The increase in ϒ during the early part of the AMOC weakening process is apparent.Note in particular the three sharp peaks shortly after time t=6000.Figure 25 again shows the time series, now color coded according to the values of q and p.The latter are also plotted for further clarification.
From this plot, it becomes clear that the most common fit prior to the transition is the ARMA(1,0) process, which aligns with the observations of Faranda et al. 10 .After the weakening phase, the value of p is generally an order higher, presumably related to the dramatic increase in the variance.The three sharp peaks in the plot of ϒ appearing around time t = 6300 correspond to high The ϒ indicator for Early Warning values of q.The gradual increase in ϒ preceding these peaks is presumably due to the increase in the persistence (not shown).The q component exhibits peaks prior to t = 6000, when the forcing is applied and these are reflected in small peaks in ϒ.These are obviously not connected to the AMOC weakening.Following the initial weakening phase, the value for ϒ remains high, probably a result of the increase in the p value.However, the values of ϒ do not go above 0.4 which is considerably smaller than the values found for the 3-box model.In addition, from our previous discussion on the response of the ϒ indicator to colored noise, it is conceivable that the increase in ϒ observed from in the CESM2 data is primarily caused by changes in the noise amplitude, and not as a consequence of inherent instability of the underlying dynamics.
Furthermore we note that, although the result is not explicitly shown, for the CESM2 data ∆BIC 1 is always smaller than ∆BIC 0 , and the ∆BIC 1 values are at no point negative, implying that the autoregressive coefficient in the ARMA(1,0) model always satisfy the stationarity constraints.This differs from what was observed in the 3-box model and is presumably related to the difference in the observed ϒ values.
However, we emphasize that it is not clear if one in actuality can compare values of ϒ between datasets.For the autocorrelation and the variance it is typically assumed that it is the change within the dataset that is significant, rather than the absolute numerical values.
For completeness, we have included a comparison between ϒ and two other statistical early warning indicators, namely autocorrelation and variance.This is shown in Figure 26.In all cases, the window length is 250 points, corresponding to approximately 20 years.All three indicators show a clear increase shortly after time t = 6000.

B. Abrupt 2 × CO 2
The time series of the monthly density difference, δ ρ, and AMOC strength, ψ AMOC , in the case of abrupt 2 × CO 2 is shown in Figure 27.Again, we only apply the indicator to the density difference data, and choose the same window length as in the case of abrupt 4 × CO 2 .Figure 28 shows an excerpt of the density difference time series close to the initial weakening, as well as a plot of the ϒ values.A weakening is clearly seen in the model's own AMOC measure, and is also accurately captured with the measure based on the density difference across the Gulf Stream (Fig. 27).
The first thing to note is how small the ϒ values are compared to what we have seen previously; on The ϒ indicator for Early Warning the order of 10 −2 .It should, however, be noted that the ∆BIC values are well above the significance threshold 29 .Figure 29 shows the density difference time series color coded according to the value of q and p. From this, we again see that prior to the increase in CO 2 , the most common fit is the ARMA(1,0) process, while after the initial weakening phase the p values show a clear increase.
The q value, on the other hand, does not exceed 2, indicating a very low degree of memory in the noise term.Since we have by now clearly demonstrated a correlation with the value of ϒ and the value of q, this should provide an explanation as to why we see such low values of ϒ.
From this analysis, one would conclude the system does not appear to be approaching a tipping point.Indeed, the measure suggests that the weakening in the overturning in this case with reduced forcing is not associated with a loss of dynamical stability.Once more we have, as shown in Figure 30, included a comparison with other early warning indicators.The autocorrelation and variance show a dramatic increase around time t=6000, which corresponds to the appearance of the cluster of sharp peaks in the time series plot for ϒ.

VI. DISCUSSION
In summary, we analysed an indicator for dynamical stability based on ARMA modelling as a way to detect transitions in complex systems.A detected need for higher order terms in the ARMA model fitted to moving windows of a timeseries is related to diverging memory properties, The ϒ indicator for Early Warning which are expected to arise when approaching a transition to a new equilibrium state.The rationale behind this indicator is that it uses a broad family of linear statistical models that can be fitted even on short time series and which have proven their utility in many contexts (see Brockwell and Davis 28 ).That the underlying models do not require long time series is an advantage when employing a sliding window approach on limited data sets.The method generalizes classical metrics of instability, and allows one to extract more global dynamical information from the time series data.
The indicator was tested on time series data from a 3-box model of the AMOC, where three categories of critical transitions, namely B-, N-, and R-tipping, were explored.In all cases the transition is identified by the indicator, albeit it is not always easy to interpret the signal.In the rate-induced tipping scenario a comparison between the avoided tipping and the tipping cases shows a response of the indicator prior to the transition only in the tipping case although the time series are nearly identical at this stage.The indicator also successfully identifies the unstable limit cycle when returning to the upper equilibrium branch.We similarly see fairly clear signals in the bifurcation-induced tipping scenario prior to the transition.For the case of noise-induced tipping, the signal is less clear, obscured by the high amplitude noise.However, when going from the lower to the upper equilibrium branch the indicator signals an increased degree of instability in accordance with the presence of the unstable limit cycle.
The primary drawback of the ϒ indicator is that it is computationally quite expensive, at least compared to the autocorrelation and variance, and that, due to its complexity, the results can be harder to interpret.We therefore suggest that the indicator should be applied with care, and prefer- ably in combinations with other measures of instability, like the increase in the order, p + q, and the persistence.Although the current scaling with τ, see equation ( 3), seems to yield reasonable results, it is certainly possible that another scaling would be preferred.It is also possible that this is problem-dependent.This uncertainty regarding the correct scaling is certainly a drawback, but we argue that this problem can largely be circumvented by including an examination of the persistence and order values.However, it would still be advantageous to have an indicator whose values were to have a clear meaning in terms of the stability of the system, and it is not clear if the ϒ indicator as it stands achieves this, partly due to the aforementioned issue with the choice of the correct scaling.Although we have attempted to make some comparison to other early warning indicators, like the increase in autocorrelation and variance, we are not claiming that the ϒ indicator is in any way better than these other indicators, rather that it can act as a complementary approach, as it can allow one to extract more information from time series data.For example, we have suggested, that it might be helpful in identifying the effects of colored noise, something the other indicators struggle with.
Furthermore, we note that it is conceivable that one would wish to exclude white noise and pure moving-average, MA(1), processes when doing the fitting, as was done in the earlier studies by Faranda et al. 10 .In such a scenario the modified definition of the ϒ indicator would of course no longer be valid, as the ARMA(0,0) process is excluded, and thus cannot be used as a base model.In this case one might argue that the points where ∆ 1 BIC are negative should either be ignored completely, or one should assume that the best fit is in fact the ARMA(1,0) process and the algorithm is being too strict it its enforcement of the auxiliary conditions on the fitting parameters.
This would of course lead to different results than what has been presented here, and is an option The ϒ indicator for Early Warning worth considering.
When considering a full complexity AMOC model as arising from a global climate model (CESM2) many more degrees of freedom are involved.This has two consequences: firstly, the pure categories of tipping cannot really be expected anymore and secondly, the tipping behaviour might disappear altogether as the added degrees of freedom may stabilize the system.
When applied to the CESM2 data, the results were mixed.The measure exhibited a significant increase in ϒ under the more severe 4xCO 2 forcing but much less variability with the weaker 2xCO 2 forcing.Hence the measure only registers larger changes in AMOC as associated with dynamically unstable behavior.Indeed, it is possible that the model AMOC experiences a continuously shifting steady state, rather than making a transition between two distinct states as in low dimensional models.The results from the doubling CO 2 experiment seems to support this hypothesis.Other members of the CMIP6 ensemble exhibiting very different AMOC weakening from the same forcing, with some declining by only 15% and others falling by 80% 27 , and this suggests a continuum of different responses.
While the results for 4×CO 2 suggest a loss of dynamical stability during the AMOC weakening phase, concluding on the tipping behaviour would require a more in depth analysis along the lines done in Hawkins et al. 23 ; in this paper the bi-stability is clearly demonstrated by exploring a range of hosing experiments.Although we are confident that the ϒ indicator can be used to assess the stability of such complex systems, as was already demonstrated in previous works by Nevo et al. 12 , concluding on the ability to detect critical transitions would require a full analysis of the hysteresis behaviour of the system.the development of the code used for the numerical analysis.

FIG. 1 .
FIG. 1. Sketch of the 5-box model for the Atlantic Meridional Overturning Circulation (AMOC).Here, a light gray coloring is used to denote the two boxes whose salinities do not change, as well as all the arrows indicating terms which do not appear in the equations describing the dynamics of the 3-box model.Adapted from Alkhayuon et al. 15 .

FIG. 2 .
FIG. 2. Schematic illustration of the piece-wise linear hosing function used to simulate the influx of fresh water.Adapted from Alkhayuon et al. 15 .
FIG. 5. ϒ as a function of time for a time series of S N undergoing B-tipping.

FIG. 6 .
FIG. 6. Bifurcation induced tipping of S N (t), color coded according to the value of the best-fit ARMA model orders (a) q and (b) p (scatter plot).The line plots additionally show the same values for q and p as functions of time in (a) and (b), respectively.

FIG. 8 .
FIG. 8. (a) Noise-induced tipping, color coded according to the value of ϒ.The gray lines denote the equilibria, with the dashed line denoting the unstable equilibrium branch.Transition from the lower to the upper equilibrium branch for H = −0.25,τ = 350.(b) Plot of ϒ as a function of time.Note how the peaks correspond to the brightly colored points in (a).

FIG. 9 .
FIG. 9. (a) Noise-induced tipping, color coded according to the value of ϒ.The gray lines denote the equilibria, with the dashed line denoting the unstable equilibrium branch.Transition from the upper to the lower equilibrium branch for H = 0.24, τ = 200.(b) Plot of ϒ as a function of time.Note how the peaks correspond to the brightly colored points in (a).

FIG. 10 .
FIG. 10.Noise-induced tipping of S N (t) for H = −0.25,τ = 350, color coded according to the value of (a) p and (b) q.For clarity we have also plotted is p and q as functions of time in (a) and (b), respectively.

FIG. 11 .
FIG. 11.Noise-induced tipping of S N (t) for H = 0.24, τ = 200, color coded according to the value of (a) p and (b) q.For clarity we have also plotted is p and q as functions of time in (a) and (b), respectively.

FIG. 12 .
FIG. 12. Rate-induced tipping of S N , color coded according to the value of ϒ.The moving equilibria are plotted in gray, with the dashed line denoting the unstable branch.Compare this figure to Figure ??, which

FIG. 14 .
FIG.14.Rate-induced tipping of S N (t), color coded according to the value of (a) q and (b) p.The value for q and p are also plotted as functions of time in (a) and (b), respectively.

Figure 12 FIG. 15 .
Figure12shows a time series undergoing rate-induced tipping, with the color coding corresponding to the values of ϒ. Again, we have chosen τ = 350 points, corresponding to 70 years.We

FIG. 17 .
FIG. 17. Autocorrelation, Variance and ϒ plotted as functions of time for a time series undergoing Btipping.The increase in the variance as one approaches the tipping point is clear, while the increase in autocorrelation is less clear.

Figures 17 and 18
Figures 17 and 18 show the autocorrelation, variance and ϒ plotted as functions of time.The peaks in ϒ preceding the transition are clear, as is the increase in variance and autocorrelation, at least in the case of R-tipping, provided the tipping point is approximately at t = 450.For Btipping, there appears to be a clear increase in the variance preceding the tipping point, provided the tipping point happens around t = 850 (see Figure4for comparison).The expected increase in

FIG. 21 .
FIG. 21.Time series with colored noise and no tipping points, corresponding to equation (21), color coded according to the value of (a) q and (b) p.

for the case of abrupt 4 ×
FIG.23.CESM2 model with abrupt 4×CO 2 , where the monthly density difference (blue) is plotted together with the maximum AMOC flow strength (red).Note that the CO 2 was increased at t=6000 months.

FIG. 24 .
FIG. 24.Time series of monthly density changes for abrupt 4 × CO 2 , color coded according to the value of ϒ.The window length is 250 points, corresponding to exactly 20 years.The dashed line indicates the point when the CO2 concentration abruptly changes.

FIG. 25 .
FIG.25.Time series of monthly density changes for abrupt 4 × CO 2 , color coded according to the value of (a) q and (b) p.The value for q and p are also plotted as functions of time in (a) and (b), respectively.

The
FIG. 26.Autocorrelation, variance and ϒ plotted as functions of time for the case of abrupt 4 × CO 2 .

FIG. 29 .
FIG.29.Time series of monthly density changes for abrupt 2 × CO 2 , color coded according to the value of (a) q and (b) p.The value for q and p are also plotted as functions of time in (a) and (b), respectively.
0.3683 × 10 7 m 3 S N = 0.034912 F N = 0.486 Sv Tropical Atlantic V T = 0.5418 × 10 7 m 3 S T = 0.035435 F T = −0.997Sv Southern Ocean V S = 0.6097 × 10 7 m 3 S S = 0.034427 F S = 1.265Sv Indo-Pacific V IP = 1.4860 × 10 7 m 3 S IP = 0.034668 F IP = −0.754Sv Bottom Ocean V B = 9.9250 × 10 7 m 3 S B = 0.034538 Before proceeding, we should clarify one point regarding noise-induced tipping, and what is meant by an early warning indicator in this context.Noise-induced tipping is inherently unpredictable, and hence one might conclude that any attempt at predicting such transitions is doomed to fail based on a single time series.In contrast, assuming the underlying model is known, one could use ensembles of realizations to estimate the likelihood of noise-induced transitions.Examples of these statistical approaches are discussed in Thompson and Sieber 31 .Although one cannot expect to develop an early warning indicator for these types of transitions, one should at the very least be able to tell, from time series data, once