Nowadays, ocean, atmosphere and climate sciences face a deluge of data pouring from space, in situ monitoring as well as numerical simulations. The availability of these different data sources offer new opportunities, still largely under exploited, to improve the understanding, modeling and reconstruction of geophysical dynamics. In a classical filtering or smoothing framework, data assimilation methods address the reconstruction of space-time dynamics from observations. They require multiple runs of an explicit dynamical model and may involve severe limitations including their computational cost, the lack of consistency of the model with respect to the observed data as well as modeling uncertainties. Here, an alternative strategy is explored by developing a fully data-driven assimilation. No explicit knowledge of the dynamical model is required. Only a representative catalog of trajectories of the system is assumed to be available. Based on this catalog, the Analog Data Assimilation (AnDA) is introduced by combining machine learning and stochastic assimilation techniques. It relies on the non-parametric sampling of the dynamical model using different analog forecasting methods, such that no online evaluation of the physical model is exploited.
In this presentation, we explore different analog forecasting strategies and derive both ensemble Kalman and particle filtering versions of the proposed AnDA approach. We test the methodology on the Lorenz-63 and Lorenz-96 chaotic systems, with respect to the classical model-driven assimilation. Finally, we present a Matlab toolbox and Python library of the AnDA procedure.