The international scientific context is pointing out the role that is played by models and observation systems for the evaluation and forecasting of the environmental risks.
The complexity of the environmental phenomena as well as the operational objectives of risk mitigation necessitate an intensive interweaving between physical models, data processing, simulation, visualization and database tools.
This situation is met for instance in atmospheric pollution, an environmental domain whose modeling is gaining an ever-increasing significance and impact, either at local (air quality), regional (transboundary pollution) or global scale (greenhouse effect). In this domain, modeling systems are used for operational forecasts (short or long term), detailed case studies, impact studies for industrial sites, as well as coupled modeling (e.g., pollution and health, pollution and economy). These scientific subjects strongly require linking the models with all available data either of physical origin (e.g., models outputs), coming from raw observations (satellite acquisitions and/or information measured in situby an observation network) or obtained by processing and analysis of these observations (e.g., chemical concentrations retrieved by inversion of a radiative transfer model).
Clime has been jointly created, by INRIA and École des Ponts ParisTech, for studying these questions with researchers in data assimilation, image processing, and modeling.
Clime carries out research activities in three main areas:
Data assimilation methods: inverse modeling, network design, ensemble methods, uncertainties estimation, uncertainties propagation.
Image assimilation: assimilation of structures in environmental forecasting models, study of ill-posed image processing problems with data assimilation technics, definition of dynamic models from images, reduction of models.
Development of integrated chains for data/models/outputs (system architecture, workflow, database, visualization).
This activity is currently one of the major concerns of environmental sciences. It matches up the setting and the use of data assimilation methods, for instance variational methods (such as the 4D-Var method). An emerging issue lies in the propagation of uncertainties by models, notably through ensemble forecasting methods.
Although modeling is not part of the scientific objectives of Clime, the project-team has complete access to models developed by CEREA: the models from Polyphemus (pollution forecasting from local to regional scales) and Code_Saturne (urban scale). In regard to other modeling domains, Clime accesses models through co-operation initiatives either directly (for instance, the ocean model developed at MHI, Ukraine, has been provided to the team), or indirectly (for instance, issues on image assimilation in meteorology are studied in collaboration with operational centers).
The research activities tackle scientific issues such as:
Within a family of models (differing by their physical formulations and numerical approximations), which is the optimal model for a given set of observations?
How to make a forecast (and a better forecast!) by using several models corresponding to different physical formulations? It also raises the question: how should data be assimilated in this context?
Which observational network should be set up to perform a better forecast, while taking into account additional criteria such as observation cost? What are the optimal location, type and mode of deployment of sensors? How should the trajectories of mobile sensors be operated, while the studied phenomenon is evolving in time? This issue is usually referred as “network design”.
How to assess the quality of a forecast? How do data quality, missing data, data obtained from sub-optimal locations, affect the forecast? How to better include information on uncertainties (of data, of models) within the data assimilation system?
In geosciences, the issue of coupling data, in particular satellite acquisitions, and models is extensively studied for meteorology, oceanography, chemistry-transport models, land surface models. However, satellite images are mainly assimilated on a point-wise basis, without taking into account their spatial structures. To better understand our research orientation, a classification of image assimilation methods is proposed:
Image approach. Image assimilation allows the extraction of features from image sequences, for instance motion fields. A model of the dynamics is considered (often obtained by simplification of a physical model such as the Navier-Stokes equations). An observation operator is defined to express the links between the model state and the pixel value. In the simplest case, the pixel value corresponds to one coordinate of the model state and the observation operator is a projection. However, in most cases, the operator is highly complex, implicit and non-linear. Data assimilation techniques are developed to control the initial state or the whole assimilation window. Image assimilation is also applied to learn reduced models from image data and estimate a reliable and small-size reconstruction of the dynamics.
Model approach. Image assimilation is used to control an environmental model and obtain improved forecasts. In order to take into account the spatial and temporal coherency of structures, specific image characteristics are considered, and dedicated norms and observation error covariances are defined.
Correcting a model. Another topic, mainly described for meteorology in the literature, concerns the location of structures. How to force the existence and to correct the location of structures in the model state using image information? Most of the operational meteorological forecasting institutes, such as MétéoFrance, UK-met, KNMI (in Netherlands), ZAMG (in Austria) and Met-No (in Norway), study this issue because operational forecasters often modify their forecasts based on comparisons between the model outputs and the structures displayed on satellite images.
An objective of Clime is to participate in the design and creation of software chains for impact assessment and environmental crisis management. Such software chains bring together static or dynamic databases, data assimilation systems, forecast models, processing methods for environmental data and images, complex visualization tools, scientific workflows, ...
Clime is currently building, in partnership with École des Ponts ParisTech and EDF R&D, such a system for air pollution modeling: Polyphemus (see the web site
http://
The central application domain of the project-team is atmospheric chemistry. We develop and maintain the air quality modeling system Polyphemus, which includes several numerical models (Gaussian models, Lagrangian model, two 3D Eulerian models including Polair3D) and their adjoints, and different high level methods: sequential and variational data assimilation algorithms. Advanced data assimilation methods, network design, inverse modeling, ensemble forecast are studied in the context of air chemistry - note that addressing these high level issues requires controlling the full software chain (models and data assimilation algorithms).
The activity on assimilation of satellite data is mainly carried out for meteorology and oceanography. This is addressed in cooperation with external partners who provide the numerical models. Concerning oceanography, the aim is to improve the forecast of ocean circulation, by assimilation of fronts and vortices displayed on the image data. Concerning meteorology, the focus is on correcting the model location of structures related to high-impact weather events (cyclones, convective storms, etc.) by assimilating the images.
Air quality modeling implies studying the interactions between meteorology and atmospheric chemistry in the various phases of matter, which leads to the development of highly complex models. The different usages of these models comprise operational forecasting, case studies, impact studies, etc, with both societal (e.g., public information on pollution forecast) and economical impacts (e.g., impact studies for dangerous industrial sites). Models lack some appropriate data, for instance better emissions, to perform an accurate forecast and data assimilation techniques are recognized as a key point for the improvement of forecast's quality.
In this context, Clime is interested in various problems, the following being the crucial ones:
The development of ensemble forecast methods for estimating the quality of the prediction, in relation with the quality of the model and the observations. Sensitivity analysis with respect to the model's parameters so as to identify physical and chemical processes, whose modeling must be improved.
The development of methodologies for sequential aggregation of ensemble simulations. What ensembles should be generated for that purpose, how spatialized forecasts can be generated with aggregation, how can the different approaches be coupled with data assimilation?
The definition of second-order data assimilation methods for the design of optimal observation networks. Management of combinations of sensor types and deployment modes. Dynamic management of mobile sensors' trajectories.
How to estimate the emission rate of an accidental release of a pollutant, using observations and a dispersion model (from the near-field to the continental scale)? How to optimally predict the evolution of a plume? Hence, how to help people in charge of risk evaluation for the population?
The definition of non-Gaussian approaches for data assimilation.
The assimilation of satellite measurements of troposphere chemistry.
The activities of Clime in air quality are supported by the development of the Polyphemus air quality modeling system. This system has a modular design, which makes it easier to manage high level applications such as inverse modeling, data assimilation and ensemble forecast.
The capacity of performing a high quality forecast of the state of the ocean, from the regional to the global scales, is of major interest. Such a forecast can only be obtained by systematically coupling numerical models and observations ( in situand satellite data). In this context, being able to assimilate image structures becomes fundamental. Examples of such image structures are:
apparent motion linked to surface velocity;
trajectories, obtained either from tracking of features or from integration of the velocity field;
spatial structures, such as fronts, eddies or filaments.
Image Models for these structures are developed taking into account the underlying physical processes. Image data are assimilated within the Image Models to derive pseudo-observations of the state variables, which are further assimilated within the numerical ocean forecast model.
Meteorological forecasting constitutes a major applicative challenge for Image Assimilation. Although satellite data are operationally assimilated within models, this is mainly done on an independent pixel basis: the observed radiance is linked to the state variable via a radiative transfer model that plays the role of an observation operator. Indeed, because of their limited spatial and temporal resolutions, numerical weather forecast models fail to exploit image structures, such as precursors of high impact weather:
cyclogenesis related to the intrusion of dry stratospheric air in the troposphere (a precursor of cyclones);
convective systems (supercells) leading to heavy winter time storms;
low-level temperature inversion leading to fog and ice formation, etc.
To date, there is no available method for assimilating data, which are characterized by a strong coherence in space and time. Meteorologists developed qualitative Conceptual Models (CMs), for describing the high impact weathers and their signature on images, and tools to detect CMs on image data. The result of this detection is used for correcting the numerical models, for instance by modifying the initialization. The aim is therefore to develop a methodological framework allowing to assimilate the detected CMs within numerical forecast models. This is a challenging issue given the considerable impact of the related meteorological events.
“Urban Air Quality Analysis” carries out data assimilation at urban scale. It merges the outputs of a numerical model (maps of pollutant concentrations) with observations from an air quality monitoring network, in order to produce the so-called analyses, that is, corrected concentration maps. The data assimilation computes the Best Linear Unbiased Estimator (BLUE), with a call to the data assimilation library Verdandi. The error covariance matrices are parameterized for both model simulations and observations. For the model state error covariances, the parameterization primarily relies on the road network. The software handles ADMS output files, for a posteriori analyses or in an operational context.
Polyphemus (see the web site
http://
libraries that gather data processing tools (SeldonData), physical parameterizations (AtmoData) and postprocessing abilities (AtmoPy);
programs for physical preprocessing and chemistry-transport models (Polair3D, Castor, two Gaussian models, a Lagrangian model);
drivers on top of the models in order to implement advanced simulation methods such as data assimilation algorithms.
Figure depicts a typical result produced by Polyphemus. Clime is involved in the overall design of the system and in the development of advanced methods in model coupling, data assimilation and ensemble forecast (through drivers and post-processing).
In 2011, Polyphemus was extended for a better integration with the data assimilation library Verdandi. A first (unstable) version of Polyphemus with a complete overhaul of the input/output operations and of the configuration files was provided to the developers. The derivative of Polyphemus that is used at IRSN was used for the first time in a crisis context in order to simulate the transport of radionuclides during the Fukushima nuclear disaster.
The leading idea is to develop a data assimilation library intended to be generic, at least for high-dimensional systems. Data assimilation methods, developed and used by several teams at INRIA, are generic enough to be coded independently of the system to which they are applied. Therefore these methods can be put together in a library aiming at:
making easier the application of methods to a great number of problems,
making the developments perennial and sharing them,
improving the broadcast of data assimilation works.
An object-oriented language (C++) has been chosen for the core of the library. A high-level interface to Python is automatically built. The design raised many questions, related to high dimensional scientific computing, the limits of the object contents and their interfaces. The chosen object-oriented design is mainly based on three class hierarchies: the methods, the observation managers and the models. Several base facilities have also been included, for message exchanges between the objects, output saves, logging capabilities, computing with sparse matrices.
In 2011, versions 0.9, 1.0 and 1.1 of Verdandi were released. These versions are advanced enough to be used by the data assimilation community. Compared to previous versions, the additions are: 4D-Var, ensemble Kalman filter, redesigned perturbation managers, sequential aggregation, improvements in the documentation and an improved support of Windows.
Since the beginning, CLIME has been focused on new techniques for data assimilation. Last year's focus of the methodological development was on the the use of non-Gaussian approaches for inverse modeling, and the construction of a multiscale data assimilation methodology. Several methodological papers have now been published on these topics. This year, the applications of these methodologies are put forward in the inverse modeling section, although new theoretical developments have been added to these approaches. In addition, new topics have been addressed, such as the ensemble Kalman filter with a theory that puts the EnKF on safer grounds, the use of 4D-Var for the estimation of fields of parameter in dispersion models, and the real-time data assimilation at urban scale.
The main intrinsicsource of error in the ensemble Kalman filter (EnKF) is sampling error. External sources of error, such as model error or deviations from Gaussianity, depend on the dynamical properties of the model. Sampling errors can lead to instability of the filter, which, as a consequence, often requires inflation and localization. The goal of this study is to derive an ensemble Kalman filter, which is less sensitive to sampling errors. A prior probability density function conditional on the forecast ensemble is derived using Bayesian principles. Even though this prior is built upon the assumption that the ensemble is Gaussian-distributed, it is different from the Gaussian probability density function defined by the empirical mean and the empirical error covariance matrix of the ensemble, which is implicitly used in traditional EnKFs. This new prior generates a new class of ensemble Kalman filters, called finite-size ensemble Kalman filter (EnKF-N). One deterministic variant, the finite-size ensemble transform Kalman filter (ETKF-N), is derived. It is tested on the Lorenz '63 and Lorenz '95 models. In this context, ETKF-N is shown to be stable without inflation for ensemble size greater than the model unstable subspace dimension, at the same numerical cost as the ensemble transform Kalman filter (ETKF). One variant of ETKF-N seems to systematically outperform the ETKF with optimally tuned inflation. However, it is shown that ETKF-N does not account for all sampling errors and necessitates localization like any EnKF, whenever the ensemble size is too small. In order to explore the need for inflation in this small ensemble size regime, a local version of the new class of filters is defined (LETKF-N) and tested on the Lorenz '95 toy model. Whatever the size of the ensemble, the filter is stable. Its performance without inflation is slightly inferior to that of LETKF with optimally tuned inflation for small interval between updates, and superior to LETKF with optimally tuned inflation for large time interval between updates.
Atmospheric chemistry and air quality numerical models are driven by uncertain forcing fields: emissions, boundary conditions, wind fields, vertical turbulent diffusivity, kinetic chemical rates, etc. Data assimilation can help to assess these parameters or fields of parameter. Because those parameters are often much more uncertain than the fields diagnosed in meteorology and oceanography, data assimilation is much more an inverse modeling challenge in this context. In this study, we experiment with these ideas by revisiting the Chernobyl accident dispersion event over Europe. We develop a fast four-dimensional variational scheme (4D-Var), which seems appropriate for the retrieval of large parameter fields from large observations sets, and for the retrieval of parameters that are non-linearly related to concentrations. The 4D-Var, and especially an approximate adjoint of the transport model, is tested and validated using several advection schemes that are influential on the forward simulation as well as on the data assimilation results. Firstly, the inverse modeling system is applied to the assessment of the dry deposition parameters and of the wet deposition parameters. It is then applied to the retrieval of the emission field alone, to the joint optimization of removal process parameters and source parameters, and to the optimization of larger parameter fields, such as horizontal and vertical diffusivities, or dry deposition velocity field. The physical parameters used so far in the literature for the Chernobyl dispersion simulation are partly supported by the study. The crucial question of deciding whether such an inversion is merely a tuning of parameters, or a retrieval of physically meaningful quantities is discussed. Even though inversion of parameter fields may fail to determine physical values for the parameters, it achieves statistical adaptation that partially corrects for model errors, and, using the inverted parameter fields, leads to considerable improvement in the simulation scores.
Based on Verdandi, Polyphemus and the “Urban Air Quality Analysis” software, real-time data assimilation was carried out at urban scale. The Best Linear Unbiased Estimator (BLUE) was
computed for every hourly concentration map that the ADMS model computed. A posteriori tests were conducted over Clermont-Ferrand and Paris. We addressed the key issue of the covariance of
the state error. The form of the error covariance between two points was determined based on the road network, considering the distance between points along the road and the distance of each
point to the road. A few parameters (primarily two decorrelation lengths) were determined thanks to cross validation with several months of simulations and observations. The results showed
strong improvements even at locations where no data was assimilated. The assimilation was carried out in the prototype “Votre Air” (
http://
At larger scale, the data assimilation library Verdandi was used to apply data assimilation (optimal interpolation) with the air quality model Chimere. This preliminary work will help INERIS to apply optimal interpolation in the operational platform Prev'air.
Many of this year's studies have focused on inverse modeling, including the reconstruction of the Fukushima radionuclide source term. All were targeted to a particular application. However most of them include new methodological developments.
The aim of this research activity is the implementation of data assimilation methods, particularly inverse modeling methods, in the context of an accidental radiological release from a nuclear power plant and their application in the specific case of the Fukushima Daiichi accident. The particular methodological focus is the a posteriori estimation of the prior errors statistics. In the case of the Fukushima Daiichi accident, the number of available observations is small compared to the number of source parameters to retrieve and the reconstructed source is highly sensitive to the prior errors. That is the why they need to be well established and justified.
In this aim, three methods have been proposed: one method relies on an L-curve estimation technique, another one on the Desroziers' iterative scheme and the last method, assumed to be the most robust, relies on the maximum likelihood principle (generalized to a non-Gaussian context).
These three methods have been applied to the reconstruction of cesium-137 and iodine-131 source terms from the Fukushima Daiichi accident. Because of the poor observability of the
Fukushima Daiichi emissions, these methods provide lower-bounds for cesium-137 and iodine-131 reconstructed activities. Nevertheless, with the new method based on semi-Gaussian statistics for
the background errors, these lower-bound estimates,
This study is an application of our previous theoretical developments on a consistent Bayesian multiscale formalism to optimally design control space (in which control variables are to be estimated). We construct the optimal adaptive representations of the surface fluxes for mesoscale carbon dioxide inversions. Such representations are taken from a large dictionary of adaptive multiscale grids. These optimal representations are obtained by maximizing the number of Degrees of Freedom for the Signal (DFS) that measures the information gain from observations to resolve the unknown fluxes. Consequently information from observations can be better propagated within the domain through these optimal representations.
The optimal representations are constructed using synthetic continuous hourly carbon dioxide concentration data in the context of the
Ring 2experiment in support of the North American Carbon Program Mid Continent Intensive (MCI). Compared with the regular grid at finest scale, optimal representations can have similar
inversion performance with far fewer grid cells. For the
Ring 2network of eight towers, in most cases, the DFS value is relatively small compared to the number of observations
In this section, we report studies that are related to the evaluation of monitoring network and to new monitoring strategies. As opposed to last year's report, they may not be strictly addressing optimal network design.
Following the eruption of the Icelandic volcano Eyjafjallajökull on April 14 2010, ground-based N2-Raman lidar (GBL) measurements were used to trace the temporal evolution of the ash plume from April 16 to April 20 2010, above the southwestern suburb of Paris. The nighttime overpass of the Cloud-Aerosol LIdar with Orthogonal Polarization onboard Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observation satellite (CALIPSO/CALIOP), on April 17 2010, was an opportunity to complement GBL observations. The plume shape retrieved from GBL has been used to assess the size range of the particles size. The lidar-derived aerosol mass concentrations (PM) have been compared with model-derived PM concentrations held in the Eulerian model Polair3D transport model, driven by a source term inferred from the SEVIRI sensor on board the satellite Meteosat. The consistency between model and ground- based wind lidar and CALIOP observations has been checked. The spatial and temporal structures of the ash plume as estimated by each instrument and by the Polair3D simulations are in good agreement.
The International Monitoring System (IMS) radionuclide network enforces the Comprehensive Nuclear-Test-Ban Treaty, which bans nuclear explosions. We have evaluated the potential of the IMS radionuclide network for inverse modeling of the source, whereas it is usually assessed by its detection capability. To do so, we have chosen the Degrees of Freedom for the Signal(DFS), a well established criterion in remote sensing, in order to assess the performance of an inverse modeling system. Using a recent multiscale data assimilation technique, we have computed optimal adaptive grids of the source parameter space by maximizing the DFS. This optimization takes into account the monitoring network, the meteorology over one year (2009) and the relationship between the source parameters and the observations derived from the FLEXPART Lagrangian transport model. Areas of the domain where the grid-cells of the optimal adaptive grid are large emphasize zones where the retrieval is more uncertain, whereas areas where the grid-cells are smaller and denser stress regions where more source variables can be resolved.
The observability of the globe through inverse modeling is studied in strong, realistic and small model error cases. The strong error and realistic error cases yield heterogeneous adaptive grids, indicating that information does not propagate far from the monitoring stations, whereas in the small error case, the grid is much more homogeneous. In all cases, several specific continental regions remain poorly observed such as Africa as well as the tropics, because of the trade winds. The northern hemisphere is better observed through inverse modeling (more than 60% of the total DFS) mostly because it contains more IMS stations. This unbalance leads to a better performance of inverse modeling in the northern hemisphere winter. The methodology is also applied to the subnetwork composed of the stations of the IMS network that measure noble gases.
In this study, the BDQA background stations are partially redistributed over France under a set of design objectives, which are defined on a regular grid that covers France. Spatial interpolations are used to extrapolate simulated concentrations (of chemistry-transport models or assimilation results) to these grid nodes. Three types of criteria are considered: the geostatistical, geometrical, and physical ones. Simulated annealing is employed to select optimally the stations. Significant improvement with all the proposed criteria has been found for the optimally redistributed network against the original background BDQA network. For complex objectives, e.g. that addressing the heterogeneity of ozone field, the physical criteria are more appropriate.
Due to the great uncertainties that arise in air quality modeling, relying on a single model may be not sufficient. Therefore ensembles of simulations are now considered in a wide range of applications, from uncertainty estimation to operational forecast.
Air quality forecasts are limited by strong uncertainties especially in the input data and in the physical formulation of the models. There is a need to estimate these uncertainties for the evaluation of the forecasts, the production of probabilistic forecasts, and a more accurate estimation of the error covariance matrices required by data assimilation.
Because a large part of the uncertainty in the forecast originates from uncertainties in the model formulation (primarily the physical parameterizations), a multimodel ensemble seems to be the adequate tool for uncertainty estimation. Several 100-member ensembles were generated and calibrated (based on observations – see the example in Figure ) over year 2007 in order to study the impact of EDF thermal plants on air quality. The ensemble simulations were carried out at European scale and at the scale of French regions so as to study the local impact of two EDF plants. This approach allowed us to estimate the uncertainty on the simulated impact of the plants. Based on the calibrated ensembles, we also computed probabilistic forecasts for threshold exceedences.
Specific work has been carried out to decompose the discrepancy between observations and model simulations in three errors: the modeling error, the representativeness error and the observational error. It was shown that the representativeness error might account for a third of the discrepancy.
After the Fukushima nuclear disaster, we worked on uncertainty estimation of the transport simulations at Japan scale. We carried out Monte Carlo simulations with perturbations on most input parameters. The first results showed that the variability of the ensemble simulations is great enough to reasonably sample the uncertainties after a calibration.
Nowadays it is standard procedure to generate an ensemble of simulations for a meteorological forecast. Usually, meteorological centers produce a single forecast, out of the ensemble forecasts, computing the ensemble mean (where every model receives an equal weight). The generation of a single forecast with a weighted linear combination is called sequential aggregation. Each time new observations are available, the weights of the linear combination are updated and applied for the next forecasts. We applied the discounted ridge regression algorithm, which we previously introduced for sequential aggregation of air quality forecasts, to forecast wind and temperature at given observation stations. The ensemble was generated with forecasts at different range from two models. The aggregation proved to be efficient for one-day forecasts at least.
Sequences of images, such as satellite acquisitions, display structures evolving in time. This information is recognized of major interest by forecasters (Meteorologists, oceanographers, etc) in order to improve the information provided by numerical models. However, these satellite images are mostly assimilated in geophysical models on a point-wise basis, discarding the space-time coherence visualized by the evolution of structures such as clouds. Assimilating in an optimal way image data is of major interest and this issue should be considered in two ways:
from the model's viewpoint, the problem is to control the location of structures using the observations,
from the image's viewpoint, a model of the dynamics and structures has to be built from the observations.
This research addresses the issue of divergence-free motion estimation on an image sequence, acquired over a given temporal window. Unlike most state-of-the-art technics, which constrain the divergence to be small thanks to Tikhonov regularisation terms, a method that imposes a null value of divergence of the estimated motion is defined.
Motion is characterized by its vorticity value and assumed to satisfy the Lagragian constancy hypothesis. An image model is then defined: the state vector includes the vorticity, whose evolution equation is derived from that of motion, and a pseudo-image that is transported by motion. An image assimilation method, based on the 4D-Var technics, is defined and developed that estimates motion as a compromise between the evolution equations of vorticity and pseudo-image and the observed sequence of images: the pseudo-images have to be similar to the acquisitions.
As the evolution equations of vorticity and pseudo-image involve the motion value, the motion field has to be retrieved at each time step of the studied temporal window. An algebraic method, based on the projection of vorticity on a subspace of eigenvectors of the Laplace operator, is defined in order to allow Dirichlet boundary conditions for the vorticity field.
The divergence-free motion estimation method is tested and quantified on synthetic data. This shows that it computes a quasi-exact solution and outperforms the state-of-the-art methods that were applied on the same data.
The method is also applied on Sea Surface Temperature (SST) images acquired over Black Sea by NOAA-AVHRR sensors. The divergence-free assumption is roughly valid on these acquisitions, due to the small values of vertical velocity at the surface. Fig. displays data and results. As no ground truth of motion is available, the method is quantified by the value of correlation between the pseudo-images and the real acquisitions. Again, the method provides the best result compared to other state-of-the-art algorithms.
Data assimilation technics are used to retrieve motion from image sequences. These methods require a model of the underlying dynamics, displayed by the evolution of image data. In order to quantify the approximation linked to the chosen dynamic model, we consider adding a model error term in the evolution equation of motion and design a weak formulation of 4D-Var data assimilation. The cost function to be minimized simultaneously depends on the initial motion field, at the begining of the studied temporal window, and on the error value at each time step. The result allows to assess the model error and analyze its impact on motion estimation.
This error assessment method is evaluated and quantified on twin experiments, as no ground truth would be available for real image data. Fig.
shows four frames of a series of observations obtained by integrating the
evolution model from an initial condition on image and velocity field (the ground truth
We performed two data assimilation experiments. The first one considers the evolution model as perfect, with no error in the evolution equation. It is denoted PM (for Perfect Model). The second one, denoted IM (for Imperfect Model) involves an error in the motion evolution equation. In fig. are displayed the motion fields retrieved by PM and IM at the beginning of the temporal window.
As it can be seen, IM computes a correct velocity field while PM completely fails.
The results on this error assessment method are still preliminary. Perspectives are considered in order to correctly retrieve the error on dynamics by constraining its shape. An important application is, for instance, the detection of dynamics changes on long temporal sequences.
In the image processing literature, the optical flow equation is usually chosen to assess motion from an image sequence. However, it corresponds to an approximation that is no more valid in case of large displacements. We evaluate the improvements obtained when using the non linear transport equation of the image brightness by the velocity field. A 4D-Var data assimilation method is designed that simultaneously solves the evolution equation and the observation equation, in its non linear and linearized form. The comparison of results obtained with both observation equations is quantified on synthetic data and discussed on oceanographic Sea Surface Temperature (SST) images. We show that the non linear model outperforms the linear one, which underestimates the motion norm. Fig. illustrates this on SST images (motion vectors are displayed by arrows).
The aim of this research is to achieve a correct estimation of motion when the object displacement is greater than its size. In this case, coarse-to-fine incremental methods as well as the non linear data assimilation method fail to retrieve a correct value. The perspective is then to include, in the state vector, a variable describing the trajectory of pixels. The observation operator will then measure the effective displacement of pixels, according to their trajectories, and allow a better estimation of motion value.
A data assimilation method was designed to recover missing data and reduce noise on satellite acquisition. The state vector includes motion and image fields. Its evolution equation is
based on assumptions on the underlying dynamics displayed by the sequence of images and considers the passive transport of images by the velocity field. The observation equation compares the
image component of the state vector and the real observations. Missing and noisy data regions are characterized by a gaussian observation error, whose covariance matrix
The recovering method was applied on synthetically noised SST images in order to quantify the quality of the recovering (see Fig. ).
The method is a promising alternative to those such as space-time interpolation. In the experiments, the Lagrangian constancy of the state vector is used as evolution equation. The perspectives concern the use of more advanced dynamic equations, as for instance the shallow water equations that link the motion field to the thickness of the ocean surface layer, and improved modeling of illumination changes over the sequence, due to various order acquisition times.
This study is achieved in collaboration with the Marine Hydrophysical Institute (MHI) of Sevastopol. The aim is to estimate and further validate the estimation of Black Sea surface velocity from sequences of satellite images in order to allow an optimal assimilation of these pseudo-observations in 3D ocean circulation models. Several Image Models were designed that express the dynamics of velocity and the temporal evolution of image data. An image assimilation method was developed based on the 4D-Var formalism and estimates motion as a compromise between the Image Model, the image acquisitions and regularity heuristics on the velocity field. Two Image Models were qualitatively and quantitatively compared: the Stationary Image Model(SIM) based on the heuristics of stationary motion, which is valid at short temporal scale, and the Shallow Water Image Model(SWIM), based on the shallow-water equations.
The comparison between SIM and SWIM results confirms that SIM provides correct results only on short temporal windows, while SWIM allows to process longer image sequences.
The validation of motion estimation by image assimilation requires additional observation data, as no measure of motion is available from satellite sensors. Sea Level Anomaly, measured by satellite altimeters, is then compared to the thickness of the surface layer as estimated by the Shallow Water Image Model. This comparison shows a good adequacy of shape and values , . As the velocity field is strongly related to this thickness value from the physical evolution laws, these results further validate the estimation of the velocity and the image assimilation approach.
The surface motion of the Black Sea approximately verifies the geostrophic equilibrium property. As the surface velocity can be directly derived from the surface layer thickness
This method was first tested and quantified on twin experiments with satellite data. Figure simultaneously displays the result of the velocity estimation by GSWIM and the ground truth.
This study concerns the estimation of motion fields from satellite images on long temporal sequences. The huge computational cost and memory required by data assimilation methods on the pixel grid makes impossible to use these techniques on long temporal intervals. For a given dynamic model (named full modelon the pixel grid), the Galerkin projection on a subspace provides a reduced modelthat allows image assimilation at low cost. The definition of this reduced model however requires defining the optimal subspace of motion. A sliding windowsmethod is thus designed:
The long image sequence is split into small temporal windows that half overlap in time.
Data assimilation in the full model is applied on the first window to retrieve the motion field.
The estimate of motion field at the beginning of the second window makes it possible to define the subspace for motion and a reduced model is obtained by Galerkin projection.
Data assimilation in the reduced model is applied for this second window.
The process is then iterated for the next window until the end of the whole image sequence.
Figure summarize the described methodology.
Twin experiments were designed to quantify the results of this sliding windows method. Results on motion estimation are given in Figure and compared with the ground truth. The NRMSE (in percentage) ranges from 1.1 to 4.0% from the first to the sixth window. On the first window, 3 hours are required to estimate the motion fields with the full model. For the next 5 windows, less than 1 minute is required to compute motion.
In air quality modeling, the model error is supposed to take into account the uncertainty on the meteorological fields (winds and vertical diffusivities), the segregation and mixing in emission areas that affect the effective kinetic rates of reactions, the boundary condition fields, all physical parameterizations (dry deposition, wet scavenging), etc. All the above sources of error have bounded energy and typically are not normally distributed or independent.
In order to take this into account in the data assimilation process, we applied the Minimax State Estimation (MSE) approach. It is well known that a bottle-neck of minimax estimation algorithms as well as of the family of Kalman-type filters is the dimension issue. To solve it, we applied a powerful version of the minimax filter developed for the so-called differential-algebraic equations. This filter works for any linear ordinary differential equation with time-dependent coefficients on any linear manifold, which can also change in time. Based on this novel approach, we derived a computationally tractable reduced version of the minimax filter. The derivation was made in a new and rigorous framework. In addition to the reduction, the new filter shows all the interesting properties inherited from the minimax setting, especially the description of the model and observational errors, which only need to have bounded energy. The later is important in the context of applications because the errors are always bounded. In contrast, most high-dimensional statistical filters are designed for unbounded random errors with special distribution function.
The algorithm, already implemented in the data assimilation library Verdandi, was further developed to compute a better reduction base.
The algorithm was in addition applied for ensemble sequential aggregation. The minimax filter computes weights for each model in the ensemble and a forecast is generated as the weighted linear combination of the ensemble members. In this case, the dimension is small so that no reduction is needed. The approach shows two noteworthy advantages: the observational errors can be taken into account and a dynamics can be given for the weights.
Data assimilation algorithms based on the 4D-Var formulation look for the so-called conditional mode estimate. The latter maximizes the conditional probability density function, provided the initial condition, model error and observation noise are realizations of independent Gaussian random variables. However this Gaussian assumption is often not satisfied for geophysical flows. Moreover, the estimation error of the conditional mode estimate is not a first-hand result of these methods. The issues above can be addressed by means of the Minimax State Estimation (MSE) approach. It allows to filter out any random (with bounded correlation operator) or deterministic (with bounded energy) noise and assess the worst-case estimation error.
The iterative MSE algorithm was developed for the problem of optical flow estimation from a sequence of 2D images. The main idea of the algorithm is to use the "bi-linear" structure of the Navier-Stokes equations and optical flow constraint in order to iteratively estimate the optical flow. The algorithm consists of the following parts:
1) we construct the pseudo-observations that is the estimate of the image brightness function
2) we plug the estimate of the image gradient, obtained out of pseudo-observations
3) we use the minimax estimate of the velocity field obtained in 2) in order to start 1) again.
Currently numerical experiments are carried out in order to study the convergence rate of the algorithm.
In the field of forest fires risk management, important challenges exist in terms of people and goods preservation. Answering to strong needs from different actors (firefighters, foresters), researchers focus their efforts to develop operational decision support system tools that may forecast wildfire behavior. This requires the evaluation of models performance, but currently, simulation errors are not sufficiently qualified and quantified. As the main objective is to realize a decision support system, it is required to establish robust forecast evaluations. In the context of the ANR project IDEA, the evaluation of model simulations has been started with a bibliographical review, the implementation of a series of forecast scores and the definition of a series of ideal cases where some classical scores may fail (especially in taking into account the dynamics).
In addition, we consider that the proper evaluation of a model requires to apply it to a large number of fires – instead of carrying out a fine tuning on just one fire. We implemented a
software to simulate a large number of fires (from the Prométhée database,
http://
Clime is partner with INERIS (National Institute for Environmental and Industrial Risks) in a joint cooperation devoted to air quality forecast. This includes research topics in uncertainty estimation, data assimilation and ensemble modeling.
Clime also provides support to INERIS in order to operate the Polyphemus system for ensemble forecasting, uncertainty estimations and operational data assimilation at continental scale.
Clime is partner with IRSN, the French national institute for radioprotection and nuclear safety, for inverse modeling of emission sources and uncertainty estimation of dispersion simulations. The collaboration aims at better estimating emission sources, at improving operational forecasts for crisis situations and at estimating the reliability of the forecasts. The work is derived at large scale (continental scale) and small scale (a few kilometers around a nuclear power plant).
Clime takes part to a joint Ilab with the group SETH (Numtech). The objective is to (1) transfer Clime work in data assimilation, ensemble forecasting and uncertainty estimation, with application to urban air quality, (2) identify the specific problems encountered at urban scale in order to determine new research directions. The first study addresses the application of data assimilation at urban scale.
Clime took part with Numtech and AirParif to the project “Votre Air”, from the call “Futur en Seine” organized by Cap Digital and notably supported by Île-de-France.
Clime is in charge of providing data assimilation methods in order to generate analyses out of ADMS simulations and AirParif ground observations. The corresponding prototype is running
operationnally and the results are available at
http://
Clime is involved in the starting project PREQUALIF–IZNOGOUD–BARC, with many partners including the leading partner LSCE (“Laboratoire des Sciences du Climat et l'Environnement”), which aims at designing methods for the evaluation of the measures to be taken in the ZAPA areas (“Priority Areas for Air Quality Measures”). Clime will focus on the assimilation of observations to better evaluate the actual air quality.
The MSDAG project (Multiscale Data Assimilation in Geophysics) is an ANR SYSCOMM project. Fours partners are in the project: CEREA (Clime project-team, Marc Bocquet, PI of the whole project), Fluminance (Étienne Mémin), Moise Project-team (Laurent Debreu), LSCE (Frédéric Chevallier). It has been extended to the end of September 2012.
Clime is running the project MIDAR “Inverse modelling of deposition measurements in case of a radiological release”, under the framework of the LEFE-ASSIM program of INSU. This includes a cooperation with the Institute for Safety Problems of Nuclear Power Plants (National Academy of Sciences of Ukraine).
Clime is part of the INSU/LEFE project ADOMOCA-2, which unites about ten French teams working in atmospheric chemistry data assimilation.
Clime is one partner of the ANR SYSCOMM GeoFluids project. It focuses on the specification of tools to analyse geophysical fluid flows from image sequences. Clime objectives concern the definition of reduced models from image data.
Clime takes part to the ANR project IDEA that addresses the propagation of wildland fires. Clime is in charge of the estimation of the uncertainties, based on sensitivity studies and ensemble simulations.
Program: COST Action ES104.
Project acronym: EuMetChem.
Project title: European framework for online integrated air quality and meteorology modelling.
Duration: January 2011 - December 2014.
Coordinator: Alexander Baklanov, Danish Meteorological Institute (DMI) Danemark.
Other partners: around 14 european laboratories, experts from United States, ECMWF.
Abstract: European framework for online integrated air quality and meteorology modelling (EuMetChem) - will focus on a new generation of online integrated Atmospheric Chemical Transport (ACT) and Meteorology (Numerical Weather Prediction and Climate) modelling with two-way interactions between different atmospheric processes including chemistry (both gases and aerosols), clouds, radiation, boundary layer, emissions, meteorology and climate. At least, two application areas of the integrated modelling are planned to be considered: (i) improved numerical weather prediction (NWP) and chemical weather forecasting (CWF) with short-term feedbacks of aerosols and chemistry on meteorological variables, and (ii) two-way interactions between atmospheric pollution/ composition and climate variability/change. The framework will consist of four working groups namely: 1) Strategy and framework for online integrated modelling; 2) Interactions, parameterisations and feedback mechanisms; 3) Chemical data assimilation in integrated models; and finally 4) Evaluation, validation, and applications. Establishment of such a European framework (involving also key American experts) will enable the EU to develop world class capabilities in integrated ACT/NWP-Climate modelling systems, including research, forecasting and education.
Partner: ERCIM working group “Environmental Modeling”.
The working group gathers laboratories working on developing models, processing environmental data or data assimilation.
Partner: CWI, Amsterdam (the Netherlands)
The collaboration deals with data assimilation based on minimax filtering. The objective is to apply a reduced form of a minimax filter to a high-dimensional air quality model and to process satellite images.
Partner: Chilean meteorological office (Dirección Meteorológica de Chile)
The partner produces its operational air quality forecasts with Polyphemus. The 3-day forecasts essentially cover Santiago. The forecasts are accessible online in the
form of maps, time series and video (
http://
Partner: Marine Hydrophysical Institute, Ukraine.
The collaboration concerns the study of the Black Sea surface circulation.
Partner: Institute of Numerical Mathematics, Russia
The collaboration concerns the estimation of uncertainty of the motion field derived from image data.
Clime is running a two-year project under the PHC-DNIPRO program with the national university Taras Chevtchenko in Kiev. The subjects concerns the a posteriori minimax motion estimation from images.
Marc Bocquet has been co-chair of the INSU/LEFE ASSIM scientific commitee until September 2011. He is now co-chair of the INSU/LEFE MANU scientific commitee (since October 2011), which replaces LEFE ASSIM. Marc Bocquet is a member of the Scientific Council of the CERFACS institute in Toulouse, France. He is Associate Editor of the Quaterly Journal of the Royal Meteorological Society.
Isabelle Herlin is member of the Scientific Council of CSFRS (High Council for Strategic Education and Research in France). She was member of the Program Committee of the Workshop ofAerial Video Processing, joint with IEEE CVPR2011. She is organising a session on "Analysis of data of different scales and sources for mesoscale environmental models" for the International Congress on Environmental Modelling and Software (iEMSs2012).
Introduction to Data Assimilation for Geophysics (Master2 OACOS (ocean, atmosphere, climate and space observa- tion), UPMC, X, ENS, ENSTA ParisTech, École des Ponts ParisTech): 30 hours (Marc Bocquet, Vivien Mallet).
Master2 "Nuclear Energy": 9 hours (Marc Bocquet, Vivien Mallet, Victor Winiarek, Anne Tilloy).
Practical sessions on air pollution, École des Ponts ParisTech, third year (master 2): 3 hours (Vivien Mallet)
“Introduction to chemistry-transport models”, master 2 “Sciences et le Management de l'Environnement” (Paris Diderot, École des Ponts ParisTech ): 3.5 hours (Vivien Mallet)
PhD : Damien Garaud, “Estimation des incertitudes et prévision des risques en qualité de l'air”, University Paris-Est, December 14th, 2011, Isabelle Herlin.
PhD in progress : Karim Drifi, “Assimilation of image structures”, University Paris Centre, December 1st, 2009, Isabelle Herlin.
PhD in progress : Mohammad Koohkan, “Modélisation inverse et assimilation de données en qualité de l'air”, University Paris-Est, December 1st, 2009, Marc Bocquet.
PhD in progress : Victor Winiarek, “Dispersion atmosphérique en milieu urbain et modélisation inverse pour la reconstruction de sources”, University Paris-Est, October 1st, 2009, Marc Bocquet.
Marc Bocquet:
MCI workshop, Fort Collins, Colorado, USA, January 2011.
Invited seminar at NCAR, Boulder, Colorado, January 2011.
Invited seminar at Maryland University, Maryland, USA, January 2011.
Oral presentation at GLOREAM-EURASAP Workshop 2011, Copenhagen, Denmark.
Two oral presentations and three poster presentations at EGU meeting, Vienna, Austria, April 2011.
Invited lecturer at GdR Mascotnum meeting, "Journée Environnement et Exploration Numérique", June 2011.
Invited lecturer at the "2nd Summer School on Data Assimilation and its applications", Iasi, Romania, July 2011.
Invited lecturer at the LMS Durham Research Symposium, "Mathemathics of Data Assimilation", Durham UK, August 2011.
Invited lecturer at the TTM 2011 workshop/summer school, "Tracer and Timescale Methods for Understanding Complex Geophysical and Environmental Processes" Louvain-La-Neuve, Belgium, August 2011.
Participant at the ECMWF Seminar 2011 on data assimilation, Reading, UK, September 2011.
Invited lecturer at the statistics workshop JSTAR 2011, Rennes, France, October 2011.
Invited lecturer and chairman at the OHOLO Conference, Eilat, Israel, November 2011.
Invited lecturer at Journée "Fouille de données", Jacques-Louis Lions laboratory, Paris, France, November 2011.
Isabelle Herlin and Etienne Huot: invited at the conference "Hydrodynamic modeling of the Black Sea Dynamics" Sevastopol, Ukraine, September 20-24th, 2011.
Vivien Mallet, Damien Garaud, Anne Tilloy: “Estimation des incertitudes, et couplage entre les modèles et les données“; invited at the “1ère Journée Technique autour de la Modélisation de la Qualité de l’Air en milieu Urbain”, Lyon, January 2011.
Vivien Mallet:
“Uncertainty Estimation and Ensemble Forecasting in Air Quality Modeling “; invited at the workshop on data assimilation in air quality models (Katholieke Universiteit Leuven, VITO), March 2011.
“Uncertainty estimation and data assimilation in air quality modeling”; seminar at CWI, November 2011.
Mohammad Reza Koohkan: Oral presentation at the second ADOMOCA-2 workshop, Saint-Cyr Les Lecques, France, November 2011.
Victor Winiarek. Oral presentation at GLOREAM-EURASAP Workshop 2011, Copenhagen, Denmark. Poster presentation at EGU meeting, Vienna, Austria, April 2011.
Lin Wu. Oral presentation at MCI workshop, Fort Collins, Colorado, USA, January 2011. Poster presentation at EGU meeting, Vienna, Austria, April 2011.