Section: Research Program
Generating synthetic turbulence
A crucial point for any multi-scale simulation able to locally switch (in space or time) from a coarse level of turbulence description to a more refined one, is the enrichment of the solution by fluctuations as physically meaningful as possible. Basically, this issue is an extension of the problem of the generation of realistic inlet boundary conditions in DNS or LES of subsonic turbulent flows. In that respect, the method of anisotropic linear forcing (ALF) we have developed in collaboration with EDF proved very encouraging, by its efficiency, its generality and simplicity of implementation. So, it seems natural, on the one hand, to extend this approach to the compressible framework and then implement it in AeroSol. On the other hand, we shall concentrate (in cooperation with EDF R&D in Chatou via a CIFRE PhD do be started next year) on the theoretical link between the local variations of the scale's description of turbulence (e.g. a sudden variations in the size of the time filter) and the intensity of the ALF forcing transiently applied to help in the development of missing scales of fluctuations.
Stable and non reflecting boundary conditions
In aerodynamics, and especially for subsonic computations, handling inlet and outlet boundary conditions is a difficult issue. A lot of work has already been done for second order schemes for Navier Stokes equations, see ,  and the huge number of papers citing it. On the one hand, we believe that strong improvements are necessary with higher order schemes: indeed, the less dissipative the scheme is, the worse impact have the spurious reflections. For this, we will first concentrate on the linearized Navier-Stokes system, and analyze the boundary condition imposition in a discontinuous Galerkin framework with a similar approach as in . We will also try to extend the work of , which deals with Euler equations, to the Navier Stokes equations.
Turbulence models and model agility
Extension of zero Mach models to the compressible system
We shall develop in parallel our multi-scale turbulence modeling and the related adaptive numerical methods of AeroSol. Without prejudice to methods that will be on the podium in the future, a first step in this direction will be to extend to a compressible framework the continuous hybrid temporal RANS/LES models we have developed up to now in a Mach zero context.
Study of wall flows with and without mass or heat transfer at the wall: determination and validation of relevant criteria for hybrid turbulence models
In the targeted application domains, the turbulence/wall interaction and the heat transfer at the fluid-solid interfaces are physical phenomena whose numerical prediction is at the heart of the concerns of our industrial partners. For instance, for a jet engine manufacturer, being able to properly design the configuration of the cooling of the walls of its engine combustion chamber in the presence of thermoacoustic instabilities is based on the proper identification and a thorough understanding of the major mechanisms that drive the dynamics of the parietal transfers. For our part, we will gradually use all our analysis and experimentation tools to actively participate in the improvement of the collective knowledge on such kind of transfers. The flow configurations dealt with by the beginning of the project will be those of subsonic single phase impacting jets or JICF with the possible presence of an interacting acoustic wave. The conjugate heat transfer at the wall will be also progressively tackled. The existing criteria of switching of the hybrid RANS/LES model will be tested on those flow configurations in order to determine their domain of validity. In parallel, the hydrodynamic instability modes of the JICF will be studied experimentally and theoretically (in cooperation with the SIAME laboratory) in order to determine if it is possible to drive a change of instability regime (e.g. from absolute to convective) and so propose challenging flow conditions that would be relevant for the setting-up of an hybrid LES/DNS approach aimed at supplementing the hybrid RANS/LES one.
Improvement of turbulence models
The production and the subsequent use of DNS (AeroSol library) and experimental (bench MAVERIC) databases dedicated to the improvement of the physical models will be an important part of our activity. In that respect, our present capability of producing in-situ experimental data for simulation validation and flow analysis is clearly a strongly differentiating mark of our project. It is on the improvement of the hybrid RANS/LES approach that will focus most of our initial efforts of analysis of the DNS and experimental data as soon as they will become available. This method has a decisive advantage over all other hybrid RANS/LES approaches since it relies on a well defined time filtering formalism. This greatly facilitates the proper extraction from the databases of the various terms appearing in the relevant flux balances obtained at the different scales involved (e.g. from RANS to LES). But we would not be comprehensive in that matter if we were not questioning the relevance of any simulation-experiment comparisons. In other words, a central issue will also be to answer positively the following question: will we be comparing the same quantities between simulations and experiment? From an experimental point of view, the questions to be raised will be, among others, the possible difference in resolution between the experiment and the simulations, the similar location of the measurement points and simulation points, the acceptable level of random error associated to the necessary finite number of samples. In that respect, the recourse to uncertainty quantification techniques will be advantageously considered.
Development of an efficient implicit high-order compressible solver scalable on new architectures
As the flows we wish to simulate may be very computationally demanding, we will maintain our efforts in the development of AeroSol in the following directions:
Efficient implementation of the discontinuous Galerkin method
In high order discontinuous Galerkin methods, the unknown vector is composed of a concatenation of the unknowns in the cells of the mesh. An explicit residual computation is composed of three loops: an integration loop on the cells, for which computations in two different cells are independent, an integration loop on boundary faces, in which computations depend on data of one cell and on the boundary conditions, and an integration loop on the interior faces, in which computations depend on data of the two neighboring cells. Each of these loops are composed of three steps: the first step consists in interpolating data at the quadrature points, the second step in computing a nonlinear flux at the quadrature points (the physical flux for the cell loop, an upwind flux for interior faces or a flux adapted to the kind of boundary condition for boundary faces), and the third step consists in projecting the nonlinear flux on the degrees of freedom.
In this research direction, we propose to exploit the strong memory locality of the method (i.e., the fact that all the unknowns of a cell are stocked contiguously). This formulation can reduce the linear steps of the method (interpolation on the quadrature points and projection on the degrees of freedom) to simple matrix-matrix product which can be optimized. For the nonlinear steps, composed of the computation of the physical flux on the cells and of the numerical flux on the faces, we will try to exploit vectorization.
Implicit methods based on Jacobian-Free-Newton-Krylov methods and multigrid
For our computations of the IMPACT-AE project, we use an explicit time stepping. The time stepping is limited by the CFL condition, and in our flow, the time step is limited by the acoustic wave velocity. As the Mach number of the flow we simulate in IMPACT-AE is low, the acoustic time restriction is much lower than the turbulent time scale, which is driven by the velocity of the flow. We hope to have a better efficiency by using time implicit methods, for using a time step driven by the velocity of the flow.
Using implicit time stepping in compressible flows in particularly difficult, because the system is fully nonlinear, so that the nonlinear solving theoretically requires to build many times the Jacobian. Our experience in implicit methods is that the building of a Jacobian is very costly, especially in three dimensions and in a high order framework, because the optimization of the memory usage is very difficult. That is why we propose to use Jacobian free implementation, based on . This method consists in solving the linear steps of the Newton method by a Krylov method, which requires Jacobian-vector product. The smart idea of this method is to replace this product by an approximation based on a difference of residual, therefore avoiding any Jacobian computation. Nevertheless, Krylov methods are known to converge slowly, especially for the compressible system when the Mach number is low, because the system is ill-conditioned. In order to precondition, we propose to use an aggregation-based multigrid method, which consists in using the same numerical method on coarser meshes obtained by aggregation of the initial mesh. This choice is driven by the fact that multigrid methods are the only one to scale linearly ,  with the number of unknowns in term of number of operations, and that this preconditioning does not require any Jacobian computation.
Beyond the technical aspects of the multigrid approach, which will be challenging to implement, we are also interested in the design of an efficient aggregation. This often means to perform an aggregation based on criteria (anisotropy of the problem, for example) . For this, we propose to extend the scalar analysis of  to a linearized version of the Euler and Navier-Stokes equations, and try to deduce an optimal strategy for anisotropic aggregation, based on the local characteristics of the flow. Note that discontinuous Galerkin methods are particularly well suited to h-p aggregation, as this kind of methods can be defined on any shape .
Porting on heterogeneous architectures
Until the beginning of the 2000s, the computing capacities have been improved by interconnecting an increasing number of more and more powerful computing nodes. The computing capacity of each node was increased by improving the clock speed, the number of cores per processor, the introduction of a separate and dedicated memory bus per processor, but also the instruction level parallelism, and the size of the memory cache. Even if the number of transistors kept on growing up, the clock speed improvement has flattened since the mid 2000s . Already in 2003,  pointed out the difficulties for efficiently using the biggest clusters: "While these super-clusters have theoretical peak performance in the Teraflops range, sustained performance with real applications is far from the peak. Salinas, one of the 2002 Gordon Bell Awards was able to sustain 1.16 Tflops on ASCI White (less than 10% of peak)." From the current multi-core architectures, the trend is now to use many-core accelerators. The idea behind many-core is to use an accelerator composed of a lot of relatively slow and simplified cores for executing the most simple parts of the algorithm. The larger the part of the code executed on the accelerator, the faster the code may become. In this task, we will work on the heterogeneous aspects of computation. These heterogeneities are intrinsic to our computations and have two sources. The first one is the use of hybrid meshes, which are necessary for using a local structured mesh in a boundary layer. As the different cell shapes (pyramids, hexahedra, prisms and tetrahedra) do not have the same number of degrees of freedom, nor the same number of quadrature points, the execution time on one face or one cell depends on its shape. The second source of heterogeneity are the boundary conditions. Depending on the kind of boundary conditions, user defined boundary values might be needed, which induces a different computational cost. Heterogeneities are typically what may decrease efficiency in parallel if the workload is not well balanced between the cores. Note that heterogeneities were not dealt with in what we consider as one of the most advanced work on discontinuous Galerkin on GPU , as only straight simplicial cell shapes were addressed. For managing at best our heterogeneous computations on heterogeneous architectures, we propose to use the execution runtime StarPU . For this, the discontinuous Galerkin algorithm will be reformulated in term of a graph of tasks. The previous tasks on the memory management will be useful for that. The linear steps of the discontinuous Galerkin methods require also memory transfers, and one task of the project will consist in determining the optimal task granularity for this step, i.e. the number of cells or face integrations to be sent in parallel on the accelerator. On top of that, the question of which device is the most appropriate to tackle such kind of tasks will be discussed.
Last, we point out that the combination of shared-memory and distributed-memory parallel programming models is better suited than only the distributed-memory one for multigrid, because in a hybrid version, a wider part of the mesh shares the same memory, therefore allowing for a coarser aggregation.
The consortium will benefit from a particularly stimulating environment in the Inria Bordeaux Sud Ouest center around high performance computing, which is one of the strategic axis of the center.
Implementation of turbulence models in AeroSol and validation
We will gradually insert models developed in research direction 126.96.36.199 in the AeroSol library in which we develop methods for the DNS of compressible turbulent flows at low Mach number. Indeed, thanks to its formalism of temporal filtering, the HTLES approach offers a theoretical framework characterized by a continuous transition from RANS to DNS, even for complex flow configurations (e.g. without directions of spatial homogeneity). As for the discontinuous Galerkin method available presently in AeroSol, it is the best suited and versatile method able to meet the requirements of accuracy, stability and cost related to the local (varying) level of resolution of the turbulent flow at hand, regardless of its configuration complexity. This task is part of a the European project iHybrid, coordinated by TU Berlin, that we are currently writting in collaboration with two of our industrial partners, EDF and PSA.
Validation of the simulations: test flow configurations
To supplement whenever necessary the test flow configuration of MAVERIC and apart from configurations that could emerge in the course of the project, the following configurations for which either experimental data, simulation data or both have been published will be used whenever relevant for benchmarking the quality of our agile computations: