Section: New Results

Modeling the dynamic behavior of executable programs

Modeling the dynamic behavior of a given program is useful for several reasons. First, it is a particular way of profiling the program, targeting non trivial characteristics of the execution. As such, it helps programmers understand the behavior of the program, and hopefully helps them optimizing it. Second, the results can be used in a compiler, using run-time information to drive static optimizations. This path has seen considerable development when it comes, for example, to sequence basic blocks so as to leverage branch predictors and/or optimize the usage of instruction caches. We ambition to take it one step further, using run-time information to help a compiler in the task of auto-parallelization. Third, modeling can be used on-line, during the execution of the program, to drive the use of dynamic optimizations.

Our first achievement in 2011 has been the presentation of our paper at the International Symposium on Performance Analysis of Software and Systems [17] . The paper describes an approach that e have called program skeletonization. The basic idea is to perform a static analysis of the code under scrutiny, and locate a small number of register assignments that completely determine the set of memory addresses the program will access. By instrumenting these elementary value assignments and extracting the ensuing computations of addresses, the amount of instrumentation can be dramatically reduced, at the cost of offloading some computations to the profiler. On average, this provides a significant gain in the time needed to obtain a memory trace, and is independent on the particular application, e.g., cache simulation, data race detection, and so on.

Our second main research direction on this topic has been the characterization of semi-regular memory accesses. Semi-regular accesses are caused by the traversal of a data structure linking successive memory cells in no particular memory order, i.e., a linked list or a tree. We have developed a modeling algorithm that is able to detect that a set of instructions perform irregular accesses that are actually highly correlated, differing only by an affine function of the enclosing loop indices. This has several implications in terms of potential optimizations. First it exhibits a kind of abstract iterator, that can later be handled, e.g., by inspector/executor techniques. Second, it reduces the number of potential dependencies that have to be tested.

The third, most recent, research project that we have started this year is the analysis of traces of parallel programs, currently MPI programs. Parallel traces offer new research challenges, because they contain events that are only partially ordered. We have extended our loop nest recognition algorithm [7] to handle parallel traces. Our preliminary results show that this algorithm is highly effective in extracting communication patterns. This has a number of potential applications that we plan to study in the coming months.