Aoste is a joint team with UNS (University of Nice/Sophia-Antipolis) and CNRS UMR I3S. It is also co-located between Sophia-Antipolis and Rocquencourt. Project members originate from the former INRIA Tick and Ostre teams, together with the I3S Sports team.

Modern embedded systems combine complexity and heterogeneity both at the level of applications (with a mix of control-flow modes and multimedia data-flow streaming), and at the level of execution platforms (with increasing parallelism and multicore architectures). Compilation of the application onto the platform then takes the form of an allocation mapping involving spatial distribution as well as temporal scheduling. Formal models and methods may help to establish the correctness and the efficiency of such transformations. Static and dynamic (model-checking) analyses are also used to provide insights regarding prescribed formal semantics.

The main objective of the
Aosteteam is thus to promote the formal design of embedded systems, with their intrinsic
concurrent, distributed and real-time aspects. For this, we develop a model-based approach, where models here have sound and precise operational semantics. For this we build upon previous
experience by team members on synchronous reactive formalisms such as
Estereland its graphical
SyncChartsversion, various GALS or polychronous extensions owing to Concurrency Theory (like Process Networks), and the
*Algorithm-Architecture Adequation*methodology (
AAA) embodied in the
SynDExenvironment.

Because of their formal semantics, the various Models of Computation and Communication (MoCCs) considered in our team can be used in a true effective design flow methodology based on model transformation to represent compilation, synthesis, analysis and optimization of concurrent embedded applications onto parallel and multicore embedded architectures. Allocation seen in that sense comprises a physical distribution/placement as well as a temporal scheduling aspect. Timing constraints and requirements may be expressed and have to be checked and preserved in the process.

This type of incremental design flow may be applied to represent a number of existing, theoretical or practical approaches to the design of embedded systems.

In synchronous reactive models the various concurrent processes all run at the speed of a common global logical clock, which sets up the instantaneous reaction step. Synchronous formalisms provide an accurate representation of both hardware and scheduled embedded concurrent software; in both cases, simultaneous behaviors in a single global instant are allowed, and even often required.

Examples include
Esterel/SyncCharts,
Lustre/Scade, and
Signal/PolyChrony.
Estereland
SyncChartsare control-oriented state-based formalisms
,
, while
*Lustre*and
*Signal*are declarative data-flow based formalisms. Synchronous formalisms were discussed in many articles and book chapters, amongst which
,
,
,
,
,
.

The INRIA spin-off Esterel-EDAnow develops and markets the industrial versions Esterel Studioand SCADEtogether with their programming environments.

The purely single-clock synchronous formalisms often prove to be excessively demanding for users to write large systems descriptions, with different clock domains. Independent logical clocks may be used to represent (total or partial) asynchrony amongst concurrent processes. Globally-Asynchronous/Locally-synchronous (GALS) and polychronous/multiclock models are handy extensions to provide flexibility and modularity in system design. The recently proposed theory of latency-insensitive design (LID) with elastic time is a good example of such an approach: specific protocol elements may be inserted between existing “black-box” IP block components, at a subsequent design time, to make them comply with imperative latencies on the global communications.

In any case the basic synchronous model remains the basic semantic level for behaviors, where the reaction step is defined. But natural properties (such as endochrony/asynchrony) allows to
view the GALS and multiclock descriptions as higher-level versions with a natural synchronous interpretation provided by simple scheduling. The monoclock version is then obtained by dedicated
scheduling techniques known as
*clock calculus*.

The previous model extensions were often calling for general results from branches of Theoretical Computer Science and Concurrency Theory such as Process Networks and Process Calculi. Process Networks comprise Petri Nets and Kahn Networks, as well as various specializations and generalizations, such as Event/Marked Graphs, Data-Flow Domains (Synchronous, Boolean, CycloStatic, CycloDynamic,...), while Process Algebras such as CCS and CSP gave rise to extensions with simultaneous events in SCCS and Meije. We held former background in this field, and more generally in the use of formal operational semantics in the design of verification and analysis techniques for such systems. We bridged the gap in using such models to study techniques for optimized placement and (static) scheduling of models. The specific features of hardware targets let us phrase these questions in a specific context, where ad-hoc ultimately periodic regimes can be established.

Following our
*time refinement*approach, early untimed causal models may be transformed into multiclock or GALS ones, then precisely scheduled to a uniform single time. This type of approach is used
for instance in the static, k-periodic scheduling of dataflow process networks such as Event Graphs
,
, or Synchronous DataFlow graphs
and various extensions in UC Berkeley's
Ptolemy. We extended the approach by providing means for the designers to provide his/her own
time constraints on a given modeling framework, and to express the actual refinement from a given (abstract) time frame to another, more concrete one.

This theory of modulo and k-periodic static scheduling for process networks (mostly Marked/Event Graphs) recently got a renewal of interest due to its application in the context of Latency-Insensitive Design of SoCs. The nature of communication channels, used there for interconnect fabric, demands optimal buffer/place sizing, with corresponding flow control. We contributed several results in this direction, with fine characterization of optimal algorithmic techniques to provide such ultimately k-periodic schedules. They are progressively implemented in the K-Passaprototype software, described in .

The AAA (
*Algorithm-Architecture Adequation*) methodology which is intended for optimizing distributed real-time embedded systems relies on three models.

The Architecture model is a directed graph , whose vertices are sequential machines of two types: “processor” (computation resource or sequencer of operations) and “medium” (communication resource or sequencer of communications), and whose edges are directed connections.

The implementation model is also a directed graph, obtained through an external compositional law, where an architecture graph operates on an algorithm graph in order to give, as a result, a new algorithm graph, which corresponds to the initial algorithm graph, distributed and scheduled according to the architecture graph .

We adress two main issues: monoprocessor real-time scheduling and multiprocessor real-time scheduling where constraints must mandatorily be met otherwise dramatic consequences may occur (hard real-time) and where resources must be minimized because of embedded features.

In our monoprocessor real-time scheduling work, beside the classical deadline constraint, often equal to a period, we take into consideration dependences beetween tasks and several, possibly related, latencies. A latency is a generalization of the typical “end-to-end” constraint . Dealing with multiple real-time constraints raises the complexity of that issue. Moreover, because the preemption leads to a waste of resources due to its approximation in the WCET (Worst Execution Time) of every task as proposed by Liu and Leyland , we first studied non-preemtive real-time scheduling with dependences, periodicities, and latencies constraints. Although a bad approximation may have dramatic consequences on real-time scheduling, there are only few researches on this topic. We have been investigating preemptive real-time scheduling since few years, but seeking the exact cost of the preemption such that it can be integrated in schedulability conditions, and in the corresponding scheduling algorithms. More generally, we are interested in integrating in the schedulability analyses the cost of the RTOS (Real-Time Operating System), for which the exact cost of preemption is the most difficult part because it varies according to the instance of each task. Finally, we investigate also the problem of mixing hard real-time and soft real-time constraints that arises in the most complex applications.

The second research area is devoted to distributed real-time scheduling with embedding constraints. We use the results obtained in the monoprocessor case in order to derive solutions for the problem of multiprocessor (distributed) real-time scheduling. In addition to satisfy the multiple real-time constraints mentioned in the monoprocessor case, we have to minimize the total execution time (makespan) since we deal with automatic control applications involving feedback. Furthermore, the domain of embedded systems lead to solve minimization resources problems. Since these optimization problems are of NP-hard complexity we develop exact algorithms (B & B, B & C) which are optimal for simple problems, and heuristics which are sub-obtimal for realistic problems corresponding to industrial needs. We proposed a family of very fast “greedy” heuristics whose results are improved with local neighborhood heuristics, or used as initial solutions for metaheuristics such as variants of “simulated annealing”.

We promote a model-driven engineering approach for embedded system design based on formal semantics, models and methods. The range of models meant is listed in . They all consist mainly in hierarchical state diagrams and dataflow/activity diagrams for behavior, and component diagrams with connection ports for compositional structure. This brought to light the idea of using the OMG UML formalisms for graphical representation of models, as it contains all these modeling views.

Still, our considered range of models differ mainly by their timely semantics, ranging from fully asynchronous to monoclock synchronous through intermediate multiclock/polychronous versions.
The original UML semantics only considers asynchronous event-based simulation behaviors, mostly because the formalisms does not consider
*time*as an essential feature. Whenever time is (scarcely) introduced in the original standard, it is as a non-functional aspect irrelevant to the described semantics.

As part of the
Marteprofile for
*Modeling and Analysis of Real-Time systems*we introduced a
*time model*, which allows to precisely describe the various time threads (or TimeBase, producing logical clocks). These clocks are used to define the precise temporal semantics of the
global model. Thus, time becomes a functional aspect, essential to comprehend the proper dynamics of the system.

We thus have defined a specification language CCSLfor expressing Clock Constraints. It can be used as input to the TimeSquareprototype tool, described in , to extract feasible simulations, and applied onto UML representation models as part of the MarteOMG profile.

Then common Models of Computation and Communication (MoCCs) can be built on top of these constructs, to be used directly by the final user. The advanced profile features are aimed primarily at advanced designers and semanticians willing to devise accurate time patterns.

Following the AAA (
*Algorithm-Architecture Adequation*) methodology
,
MARTEpromotes independent modeling of
*applications*called
*algorithm*and
*embedded platforms*called
*architecture*in a first step. The mapping (spatial and temporal) of applications onto embedded platforms is realized only in a subsequent step, through distributed and real-time
scheduling analysis and optimizations, relative to the timing constraints and resource costs involved.

Martewas started as a joint action of Thales, CEA-List and INRIA in their CARROLL collaborative program. The profile RFP (Request For Proposals) was voted early 2005, the initial submission in June 2007, and the (first complete) revised version in middle 2008. The profile is currently undergoing final revision in the Ad-Hoc Finalization Task Force at OMG.

The field of model-driven engineering of hardware/software embedded systems is hosting a number of ad-hoc standards dedicated to specific domains. These standards are instrumental in shaping up the technical and economic activities of model exchange between various industrial and academic partners. The main ones considered in our team are:

for avionic systems;

for automotive systems;

for System-on-Chip design;

as Electronic System-Level modeling and programing languages.

These standards may be helpful in performing a number of analyses, such as early component integration, performance/schedulability analysis, and so forth. We conducted a number of comparative studies establishing how generic and specific concepts embodied in these standards could be reflected in Marte, thereby allowing model transformations and exchanges, in a domain-agnostic fashion.

Synchronous formalisms and GALS or multiclock extensions are natural model representations of hardware circuits at various abstraction levels. They may compete with HDLs (Hardware
Description Languages) at RTL and even TLM levels. The main originality of languages built upon these models is to be based on formal
*synthesis*semantics, rather than mere simulation forms.

The flexibility in formal Models of Computation and Communication allows to specify modular Latency-Insensitive Designs, where the interconnect structure is built up and optimized around existing IP components, respecting some mandatory computation and communication latencies prescribed by the system architect. This allows a real platform view development, with component reuse and timing-closure analysis. The design and optimization of interconnect fabric around IP blocks transform at modeling level an (untimed) asynchronous versions into a (scheduled) multiclock timed one.

Also, Network on Chip design may call for computable switching patterns, just like computable scheduling patterns were used in (predictable) Latency-Insensitive Design. Here again formal models, such as Cyclo-static dataflow graphs and extended Kahn networks with explicit routing schemes, are modeling elements of choice for a real synthesis/optimization approach to the desig of systems.

Multicore embedded architecture platform may be represented as Marte UML component diagrams. The semantics of concurrent applications may also be represented as Marte behavior diagrams embodying precise MoCCs. Optimized compilation/synthesis rely on specific algorithms, and are represented as model transformations and allocation (of application onto architecture).

Our current work aims thus primarily at providing Theoretical Computer Science foundations to this domain of multicore embedded SoCs, with possibly efficient application in modeling, analysis and compilation wherever possible due to some natural assumptions. We also deal with a comparative view of Esterel and SystemC TLM for more practical modeling, and the relation between the Spirit IP-Xact interface standard in SoC domain with its Marte counterpart.

Model-Driven Engineering is progressively pertaining to these fields. The formalisms AADL(for avionics) and AutoSar are providing support for this, unfortunately not always with a clean and formal semantics. Yet, some interesting issues are involved there in the mix of event-triggered and time-triggered processing means, the various related protocols, and the coexistence of periodic and aperiodic tasks, with distinct periodicity if ever. The process of scheduling and allocation of multiple heterogeneous and communicating applications onto complex embedded architectural paltforms require adequate model and synthesis/analysis/verification techniques to help the designers converge to acceptable solutions.

TimeSquare is a software environment for modeling and analyzing timed systems. It supports an implementation of the Time Model introduced in the MarteUML profile (see Sec. ), and its companion Clock Constraint Specification Language (CCSL).

TimeSquare has four main functionalities:

interactive clock-related specifications, through dialog boxes,

clock constraint checking,

generation of a consistent temporal structure, using a Boolean solver,

displaying and exploring waveforms, written in the IEEE standard VCDformat.

TimeSquare is a plug-in developed with Ganymede Eclipse Modeling Tools. ANTLR for constraint parsing, and JavaBDD for the solver are also used. It is integrated in the
OpenEmbeDDplatform and can be downloaded from the team site
(http://

This software is dedicated to the simulation, analysis, and effective regular scheduling of Event/Marked Graphs, SDF and KRG extensions. A graphical interface allows to edit the Process
Networks and their time annotations (
*latency, ...*). Linear programming as well as simulations and other analytic methods allow to compute the necessary addition of integer and fractional extra latencies/registers. Critical
cycles are displayed. In the case of KRG the (ultimately k-periodic) routing patterns can also be input and transformed.

K-Passacurrently relies in part on the MascOptlibary for graph algorithms developed in the Mascotte EPI. It also uses Ilog's Cplexas its underlying linear constraint solver.

This software was developed as a result of researches on Latency-Insensitive Design conducted in the context of the CIM PACA initiative, with the support of industrial partners providing motivations.

SynDEx is a system level CAD software implementing the AAA methodology for rapid prototyping and for optimizing distributed real-time embedded applications. It can be downloaded free of
charge, under copyright, at the url:
http://

The AAA methodology requires the specification of 3 main ingredients: an application algorithm, an architectural platform, and real-time features or requirements regarding their combination. Given these, SynDExwill explore the space of possible allocations (distribution and scheduling) from application elements to architecture resources and services to match the real-time requirements, using heuristic techniques. It will generate automatically distributed real-time code running on the embedded platform.

Application algorithms can be edited graphically as directed acyclic task graphs (DAG), or they may be obtained by translation from various sources, such as (formal) synchronous reactive languages, Scilab/Scicosand UML/MARTE.

Architectures are represented as graphical block diagrams composed of programmable (processors) and non-programmable (ASIC, FPGA) computing components, interconnected by communication media (shared memories, links and busses for message passing).

Application real-time features for elementary operations, relative to hardware component, range amongst
*execution and transfer time, period, memory, etc*. Requirements are generally constraints on latency, throughput, etc.

Exploration of alternative allocations of the algorithm onto the architecture may be performed manually or with the help of optimization heuristics; results are visualized as timing diagrams simulating the distributed real-time implementation.

Implementation deployments use dedicated distributed real-time executives, or general purpose real-time operating systems such as Linux/RTAI or Osek for instance. These executives are deadlock-free, based on off-line scheduling policies. Dedicated executives induce minimal overhead, and are built from processor-dependent executive kernels. Presently, executives kernels are provided for: TMS320C40, PIC18F2680, i80386, MC68332, MPC555, i80C196 and Unix/Linux workstations. Executive kernels for other processors can be ported at reasonable cost following these patterns.

The SAS (Simulation and Analysis of Scheduling) software allows the user to perform the schedulability analysis of periodic task systems in the monoprocessor case. The parameters of the system can either be set using the graphical interface, or from a file. The main contribution, compared to other commercial and academic tools of the same kind, is that SAS takes into account the exact cost of preemption during the schedulability analysis. Beside the usual deadline constraints, the user may also set other constraints such as precedence, strict periodicity, and latency. Several scheduling policies can be selected: Deadline Monotonic, Rate Monotonic, particular priorities set by the user, and a specific version of Audsley's algorithm. The classical Audsley's algorithm is useless and no longer optimal when the cost of preemtion is considered. The specific version is optimal and shows all the possible solutions.

Once the parameters are loaded, the user launches the simulation and the analysis either step by step, or for the whole set of tasks according to the constraints. The resulting schedule is displayed as a typical Gantt chart with a transient and a permanent phase, or as a disk named "dameid" that clearly shows the idle slots of the processor in the permanent phase. For systems with a large least common multiple of periods it is possible to zoom parts of the results.

When the system is schedulable, the following results are displayed: classical utilization factor, permanent exact utilization factor, preemption cost in the permanent phase, and the worst response time for each task. An extra graphic can be displayed showing the response time in function of the time. All the graphics can be converted into PostScript for printing.

The software has been written in OCaml, using Camlp5(syntactic preprocessor) and Olibrt(a graphic toolkit under X) both written by Daniel de Rauglaudre.

Our former work on Latency-Insensitive Design of SoC led us to consider formal models of interconnected IP components, as Process Networks and Marked/Event Graphs with latency information. We studied dynamic, then static off-line scheduling techniques for such models, owing to ultimately k-periodic regular regimes.

This year we established that the naive computation of schedules by asap execution from
*any*initial data allocation could suffer from several misfunctions. First, the global throughput could be impaired by back pressure flow control mechanisms; second, the interconnect
channel utilization could sometimes be suboptimal. Even if these phenomena could be considered as extremely rare, we wanted to obtain a theoretically optimal schedule allocation.

We then studied the theory of so-called
*balanced*periodic binary words, originally due to Christoffel (1885) and Bernouilli (1776). We introduced two specific operations on such words:
*rotations*, representing integral delays between activations of successive computation blocks;
*transpositions*, representing (residual) fractional delays. We established deep structure theorems linking both. With this mathematical apparatus we were able to prove that balanced
schedules could be imposed and correctly propagated across the process network in a way that every computation node executes exactly when input data occur at its entry ports. The scheduling
produced requires also exactly the same number of data as originally present in the model, so that the full system is completely consistent. These results were presented in Jean-Vivien Millo's
PhD thesis, defended in December
.

The last remaining problem now consists in finding a “short” asynchronous sequence of execution steps by individual IP components reaching a balanced data marking from any original initial one.

The Process Network models based on Event Graphs and varying synchronous or asynchronous interpretations allow powerful results in static scheduling and distribution allocations.
Nevertheless they always postulate a uniform data flow. We tried to relax this strong assumption, while preserving the determinism/confluence of computations, as in Kahn Process Networks.
Several models were inspirational (such as Boolean and CycloStatic DataFlow Graphs). But the main originality of our KRGs is to rely on two
*Select/Merge*operator nodes only (for
*mux/demux*effects on data streams). And,
*most importantly*the switching patterns for these conditional nodes have to be infinitary periodic binary words, thus using exactly the same description formalisms as our previous
*schedule words*.

We have proven a number of powerful algebraic results on such models. They may best be understood by analogy with Boolean Algebra, and the existence of normal forms (such as sum of products, or product of sums). Here, the expansion and factorization of expressions and variables amounts to sharing and unsharing of links and channels in the interconnect fabric representation of the communications across the networks.

Our main results during this year are reported in
(submitted for publication). We believe this model of KRGs to be an important step for the Theoretical Computer
Science modeling of modern Networks-on-Chip
*NoCs*.

The Marte time model, part of the OMG standards, is now in its Finalization Task Force (FTF) phase. The latest Marte specificationhas been released in August 2008. Frédéric Mallet attends all the OMG technical meetings, actively contributing to Marteand SysMLstandardizations and he is in charge of the convergence between the two specifications. He also participates with Charles André in the issue resolutions related to Time and Allocation chapters of Marte.

The main concepts of the Time model and their UML representations have been presented at MoDELS'07 . The applicability of Marte has been demonstrated by a study , in collaboration with Thales and the CEA.

Besides this standardization effort, Aoste promotes a
*timed causality model for UML*addressing both semantic and pragmatic points of view. A formal semantics for CCSL has been proposed
,
. The TimeSquare environment, described in Sec.
, has been developed. It fully supports the Marte Time Model, CCSL, and timed simulations
. We plan to use this simulation capability in the ANR project RT-SIMEX that will start on January 1st, 2009.

The Marte Time Model together with its constraint language (CCSL) provides a way to specify timed causality in a UML model. More generally, this broadly expressive time model is devised to define models of computations and communications (MoCC) by extending the UML. The expected benefit of relying on the UML is to reuse existing graphical editors and integrate various MoCC within the same framework. Classical UML models like state machines or activities can then be executed according to the specific semantics of a given MoCC. For instance, UML state machines can be given a synchronous semantics to behave like SyncCharts, and UML activities can behave like Scade dataflow diagrams.

Towards this goal, equivalent Marte/CCSL constructs have been defined in Signal and in Time Petri nets , . These two well-known languages have been chosen as representatives of synchronous and asynchronous languages commonly used in the domain for formal analysis.

Besides, the SOS semantics of CCSL enables the creation of new specific tools. For instance, we have built TimeSquare , a java-based Eclipse plug-in that provides assistance to construct Marte/CCSL models and run simulations. Following a Model-Driven Engineering process, we have also provided a metamodel of CCSL whose formal semantics has been weaved directly in the models by using KerMeta. This metamodeling work widens the scope of application of CCSL and should help to link a specific model with its MoCC.

CCSL has been used as a pivot language to give a timed semantics to several formalisms from different application domains:

In the
**automotive**domain,
AutoSaris a broad scope emerging standard.
EAST-ADL2(Electronic
Architecture and Software Tools) is an architecture description language that provides a structure for the engineering information involved in automotive software development. Aoste
contributes to this effort through the MeMVaTEx project (Sec.
). By defining transformations from East-ADL timing requirements to CCSL relations, we gave an operational semantics to
East-ADL, which makes the requirements executable
,
.

In the
**avionic**domain, we considered the SAE standard
AADL. Indeed, one of the Marte requirements was to integrate UML and AADL models. Aoste has studied the
AADL communication model, which mixes time-triggered and event-based messages, and has defined systematic transformation rules to CCSL relations
,
,
thus allowing Marte models to be partially analyzed by schedulability analysis tools often used with AADL
(like Cheddar).

In the
**Electronic Design Automation**(EDA) domain,
IP-Xactis
*the*standard for describing and handling
*intellectual properties*(IP). IP-Xact describes static properties of IP including type of ports, usage of the address space, interconnections, etc, but it entirely relies on
programming languages such as Verilog-HDL or SystemC for the description of the behavior. Early analysis of systems of IPs requires an abstract description of timing and behavioral
aspects. UML can provide both structural and behavioral aspects. Following a MDE approach, Aoste has proposed a metamodel for IP-Xact, defined a dedicated UML profile, and provided an
automated model transformation from UML (profiled with Marte and our IP-Xact profile) to IP-Xact
.

On a methodological plane, F. Lagarde in his thesis
has studied UML profiles as possible candidates to Domain Specific Languages (DSL) design. He has defined
*metrics*for profile asssessment and proposed a set of dedicated high-level constructs that embody OCL expressions to constrain models on which several profiles are applied
. Another contribution, in collaboration with F. Mallet and C. André, is the use of a
*multilevel paradigm*(i.e., not restricted to the Class-Instance relationship). This approach has brought a fresh view to the Marte Time profile
, a posteriori justifying the existence of two distinct stereotypes (ClockType and Clock) for the Clock
concept.

Last year, we proposed a necessary and sufficient schedulability condition for a system of periodic tasks, all released at the same time (simultaneous) and scheduled according to
*RMA*while taking into account the exact cost due to preemption. Then, we extended this result to the case of the scheduling problem which consists of periodic tasks with precedence and
strict periodicity constraints and with periods not necessarily forming an harmonic sequence.

This year, we have introduced a new model in order the solve the general scheduling problem of hard real-time systems with various kinds of constraints such as precedence, strict
periodicity, latency and jitter while taking into account the exact cost due to preemption for any scenario of first release for all tasks (simultaneous or not simultaneous). The schedulability
analysis is based on the definition of a binary scheduling operation
whose the operands are called
*otasks*. We have showed the correspondance between an otask and a periodic task. Now, as a convention in the definition of operation
, the left-hand otask is always assigned the highest priority and the right-hand otask the lowest, thus
is not commutative. Each otask is an ordered multiset which consists of either a finite or an infinite sequence of symbols belonging to a finite set called the
*generator*. For the generator
{
a,
e}, symbol “
a” always corresponds to an
*available time unit*and symbol “
e” corresponds to an
*already executed time unit*when it belongs to the left-hand otask and to an
*executable time unit*when it belongs to the right-hand otask of
. For a given set of otasks, operation
is used as many times as there are tasks in the system thanks to its associativity. We have considered two approaches. First, the priority of each otask belonging to the system is known,
thus leading to a decreasing order of priorities of the
*otasks*. Second, the priority of each otask is not known, in this case we have first derived an
*optimal*assignment of priorities to the otasks. This model has been used to handle the scheduling problem for hard real-time systems with precedence, strict periodicity and latency
constraints. We have closely studied the impact of the scenario of first releases of all tasks on the schedulability analysis. We have proved on the one hand that the scenario where the first
releases of all tasks are simultaneous does not correspond to the worst-case scenario, and on the other hand that there is not such a scenario when the cost of the preemption is considered.
Then, we have proved that the assignment of priorities to tasks based on Audsley's policy is no longer applicable and nor optimal. Here optimal means that if there exits an assignment which
leads to a valid schedule for the given set of tasks then the proposed assignment of priorities will also lead to a valid schedule. In order to take into account the global cost of the RTOS
(Real-Time Operating System), for which the preemption cost is the variable part, we have simulated many sets of tasks with the
*Linux/RTAI*RTOS in order to evaluate its impact on the schedulability analysis.

Currently, we are introducing jitter constraints to our model, and the preliminary results we got were about non-schedulability conditions.

The last two years we have performed a schedulability study which led to a schedulability condition taking into account only two tasks at the same time. From this condition we proposed, first, a greedy heuristic for non-preemptive multiprocessor scheduling of systems with precedence and strict periodicity constraints. Secondly, in order to improve the first algorithm, we introduced a back-tracking procedure leading to a better schedulability ratio.

This year, we continued the schedulability study by proposing a more general schedulability condition which checks whether a task satisfies or not its periodicity and precedence constraints on a processor where some tasks have already been scheduled . Since the problem of deciding that such systems are schedulable or not is NP-hard in the strong sense, this condition constitutes an important outcome. It can be used in partitioned scheduling approaches for solving the multiprocessor scheduling problem. In this approach tasks are partitioned among the processors which transforms the multiprocessor problem to several monoprocessor problems. We proposed a new heuristic using this condition. It has a much more effective schedulability ratio than the previous versions, and a satisfying execution time.

The novelty, this year, is the introduction of latency constraints in the schedulability study. The considered latency constraint corresponds to a delay between whatever pair of tasks in the system. Most of the time it is imposed between an input event consumed by the system and an output event produced by the system. As the scheduling problem under latency constraints is also an NP-hard problem, we chose to first study schedulabilty under latency and precedence constraints first, and then to deal with schedulability under all constraints. As a result we proposed a schedulability condition and a heuristic for multiprocessor scheduling of tasks under precedence, strict periodicity and latency constraints.

All previous scheduling heuristics aim at minimizing the makespan which is the total execution time of all the tasks, in addition to satisfy all the constraints. However, the makespan can be
more minimized with a load balancing heuristic. Moreover, since memory is limited in embedded systems, we need a heuristic to perform an efficient memory usage. Thus, we proposed a heuristic
for load balancing and efficient memory usage of homogeneous distributed real-time embedded systems
. Basically, it is achieved by grouping the tasks into blocks, and moving them to the processor such that the block
start time decreases, and this processor has enough memory capacity to execute the tasks of the block. We have shown that the proposed heuristic has a polynomial complexity which ensures a fast
execution time. We performed also a theoretical performance study which bounds the total execution time decreasing, and shows that our heuristic is a (
)-approximation algorithm for the memory usage, with
Mthe number of processors.

We have continued in 2008 our research effort concerning the deterministic globally asynchronous implementation of synchronous specifications. Our approach is to consider multi-clock synchronous specifications, and then encode the absence and non-execution of the synchronous model with actual absence of communication and non-execution in the globally asynchronous implementation. This raises complex correctness issues that must be solved. The primary practical application is the extension of the class of implementations AAA/SynDEx can support through the definition of new synchronization schemes and associate implementation algorithms. Here, the objective is to allow operations and communication lines to be inactive in certain logical instants (repetitions of the graph pattern representing the algorithm) depending on the state and input data.

We focused on the semantics-preserving execution of a single synchronous program/core in an asynchronous environment. More precisely, we need to preserve the function of the synchronous program/core (as an I/O stream mapping) while allowing for elastic timing.

Our first contribution here has been the characterization of the largest class of synchronous programs/cores that produce deterministic implementations using a very general execution machine
based on: (1) the chosen signal absence encoding, and (2) an ASAP (as soon as possible) reaction triggering policy. The characterization is a form of
*confluence*and
*determinism*. We also characterized the largest sub-class of such programs that is closed under synchronous composition (thus offering the basis for realistic techniques for incremental
development). These results can be found in
.

The second contribution has been the definition of algorithms allowing us to check whether a general synchronous program (specified in the Signal language developed by the Espresso Team-Project) is weakly endochronous. For a weakly endochronous program, the algorithms also produce the set of atomic behaviors, allowing the simple construction of the GALS wrapper. When the program is not weakly endochronous, the output of the algorithm allows the transformation of the specification into a weakly endochronous one through the addition of a minimal number of messages carrying signal absence information over the existing communication lines.

Recent evolutions in the classes of specifications and desired implementations of SynDEx resulted in a need for revisiting and improving the formal support of the AAA/SynDEx methodology. A preliminary analysis of the AAA/SynDEx has revealed that the high abstraction level of the hardware architecture models of SynDEx results in:

Very large time overheads in the implementation, mainly due to over-synchronization in a distributed environment.

Difficult construction of correct and efficient communication libraries and difficult synthesis of correct and efficient synchronization code.

It is therefore imperative to invest more in the definition of lower-level models of the implementation architecture, in order to ensure the correctness and efficiency of the generated code.

We started by considering the problem of optimizing existing static schedules of conditional communications over a broadcast asynchronous bus. Preliminary results in this direction are presented in .

The previous years we developed with SynDEx a “visual control of autonomous CyCabs for platooning” application. This year we improved the image processing algorithm which detects the followed CyCab from a webcam in order to achieve longitudinal as well as lateral control. The latter improve the detection when the followed CyCab is turning. In addition, the CyCab detection algorithm was integrated in the control algorithm to obtain a unique algorithm specified with SynDEx whereas the detection algorithm did not appear in the control algorithm previously. Complex problems have arisen due to the mix of two types of real-time executives. Indeed, the detection algorithm is executed under Linux/RTAI and the control algorithm is executed under a real-time executive completely synthetized by SynDEx. Moreover, the webcam uses a Linux driver which must communicate the images to Linux/RTAI through a shared memory. Finally, a first version of the complete application specified and implemented with SynDEx on the distributed architecture of the CyCab (several MPC555 microcontrollers and an embedded PC communicating through a CAN bus) was obtained.

The previous years two main improvements have been achieved in SynDEx: a new data structure associated to a better GUI (User Graphical Interface), and a multi-periodic version of the optimized distribution and scheduling heuristic, called adequation in SynDEx, whereas it was mono-periodic before. However, these two developments were achieved separately, mainly because the new multi-periodic heuristic was developed by the PhD student Omar Kermia whereas the new structure and GUI was developed by a software engineer. Consequently, this year was mainly devoted to merge both software developments in a consistent way, to test the new GUI and the new adequation, and to update the documentation in consequence. This important work should lead to a new major release of SynDEx, called V7, at the end of the year or the beginning of the next one.

In addition, for the code generator of SynDEx, we studied the possible improvements in terms of code length and synchronization optimizations.

This collaboration takes the form of a series of one-year grants (4 so far). We explore Latency-Insensitive Design (LID), based on the original work of Luca Carloni (now at Columbia University), with whom we share an associated-team programme supported by INRIA.

This year we worked on the extension of our modeling framework to encompass alternative routing and signal redirection. In the case of ultimately k-periodic schedules of LID systems, we also consider k-periodic switching schemes. The results are utterly interesting in that we prove that algebraic transformation holds, by which we can either share less channels with more interleaving and demanding scheduling, or progress communications faster with more channeling resources, where easier schedules may be feasible. The combination of scheduling time periodicity and routing space periodicity allows us to compute predictable buffering needs to accomodate the throughput. One can then hope in the future to devise techniques for re-allocating and re-routing, similar to nowadays retiming/recycling approach.

We also considered the relevance of these models in the light of existing tools for high-level synthesis currently introduced at Texas Instruments, mainly Synfora's Pico Expressand Esterel-EDA's Esterel Studio.

This contract started at the end of the year, in the context of the Nano2012 programme and the overall INRIA-ST partnership agreement. The Kick-off meeting was held on December 5th. The DaRT EPI from Lille is also taking part in this collaboration.

The main topic of ID/TL-M is to study the introduction of model-driven engineering techniques (MDE) at the transaction-level modeling (TLM) level of SoC design. Bridges between the OMG profile Marteand the dedicated standard IP-Xactfor Electronic System-Level (ESL) design shall be establised and realized. Formal Models of Computation and Communication (MoCCs) wil be represented, and model transformations across levels shall also be established and realized. The PapyrusUML modeler by CEA, and Eclipseenvironments shall be considered to support these implementations.

We started a collaboration with the Thales Research and Technology group which designs the new distributed architectures for embedded systems that are used in the different business units of Thales. They are interested in multi-core archictecture dedicated to critical embedded systems. A first study focused on predictable RTOS that will be used in each core.

This ambitious regional initiative is intended to foster collaborations between local PACA industry and academia partners on the topics of microelectronic design, though mutualization of equipments, resources and R&D concerns. We are actively participating in the Design Platform (one of the three platforms launched in this context). Other participants are UNS, CNRS (I3S and LEAT laboratories), L2MP Marseille, CMP-ENSE Gardanne on the academic side, and Texas Instruments, Philips, ST Microelectronics, ATMEL, and Esterel-EDA on the industrial side.

Inside this platform we are coordinating a dedicated project, named Spec2RTL, on methodological flows for high-level SoC synthesis. Participants are Texas Instruments, NXP, ST Microelectronics, Synopsys, Esterel-EDA, and Scaleo Chip as industrial partners, INRIA, I3S (CNRS/UNSA) and ENST on the academic side. A pool of PhD students are funded on a par basis between industrial partners and local PACA PhD grants under the BDI programme. There are currently 7 such students, two of them hosted by the Aoste team.

Jean-Vivien Millo, supported in part by ST Microelectronic, defended his PhD thesis on December 17th. His main research topic was static balanced scheduling of data traffic in the LID design of GALS systems, with applications on modular SoC design.

Jean-François Le Tallec started his PhD thesis in connection with
Scaleo Chip, a local SME company developing SoC platform simulators. The PhD topic is to investigate
new
*virtual*platform environments at ESL TLM level, and their relation to formal modeling in multiclock
Esterel.

OpenEmbeDDis a large platform project aimed at connecting several formalisms with model-driven engineering tools, in the embedded domain. The project partners are: INRIA, CEA-List, Thales, Airbus, France Telecom, CS, LAAS, and VERIMAG. Four INRIA teams are involved (ATLAS, Triskell, Aoste and DaRT).

The focus is on the use of model-driven approaches to combine various specification formalisms, analysis and modeling techniques, into an interoperable framework. We contribute to this in several directions:

first, we provide the definition and implementation of the Marte Time subprofile, as described in , with the support of our TimeSquaretool for resolution and animation of Clock Constraints;

second, we contribute our work on compilation-by-transformation of asynchronous to multiclock to plain synchronous programs;

third, we developed an effective coupling of SynDExto Martemodels. The editor of SynDEx models based on TopCased under Eclipse with the UML/MARTE profile was improved in order to represent application algorithms with conditioning and hierarchical descriptions. On the other hand we also improved the translator from XMI to the .sdx format for these new features.

The various partner contributions in this project are assembled together by a dedicated engineer team of two people located at IRISA, as part of an INRIA forge.

The Lambdaproject is headed by Thales, with ST Microelectronics, Airbus, Esterel-EDA, CoFluent, CEA-LIst, and several other partners.

Our contribution is initially rather light. We bring expertise to help with the definition of a model transformation between SyncChartsand UMLState Diagrams. We then contribute to the combination of SysMLand Marteparadigms, in the context of SoC design as well as the SPIRIT IP-Xactstandard.

The partners of the MeMVaTExproject are: Continental, INRIA, CEA-List, CNRS-UTC, and Sherpa Engineering. The project focuses on developing a design methodology, centred on the requirements, their traceability and their validation. The application domain is the design of complex real-time automotive systems. The methodology is based on the standards EAST-ADL2, Autosar, SysMLand Marte. The project is currently centred on the heterogeneous phase. This phase integrates the Simulink and SynDEx tools in order to provide the validation of requirements and models.

During this year, we worked on the solution http://www.artist-embedded.org/artist/model and we integrated in the methodology the joint modeling of the hardware architecture models and the timing modeling at the different levels of the development cycle (i.e. Design Level and Autosar Level). We established the link between the solution functional model and the requirement model for these particular model elements (hardware and timing).

We worked also on the MeMVaTEx demonstrator for illustrating the previous results by developing the different models and by implementing in the Artisan software tool the Autosar profile and the time package of the Marteprofile.

We reported all these results in the deliverables JSP1T2-I-b, JSP1T2-II-a, JSP1T5-II-a, JSP1T6-I-b of the project.

This new project is dedicated to the reverse engineering of analysis traces of simulation and execution back up to the source code, or in our case most likely into the original models in a Marteprofile representation. The prime contractor is the Obeocompany. A kick-off meeting is due soon.

We conducted a comparative work in this context on Process Networks and MoCCs, submitted for publication.

Frédéric Mallet attended the UML&FM workshop held in Japan with the strong support of Artist.

Robert de Simone was programme committee member for MemoCode'08. He was on the Selection Board of experts for the ANR programme ARPEGE 2008. He was a member of the
*Commission de Spécialiste 27
^{e}section*from the University of Nice-Sophia Antipolis, and INRIA representative to the UNS Doctoral School. He represents INRIA in the CA board of ARCSIS, the ruling
association for the CIM PACA, as well being a member of its Strategic Council. He holds the same positions in the Design Platform branch of this organization. He was reviewer for the PhD
theses of Ludovic Samper and Jerôme Cornet (Verimag).

Charles André was reviewer of the PhD theses of Sébastien Revol (INPG), Huafeng Yu (LIFL), Cécile Hardebolle (LRI).

Yves Sorel is programme committee member for the following conferences: DASIP, EUSIPCO, GRESTSI. He is member of the OCDS/SYSTEM@TIC Paris-Region Cluster Steering Committee. He participated to the Habilitation Thesis jury of Laurent George.

Marie-Agnès Peraldi-Frati is a member of the UNS
*Commission de Spécialiste 27
^{e}section*. She is member of the CNRS/I3S conseil de laboratoire and member of the CERTEC (conseil d'études et de la recherche technologique) of the IUT of Nice-Sophia
Antipolis.

Charles André is a professor at the university of Nice-Sophia Antipolis, department of Electrical Engineering. He taught Computer Architecture, Hardware Description Languages and Real-time programming (1 term).

Julien DeAntoni gives courses at different cursus levels of the polytechnic school of the UNS: A course and labs on micro-controller and Real-Time operating system
programming in the last year of the engineering cursus. He also teaches object oriented programmation through C++ in the 4
^{th}year of the mathematical and modeling cursus as well as linux shell programming in the second year of the engineering cursus.

Robert de Simone taught courses on Formal Methods and Models for Embedded Systems in the STIC Research Master program of the university of Nice-Sophia Antipolis (UNS), for approximately 15h.

Marie-Agnès Peraldi-Frati gives courses at different cursus levels of UNS: A course and labs on UML for real-time in the TSM master (Telecommunication, System and Microelectronics) at the University of Nice. She gives different courses (Systems and networks, Programming, Web development, Computer architecture) at the L1 level of the IUT Informatique.

Yves Sorel gives courses in last year of the ESIEE (Engineering School located in Noisy-le-Grand), in the SETI Research Master at the University of Orsay Paris 11, and in last year of the ENSTA (Engineering School located in Paris), on topics comprising the AAA methodology and the optimization of distributed real-time embedded systems.