Section: New Results

Fault Handling, compensations and transactions

Participants : Mila Dalla Preda, Maurizio Gabbrielli, Ivan Lanese, Jacopo Mauro, Gianluigi Zavattaro.

One of the predominant properties of CBUS is the loose coupling among the components. In fact, components can dynamically connect/disconnect and can be modified/updated at run time. It is thus important to support unexpected events, called faults.

In [30] we have studied the problem of fault handling in the kind of object-oriented languages developed in the EU Hats project; notably these languages have asynchronous method calls whose results are returned inside futures. We present an extension for those languages where futures are used to return fault notifications and to coordinate error recovery between the caller and callee. This can be exploited to ensure that invariants involving many objects are restored after faults.

Traditional fault handling mechanisms, including those based on try-catch operators, do not seem sufficient to deal with the non-local errors and failures of distributed systems. At the application level, more advanced transactional models and primitives are needed to guarantee integrity and continuity of the whole system. We study approaches based on long running transactions and compensations. A long running transaction is a computation that either successfully terminates, or it aborts. In case of abort, a compensation is executed to take the system to a consistent state. In [53] , extending work started last year, we make a thorough comparison among different approaches to the specification of compensations, in particular static forms of recovery where the compensation is statically defined together with the transaction, and dynamic forms where the compensation is progressively built along with a computation.

We have also continued our study on faults and compensations in Service Oriented Computing. The approach to the interplay between bi-directional request-response interaction and faults, proposed in our past works on the Jolie language, supported the idea that the bi-directional pattern should not be interrupted in case of faults. However, this may cause long delays or even deadlocks if the communicating partner disappears. On the contrary, the approach of WS-BPEL causes no delay, but it does not allow to compensate the remote activity. We have investigated [38] an intermediate approach in which it is not necessary for the fault handler to wait for the response, but it is still possible on response arrival to gracefully close the conversation with the remote service.

A related work, but mainly developped in 2010, is [21] .