WAM - 2011 - Annual activity report

WAM

WAM - 2011

Project Team Wam

Members

Overall Objectives

Objectives

Scientific Foundations

Application Domains

Application Domains

Software

New Results

Partnerships and Cooperations

Dissemination

Bibliography

Publications of the year

Previous |

Home | Next next

Section: New Results

XML Processing

Mature results about XML processing were obtained along three main directions: the formalization and implementation for checking the impact of schema evolution on validation and queries; logical extensions supporting a notion of counting and the shuffle operator in trees; and the decision of a subtyping relation for a very expressive type algebra supporting a notion of polymorphism.

In addition, preliminary results were obtained on the definition of a rigorous logical framework for the static analysis of semantic web languages, on the static analysis of cascading style sheets, and on the equipment of an IDE with new static analysis features for XQuery.

Impact of XML schema evolution

In the ever-changing context of the web, XML schemas continuously change in order to cope with the natural evolution of entities they describe. Schema changes have important consequences. First, existing documents valid with respect to the original schema are no longer guaranteed to fulfill the constraints described by the evolved schema. Second, the evolution also impacts programs manipulating documents whose structure is described by the original schema.

We have proposed a unifying framework for determining the effects of XML Schema evolution both on the validity of documents and on queries [2] . The system is very powerful in analyzing various scenarios in which forward/backward compatibility of schemas is broken, and in which the result of a query may not be anymore what was expected. Specifically, the system offers a predicate language which allows one to formulate properties related to schema evolution. The system then relies on exact reasoning techniques to perform a fine-grained analysis. This yields either a formal proof of the property or a counter-example that can be used for debugging purposes. The system has been fully implemented and tested with real-world use cases, in particular with the main standard document formats used on the web, as defined by W3C. The system identifies precisely compatibility relations between document formats. In case these relations do not hold, the system can identify queries that must be reformulated in order to produce the expected results across successive schema versions.

Counting in trees

A major challenge of query language design is the combination of expressivity with effective static analyses such as query containment. In the setting of XML, documents are seen as finite trees, whose structure may additionally be constrained by type constraints such as those described by an XML schema. We have considered the problem of query containment in the presence of type constraints for a class of regular path queries extended with counting and interleaving operators [1] . The counting operator restricts the number of occurrences of children nodes satisfying a given logical property. The interleaving operator provides a succinct notation for describing the absence of order between nodes satisfying a logical property. We have proposed a logic supporting these operators, which can be used to solve common query reasoning problems such as satisfiability and containment of queries in exponential time [4] .

Typing higher-order programs

We have considered a type algebra equipped with recursive, product, function, intersection, union, and complement types together with type variables and universal quantification over them. We have defined the subtyping relation between such type expressions, and have proved its decidability[9] .

This has solved an open problem that was attracting a considerable research effort. The novelty, originality and strength of our solution reside in introducing a logical modeling for the semantic subtyping framework. We have modeled semantic subtyping in a tree logic and use a satisfiability-testing algorithm in order to decide subtyping. We have shown how the subtyping relation can be decided in EXPTIME. We have reported on practical experiments made with a full implementation of the system. This has provided a powerful polymorphic type system aiming at maintaining full static type-safety of functional programs that manipulate trees, even with higher-order functions, which is particularly useful in the context of XML.

Detection of inconsistent paths and dead code in XML IDEs

One of the challenges in web software development is to help achieving a good level of quality in terms of code size and runtime performance, for increasingly popular domain specific languages such as XQuery. We have presented an IDE equipped with static analysis features for assisting the programmer [8] . These features are capable of identifying and eliminating dead code automatically. The tool is based on newly developed formal programming language verification techniques, which are now mature enough to be introduced in the process of software development.

Static analysis of semantic web languages

We work with the Exmo project-team on the static analysis of semantic web languages such as RDF, OWL and SPARQL by investigating modal logics over graphs. We seek to build a rigorous logical reasoning framework based on $μ$ -calculus adapted for the web semantic languages [7] [11] . In particular, we studied the containment problem for SPARQL queries: determining whether, for any graph, the answers to a query are contained in those of another query. Our approach consists in encoding RDF graphs as transition systems and queries as $μ$ -calculus formulas and then reducing the containment problem to testing satisfiability in the logic.

Static analysis of style sheets

Developing and maintaining cascading style sheets (CSS) is an important issue to web developers as they suffer from the lack of rigorous methods. Most existing means rely on validators that check syntactic rules, and on runtime debuggers that check the behavior of a CSS style sheet on a particular document instance. However, the aim of most style sheets is to be applied to an entire set of documents, usually defined by some schema. To this end, a CSS style sheet is usually written w.r.t. a given schema. While usual debugging tools help reducing the number of bugs, they do not ultimately allow to prove properties over the whole set of documents to which the style sheet is intended to be applied.

We have proposed a novel approach to fill this lack [14] by analyzing CSS style sheets using the same logic and compile-time verification technique we use for other XML problems. We have developed an original tool based on our XML Reasoning Solver (see section 5.2 ). The tool is capable of statically detecting a wide range of errors (such as empty CSS selectors and semantically equivalent selectors), as well as proving properties related to sets of documents (such as coverage of styling information), in the presence or absence of schema information. This new tool can be used in addition to existing runtime debuggers to ensure a higher level of quality of CSS style sheets.

Previous |

Home | Next next