Web Usage Mining
AWLH for Pre-processing Web Logs
Participants : Yves Lechevallier [co-correspondant] , Brigitte Trousse [co-correspondant] .
AWLH (AxIS Web Log House) for Web Usage Mining (WUM) is issued from AxISlogminersoftware which implements the mult-site log preprocessing methodology and extrcation of sequential pattern with low support developed by D. Tanasa in his thesis  for Web Usage Mining (WUM). In the context of the Eiffel project (2008-2009), we isolated and redesigned the core of AxISlogMiner preprocessing tool (we called it AWLH) composed of a set of tools for pre-processing web log files. The web log files are cleaned before to be used by data mining methods, as they contain many noisy entries (for example, robots requests). The data are stored within a database whose model has been improved.
So AWLH offers:
An additionnal tool has been developped for capturing user actions in real time based on an open source project called "OpenSymphony ClickStream". An extension version of AWLH called AWLH-Debate has been developed for recording and structuring data issued from annotated documents inside discussion forums.
ATWUEDA for Analysing Evolving Web Usage Data
Participants : Yves Lechevallier [correspondant] , Brigitte Trousse, Mohamed Gaieb, Yves Lechevallier [correspondant] .
ATWUEDA for Web Usage Evolving Data Analysis  was developed by A. Da Silva in her thesis  under the supervision of Y. Lechevallier. This tool was developed in Java and uses the JRI library in order to allow the application of R which is a programming language and software environment for statistical computing functions in the Java environment.
ATWUEDA is able to read data from a cross table in a MySQL database. It splits the data according to the user specifications (in logical or temporal windows) and then applies the approach proposed in the Da Silva's thesis in order to detect changes in dynamic environment. The proposed approach characterizes the changes undergone by the usage groups (e.g. appearance, disappearance, fusion and split) at each timestamp. Graphics are generated for each analyzed window, exhibiting statistics that characterizes changing points over time.
Version 2.of ATWUEDA (september 2009) is available at Inria's gforce website.
In 2011 we have demonstrated the efficiency of ATWUEDA  by applying it on another real case study on condition monitoring data streams of an electric power plant provided by EDF.
ATWUEDA is used by Telecom Paris Tech and EDF  .
This year we studied how to transform the code of ATWUEDA as a web service for the version 1.2 of FocusLab: in fact we gave up this objective, which would require more resource than we have.