CQFD - 2019 - Rapport annuel d'activité

CQFD

CQFD - 2019

Project-Team Cqfd

Team, Visitors, External Collaborators

Overall Objectives

Presentation

Research Program

Application Domains

Dependability and safety

New Software and Platforms

New Results

Power-of-d-Choices with Memory: Fluid Limit and Optimality
Hamilton-Jacobi-Bellman Inequality for the Average Control of Piecewise Deterministic Markov Processes
Approximation of discounted minimax Markov control problems and zero-sum Markov games using Hausdorff and Wasserstein distances
Combining clustering of variables and feature selection using random forests: the CoV/VSURF procedure
Statistical model choice including variable selection based on variable importance: A relevant way for biomarkers selection to predict meat tenderness
Genome sequencing for rightward hemispheric language dominance
An Original Methodology for the Selection of Biomarkers of Tenderness in Five Different Muscles
New Approach Studying Interactions Regarding Trade-Off between Beef Performances and Meat Qualities
Impact of Speller Size on a Visual P300 Brain-Computer Interface (BCI) System under Two Conditions of Constraint for Eye Movement
High-Dimensional Multi-Block Analysis of Factors Associated with Thrombin Generation Potential
Multiple‐output quantile regression through optimal quantization
Artificial evolution, fractal analysis and applications
Self-affinity of an Aircraft Pilot's Gaze Direction as a Marker of Visual Tunneling
Filtering-based Analysis Comparing the DFA with the CDFA for Wide Sense Stationary Processes
2D Fourier Transform Based Analysis Comparing the DFA with the DMA
Interpréter les Fonctions de Fluctuation du DFA et du DMA comme le Résultat d'un Filtrage
A perturbation analysis of stochastic matrix Riccati diffusions
On the stability of matrix-valued Riccati diffusions
A variational approach to nonlinear and interacting diffusions
An explicit Floquet-type representation of Riccati aperiodic exponential semigroups
Uniform propagation of chaos and creation of chaos for a class of nonlinear diffusions
Stability Properties of Systems of Linear Stochastic Differential Equations with Random Coefficients
On One-Dimensional Riccati Diffusions
Adaptive Approximate Bayesian Computational Particle Filters for Underwater Terrain-Aided Navigation.
Inference for conditioned Galton-Watson trees from their Harris path

Bilateral Contracts and Grants with Industry

Bilateral Contracts with Industry

Partnerships and Cooperations

Dissemination

Bibliography

Previous |

Home | Next next

Section: New Results

Combining clustering of variables and feature selection using random forests: the CoV/VSURF procedure

Abstract : Standard approaches to tackle high-dimensional supervised classification problem often include variable selection and dimension reduction procedures. The novel methodology proposed in this paper combines clustering of variables and feature selection. More precisely, hierarchical clustering of variables procedure allows to build groups of correlated variables in order to reduce the redundancy of information and summarizes each group by a synthetic numerical variable. Originality is that the groups of variables (and the number of groups) are unknown a priori. Moreover the clustering approach used can deal with both numerical and categorical variables (i.e. mixed dataset). Among all the possible partitions resulting from dendrogram cuts, the most relevant synthetic variables (i.e. groups of variables) are selected with a variable selection procedure using random forests. Numerical performances of the proposed approach are compared with direct applications of random forests and variable selection using random forests on the original p variables. Improvements obtained with the proposed methodology are illustrated on two simulated mixed datasets (cases n>p and n<p, where n is the sample size) and on a real proteomic dataset. Via the selection of groups of variables (based on the synthetic variables), interpretability of the results becomes easier.

Authors : Marie Chavent (CQFD), Robin Genuer (SISTM), Jerome Saracco (CQFD)

Previous |

Home | Next next