Agnès Lagnoux
Institut de Mathématiques de Toulouse, Université Toulouse 2 Jean Jaurès, Toulouse, France
Global Sensitivity Analysis: a novel generation of mighty estimators based on rank statistics
Abstract: In this talk, I present a new statistical estimation framework for a large family of global sensitivity analysis indices that we have proposed in a recent paper published in 2021. Our approach is based on rank statistics and uses an empirical correlation coefficient recently introduced by Chatterjee. We show how to apply this approach to compute not only the Cramér-von-Mises indices, directly related to Chatterjee's notion of correlation, but also first-order Sobol' indices, general metric space indices and higher-order moment indices. We establish the consistency of the resulting estimators and demonstrate their numerical efficiency, especially for small sample sizes. In addition, we prove a central limit theorem for the estimators of the first-order Sobol' indices.
Juliane Mai
Department of Civil and Environmental Engineering, University of Waterloo, Waterloo, ON, Canada
Towards more general sensitivity estimates: Applications considering model structural uncertainties, grouping of parameters, and large-scale analyses
Abstract: Sensitivities of model outputs are traditionally evaluated for the parameters specific to a given model of interest simulating a specific output, for example, streamflow. This presentation will focus on attempts leading to more general sensitivity estimates that hold for more than one specific model through (1) the inclusion of model structural uncertainties as parameters in the analysis, (2) grouping parameters such that sensitivities are not parameter specific but process specific, and (3) the deployment of these methods to large regions such that underlying patterns can be identified and transferred to locations that might have not been analysed before.
These approaches have been recently applied to hydrologic models across North America evaluating their sensitivity to simulated streamflow. This presentation will describe the underlying methods applied and present results derived from analysing a blended hydrologic model structure, which includes not only parametric, but also structural uncertainties over more than 3000 basins across North America. Furthermore, it will be described how the results of the 3000 basins were used to derive an approximation of sensitivities based on physiographic and climatologic data such that sensitivities can be estimated without the expensive analysis. The interactive website sharing detailed spatio-temporal inputs and results of this study will be shown.
Art Owen
Department of Statistics, Stanford University, Stanford, CA, U.S.A
Variable importance and explainable AI
Abstract: In order to explain what a black box algorithm does we can start by studying which variables are important for its decisions. Variable importance is studied by making hypothetical changes to predictor variables. Changing parameters one at a time can produce input combinations that are outliers or very unlikely. They can be physically impossible, or even logically impossible. It is problematic to base an explanation on outputs corresponding to impossible inputs. We introduced the cohort Shapley (CS) measure to avoid this problem, based on Shapley value from cooperative game theory. There are many tradeoffs in picking a variable importance measure, so CS is not the unique reasonable choice. One interesting property of CS is that it can detect `redlining', meaning the impact of a protected variable on an algorithm's output when that algorithm was trained without the protected variable.
This talk is based on recent joint work with Masayoshi Mase and Ben Seilert. The opinions expressed are my own, and not those of Stanford, the National Science Foundation, or Hitachi, Ltd.
Samuele Lo Piano
University of Reading
Reading, UK
Understanding the modelling process and model use
Abstract: Models are used to represent systems and their possible evolutions, gaining insights to be translated into decisions on the real system they aim to represent. The steps of model development and use are especially critical when these are used at the policy-making interface. It is not infrequent that the model replaces the system modelled as locus of attention and that the use of a given model may be extrapolated well beyond the function it has initially been conceived for. In these settings, uncertainty and sensitivity analysis, however useful to draw inference on model’s robustness and stability, may be insufficient to acknowledge these kinds of issues leading to potentially regrettable decisions. In this contribution, I will discuss the approaches proposed for thorough scrutiny of the modelling activities and their use at the science-policy interface. I will conclude by examining practical examples in the context of recent initiatives where efforts have been put forth to mainstream these practices and approaches.
Clémentine Prieur
Université Grenoble Alpes, Grenoble, France
(Non)linear dimension reduction of input parameter space using gradient information
Abstract: Many problems that arise in uncertainty quantification, e.g., integrating or approximating multivariate functions, suffer from the curse of dimensionality. The cost of computing a sufficiently accurate approximation grows indeed dramatically with the dimension of input parameter space. It thus seems important to identify and exploit some notion of low-dimensional structure as, e.g., the intrinsic dimension of the model. A function varying primarily along a a low dimensional manifold embedded in the high-dimensional input parameter space is said of low intrinsic dimension. In that setting, algorithms for quantifying uncertainty focusing on the most relevant features of input parameter space are expected to reduce the overall cost. Our presentation goes from global sensitivity analysis to (non)linear gradient-based dimension reduction, generalizing the active subspace methodology.
Sébastien Da Veiga
Safran Tech, Paris, France
A kernel-based ANOVA decomposition: extending sensitivy indices and Shapley effects with kernels
Abstract: Global sensitivity analysis is the main quantitative technique for identifying the most influential input variables in a numerical model.
In particular when the inputs are independent, Sobol’ sensitivity indices attribute a portion of the output variance to each input and all possible interactions in the model, thanks to a functional ANOVA decomposition.
On the other hand, moment-independent sensitivity indices focus on the impact of inputs on the whole output distribution instead of the variance only, thus providing complementary insight on the inputs/output relationship. But they do not enjoy the nice decomposition property of Sobol’ indices and are consequently harder to analyze.
In this talk, we introduce two moment-independent indices based on kernel-embeddings of probability distributions and show that the RKHS framework makes it possible to exhibit a kernel-based ANOVA decomposition.
This is the first time such a desirable property is proved for sensitivity indices apart from Sobol’ ones. With dependent inputs, we also use these new sensitivity indices as building blocks to design kernel-embedding Shapley effects which generalize the traditional ones.
Several estimation procedures are discussed and illustrated on test cases with various output types such as categorical variables and probability distributions. All these examples show their potential for enhancing sensitivity analysis with a kernel viewpoint.