A generic Bayesian approach for the calibration of advanced snow avalanche models with application to real-time risk assessment conditional to snow

Maria Belen HEREDIA, Irstea/CEN, oct. 2017 - oct. 2020

Phd student : Maria Belen HEREDIA, Irstea/CEN, oct. 2017 - oct. 2020
Supervision : Nicolas Eckert (Irstea
Funding : 50% Labex
Doctoral School : Terre Univers Environnement


Real snow avalanche flows are complex phenomena because of the changing nature of the fluid involved, and, more broadly, the highly nonlinear nature of snow avalanche activity response to snow and weather drivers (Schweizer et al., 2003). Evidence obtained on full-scale experimental slopes (Sovilla et al., 2008 ; 2010a ; 2010b ; Vriend et al., 2013 ; Thibert et al., 2015) and small scale experiments (Casassa et al., 1989 ; Rognon et al., 2008a) continuously refine our understanding of the processes at play, but all existing avalanche models, even the most advanced ones, still rely on ad-hoc formulations to a certain degree. This especially concerns the mechanical behavior of snow during motion which remains, more often than not, modeled with Voellmy’s historical proposal (1955), notably in state of the art shallow water tools for hazard mapping (Bartelt et al., 1999 ; Naaim et al. 2004). Things are not really different for full three-dimensional models in which some free parameters still need to be fixed on the basis of the modeler’s skill to provide simulations with the taste of reality. A direct consequence of this situation is that on-site calibration using available archival records remains today unavoidable to predict high-return-period avalanches in long term forecasting. Also, real time and/or short term forecasting remain limited to avalanche activity indexes that does not consider the potential avalanche extent and dynamic characteristics conditional to snow conditions. Hence, for basic understanding, model comparison and validation, and risk assessment improvement, an effort to better relate, on sound mathematical basis, increasingly comprehensive datasets to increasingly advanced models appears as an urgent need. To this end, the probability of the data can be maximised with respect to the model’s parameters (Fisher 1925 ; Fisher, 1934). Asymptotic likelihood theory provides standard errors approximately normally distributed and confidence intervals, which can be used in hypotheses testing and probabilistic forecasts (Fisher, 1922 ; Neyman & Pearson, 1933).
Bayesian statistics is an alternative avenue of thought, leading credible intervals, the Bayesian counterpart for confidence intervals, but using the posterior pdf instead of the likelihood function, and providing a fair (non-asymptotic) quantification of the actual state of knowledge given data, prior and modelling assumptions . For most models, both Bayesian and Frequentist analyses lead asymptotically to the same inferential results (Berger, 1985). Besides, uninformative priors can be used to let the data speak for themselves (Bernardo and Smith, 1994). These two arguments have more or less closed the debate, with Bayesian statistics now been generally accepted as a reasonable option in environmental and social sciences (Krzysztofowicz, 1983 ; Gelman et al., 1995 ; Berliner, 2003 ; Clark, 2005). From a more practical point of view, the question of how to compute the normalising constant in Bayes theorem has long limited Bayesian analysis to toy models, for which the posterior was explicitly obtainable using conjugate distributions.
Nowadays, several algorithms are well suited to overcome the computational difficulties (Brooks, 2003) and, among these, Markov Chain Monte Carlo (MCMC) methods are the most popular (Robert and Casella, 1998 ; Gelman and Rubin, 1992 ; Gilks et al., 2001). In practice, writing MCMC algorithms that converge reasonably quickly needs some practical skill in addition to theoretical knowledge. On the other hand, MCMC methods allow overcoming the inferential challenge, even with highly complex numerical simulation codes (Oakley & O’Hagan, 2004), notably using tricks like data-augmentation techniques (Tanner, 1992). Additional tools are available within the same framework, providing a complete framework to model selection, model checking and hypothesis testing : information criterions such as the Bayesian information criterion (BIC), dedicated model selection scores (Bayes factors, Kaas & Raftery, 1995), goodness-of-fit diagnoses (posterior predictive p-value, Gelman et al., 1996, etc.), etc. Also, link to decision theory is direct (Von Neumann & Morgenstern, 1954, Pratt et al.,1964), which is appreciable for risk problems (Eckert et al., 2012). After deterministic inversion methods had shown their limits (Ancey et al., 2003), the Bayesian framework has started to be considered as an appealing alternative to model calibration in snow science (Ancey, 2005 ; Gauer et al., 2009, Eckert et al., 2007 ; 2008 ; 2009 ; 2010 ; Fischer et al., 2014 ; Schläppy et al., 2014). Yet, existing implementations have to date generally remained limited to rather simple avalanche models and coarse field data (e.g., samples of runout distances supplemented by input conditions). And when more comprehensive data sets have been considered, improper likelihood formulations have been used (Fischer et al., 2015). Regarding snow conditions, whereas empirical links could be documented using large data sets (Naaim et al., 2013), their weakness and the scarcity of the calibration method used make real time avalanche dynamics forecasting conditional to snow conditions still out of reach.