Model fitting

In the model \(H0\), we infered 1 parameter (\(\phi\)). In the model \(H_1\) We estimated the 9,679 parameters with a Bayesian inference method using the Stan language Carpenter et al. (2017) and the R package cmdstanr Gabry et al. (2025). The posterior distribution evaluation consisted of a preliminary warm-up phase using a Laplace approximation algorithm (10,000 iterations) followed by estimation with a No U-Turn Sampler including 2,000 warm-up iterations and 2,000 sampling iterations.

Code
path_fitH0 <- here_rel("results", "models", "H0")
fitH0_chains <- brms::read_csv_as_stanfit(paste0(
    path_fitH0, "/",
    list.files(path_fitH0, "*.csv")
))
fitH0 <- readRDS(here_rel("results", "models", "H0", "fitH0.rds"))

path_fitH1 <- here_rel("results", "models", "H1")
fitH1_chains <- brms::read_csv_as_stanfit(paste0(
    path_fitH1, "/",
    list.files(path_fitH1, "*.csv")
))
fitH1 <- readRDS(here_rel("results", "models", "H1", "fitH1.rds"))

Model diagnostics

Convergence

Convergence was verified using the \(\hat R\) statistic.

Code
if (!file.exists(here_rel("notebook", "figs","rhat_H1.png"))) {
    if (file.exists(here_rel("results", "models", "H1", "Summary_H1.rds"))) {
        RhatH1 <- readRDS(here_rel("results", "models", "H1", "Summary_H1.rds"))$rhat
    } else {
        summaryH1 <- fitH1$summary()
        saveRDS(summaryH1,here_rel("results", "models", "H1", "Summary_H1.rds"))
        RhatH1 <- summaryH1$rhat
        names(RhatH1) <- summaryH1$variable
    }
    g  <- RhatH1 |>
        as_tibble() |>
        mutate(par = names(RhatH1)) |>
        rename(rhat = value) |>
        ggplot(aes(x = rhat)) +
        geom_density() +
        theme_public() +
        ggtitle("9679 parameters", "0 with rhat > 1.01")
    ggsave(plot = g, filename = here_rel("notebook", "figs","rhat_H1.png"), bg = "white", width = 5,height = 5)
   
} 

knitr::include_graphics(here_rel("notebook", "figs","rhat_H1.png"))
Figure 1: Distribution of convergence metrics (R hat) for model H1

The model correctly converged (all \(\hat R < 1.01\)) for all parameters.

For the sake of illustration, we show just five parameters including log-posterior lp__ and 5 paramaters selected at random.

Code
plot_trace(
    fitH1,
    c("lp__", sample(variables(fitH1), 5))
)
Figure 2: Traceplots for log posterior and 4 parameters from model H1

These plots show that the chains are stable, well-mixed, and converged.

We performed a NUTS energy diagnostics where we examine the distribution of the sampler’s Hamiltonian “energy” (energy__) across post–warm-up iterations. In Hamiltonian Monte Carlo, the energy reflects the sum of potential energy (linked to the log-posterior) and kinetic energy (from the momentum variables), and its behaviour provides a direct check of whether the sampler is moving efficiently through the posterior. We expect overlapping, unimodal distributions of energy across chains which indicate a consistent exploration of the same posterior geometry.

Code
bayesplot::mcmc_nuts_energy(nuts_params(fitH1)) + theme_public()
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
## `stat_bin()` using `bins = 30`. Pick better value `binwidth`.
Figure 3: NUTS energy diagnostics of model H1

Across the four NUTS chains, we observe the expected pattern. The centered, approximately symmetric shapes suggest no strong pathology in energy transitions and no obvious chain-specific deviations. Minor differences in tail mass are limited, supporting stable sampling behaviour across chains.

Posterior predictive adequacy (pseudo-residuals)

We use the pseudo-residual approach implemented in estimate_residuals() following to Gerber & Craig method Gerber and Craig (2024).

Code
if (!all(file.exists(
    here_rel("results", "models", "H0", "Y_H0_fitted.rds"),
    here_rel("results", "models", "H1", "Y_H1_fitted.rds")
))) {
    if (!file.exists(here_rel("data", "raw_data", "Y_raw.rds"))) {
        stop(paste0("missing ", here_rel("data", "raw_data", "Y_raw.rds")))
    } else {
        Y_raw <- readRDS(here_rel("data", "raw_data", "Y_raw.rds"))
        Y_fitted_H0 <- estimate_residuals(fitH0,
            Y_raw |> filter(type == "train"),
            Y_raw |> filter(type == "validation"),
            chunk_size = 100, ncores = 1
        )
        saveRDS(Y_fitted_H0, here_rel("results", "models", "H0", "Y_H0_fitted.rds"))

        Y_fitted_H1 <- estimate_residuals(fitH1,
            Y_raw |> filter(type == "train"),
            Y_raw |> filter(type == "validation"),
            chunk_size = 100, ncores = 1
        )
        saveRDS(Y_fitted_H1, here_rel("results", "models", "H1", "Y_H1_fitted.rds"))
    }
}

Y_fitted_H0 <- readRDS(here_rel("results", "models", "H0", "Y_H0_fitted.rds"))
Y_fitted_H1 <- readRDS(here_rel("results", "models", "H1", "Y_H1_fitted.rds"))

Gerber & Craig pseudo-residuals for multinomial count models that are constructed to behave like standard normal residuals when the fitted model is correct. For each observation (count vector) : * (i) generate many replicate count vectors from the fitted multinomial model, * (ii) map observed and simulated compositions to using an additive log-ratio (log-odds) transform, after applying a correction that removes zeros (so log-ratios are defined), * (iii) quantify how unusual the observed log-odds vector is relative to its model-based sampling distribution using a squared Mahalanobis distance (which accounts for covariance among categories), * (iv) convert the observed distance into a randomised percentile under the empirical distance distribution, then apply a probit transform to obtain a single residual per observation.

Under correct specification, these residuals are approximately (and, aside from parameter-estimation uncertainty, conceptually) standard normal, enabling familiar residual-vs-covariate plots to diagnose general lack of fit in multinomial.

Code
tibble(res_H0 = Y_fitted_H0$res, res_H1 = Y_fitted_H1$res) |>
    ggplot() +
    geom_histogram(aes(x = res_H0), fill = "darkred", color = "darkred", 
    bins = 50, alpha = 0.5) +
    geom_histogram(aes(x = res_H1), fill = "darkblue", color = "darkblue", 
    bins = 50, alpha = 0.5) +
    geom_vline(xintercept = 0, linetype = "dashed") + 
    labs(x = "Pseudo-residual", y = "Count", 
    title = "Distribution of pseudo-residuals", 
    subtitle = expression(Blue: H[0]- Red: H[1]))+
    theme_public()
Figure 4: Distribution of pseudo-residuals for H0 and H1

Analysis of the pseudo-residuals Figure 4 shows an unbiased zero-centred distribution for model \(H_1\) whereas the model \(H_0\) present a severelly biaised distribution.

Code

if (!file.exists(here_rel("notebook", "figs", "fig-res_env.png"))) {
    p1 <- Y_fitted_H1 |>
        sf::st_drop_geometry() |>
        select(type, Clim1, Clim2, GeomSub, SWI, res) |>
        ggplot(aes(color = type, fill = type, x = Clim1, y = res)) +
        geom_point(size = 0.5) +
        geom_smooth(method = "lm") +
        geom_vline(aes(xintercept = mean(Clim1)),
            linetype = "dashed", color = "black"
        ) +
        ylab("Gerber & Craig pseudo-residuals") +
        scale_color_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred")) +
        scale_fill_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred"))

    p2 <- Y_fitted_H1 |>
        sf::st_drop_geometry() |>
        select(type, Clim1, Clim2, GeomSub, SWI, res) |>
        ggplot(aes(color = type, fill = type, x = Clim2, y = res)) +
        geom_point(size = 0.5) +
        geom_smooth(method = "lm") +
        geom_vline(aes(xintercept = mean(Clim2)),
            linetype = "dashed", color = "black"
        ) +
        ylab("Gerber & Craig pseudo-residuals") +
        scale_color_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred")) +
        scale_fill_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred"))

    p3 <- Y_fitted_H1 |>
        sf::st_drop_geometry() |>
        select(type, Clim1, Clim2, GeomSub, SWI, res) |>
        ggplot(aes(color = type, fill = type, x = log(SWI + 1), y = res)) +
        geom_point(size = 0.5) +
        geom_smooth(method = "lm") +
        geom_vline(aes(xintercept = mean(log(SWI + 1))),
            linetype = "dashed", color = "black"
        ) +
        ylab("Gerber & Craig pseudo-residuals") +
        xlab(expression(log(SWI + 1))) +
        scale_color_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred")) +
        scale_fill_manual("Split datasets", values = c("train" = "darkblue", "validation" = "darkred"))

    g <- patchwork::wrap_plots(p1 | p2, p3, ncol = 1) +
        patchwork::plot_layout(guides = "collect") &
        theme(legend.position = "bottom")
    ggsave(plot = g, filename = here_rel("notebook", "figs", "fig-res_env.png"), bg = "white", width = 7, height = 10)
}
knitr::include_graphics(here_rel("notebook", "figs", "fig-res_env.png"))
Figure 5: Distribution of residuals across environmental predictors from model H1

Analysis of the pseudo-residuals Figure 5 shows an unbiased zero-centred distribution for quantitative variables \(Clim_1\), \(Clim_2\) and \(SWI\) in validation dataset.

Predictive performance

We quantify predictive improvement relative to a null model using the pseudo-residual approach, and report a McFadden pseudo-R² : \(\rho^{2} = 1-\dfrac{\log(\mathcal{L}(H_1))}{\log(\mathcal{L}(H_0))}\)

Code

if (!all(file.exists(c(
    here_rel("results", "models", "H0", "LL_H0.rds"),
    here_rel("results", "models", "H1", "LL_H1.rds")
))) ) {

    if (!file.exists(here_rel("data", "raw_data", "Y_raw.rds"))) {
        stop(paste0("missing ", here_rel("data", "raw_data", "Y_raw.rds")))
    } else {
        Y_raw <- readRDS(here_rel("data", "raw_data", "Y_raw.rds"))

        Y_train <- Y_raw |> filter(type == "train")
        Y_val <- Y_raw |> filter(type == "validation")
        fitH0$data$y <- fitH1$data$y <- Y_train$y

        LL_H0 <- estimate_LL(
            fit = fitH0,
            Y_train = Y_train,
            Y_val = Y_val,
            chunk_size = 100, ncores = 1
        )

        LL_H1 <- estimate_LL(
            fit = fitH1,
            Y_train = Y_train,
            Y_val = Y_val,
            chunk_size = 100, ncores = 1
        )

        saveRDS(LL_H0, here_rel("results", "models", "H0", "LL_H0.rds"))
        saveRDS(LL_H1, here_rel("results", "models", "H1", "LL_H1.rds"))
    }
}


LL_H0 <- readRDS(here_rel("results", "models", "H0", "LL_H0.rds"))
LL_H1 <- readRDS(here_rel("results", "models", "H1", "LL_H1.rds"))

rho_train <- 100 * quantile(1 - (rowSums(t(LL_H1$LL_train_pw)) / rowSums(t(LL_H0$LL_train_pw))), c(0.5, 0.025, 0.975))
rho_val <- 100 * quantile(1 - (rowSums(t(LL_H1$LL_val_pw)) / rowSums(t(LL_H0$LL_val_pw))), c(0.5, 0.025, 0.975))

rbind(rho_train, rho_val) |>
    as_tibble() |>
    mutate(Dataset = c("Train", "Validation")) |>
    select("Dataset", `50%`, `2.5%`, `97.5%`) |>
    gt() |>
    fmt_number(columns = c(`50%`, `2.5%`, `97.5%`), decimals = 2) |>
    opts_theme()
Pseudo-R² MacFadden of model H1 in train and validation datasets
Dataset 50% 2.5% 97.5%
Train 20.86 20.80 20.91
Validation 19.26 19.07 19.44

We found a median \(\rho^{2} \sim 20 \%\) which indicate a pretty good adjustement for a multinomial model (Domencich and McFadden 1975).

References

Carpenter, Bob, Andrew Gelman, Matthew D. Hoffman, Daniel Lee, Ben Goodrich, Michael Betancourt, Marcus Brubaker, Jiqiang Guo, Peter Li, and Allen Riddell. 2017. “Stan: A Probabilistic Programming Language.” Journal of Statistical Software 76 (January): 1–32. https://doi.org/10.18637/jss.v076.i01.
Domencich, Thomas A., and Daniel McFadden. 1975. Urban Travel Demand: A Behavioral Analysis: A Charles River Associates Research Study. Contributions to Economic Analysis 93. Amsterdam : New York: North-Holland Pub. Co. ; American Elsevier.
Gabry, Jonah, Rok Češnovar, Andrew Johnson, and Steve Bronder. 2025. “Cmdstanr: R Interface to ’CmdStan’.”
Gerber, Eric A. E., and Bruce A. Craig. 2024. “Residuals and Diagnostics for Multinomial Regression Models.” Statistical Analysis and Data Mining: The ASA Data Science Journal 17 (1): e11645. https://doi.org/10.1002/sam.11645.