14 Validation

14.1 Residuals

The underlying framework is the same for all cases listed in this section. [ Description of general framework FIXME ]

For models that does not include random effects the calculations can be simplified greatly.

14.1.1 Models without random effects

14.1.1.1 Normal distribution (Pearson residuals) .

This example shows how standardized residuals can be calculated within the template code and reported back to R using the REPORT function in TMB.

// linear regression with reporting of residuals
#include <TMB.hpp>
template<class Type>
Type objective_function<Type>::operator() ()
{
  DATA_VECTOR(Y);
  DATA_VECTOR(x);
  PARAMETER(a);
  PARAMETER(b);
  PARAMETER(logSigma);
  Type sigma = exp(logSigma);
  Vector<Type> pred = a + b*x;
  Type nll = -sum(dnorm(Y, a+b*x, sigma, true));
  Vector<Type> residuals = (Y - pred)/sigma;  
  REPORT(residuals);  
  return nll;
}

Assuming that the model parameters have been fitted, and the model object is called obj, the standardized residuals can now be extracted from the model object usinig the report() function and inspected for normality as follows:

... 
rep <- obj$report()
qqnorm(rep$residuals)
abline(0,1)

14.1.1.2 Non-normal distributions

14.1.1.2.1 Continuous

We now consider situations where the error distribution is continuous but not Gaussian.
Residuals that are standard normal distributed given that the model is correct, can be obtained be using the “transformation trick,” here illustrated using a model that fits a gamma distribution.


#include <TMB.hpp>
template<class Type>
Type objective_function<Type>::operator() ()
{
  DATA_VECTOR(Y);
  PARAMETER(shape);
  PARAMETER(scale);

  Type nll=-dgamma(Y,shape,scale,true).sum();
  vector<Type> residuals = qnorm( pgamma(Y,shape,scale) );
  REPORT(residuals);
  return nll;
}
14.1.1.2.2 Discrete

For discrete probability distributions the transformation trick can also be used, but an element of randomization must be added in order to obtain residuals that are truly Gaussian.

Assume that you have a series of observed counts y and you have fitted some TMB model using a Poisson likelihood, and the predicted values from that model have been reported and saved in a vector called mu.

... 
a <- ppois(y - 1, mu)
b <- ppois(y, mu)
u <- runif(n = length(y), min = a, max = b)
residuals <- qnorm(u)

14.1.2 Models with random effects

Model validation using residuals is considerably more complicated for random effect models. Further information can be found in (Thygesen et al. 2017) FIXME: not generating reference.

14.1.2.1 One-step-ahead residuals

Other names are one step prediction errors, forecast pseudo-residuals, and recursive residuals. These residuals can be computed using the oneStepPredict function. There are several methods available within this function, and it is the responsibility of the user to ensure that an appropriate method is chosen for a given model.

The following examples of its use are availabe in the tmb_examples/validation folder.

Example Description
validation/MVRandomWalkValidation.cpp Estimate and validate a multivariate random walk model with correlated increments and correlated observations.
validation/randomwalkvalidation.cpp Estimate and validate a random walk model with and without drift
validation/rickervalidation.cpp Estimate and validate a Ricker model based on data simulated from the logistic map

14.1.2.2 One sample from the posterior

An alternative (and faster) method is based on a single sample of the random effects from the their posterior distribution given the data. For state space models we can derive both process- and observation errors from the single sample and the observations, and compare these with the assumptions in the model.

An example can be found at the end of the randomwalkvalidation.R file in the tmb_examples/validation folder

14.2 Checking the Laplace approximation

FIXME:

References

Thygesen, Uffe Høgsbro, Christoffer Moesgaard Albertsen, Casper Willestofte Berg, Kasper Kristensen, and Anders Nielsen. 2017. “Validation of Ecological State Space Models Using the Laplace Approximation.” Environmental and Ecological Statistics 24 (2): 317–39.