TMB Documentation
v1.9.11
|
Custom functions and derivatives can be added to the TMB library. This may be necessary for the following reasons:
TMB uses CppAD as its engine for reverse mode derivatives. In order to add a new primitive function
\[f: R^n \rightarrow R^m\]
we must inform CppAD how to calculate derivatives of this function in reverse mode. That is, for any range space vector \(w \in R^m\) we must calculate the gradient of the function \(R^n \rightarrow R\) given by
\[ x \rightarrow \text{sum}( f(x) \odot w ) \]
where '$$' is pointwise multiplication.
As an example consider the Lambert W function defined implicitly by
\[y = W(y e^y)\]
Here, we only consider \(W\) as defined on the positive reals. It follows, by differentiating the above identity, that
\[ W'(x) = \frac{1}{ \exp\left(W(x)\right) \left(1 + W(x)\right) } \]
When coding reverse-mode derivatives we can assume that the function value \(W(x)\) has already been computed during a forward pass. For efficiency reasons we should use this intermediate calculation rather than re-calculating \(W(x)\) in the reverse pass.
We'll assume that a plain C++ function (taking double types as input/output) is available to calculate \(W(x)\). It doesn't matter whether you have the source code of an implementation or just the header with linkage to an external library:
The macro TMB_ATOMIC_VECTOR_FUNCTION()
is used to declare our new primitive Lambert \(W\) function:
Let's explain in detail what is going on. The macro takes four arguments:
ATOMIC_NAME
: Name of new primitive function taking CppAD::vector
as input and output.OUTPUT_DIM
: Dimension of the CppAD::vector
which is the function output.ATOMIC_DOUBLE
: Specifies how to evaluate the primitive function for the ordinary double type. tx
denotes the input vector and ty
the output vector of the function \(f: R^n \rightarrow R^m\). In this case both have dimension one.ATOMIC_REVERSE
: How to calculate the reverse mode derivatives for a general Type
. Again tx
and ty
denote function input and output but now ty
has been computed and is available as an intermediate value. The vectors px
and py
denote partial derivatives of the end result with respect to \(x\) and \(y\) respectively. py
is given and we must calculate px
using the chain rule. This first order derivative rule is automatically expanded up to higher orders required when using TMB's random effects calculations.To make the function work like other TMB functions it is convenient to define scalar and a vectorized versions that call the atomic function:
Here is a complete example using Newton's method to calculate the Lambert \(W\) function (there are more sophisticated algorithms such as the one by Fukushima (2013), but that doesn't matter for this example):
And from R
Check definition of the function:
Check derivatives using the numDeriv
package:
Also try second order derivatives:
For the Lambert \(W\) function we know how to calculate the derivatives. There are cases for which the derivatives are impossible (or difficult) to write down. If you're in this situation you may want to try using forward mode AD to help in defining an atomic function. A full worked out example is available here: adaptive_integration.cpp. Derivatives are calculated automatically and if-else branching is allowed. The main downside with this approach is that it is limited to functions with very few inputs.
Checkpointing is another useful technique. It is demonstrated in the example register_atomic.cpp. It does not work for adaptive algorithms but is otherwise automatic. It is useful to reduce AD memory usage in cases where the same sequence of operations is being applied many times.