TMB Documentation
v1.9.11
|
Split a computational graph using a simple heuristic. More...
#include <graph_transform.hpp>
Public Member Functions | |
void | extract () |
Complete the parallel split. | |
size_t | input_size () const |
Get info. | |
std::vector< size_t > | max_tree_depth () |
Give an estimate (maximum tree depth) of the size of each reverse sub tree. | |
size_t | output_size () const |
Get info. | |
Public Attributes | |
std::vector< std::vector< Index > > | dep_idx |
Result: Pointers into original dependent variables. | |
bool | do_aggregate |
Sum up result for each thread? | |
std::vector< std::vector< Index > > | inv_idx |
Result: Pointers into original independent variables. | |
bool | keep_all_inv |
Keep all independent variables for each thread? | |
std::vector< std::vector< Index > > | node_split |
Complete subgraphs by thread. | |
std::vector< global > | vglob |
Result: Vector of computational graphs. | |
Split a computational graph using a simple heuristic.
Given a function object representing a mapping f:R^n->R^m
. Assign each of the m
outputs to threads and split the function f
into m
new functions that can be evaluated in parallel.
Let dep(1), ...,dep(m)
denote the dependendent variables and let T(dep(i))
be the reverse tree starting from output node dep(i)
.
For some given ordering of the dependent variables dep(1),...,dep(n)
we define the work
Work(dep(i)) = |T(dep(i))|
as the size of the i'th sub tree and
dWork(dep(i)) = | T(dep(i)) \ Union_(j<i) T(dep(j)) |
i.e. the amount of extra work to calculate dep(i)
assuming that all previous dep(j)
have been calculated.
The idea beind the algorithm is to sort the work in decreasing order and assign dependent variables to threads as follows:
dWork(i)
is 'small' assign dep(i)
to current threadHowever, we approximate |Work(dep(i))|
by the maximum tree depth of T(dep(i))
(because it's easier to compute). The assigned work by thread is also replaced by a crude approximation.
Definition at line 718 of file graph_transform.hpp.