-
-
Notifications
You must be signed in to change notification settings - Fork 199
ranger requires C++14 support. On an up-to-date system with a recent R version, it should compile fine. If it is not working, try these things:
- Update R to the newest version.
- Set "CXX = g++ -std=gnu++14" or similar in your local Makevars file. On Linux and Mac, the file should be
~/.R/Makevars, see here for other locations. - Update your C++ compiler.
If all that does not help, please open an issue and report your operating system, R version and (if possible) contents of your Makevars file.
One of the tied classes is selected randomly (in terminal nodes and forest aggregation).
Splitting starts at the root (nodeID 0) and continues non-recursively by alway splitting the smallest available nodeID. Example:
0
/ \
/ \
1 2
/ \ / \
3 4 5 6
/ \
7 8
Consider reducing the number of time points for estimation of the cumulative hazards. With ranger version >=0.15.4, you can just use the time.interest argument for this.
I recommend to use saveRDS(). The ranger forest objects are already serialized.
Since version 0.11.5, use the x/y interface with a matrix for x. In older versions, use dependent.variable.name and data. For GWAS data, use the GenABEL package.
The trees are saved in rf$forest$child.nodeIDs, rf$forest$split.varIDs and rf$forest$split.values:
child.nodeIDs is a list containing two vectors for each tree. These are the child node IDs for each node in the tree. The left children in the first vector, the right children in the second vector. The nodeIDs are 0-indexed. However, since the root of a tree (ID 0) cannot be a child, the 0 is used for terminal nodes.
In split.varIDs there is a vector for each tree with the splitting variables for each node in the tree. These IDs are 0-indexed, too. If you use the formula interface, the 0-variable should be the outcome. Terminal nodes also have a 0 here.
Update: Since version 0.11.5, the outcome is not included in the variable IDs anymore.
In split.values the splitting values for all nodes are saved. In terminal nodes, the outcome for this node is saved in split.values.
Note: A 0 in split.varIDs is no sufficient condition for a terminal node. Check child.nodeIDs to be sure.
Update: The treeInfo() function is the recommended interface to access trees of ranger objects.
If you want to predict probabilities use probability = TRUE, e.g.
ranger(Species ~ ., data = iris, probability = TRUE)If you still want to get the votes out of hard classification, use predict.all = TRUE in predict(). See #288 for out-of-bag predictions.
Use edarf::extract_proximity(). See #234 for out-of-bag predictions.
See #356 for a combination function. We will probably add a similar function to the package in the future.