You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* master:
Add some development tips to documentation. (ray-project#1426)
Add link to github from documentation. (ray-project#1425)
[rllib] Update docs with api and components overview figures (ray-project#1443)
Multiagent model using concatenated observations (ray-project#1416)
Load evaluation configuration from checkpoint (ray-project#1392)
[autoscaling] increase connect timeout, boto retries, and check subnet conf (ray-project#1422)
Update wheel in autoscaler example. (ray-project#1408)
[autoscaler] Fix ValueError: Missing required config keyavailability_zoneof type str
[tune][minor] Fixes (ray-project#1383)
[rllib] Expose PPO evaluator resource requirements (ray-project#1391)
fix autoscaler test (ray-project#1411)
[rllib] Fix incorrect documentation on how to use custom models ray-project#1405
Added option for availability zone (ray-project#1393)
Adding all DataFrame methods with NotImplementedErrors (ray-project#1403)
Remove pyarrow version check. (ray-project#1394)
# Conflicts:
# python/ray/rllib/eval.py
Copy file name to clipboardExpand all lines: doc/source/autoscaling.rst
+1-1Lines changed: 1 addition & 1 deletion
Original file line number
Diff line number
Diff line change
@@ -41,7 +41,7 @@ Autoscaling
41
41
42
42
Ray clusters come with a load-based auto-scaler. When cluster resource usage exceeds a configurable threshold (80% by default), new nodes will be launched up the specified ``max_workers`` limit. When nodes are idle for more than a timeout, they will be removed, down to the ``min_workers`` limit. The head node is never removed.
43
43
44
-
The default idle timeout is 5 minutes. This is because in AWS there is a minimum billing charge of 5 minutes per instance, after which usage is billed by the second.
44
+
The default idle timeout is 5 minutes. This is to prevent excessive node churn which could impact performance and increase costs (in AWS there is a minimum billing charge of 1 minute per instance, after which usage is billed by the second).
Copy file name to clipboardExpand all lines: doc/source/rllib-dev.rst
+7-33Lines changed: 7 additions & 33 deletions
Original file line number
Diff line number
Diff line change
@@ -10,49 +10,23 @@ Recipe for an RLlib algorithm
10
10
11
11
Here are the steps for implementing a new algorithm in RLlib:
12
12
13
-
1. Define an algorithm-specific `Evaluator class <#evaluators-and-optimizers>`__ (the core of the algorithm). Evaluators encapsulate framework-specific components such as the policy and loss functions. For an example, see the `A3C Evaluator implementation <https://github.com/ray-project/ray/blob/master/python/ray/rllib/a3c/a3c_evaluator.py>`__.
13
+
1. Define an algorithm-specific `Policy evaluator class <#policy-evaluators-and-optimizers>`__ (the core of the algorithm). Evaluators encapsulate framework-specific components such as the policy and loss functions. For an example, see the `A3C Evaluator implementation <https://github.com/ray-project/ray/blob/master/python/ray/rllib/a3c/a3c_evaluator.py>`__.
14
14
15
15
16
-
2. Pick an appropriate `RLlib optimizer class <#evaluators-and-optimizers>`__. Optimizers manage the parallel execution of the algorithm. RLlib provides several built-in optimizers for gradient-based algorithms. Advanced algorithms may find it beneficial to implement their own optimizers.
16
+
2. Pick an appropriate `Policy optimizer class <#policy-evaluators-and-optimizers>`__. Optimizers manage the parallel execution of the algorithm. RLlib provides several built-in optimizers for gradient-based algorithms. Advanced algorithms may find it beneficial to implement their own optimizers.
17
17
18
18
19
19
3. Wrap the two up in an `Agent class <#agents>`__. Agents are the user-facing API of RLlib. They provide the necessary "glue" and implement accessory functionality such as statistics reporting and checkpointing.
20
20
21
21
To help with implementation, RLlib provides common action distributions, preprocessors, and neural network models, found in `catalog.py <https://github.com/ray-project/ray/blob/master/python/ray/rllib/models/catalog.py>`__, which are shared by all algorithms. Note that most of these utilities are currently Tensorflow specific.
22
22
23
-
Defining a custom model
24
-
-----------------------
25
-
26
-
Often you will want to plug in your own neural network into an existing RLlib algorithm.
27
-
This can be easily done by defining your own `Model class <#models-and-preprocessors>`__ and registering it in the RLlib catalog, after which it will be available for use by all RLlib algorithms.
28
-
29
-
An example usage of a custom model looks like this:
Note that if you need to reference large data objects as part of the computation, e.g. weights, you can put them into the Ray object store with ``ray.put`` and then retrieve them from inside your model class.
23
+
.. image:: rllib-api.svg
50
24
51
25
52
26
The Developer API
53
27
-----------------
54
28
55
-
The following APIs are the building blocks of RLlib algorithms. Note that they are not yet considered stable.
29
+
The following APIs are the building blocks of RLlib algorithms (also take a look at the `user components overview <rllib.html#components-user-customizable-and-internal>`__).
@@ -123,7 +97,7 @@ Currently we support the following action distributions:
123
97
The Model Catalog
124
98
~~~~~~~~~~~~~~~~~
125
99
126
-
The Model Catalog is the mechanism for algorithms to get preprocessors, models, and action distributions for varying gym environments. It enables sharing of these components across different algorithms.
100
+
The Model Catalog is the mechanism for algorithms to get canonical preprocessors, models, and action distributions for varying gym environments. It enables easy reuse of these components across different algorithms.
0 commit comments