Skip to content

Commit 097d0fe

Browse files
authored
Merge pull request #3862 from wangkuiyi/update_graph_construction_design_doc
Update graph construction design doc
2 parents e888175 + a266a22 commit 097d0fe

File tree

5 files changed

+25
-2
lines changed

5 files changed

+25
-2
lines changed

doc/design/graph.md

Lines changed: 21 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# Design Doc: Computations as Graphs
1+
# Design Doc: Computations as a Graph
22

33
A primary goal of the refactorization of PaddlePaddle is a more flexible representation of deep learning computation, in particular, a graph of operators and variables, instead of sequences of layers as before.
44

@@ -8,6 +8,8 @@ This document explains that the construction of a graph as three steps:
88
- construct the backward part
99
- construct the optimization part
1010

11+
## The Construction of a Graph
12+
1113
Let us take the problem of image classification as a simple example. The application program that trains the model looks like:
1214

1315
```python
@@ -25,7 +27,9 @@ The first four lines of above program build the forward part of the graph.
2527

2628
![](images/graph_construction_example_forward_only.png)
2729

28-
In particular, the first line `x = layer.data("images")` creates variable x and a Feed operator that copies a column from the minibatch to x. `y = layer.fc(x)` creates not only the FC operator and output variable y, but also two parameters, W and b.
30+
In particular, the first line `x = layer.data("images")` creates variable x and a Feed operator that copies a column from the minibatch to x. `y = layer.fc(x)` creates not only the FC operator and output variable y, but also two parameters, W and b, and the initialization operators.
31+
32+
Initialization operators are kind of "run-once" operators -- the `Run` method increments a class data member counter so to run at most once. By doing so, a parameter wouldn't be initialized repeatedly, say, in every minibatch.
2933

3034
In this example, all operators are created as `OpDesc` protobuf messages, and all variables are `VarDesc`. These protobuf messages are saved in a `BlockDesc` protobuf message.
3135

@@ -49,3 +53,18 @@ According to the chain rule of gradient computation, `ConstructBackwardGraph` wo
4953
For each parameter, like W and b created by `layer.fc`, marked as double circles in above graphs, `ConstructOptimizationGraph` creates an optimization operator to apply its gradient. Here results in the complete graph:
5054

5155
![](images/graph_construction_example_all.png)
56+
57+
## Block and Graph
58+
59+
The word block and graph are interchangable in the desgin of PaddlePaddle. A [Block[(https://github.com/PaddlePaddle/Paddle/pull/3708) is a metaphore of the code and local variables in a pair of curly braces in programming languages, where operators are like statements or instructions. A graph of operators and variables is a representation of the block.
60+
61+
A Block keeps operators in an array `BlockDesc::ops`
62+
63+
```protobuf
64+
message BlockDesc {
65+
repeated OpDesc ops = 1;
66+
repeated VarDesc vars = 2;
67+
}
68+
```
69+
70+
in the order that there appear in user programs, like the Python program at the beginning of this article. We can imagine that in `ops`, we have some forward operators, followed by some gradient operators, and then some optimization operators.

doc/design/images/graph_construction_example.dot

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -2,6 +2,8 @@ digraph ImageClassificationGraph {
22
///////// The forward part /////////
33
FeedX [label="Feed", color=blue, shape=box];
44
FeedY [label="Feed", color=blue, shape=box];
5+
InitW [label="Init", color=blue, shape=diamond];
6+
Initb [label="Init", color=blue, shape=diamond];
57
FC [label="FC", color=blue, shape=box];
68
MSE [label="MSE", color=blue, shape=box];
79

@@ -14,6 +16,8 @@ digraph ImageClassificationGraph {
1416

1517
FeedX -> x -> FC -> y -> MSE -> cost [color=blue];
1618
FeedY -> l [color=blue];
19+
InitW -> W [color=blue];
20+
Initb -> b [color=blue];
1721
W -> FC [color=blue];
1822
b -> FC [color=blue];
1923
l -> MSE [color=blue];
4.16 KB
Loading
4.06 KB
Loading
3.01 KB
Loading

0 commit comments

Comments
 (0)