Create feed_op_and_fectch_op Desgin Doc #4599

tonyyang-svail · 2017-10-04T22:45:20Z

No description provided.

helinwang · 2017-10-04T23:04:29Z

doc/design/feed_op_and_fectch_op.md

+
+```Python
+def feed_value(variable, np_variable):
+    """Overwrite feed_result[variable.name] with a numpy.array


feed_result and fetch_value will be serialized and passed from Python to C++, where are they serialized into? (i.e., are they part of ProgramDesc?)

Good point. This part must be redesigned.

I have an offline discussion with @helinwang.

Each paddle trainer will have a global Scope and two global Variables(maybe static variable in C++), feed_result and fetch_result. Python can not create C++ Variable.

What Feed Operator does is to take LodTensors from global Variables and copy to its output Variable.

I think that:

In distributed training, the training data is saved in a distributed file system. C++ feed_value method will load data from file. feed_value method will set data to Global Variable. And this method must be called before Executor::Run.

In local machine training, we can exposed feed_value to Python, and numpy array will be set into Global Varibale

@QiJune

In distributed training, the training data is saved in a distributed file system. C++ feed_value method will load data from file.

feed_value is only for the argument feed_dict in session.eval. Which will come from the network. It will not come from the disk. We will have OP that reads data from the disk, but that's is irrelevant with feed_value.

In local machine training, we can exposed feed_value to Python, and numpy array will be set into Global Varibale.

Everything needs to be serialized, "numpy array will be set into Global Varibale" means the Python code is involved in the runtime (which conflicts with the current design).

I think the compile-time_runtime separation is very important. Let's have more discussion if you have other ideas.

@helinwang
Currently, we only take local machine as our execution environment. So, all the data can be from disk.
Let's make the whole training process work first. Python code is involved in the runtime breaks our design. But it only happens in feeding data or passing data to our training process. Most other training logic is fine with compile-time design.

@QiJune sure, thanks for explaining! Could you create an issue for it and put into the TODO in the Github project?

@helinwang Yes, I have created a issue #4613. But we have not a Project now for supporting distributed training feature. Let's put it later.

reyoung

LGTM, but since @helinwang has comments, maybe he will approve this PR.

wangkuiyi · 2017-10-04T23:29:21Z

doc/design/feed_op_and_fectch_op.md

@@ -0,0 +1,120 @@
+# FeedOp and FetchOp Design Doc
+
+### Motivation


This is a second level caption, should be prefixed by ## instead of ###.

wangkuiyi · 2017-10-04T23:30:38Z

doc/design/feed_op_and_fectch_op.md

+
+### Challenge
+
+1. During the runtime of a particular Op, it only knows which `Variable` to be read from and written to. It doesn't have a direct access to python object.


python => Python

wangkuiyi · 2017-10-04T23:31:35Z

doc/design/feed_op_and_fectch_op.md

+
+### Motivation
+
+Python programer needs an interface to feed the data to PaddlePaddle, run the model, and fetch the result from it. Since PaddlePaddle runtime only goes through a graph of ops, we need to design corresponding Ops and add them to the graph.


programmer needs => programmers need, or
A Python programmer needs

the data => data

wangkuiyi · 2017-10-05T00:26:45Z

doc/design/feed_op_and_fectch_op.md

+# Run -------------------
+while not converge:
+    # user loads data
+    np_data, np_label = load_input_data()


We are going to use Python Reader API to load the data. This API doesn't split columns. Instead, it returns the mini-batch as a sequence of Python Tuples.

wangkuiyi · 2017-10-05T00:27:03Z

doc/design/feed_op_and_fectch_op.md

+    np_data, np_label = load_input_data()
+
+    # user defines the maping
+    my_feed_dict = {data: np_data, label: np_label}


This constant dict should be moved out of the loop. And it could be the form

dict = {"image":0, "label", 1}

wangkuiyi · 2017-10-05T00:28:24Z

doc/design/feed_op_and_fectch_op.md

+
+```python
+# Build the model -------------------
+data = Variable(dim)


Here a Variable class is not enough to present the idea I am afraid.

image = layer.data(column=dict{"image"}) label = layer.data(column=dict{"label"})

wangkuiyi · 2017-10-05T00:29:07Z

doc/design/feed_op_and_fectch_op.md

+  // Get Tensor reference in feed_result
+  string name = ctx.Output<Tensor>("Output")->name();
+  auto& var = GetScope()->GetVar("feed_result");
+  auto& input_tensor = var->Get<map<string, LoDTensor>>[name];


I think we will need to access an Attribute here named "column", so could we know which column of the mini-batch should be copied into the image variable.

wangkuiyi · 2017-10-05T00:30:20Z

doc/design/feed_op_and_fectch_op.md

+  // Get Tensor reference in feed_result
+  string name = ctx.Output<Tensor>("Output")->name();
+  auto& var = GetScope()->GetVar("feed_result");
+  auto& input_tensor = var->Get<map<string, LoDTensor>>[name];


map => vector

tonyyang-svail · 2017-10-18T03:25:20Z

An implementation of feed and fetch is merged at #4815.
Close this pull request.

tonyyang-svail · 2017-10-18T03:29:59Z

Moved this branch to tonyyang-svail/feed-op-desgin.

Create feed_op_and_fectch_op.md

fe0ca1d

tonyyang-svail requested review from QiJune, helinwang, reyoung and wangkuiyi October 4, 2017 22:45

helinwang reviewed Oct 4, 2017

View reviewed changes

reyoung reviewed Oct 4, 2017

View reviewed changes

wangkuiyi reviewed Oct 5, 2017

View reviewed changes

QiJune mentioned this pull request Oct 9, 2017

Executor interface design and implementation #4537

Merged

30 tasks

tonyyang-svail closed this Oct 18, 2017

tonyyang-svail deleted the tonyyang-svail-feed-op-desgin branch October 18, 2017 03:30

		@@ -0,0 +1,120 @@
		# FeedOp and FetchOp Design Doc

		### Motivation


		### Challenge

		1. During the runtime of a particular Op, it only knows which `Variable` to be read from and written to. It doesn't have a direct access to python object.


		### Motivation

		Python programer needs an interface to feed the data to PaddlePaddle, run the model, and fetch the result from it. Since PaddlePaddle runtime only goes through a graph of ops, we need to design corresponding Ops and add them to the graph.

Create feed_op_and_fectch_op Desgin Doc #4599

Create feed_op_and_fectch_op Desgin Doc #4599

Uh oh!

Conversation

tonyyang-svail commented Oct 4, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

QiJune Oct 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

helinwang Oct 5, 2017 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

reyoung left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

tonyyang-svail commented Oct 18, 2017

Uh oh!

tonyyang-svail commented Oct 18, 2017

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

6 participants

QiJune Oct 5, 2017 •

edited

Loading

helinwang Oct 5, 2017 •

edited

Loading