Add sequence concat op #4508

Yancey0623 · 2017-09-29T11:37:03Z

lcy-seso · 2017-09-30T05:50:25Z

paddle/operators/Sequence_concat_op.cu

@@ -0,0 +1,25 @@
+/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.


please rename all the source files into "sequence_concat_op.xx".
Currently all of our operator source files use lower case.

lcy-seso · 2017-09-30T05:52:26Z

paddle/operators/sequence_concat_op.cc

+    const size_t level = static_cast<size_t>(ctx->Attrs().Get<int>("level"));
+    const size_t axis = static_cast<size_t>(ctx->Attrs().Get<int>("axis"));
+    PADDLE_ENFORCE(level == 0UL || level == 1UL,
+                   "Sequence Concat Op only support one or two sequence now.");


The sequence_concat operator only accepts sequence or a nested sequence as its input.

lcy-seso · 2017-09-30T05:56:13Z

paddle/operators/sequence_concat_op.cc

+    auto ins_dims = ctx->GetInputsDim("X");
+    framework::DDim out_dims = ins_dims[0];
+    const size_t n = ins_dims.size();
+    for (size_t i = 1; i < n; i++) {


I prefer ++i in a loop (actually there is no difference for an integer value), you can check this https://google.github.io/styleguide/cppguide.html#Preincrement_and_Predecrement

lcy-seso · 2017-09-30T05:59:39Z

paddle/operators/sequence_concat_op.cc

+      : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddInput("X",
+             "Multip LodTensors, the variable-length inputs of "
+             "SequenceConcatOp")


Please follow the: (type, default value) usage style, this is recommended in our doc https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/name_convention.md

I think fc_op has a very good comment style.

lcy-seso · 2017-09-30T06:00:35Z

paddle/operators/sequence_concat_op.cc

+                 "If level is 1, the inputs will be joined with sentence.")
+        .SetDefault(0);
+    AddComment(R"DOC(
+    SequenceConcatOp concat multip LodTensors and only supports one or two levels.


The sequence_concat operator concatenates multiple LodTensors. It only supports sequences ( LoD Tensor with level=1) or nested sequences (LoD tensor with level=0) as its inputs.

lcy-seso · 2017-09-30T06:10:34Z

paddle/operators/sequence_concat_op.cc

+      axis is 1, level is 1, the Lod of Inputs are the same,
+      LoD(x0) = {{0,2,4},{0,1,2,3,4}}; Dims(x0) = (2,3,4)
+      LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
+      LoD(Out) = {{0,2,4},{01,2,3,4}}; Dims(Out) = (2,7,4)


{01,2,3,4}} --> {0,1,2,3,4}}
Is there a comma missing?

Yep, sorry for the low-level mistake, done.

lcy-seso · 2017-09-30T06:11:12Z

paddle/operators/sequence_concat_op.cc

+      LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
+      LoD(Out) = {{0,2,4},{01,2,3,4}}; Dims(Out) = (2,7,4)
+    - Case2:
+      If axis is 0, level is 1, the Lod of inputs are different,


If the axis
add an article before the noun.

lcy-seso · 2017-09-30T06:11:20Z

paddle/operators/sequence_concat_op.cc

+    AddComment(R"DOC(
+    SequenceConcatOp concat multip LodTensors and only supports one or two levels.
+    - Case1:
+      axis is 1, level is 1, the Lod of Inputs are the same,


lcy-seso · 2017-09-30T06:12:04Z

paddle/operators/sequence_concat_op.cc

+      If axis is 0, level is 1, the Lod of inputs are different,
+      LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (2,3,4)
+      LoD(x1) = {{0,3,5}, {0,1,3,4,5}}; Dims(x1) = (3,3,4)
+      LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}; Dims(Out) = (5,3,4)


LoD or Lod?
I prefer to keep consistent in the comment.

Add NOTE into the doc, as I understand, the level of all the inputs should be the same (if I am right).

lcy-seso · 2017-09-30T06:14:34Z

paddle/operators/sequence_concat_op.h

+  const size_t n = ins.size();
+  if (axis == 0UL) {
+    if (level == 0) {
+      for (size_t i = 1; i < n; i++) {


I prefer ++i, though there is no difference for integer value ...

…at_op

lcy-seso · 2017-09-30T06:46:21Z

paddle/operators/sequence_concat_op.cc

+ protected:
+  void InferShape(framework::InferShapeContextBase* ctx) const override {
+    PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
+                   "Gradient of Out should not be null.");


The gradient of Out should not be null.

lcy-seso · 2017-09-30T06:50:26Z

paddle/operators/sequence_concat_op.cc

+                        framework::OpAttrChecker* op_checker)
+      : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddInput("X",
+             "Multip LodTensors, the variable-length inputs of "


set X AddInput(……).NotInGradient()

The input multiple LoDTensors, which are variable-lengh sequence or nested sequence.

set X AddInput(……).NotInGradient()

Maybe we also need the LoD of X in the Gradient Kernel to calculate the LoD of Grad(Out).

lcy-seso · 2017-09-30T06:53:27Z

paddle/operators/sequence_concat_op.cc

+      If axis is 0, level is 1, the Lod of inputs are different,
+      LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (2,3,4)
+      LoD(x1) = {{0,3,5}, {0,1,3,4,5}}; Dims(x1) = (3,3,4)
+      LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}; Dims(Out) = (5,3,4)


Add NOTE into the doc, as I understand, the level of all the inputs should be the same (if I am right).

lcy-seso · 2017-09-30T06:59:28Z

paddle/operators/sequence_concat_op.h

+      for (size_t i = 1; i < n; i++) {
+        PADDLE_ENFORCE_EQ(ins[i]->NumLevels(), 2UL,
+                          "All the LoDTensors of Inputs(X) should "
+                          "have two level.");


I am not sure, it seems that there is no check to guarantee that in the concatenation, except dimension of the axis along which to concatenate, dimensions in all the other axis must be the same?

Yes, that's right and I added some dimension check.
Done.

lcy-seso · 2017-09-30T07:02:01Z

paddle/operators/sequence_concat_op.h

+}
+
+template <typename Place, typename T>
+class SequenceConcatOpKernel : public framework::OpKernel {


It seems that in the latest dev branch, all the OpKernel should inherit from framework::OpKernel<T> to support different computational accuracies.

lcy-seso · 2017-09-30T07:03:06Z

paddle/operators/sequence_concat_op.h

+};
+
+template <typename Place, typename T>
+class SequenceConcatGradOpKernel : public framework::OpKernel {


It seems that in the latest dev branch, all the OpKernel should inherit from framework::OpKernel to support different computational accuracies.

lcy-seso · 2017-10-09T04:51:40Z

paddle/operators/sequence_concat_op.cc

+      : OpProtoAndCheckerMaker(proto, op_checker) {
+    AddInput("X",
+             "The input Multip LoDTensors, which are variable-length "
+             "sequence or nested sequence.")


AddInput("X", "(A vector of LoDTensor), the input is a vector of LoDTensor, " "each of which is a variable-length sequence or nested sequence.")

lcy-seso · 2017-10-09T05:00:02Z

paddle/operators/sequence_concat_op.cc

+        .AsDuplicable();
+    AddOutput("Out",
+              "A LoDTensor, the variable-length output of "
+              "sequence_concat Op.");


AddOutput("Out", "(A LoDTensor), the variable-length output of " "sequence_concat Op.");

lcy-seso · 2017-10-09T05:00:55Z

paddle/operators/sequence_concat_op.cc

+        .SetDefault(0);
+    AddAttr<int>("level",
+                 "(int, default 0)"
+                 "The level which the inputs will be joined with."


"The level at which the inputs will be joined."

lcy-seso · 2017-10-09T05:03:16Z

paddle/operators/sequence_concat_op.cc

+    It only supports sequences ( LoD Tensor with level=1) 
+    or nested sequences (LoD tensor with level=0) as its inputs.
+    - Case1:
+      If the axis is 1, level is 1, the LoD of Inputs are the same,


Inputs --> input

If the axis is other than 0, (here, axis is 1 and level is 1), each input should have the same LoD information and the LoD information of the output keeps the same as the input.

lcy-seso · 2017-10-09T05:16:29Z

paddle/operators/sequence_concat_op.cc

+                 "(int, default 0)"
+                 "The level which the inputs will be joined with."
+                 "If level is 0, the inputs will be joined with "
+                 "nested sequences."


If the level is 0, the inputs will be joined at the nested sequence level, which

lcy-seso · 2017-10-09T06:26:51Z

paddle/operators/sequence_concat_op.cc

+      LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
+      LoD(Out) = {{0,2,4},{0,1,2,3,4}}; Dims(Out) = (2,7,4)
+    - Case2:
+      If the axis is 0, level is 1, the LoD of inputs are different,


If axis is 0 (here, level is 1) the inputs are concatenated along time steps, the LoD information of the output need to re-compute.

lcy-seso · 2017-10-09T06:45:01Z

paddle/operators/sequence_concat_op.cc

+    PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
+                   "The gradient of Out should not be null.");
+    PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName("X")),
+                   "The gradient of X should not be empty.");


" should not be empty." --> " should not be null.". Keep the output information the same as check in line 95, since I think these two checks are for almost for the same purpose.

lcy-seso · 2017-10-09T06:48:52Z

paddle/operators/sequence_concat_op.h

+//  the Output LoD will be modified as followed:
+//  LoD(x0) = {{0,2,4}, {0,1,2,3,4}}
+//  LoD(x1) = {{0,3,5}, {0,1,3,4,5}}
+//  LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}


I think line 26 ~ line 38 can be deleted because the examples in the operator comments explain the same logic well too.

lcy-seso · 2017-10-09T06:59:21Z

paddle/operators/sequence_concat_op.h

+        PADDLE_ENFORCE_EQ(ins[0]->dims()[j], ins[i]->dims()[j],
+                          "The dimensions of all the input LoDTensors "
+                          "except for the specify axis should be "
+                          "matched exactly.");


Except for the dimension of the specified axis along which all the inputs are concatenated, dimensions of all the other axises of the input LoDTensors should be the same.

lcy-seso · 2017-10-09T07:02:14Z

python/paddle/v2/framework/tests/test_seq_concat_op.py

+        x1 = np.random.random((11, 6, 3)).astype('float32')
+        lod1 = [[0, 2, 5, 11], [0, 1, 2, 5, 7, 11]]
+        axis = 0
+        level = 1


Only my personal question, I found in the comment and the unittest, only level=1 is tested. It is still hard to make others understand how the attribute level works.

Is it necessary to test level=0?

Done. Add a unit test with level=0.

lcy-seso · 2017-10-09T07:10:44Z

~~I still have a question about the attribute level.~~
~~- Is it necessary set it as an attribute it?~~
- Is it possible to let the operator automatically decides the sequence type of all its input (a sequence, level=0, or a nested sequence, level=1) and check the sequence types for all its input should be the same?

Sorry, my understanding is not correct. The attribute level is not used to specify sequence type of the inputs.

lcy-seso · 2017-10-09T07:20:36Z

paddle/operators/sequence_concat_op.h

+  auto out_lod = ins[0]->lod();
+  const size_t n = ins.size();
+  if (axis == 0UL) {
+    if (level == 0) {


level == 0UL

lcy-seso · 2017-10-09T07:24:48Z

paddle/operators/sequence_concat_op.h

+          out_lod[0][j] += ins[i]->lod()[0][j];
+        }
+      }
+    } else if (level == 1) {


level == 1UL

lcy-seso · 2017-10-09T07:25:37Z

paddle/operators/sequence_concat_op.h

+    } else if (level == 1) {
+      PADDLE_ENFORCE_EQ(ins[0]->NumLevels(), 2UL,
+                        "If the level is 1, all of the inputs "
+                        "should be the the nested sequence.");


the the --> an extra the

nested sequence --> nested sequences.

…at_op

qingqing01 · 2017-10-10T12:54:13Z

paddle/operators/sequence_concat_op.cc

+    for (size_t i = 1; i < n; ++i) {
+      out_dims[axis] += ins_dims[i][axis];
+    }
+    ctx->SetOutputDim("Out", out_dims);


之前设计框架时讨论，InferShape里是要能够推断出完成的Shape信息，所以下面LoD的check, set_lod, concatLoD实现可能需要移到这里。 @reyoung

赞同在InferShape里推断出所有Shape信息，但现在的接口貌似还没有获取LoD的接口？
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/shape_inference.h#L25

qingqing01 · 2017-10-10T12:55:29Z

paddle/operators/sequence_concat_op.cc

+      information of the output keeps the same as the input.
+      LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (4,3,4)
+      LoD(x1) = {{0,2,4}, {0,1,2,3,4}}; Dims(x1) = (4,4,4)
+      LoD(Out) = {{0,2,4}, {0,1,2,3,4}}; Dims(Out) = (4,7,4)


line 78前，line 80后分别空一行吧，下同。

lcy-seso

LGTM.

add sequence concat op

be3fa79

Yancey0623 requested review from lcy-seso, luotao1 and typhoonzero September 29, 2017 11:37

lcy-seso requested changes Sep 30, 2017

View reviewed changes

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into seqconc…

a35e82a

…at_op

lcy-seso reviewed Sep 30, 2017

View reviewed changes

Yancey0623 added 2 commits September 30, 2017 17:20

add some checking

927767b

update

0028459

lcy-seso reviewed Oct 9, 2017

View reviewed changes

Yancey0623 added 2 commits October 10, 2017 12:26

update comment

d211b51

Merge branch 'develop' of github.com:PaddlePaddle/Paddle into seqconc…

a4d410a

…at_op

qingqing01 added the OpPorting label Oct 10, 2017

Yancey0623 added 2 commits October 10, 2017 13:25

update

462579c

update

e880a35

qingqing01 reviewed Oct 10, 2017

View reviewed changes

Yancey0623 added 3 commits October 11, 2017 15:35

update

ad477b9

add an enforce

69e92b3

update

d68122f

lcy-seso approved these changes Oct 11, 2017

View reviewed changes

lcy-seso merged commit e9495e7 into PaddlePaddle:develop Oct 11, 2017

Yancey0623 deleted the seqconcat_op branch October 11, 2017 13:15

		@@ -0,0 +1,25 @@
		/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.

Add sequence concat op #4508

Add sequence concat op #4508

Uh oh!

Conversation

Yancey0623 commented Sep 29, 2017

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!