Skip to content

Conversation

@Yancey0623
Copy link
Contributor

Fixed #4226

@@ -0,0 +1,25 @@
/* Copyright (c) 2016 PaddlePaddle Authors. All Rights Reserve.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

please rename all the source files into "sequence_concat_op.xx".
Currently all of our operator source files use lower case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const size_t level = static_cast<size_t>(ctx->Attrs().Get<int>("level"));
const size_t axis = static_cast<size_t>(ctx->Attrs().Get<int>("axis"));
PADDLE_ENFORCE(level == 0UL || level == 1UL,
"Sequence Concat Op only support one or two sequence now.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The sequence_concat operator only accepts sequence or a nested sequence as its input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

auto ins_dims = ctx->GetInputsDim("X");
framework::DDim out_dims = ins_dims[0];
const size_t n = ins_dims.size();
for (size_t i = 1; i < n; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ++i in a loop (actually there is no difference for an integer value), you can check this https://google.github.io/styleguide/cppguide.html#Preincrement_and_Predecrement

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X",
"Multip LodTensors, the variable-length inputs of "
"SequenceConcatOp")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please follow the: (type, default value) usage style, this is recommended in our doc https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/operators/name_convention.md

I think fc_op has a very good comment style.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"If level is 1, the inputs will be joined with sentence.")
.SetDefault(0);
AddComment(R"DOC(
SequenceConcatOp concat multip LodTensors and only supports one or two levels.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • The sequence_concat operator concatenates multiple LodTensors. It only supports sequences ( LoD Tensor with level=1) or nested sequences (LoD tensor with level=0) as its inputs.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

axis is 1, level is 1, the Lod of Inputs are the same,
LoD(x0) = {{0,2,4},{0,1,2,3,4}}; Dims(x0) = (2,3,4)
LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
LoD(Out) = {{0,2,4},{01,2,3,4}}; Dims(Out) = (2,7,4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

{01,2,3,4}} --> {0,1,2,3,4}}
Is there a comma missing?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep, sorry for the low-level mistake, done.

LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
LoD(Out) = {{0,2,4},{01,2,3,4}}; Dims(Out) = (2,7,4)
- Case2:
If axis is 0, level is 1, the Lod of inputs are different,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the axis
add an article before the noun.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

AddComment(R"DOC(
SequenceConcatOp concat multip LodTensors and only supports one or two levels.
- Case1:
axis is 1, level is 1, the Lod of Inputs are the same,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The aixs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

If axis is 0, level is 1, the Lod of inputs are different,
LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (2,3,4)
LoD(x1) = {{0,3,5}, {0,1,3,4,5}}; Dims(x1) = (3,3,4)
LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}; Dims(Out) = (5,3,4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LoD or Lod?
I prefer to keep consistent in the comment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add NOTE into the doc, as I understand, the level of all the inputs should be the same (if I am right).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

const size_t n = ins.size();
if (axis == 0UL) {
if (level == 0) {
for (size_t i = 1; i < n; i++) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I prefer ++i, though there is no difference for integer value ...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

protected:
void InferShape(framework::InferShapeContextBase* ctx) const override {
PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
"Gradient of Out should not be null.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The gradient of Out should not be null.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

framework::OpAttrChecker* op_checker)
: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X",
"Multip LodTensors, the variable-length inputs of "
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. set X AddInput(……).NotInGradient()
  2. The input multiple LoDTensors, which are variable-lengh sequence or nested sequence.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

set X AddInput(……).NotInGradient()

Maybe we also need the LoD of X in the Gradient Kernel to calculate the LoD of Grad(Out).

If axis is 0, level is 1, the Lod of inputs are different,
LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (2,3,4)
LoD(x1) = {{0,3,5}, {0,1,3,4,5}}; Dims(x1) = (3,3,4)
LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}; Dims(Out) = (5,3,4)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add NOTE into the doc, as I understand, the level of all the inputs should be the same (if I am right).

for (size_t i = 1; i < n; i++) {
PADDLE_ENFORCE_EQ(ins[i]->NumLevels(), 2UL,
"All the LoDTensors of Inputs(X) should "
"have two level.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure, it seems that there is no check to guarantee that in the concatenation, except dimension of the axis along which to concatenate, dimensions in all the other axis must be the same?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's right and I added some dimension check.
Done.

}

template <typename Place, typename T>
class SequenceConcatOpKernel : public framework::OpKernel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in the latest dev branch, all the OpKernel should inherit from framework::OpKernel<T> to support different computational accuracies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

};

template <typename Place, typename T>
class SequenceConcatGradOpKernel : public framework::OpKernel {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems that in the latest dev branch, all the OpKernel should inherit from framework::OpKernel to support different computational accuracies.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

: OpProtoAndCheckerMaker(proto, op_checker) {
AddInput("X",
"The input Multip LoDTensors, which are variable-length "
"sequence or nested sequence.")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddInput("X",
    "(A vector of LoDTensor), the input is a vector of LoDTensor, "
    "each of which is a variable-length sequence or nested sequence.")

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.AsDuplicable();
AddOutput("Out",
"A LoDTensor, the variable-length output of "
"sequence_concat Op.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

AddOutput("Out",
     "(A LoDTensor), the variable-length output of "
     "sequence_concat Op.");

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

.SetDefault(0);
AddAttr<int>("level",
"(int, default 0)"
"The level which the inputs will be joined with."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"The level at which the inputs will be joined."

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

It only supports sequences ( LoD Tensor with level=1)
or nested sequences (LoD tensor with level=0) as its inputs.
- Case1:
If the axis is 1, level is 1, the LoD of Inputs are the same,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Inputs --> input
  • If the axis is other than 0, (here, axis is 1 and level is 1), each input should have the same LoD information and the LoD information of the output keeps the same as the input.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

"(int, default 0)"
"The level which the inputs will be joined with."
"If level is 0, the inputs will be joined with "
"nested sequences."
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the level is 0, the inputs will be joined at the nested sequence level, which

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

LoD(x1) = {{0,2,4},{0,1,2,3,4}}; Dims(x1) = (2,4,4)
LoD(Out) = {{0,2,4},{0,1,2,3,4}}; Dims(Out) = (2,7,4)
- Case2:
If the axis is 0, level is 1, the LoD of inputs are different,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If axis is 0 (here, level is 1) the inputs are concatenated along time steps, the LoD information of the output need to re-compute.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

PADDLE_ENFORCE(ctx->HasInput(framework::GradVarName("Out")),
"The gradient of Out should not be null.");
PADDLE_ENFORCE(ctx->HasOutputs(framework::GradVarName("X")),
"The gradient of X should not be empty.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • " should not be empty." --> " should not be null.". Keep the output information the same as check in line 95, since I think these two checks are for almost for the same purpose.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

// the Output LoD will be modified as followed:
// LoD(x0) = {{0,2,4}, {0,1,2,3,4}}
// LoD(x1) = {{0,3,5}, {0,1,3,4,5}}
// LoD(Out) = {{0,5,9}, {0,1,2,4,5,6,7,8,9}}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think line 26 ~ line 38 can be deleted because the examples in the operator comments explain the same logic well too.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

PADDLE_ENFORCE_EQ(ins[0]->dims()[j], ins[i]->dims()[j],
"The dimensions of all the input LoDTensors "
"except for the specify axis should be "
"matched exactly.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Except for the dimension of the specified axis along which all the inputs are concatenated, dimensions of all the other axises of the input LoDTensors should be the same.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

x1 = np.random.random((11, 6, 3)).astype('float32')
lod1 = [[0, 2, 5, 11], [0, 1, 2, 5, 7, 11]]
axis = 0
level = 1
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • Only my personal question, I found in the comment and the unittest, only level=1 is tested. It is still hard to make others understand how the attribute level works.
  • Is it necessary to test level=0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done. Add a unit test with level=0.

@lcy-seso
Copy link
Contributor

lcy-seso commented Oct 9, 2017

I still have a question about the attribute level.
- Is it necessary set it as an attribute it?
- Is it possible to let the operator automatically decides the sequence type of all its input (a sequence, level=0, or a nested sequence, level=1) and check the sequence types for all its input should be the same?


Sorry, my understanding is not correct. The attribute level is not used to specify sequence type of the inputs.

auto out_lod = ins[0]->lod();
const size_t n = ins.size();
if (axis == 0UL) {
if (level == 0) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

level == 0UL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

out_lod[0][j] += ins[i]->lod()[0][j];
}
}
} else if (level == 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

level == 1UL

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

} else if (level == 1) {
PADDLE_ENFORCE_EQ(ins[0]->NumLevels(), 2UL,
"If the level is 1, all of the inputs "
"should be the the nested sequence.");
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • the the --> an extra the
  • nested sequence --> nested sequences.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

for (size_t i = 1; i < n; ++i) {
out_dims[axis] += ins_dims[i][axis];
}
ctx->SetOutputDim("Out", out_dims);
Copy link
Contributor

@qingqing01 qingqing01 Oct 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

之前设计框架时讨论,InferShape里是要能够推断出完成的Shape信息,所以下面LoD的check, set_lod, concatLoD实现可能需要移到这里。 @reyoung

Copy link
Contributor Author

@Yancey0623 Yancey0623 Oct 11, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

赞同在InferShape里推断出所有Shape信息,但现在的接口貌似还没有获取LoD的接口?
https://github.com/PaddlePaddle/Paddle/blob/develop/paddle/framework/shape_inference.h#L25

information of the output keeps the same as the input.
LoD(x0) = {{0,2,4}, {0,1,2,3,4}}; Dims(x0) = (4,3,4)
LoD(x1) = {{0,2,4}, {0,1,2,3,4}}; Dims(x1) = (4,4,4)
LoD(Out) = {{0,2,4}, {0,1,2,3,4}}; Dims(Out) = (4,7,4)
Copy link
Contributor

@qingqing01 qingqing01 Oct 10, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line 78前,line 80后分别空一行吧,下同。

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

Copy link
Contributor

@lcy-seso lcy-seso left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM.

@lcy-seso lcy-seso merged commit e9495e7 into PaddlePaddle:develop Oct 11, 2017
@Yancey0623 Yancey0623 deleted the seqconcat_op branch October 11, 2017 13:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants