Skip to content

Refine backward doc #7127

@JiayiFeng

Description

@JiayiFeng

This part of backward.md needs refine:

When implementing a specific op, the developer is also asked to implement its backward version, called grad_op. A grad_op takes gradients of its corresponding op's outputs, and calculate gradients of the op's inputs. During the building of a model's backward part, the framework creates each forward op's grad_op, and then string them together in reverse order of forwarding part. In this way, gradients spread from the end to the beginning of the model, in another word, from the loss to parameters.

@reyoung 's commit:

grad_op is not the backward version of the forward operator.
We maintain a mapping between an operator and operators that will produce its gradient. It is not a one-to-one mapping. Operators in backward stage(I do not think there should be a backward stage.However, to make it easy to understand, I just assume operators in Block can be split into two stages) can be used in forward stage.

See #7123 (comment)

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions