Skip to content

What is the purpose of SmearGate cross-document 'leak fix'? #1988

@ClassicLarry

Description

@ClassicLarry

Why are PRs, such as the current record #1855, doing this?

Looking at prior documents does not break causality. Any LLM that doesn't use intra-document masking is already looking at prior documents through attention. There is no 'cheating' involved here, unless the maintainers have created an arbitrary ruling on this. When all of these 1 position techniques like smear gate and bigram hash were created, both the masked and unmasked versions were tested, and the unmasked version was intentionally selected because it ran faster, didn't hurt loss, and obeyed the causal mask.

I am concerned that every record is going to copy paste this, and the final record is going to have this janky inefficiency for no good reason.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions