Orthogonal Low-Rank Knowledge Editing

Author: gadwant

This repository contains the implementation for "Subspace Collisions in Knowledge Editing: Orthogonal Low-Rank Updates for Scalable, Stable Model Edits".

Overview

This repository contains the implementation for the paper "Subspace Collisions in Knowledge Editing: Orthogonal Low-Rank Updates for Scalable, Stable Model Edits"

This research addresses the instability of existing low-rank knowledge editing methods (like ROME and MEMIT) when scaled to hundreds or thousands of edits. We identify "subspace collisions"—overlapping update directions in the model's representation space—as a primary cause of this instability.

The code provided here implements Orthogonal Low-Rank Editing, a novel approach that:

Enforces Orthogonality: Ensures new knowledge updates are geometrically separated from existing ones.
Preserves Stability: Maintains low condition numbers and high effective rank even as the number of edits scales.
Scales Effectively: Demonstrates robust performance from 1 to 50+ edits where naive methods fail.

Installation

pip install -r requirements.txt

Project Structure

code/
├── utils/
│   ├── orthogonal_editing.py    # Core orthogonal editing implementation
│   └── evaluation.py            # Evaluation metrics
├── scripts/
│   └── run_experiments.py       # Main experiment script
├── data/                        # Dataset storage
├── models/                      # Model checkpoints
├── notebooks/                   # Jupyter notebooks for analysis
└── requirements.txt             # Python dependencies

Usage

Running Experiments

python scripts/run_experiments.py \
    --model_name "EleutherAI/pythia-70m" \
    --dataset counterfact \
    --dataset_path data/counterfact.json \
    --output_dir results \
    --scales 1 3 5 10 25 50 \
    --use_orthogonal \
    --device cpu

Key Components

OrthogonalLowRankEditor

The main class for applying orthogonal edits:

from utils.orthogonal_editing import OrthogonalLowRankEditor, Edit

editor = OrthogonalLowRankEditor(model, tokenizer, use_qr=True, device="cpu")

# Apply a single edit
edit = Edit(
    subject="Paris",
    relation="capital of",
    old_object="France",
    new_object="Germany",
    layer_idx=6
)
# Returns u (update direction) and v (projection)
u, v = editor.apply_edit(edit)

# Apply multiple edits (automatically handles orthogonalization)
edits = [edit1, edit2, ...]
updates = editor.apply_edits_batch(edits)
# Apply to model weights
editor.apply_updates_to_model(updates)

Evaluation

from utils.evaluation import KnowledgeEditingEvaluator

evaluator = KnowledgeEditingEvaluator(model, tokenizer, device="cpu")

result = evaluator.evaluate_edit(
    subject="Paris",
    relation="capital of",
    old_object="France",
    new_object="Germany",
    unrelated_facts=[...],
    paraphrases=[...]
)

Datasets

CounterFact

Download from: CounterFact Dataset (ROME website)

zsRE

Download from: zsRE Dataset (ROME website)

Experiments

The paper experiments include:

Scaling Analysis: Testing edit performance from 1 to 50 edits.
Baseline Comparison: Comparing against ROME, MEMIT, and naive sequential editing.
Geometric Analysis: Measuring condition number, interference index, and effective rank.
Robustness Testing: Testing order invariance and noise robustness.

Notes

This implementation focuses on the geometric analysis of edit interactions.
Designed for use with Pythia and GPT-style models.
Uses SimpleROME (gradient-based rank-1 updates) as the base editor signal.

Author

gadwant

Initial implementation and experiments.

License

This project is licensed under the MIT License - see the LICENSE file for details.

Name		Name	Last commit message	Last commit date
Latest commit History 5 Commits
notebooks		notebooks
scripts		scripts
utils		utils
.gitignore		.gitignore
.python-version		.python-version
LICENSE		LICENSE
README.md		README.md
requirements.txt		requirements.txt
setup_environment.sh		setup_environment.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Orthogonal Low-Rank Knowledge Editing

Overview

Installation

Project Structure

Usage

Running Experiments

Key Components

OrthogonalLowRankEditor

Evaluation

Datasets

CounterFact

zsRE

Experiments

Notes

Author

License

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Orthogonal Low-Rank Knowledge Editing

Overview

Installation

Project Structure

Usage

Running Experiments

Key Components

OrthogonalLowRankEditor

Evaluation

Datasets

CounterFact

zsRE

Experiments

Notes

Author

License

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages