Head-wise Adaptive RoPE

A proof of concept implementation for head-wise adaptive RoPE (Rotary Position Embedding) that allows each attention head to learn its own frequency and phase scaling for improved long-context recall and copy-style in-context learning.

Overview

This project implements and evaluates head-wise adaptive RoPE variants where each attention head can learn:

Per-head frequency scaling (how fast the rotation changes with position)
Per-head phase scaling (initial phase offset)

Key Features

Baseline RoPE implementation
Head-wise adaptive RoPE with learnable scales and phases
Synthetic copy and associative recall tasks
Mechanistic analysis tools (attention patching, head scale analysis)
Evaluation on context lengths from 2k to 16k

Installation

pip install -r requirements.txt

Usage

python train.py --model_size 4 --context_length 2048 --test_length 16384

Project Structure

rope.py - RoPE implementations (baseline and adaptive)
model.py - Transformer model with adaptive RoPE
tasks.py - Synthetic copy and associative recall tasks
train.py - Training and evaluation script
analysis.py - Head analysis and attention patching tools
experiments.py - Experiment configurations and comparisons

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
analysis_results		analysis_results
examples		examples
figures		figures
.gitignore		.gitignore
README.md		README.md
USAGE.md		USAGE.md
ablations.py		ablations.py
analysis.py		analysis.py
config.py		config.py
experiments.py		experiments.py
mechanistic_analysis.py		mechanistic_analysis.py
model.py		model.py
requirements.txt		requirements.txt
rope.py		rope.py
run_analysis.py		run_analysis.py
tasks.py		tasks.py
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Head-wise Adaptive RoPE

Overview

Key Features

Installation

Usage

Project Structure

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Head-wise Adaptive RoPE

Overview

Key Features

Installation

Usage

Project Structure

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages