dinov2's output should have N features, so i think feature shape should be [B, N, C, H // 14, W // 14] ? 