Commit 2eecf81
Alexei Starovoitov
Merge branch 'bpf: Fix to preserve reg parent/live fields when copying range info'
Eduard Zingerman says:
====================
Struct bpf_reg_state is copied directly in several places including:
- check_stack_write_fixed_off() (via save_register_state());
- check_stack_read_fixed_off();
- find_equal_scalars().
However, a literal copy of this struct also copies the following fields:
struct bpf_reg_state {
...
struct bpf_reg_state *parent;
...
enum bpf_reg_liveness live;
...
};
This breaks register parentage chain and liveness marking logic.
The commit message for the first patch has a detailed example.
This patch-set replaces direct copies with a call to a function
copy_register_state(dst,src), which preserves 'parent' and 'live'
fields of the 'dst'.
The fix comes with a significant verifier runtime penalty for some
selftest binaries listed in tools/testing/selftests/bpf/veristat.cfg
and cilium BPF binaries (see [1]):
$ ./veristat -e file,prog,states -C -f 'states_diff>10' master-baseline.log current.log
File Program States (A) States (B) States (DIFF)
-------------------------- -------------------------------- ---------- ---------- ---------------
bpf_host.o tail_handle_ipv4_from_host 231 299 +68 (+29.44%)
bpf_host.o tail_handle_nat_fwd_ipv4 1088 1320 +232 (+21.32%)
bpf_host.o tail_handle_nat_fwd_ipv6 716 729 +13 (+1.82%)
bpf_host.o tail_nodeport_nat_ingress_ipv4 281 314 +33 (+11.74%)
bpf_host.o tail_nodeport_nat_ingress_ipv6 245 256 +11 (+4.49%)
bpf_lxc.o tail_handle_nat_fwd_ipv4 1088 1320 +232 (+21.32%)
bpf_lxc.o tail_handle_nat_fwd_ipv6 716 729 +13 (+1.82%)
bpf_lxc.o tail_ipv4_ct_egress 239 262 +23 (+9.62%)
bpf_lxc.o tail_ipv4_ct_ingress 239 262 +23 (+9.62%)
bpf_lxc.o tail_ipv4_ct_ingress_policy_only 239 262 +23 (+9.62%)
bpf_lxc.o tail_ipv6_ct_egress 181 195 +14 (+7.73%)
bpf_lxc.o tail_ipv6_ct_ingress 181 195 +14 (+7.73%)
bpf_lxc.o tail_ipv6_ct_ingress_policy_only 181 195 +14 (+7.73%)
bpf_lxc.o tail_nodeport_nat_ingress_ipv4 281 314 +33 (+11.74%)
bpf_lxc.o tail_nodeport_nat_ingress_ipv6 245 256 +11 (+4.49%)
bpf_overlay.o tail_handle_nat_fwd_ipv4 799 829 +30 (+3.75%)
bpf_overlay.o tail_nodeport_nat_ingress_ipv4 281 314 +33 (+11.74%)
bpf_overlay.o tail_nodeport_nat_ingress_ipv6 245 256 +11 (+4.49%)
bpf_sock.o cil_sock4_connect 47 70 +23 (+48.94%)
bpf_sock.o cil_sock4_sendmsg 45 68 +23 (+51.11%)
bpf_sock.o cil_sock6_post_bind 31 42 +11 (+35.48%)
bpf_xdp.o tail_lb_ipv4 4413 6457 +2044 (+46.32%)
bpf_xdp.o tail_lb_ipv6 6876 7249 +373 (+5.42%)
test_cls_redirect.bpf.o cls_redirect 4704 4799 +95 (+2.02%)
test_tcp_hdr_options.bpf.o estab 180 206 +26 (+14.44%)
xdp_synproxy_kern.bpf.o syncookie_tc 21059 21485 +426 (+2.02%)
xdp_synproxy_kern.bpf.o syncookie_xdp 21857 23122 +1265 (+5.79%)
-------------------------- -------------------------------- ---------- ---------- ---------------
I looked through verification log for bpf_xdp.o tail_lb_ipv4 program in
order to identify the reason for ~50% visited states increase.
The slowdown is triggered by a difference in handling of three stack slots:
fp-56, fp-72 and fp-80, with the main difference coming from fp-72.
In fact the following change removes all the difference:
@@ -3256,7 +3256,10 @@ static void save_register_state(struct bpf_func_state *state,
{
int i;
- copy_register_state(&state->stack[spi].spilled_ptr, reg);
+ if ((spi == 6 /*56*/ || spi == 8 /*72*/ || spi == 9 /*80*/) && size != BPF_REG_SIZE)
+ state->stack[spi].spilled_ptr = *reg;
+ else
+ copy_register_state(&state->stack[spi].spilled_ptr, reg);
For fp-56 I found the following pattern for divergences between
verification logs with and w/o this patch:
- At some point insn 1862 is reached and checkpoint is created;
- At some other point insn 1862 is reached again:
- with this patch:
- the current state is considered *not* equivalent to the old checkpoint;
- the reason for mismatch is the state of fp-56:
- current state: fp-56=????mmmm
- checkpoint: fp-56_rD=mmmmmmmm
- without this patch the current state is considered equivalent to the
checkpoint, the fp-56 is not present in the checkpoint.
Here is a fragment of the verification log for when the checkpoint in
question created at insn 1862:
checkpoint 1862: ... fp-56=mmmmmmmm ...
1862: ...
1863: ...
1864: (61) r1 = *(u32 *)(r0 +0)
1865: ...
1866: (63) *(u32 *)(r10 -56) = r1 ; R1_w=scalar(...) R10=fp0 fp-56=
1867: (bf) r2 = r10 ; R2_w=fp0 R10=fp0
1868: (07) r2 += -56 ; R2_w=fp-56
; return map_lookup_elem(&LB4_BACKEND_MAP_V2, &backend_id);
1869: (18) r1 = 0xffff888100286000 ; R1_w=map_ptr(off=0,ks=4,vs=8,imm=0)
1871: (85) call bpf_map_lookup_elem#1
- Without this patch:
- at insn 1864 r1 liveness is set to REG_LIVE_WRITTEN;
- at insn 1866 fp-56 liveness is set REG_LIVE_WRITTEN mark because
of the direct r1 copy in save_register_state();
- at insn 1871 REG_LIVE_READ is not propagated to fp-56 at
checkpoint 1862 because of the REG_LIVE_WRITTEN mark;
- eventually fp-56 is pruned from checkpoint at 1862 in
clean_func_state().
- With this patch:
- at insn 1864 r1 liveness is set to REG_LIVE_WRITTEN;
- at insn 1866 fp-56 liveness is *not* set to REG_LIVE_WRITTEN mark
because write size is not equal to BPF_REG_SIZE;
- at insn 1871 REG_LIVE_READ is propagated to fp-56 at checkpoint 1862.
Hence more states have to be visited by verifier with this patch compared
to current master.
Similar patterns could be found for both fp-72 and fp-80, although these
are harder to track trough the log because of a big number of insns between
slot write and bpf_map_lookup_elem() call triggering read mark, boils down
to the following C code:
struct ipv4_frag_id frag_id = {
.daddr = ip4->daddr,
.saddr = ip4->saddr,
.id = ip4->id,
.proto = ip4->protocol,
.pad = 0,
};
...
map_lookup_elem(..., &frag_id);
Where:
- .id is mapped to fp-72, write of size u16;
- .saddr is mapped to fp-80, write of size u32.
This patch-set is a continuation of discussion from [2].
Changes v1 -> v2 (no changes in the code itself):
- added analysis for the tail_lb_ipv4 verification slowdown;
- rebase against fresh master branch.
[1] [email protected]:anakryiko/cilium.git
[2] https://lore.kernel.org/bpf/[email protected]/
====================
Acked-by: Andrii Nakryiko <[email protected]>
Signed-off-by: Alexei Starovoitov <[email protected]>File tree
2 files changed
+54
-7
lines changed- kernel/bpf
- tools/testing/selftests/bpf/verifier
2 files changed
+54
-7
lines changed| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3243 | 3243 | | |
3244 | 3244 | | |
3245 | 3245 | | |
| 3246 | + | |
| 3247 | + | |
| 3248 | + | |
| 3249 | + | |
| 3250 | + | |
| 3251 | + | |
| 3252 | + | |
| 3253 | + | |
| 3254 | + | |
| 3255 | + | |
| 3256 | + | |
3246 | 3257 | | |
3247 | 3258 | | |
3248 | 3259 | | |
3249 | 3260 | | |
3250 | 3261 | | |
3251 | 3262 | | |
3252 | | - | |
| 3263 | + | |
3253 | 3264 | | |
3254 | 3265 | | |
3255 | 3266 | | |
| |||
3577 | 3588 | | |
3578 | 3589 | | |
3579 | 3590 | | |
3580 | | - | |
| 3591 | + | |
3581 | 3592 | | |
3582 | 3593 | | |
3583 | 3594 | | |
| |||
3598 | 3609 | | |
3599 | 3610 | | |
3600 | 3611 | | |
3601 | | - | |
| 3612 | + | |
3602 | 3613 | | |
3603 | 3614 | | |
3604 | 3615 | | |
| |||
9592 | 9603 | | |
9593 | 9604 | | |
9594 | 9605 | | |
9595 | | - | |
| 9606 | + | |
9596 | 9607 | | |
9597 | 9608 | | |
9598 | 9609 | | |
| |||
10845 | 10856 | | |
10846 | 10857 | | |
10847 | 10858 | | |
10848 | | - | |
| 10859 | + | |
10849 | 10860 | | |
10850 | 10861 | | |
10851 | 10862 | | |
| |||
10856 | 10867 | | |
10857 | 10868 | | |
10858 | 10869 | | |
10859 | | - | |
| 10870 | + | |
10860 | 10871 | | |
10861 | 10872 | | |
10862 | 10873 | | |
| |||
11655 | 11666 | | |
11656 | 11667 | | |
11657 | 11668 | | |
11658 | | - | |
| 11669 | + | |
11659 | 11670 | | |
11660 | 11671 | | |
11661 | 11672 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
225 | 225 | | |
226 | 226 | | |
227 | 227 | | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
0 commit comments