Message ID | 167482734243.892262.18210955230092032606.stgit@firesoul (mailing list archive) |
---|---|
State | RFC |
Delegated to: | BPF |
Headers | show |
Series | [bpf-next,RFC,V1] selftests/bpf: xdp_hw_metadata clear metadata when -EOPNOTSUPP | expand |
Jesper Dangaard Brouer <brouer@redhat.com> writes: > The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of > the availability of rx_timestamp and rx_hash in data_meta area. The > kernel-side BPF-prog code doesn't initialize these members when kernel > returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to > be zeroed, and can contain garbage/previous values, which will be read > and interpreted by AF_XDP userspace side. > > Tested this on different drivers. The experiences are that for most > packets they will have zeroed this data_meta area, but occasionally it > will contain garbage data. > > Example of failure tested on ixgbe: > poll: 1 (0) > xsk_ring_cons__peek: 1 > 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 > rx_hash: 3697961069 > rx_timestamp: 9024981991734834796 (sec:9024981991.7348) > 0x18ec788: complete idx=8 addr=8000 > > Converting to date: > date -d @9024981991 > 2255-12-28T20:26:31 CET > > I choose a simple fix in this patch. When kfunc fails or isn't supported > assign zero to the corresponding struct meta value. > > It's up to the individual BPF-programmer to do something smarter e.g. > that fits their use-case, like getting a software timestamp and marking > a flag that gives the type of timestamp. > > Another possibility is for the behavior of kfunc's > bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require > clearing return value pointer. I definitely think we should leave it up to the BPF programmer to react to failures; that's what the return code is there for, after all :) -Toke
On Fri, Jan 27, 2023 at 5:58 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > > Jesper Dangaard Brouer <brouer@redhat.com> writes: > > > The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of > > the availability of rx_timestamp and rx_hash in data_meta area. The > > kernel-side BPF-prog code doesn't initialize these members when kernel > > returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to > > be zeroed, and can contain garbage/previous values, which will be read > > and interpreted by AF_XDP userspace side. > > > > Tested this on different drivers. The experiences are that for most > > packets they will have zeroed this data_meta area, but occasionally it > > will contain garbage data. > > > > Example of failure tested on ixgbe: > > poll: 1 (0) > > xsk_ring_cons__peek: 1 > > 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 > > rx_hash: 3697961069 > > rx_timestamp: 9024981991734834796 (sec:9024981991.7348) > > 0x18ec788: complete idx=8 addr=8000 > > > > Converting to date: > > date -d @9024981991 > > 2255-12-28T20:26:31 CET > > > > I choose a simple fix in this patch. When kfunc fails or isn't supported > > assign zero to the corresponding struct meta value. > > > > It's up to the individual BPF-programmer to do something smarter e.g. > > that fits their use-case, like getting a software timestamp and marking > > a flag that gives the type of timestamp. > > > > Another possibility is for the behavior of kfunc's > > bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require > > clearing return value pointer. > > I definitely think we should leave it up to the BPF programmer to react > to failures; that's what the return code is there for, after all :) +1 Maybe we can unconditionally memset(meta, sizeof(*meta), 0) in tools/testing/selftests/bpf/progs/xdp_hw_metadata.c? Since it's not a performance tool, it should be ok functionality-wise. > -Toke >
On 27/01/2023 18.18, Stanislav Fomichev wrote: > On Fri, Jan 27, 2023 at 5:58 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: >> >> Jesper Dangaard Brouer <brouer@redhat.com> writes: >> >>> The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of >>> the availability of rx_timestamp and rx_hash in data_meta area. The >>> kernel-side BPF-prog code doesn't initialize these members when kernel >>> returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to >>> be zeroed, and can contain garbage/previous values, which will be read >>> and interpreted by AF_XDP userspace side. >>> >>> Tested this on different drivers. The experiences are that for most >>> packets they will have zeroed this data_meta area, but occasionally it >>> will contain garbage data. >>> >>> Example of failure tested on ixgbe: >>> poll: 1 (0) >>> xsk_ring_cons__peek: 1 >>> 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 >>> rx_hash: 3697961069 >>> rx_timestamp: 9024981991734834796 (sec:9024981991.7348) >>> 0x18ec788: complete idx=8 addr=8000 >>> >>> Converting to date: >>> date -d @9024981991 >>> 2255-12-28T20:26:31 CET >>> >>> I choose a simple fix in this patch. When kfunc fails or isn't supported >>> assign zero to the corresponding struct meta value. >>> >>> It's up to the individual BPF-programmer to do something smarter e.g. >>> that fits their use-case, like getting a software timestamp and marking >>> a flag that gives the type of timestamp. >>> >>> Another possibility is for the behavior of kfunc's >>> bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require >>> clearing return value pointer. >> >> I definitely think we should leave it up to the BPF programmer to react >> to failures; that's what the return code is there for, after all :) > > +1 +1 I agree. We should keep this default functions as simple as possible, for future "unroll" of BPF-bytecode. I the -EOPNOTSUPP case (default functions for drivers not implementing kfunc), will likely be used runtime by BPF-prog to determine if the hardware have this offload hint, but it comes with the overhead of a function pointer call. I hope we can somehow BPF-bytecode "unroll" these (default functions) at BPF-load time, to remove this overhead, and perhaps even let BPF bytecode do const propagation and code elimination? > Maybe we can unconditionally memset(meta, sizeof(*meta), 0) in > tools/testing/selftests/bpf/progs/xdp_hw_metadata.c? > Since it's not a performance tool, it should be ok functionality-wise. I know this isn't a performance test, but IMHO always memsetting metadata area is a misleading example. We know from experience that developer simply copy-paste code examples, even quick-n-dirty testing example code. The specific issue in this example can lead to hard-to-find bugs, as my testing shows it is only occasionally that data_meta area contains garbage. We could do a memset, but it deserves a large code comment, why this is needed, so people copy-pasting understand. I choose current approach to keep code close to code people will copy-paste. --Jesper
On Tue, Jan 31, 2023 at 5:00 AM Jesper Dangaard Brouer <jbrouer@redhat.com> wrote: > > > > On 27/01/2023 18.18, Stanislav Fomichev wrote: > > On Fri, Jan 27, 2023 at 5:58 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote: > >> > >> Jesper Dangaard Brouer <brouer@redhat.com> writes: > >> > >>> The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of > >>> the availability of rx_timestamp and rx_hash in data_meta area. The > >>> kernel-side BPF-prog code doesn't initialize these members when kernel > >>> returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to > >>> be zeroed, and can contain garbage/previous values, which will be read > >>> and interpreted by AF_XDP userspace side. > >>> > >>> Tested this on different drivers. The experiences are that for most > >>> packets they will have zeroed this data_meta area, but occasionally it > >>> will contain garbage data. > >>> > >>> Example of failure tested on ixgbe: > >>> poll: 1 (0) > >>> xsk_ring_cons__peek: 1 > >>> 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 > >>> rx_hash: 3697961069 > >>> rx_timestamp: 9024981991734834796 (sec:9024981991.7348) > >>> 0x18ec788: complete idx=8 addr=8000 > >>> > >>> Converting to date: > >>> date -d @9024981991 > >>> 2255-12-28T20:26:31 CET > >>> > >>> I choose a simple fix in this patch. When kfunc fails or isn't supported > >>> assign zero to the corresponding struct meta value. > >>> > >>> It's up to the individual BPF-programmer to do something smarter e.g. > >>> that fits their use-case, like getting a software timestamp and marking > >>> a flag that gives the type of timestamp. > >>> > >>> Another possibility is for the behavior of kfunc's > >>> bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require > >>> clearing return value pointer. > >> > >> I definitely think we should leave it up to the BPF programmer to react > >> to failures; that's what the return code is there for, after all :) > > > > +1 > > +1 I agree. > We should keep this default functions as simple as possible, for future > "unroll" of BPF-bytecode. > > I the -EOPNOTSUPP case (default functions for drivers not implementing > kfunc), will likely be used runtime by BPF-prog to determine if the > hardware have this offload hint, but it comes with the overhead of a > function pointer call. > > I hope we can somehow BPF-bytecode "unroll" these (default functions) at > BPF-load time, to remove this overhead, and perhaps even let BPF > bytecode do const propagation and code elimination? > > > > Maybe we can unconditionally memset(meta, sizeof(*meta), 0) in > > tools/testing/selftests/bpf/progs/xdp_hw_metadata.c? > > Since it's not a performance tool, it should be ok functionality-wise. > > I know this isn't a performance test, but IMHO always memsetting > metadata area is a misleading example. We know from experience that > developer simply copy-paste code examples, even quick-n-dirty testing > example code. > > The specific issue in this example can lead to hard-to-find bugs, as my > testing shows it is only occasionally that data_meta area contains > garbage. We could do a memset, but it deserves a large code comment, why > this is needed, so people copy-pasting understand. I choose current > approach to keep code close to code people will copy-paste. SG, I don't think it matters, but agreed that having this stated explicitly could help with a blind copy-paste :-) Then maybe repost with the TODO's removed from the kfucs? We seem to agree that it's the user's job to manage the final buffer.. > --Jesper >
diff --git a/net/core/xdp.c b/net/core/xdp.c index a5a7ecf6391c..5ea13554c080 100644 --- a/net/core/xdp.c +++ b/net/core/xdp.c @@ -724,6 +724,7 @@ __diag_ignore_all("-Wmissing-prototypes", */ int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) { + // XXX: Question: Should we clear mem pointed to by @timestamp ? return -EOPNOTSUPP; } @@ -736,6 +737,7 @@ int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp) */ int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash) { + // XXX: Question: Should we clear mem pointed to by @hash ? return -EOPNOTSUPP; } diff --git a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c index 25b8178735ee..4c55b4d79d3d 100644 --- a/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c +++ b/tools/testing/selftests/bpf/progs/xdp_hw_metadata.c @@ -70,10 +70,14 @@ int rx(struct xdp_md *ctx) } if (!bpf_xdp_metadata_rx_timestamp(ctx, &meta->rx_timestamp)) - bpf_printk("populated rx_timestamp with %u", meta->rx_timestamp); + bpf_printk("populated rx_timestamp with %llu", meta->rx_timestamp); + else + meta->rx_timestamp = 0; /* Used by AF_XDP as not avail signal */ if (!bpf_xdp_metadata_rx_hash(ctx, &meta->rx_hash)) bpf_printk("populated rx_hash with %u", meta->rx_hash); + else + meta->rx_hash = 0; /* Used by AF_XDP as not avail signal */ return bpf_redirect_map(&xsk, ctx->rx_queue_index, XDP_PASS); }
The AF_XDP userspace part of xdp_hw_metadata see non-zero as a signal of the availability of rx_timestamp and rx_hash in data_meta area. The kernel-side BPF-prog code doesn't initialize these members when kernel returns an error e.g. -EOPNOTSUPP. This memory area is not guaranteed to be zeroed, and can contain garbage/previous values, which will be read and interpreted by AF_XDP userspace side. Tested this on different drivers. The experiences are that for most packets they will have zeroed this data_meta area, but occasionally it will contain garbage data. Example of failure tested on ixgbe: poll: 1 (0) xsk_ring_cons__peek: 1 0x18ec788: rx_desc[0]->addr=100000000008000 addr=8100 comp_addr=8000 rx_hash: 3697961069 rx_timestamp: 9024981991734834796 (sec:9024981991.7348) 0x18ec788: complete idx=8 addr=8000 Converting to date: date -d @9024981991 2255-12-28T20:26:31 CET I choose a simple fix in this patch. When kfunc fails or isn't supported assign zero to the corresponding struct meta value. It's up to the individual BPF-programmer to do something smarter e.g. that fits their use-case, like getting a software timestamp and marking a flag that gives the type of timestamp. Another possibility is for the behavior of kfunc's bpf_xdp_metadata_rx_timestamp and bpf_xdp_metadata_rx_hash to require clearing return value pointer. Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com> --- net/core/xdp.c | 2 ++ .../testing/selftests/bpf/progs/xdp_hw_metadata.c | 6 +++++- 2 files changed, 7 insertions(+), 1 deletion(-)