diff mbox series

[bpf-next,V3] xdp: bpf_xdp_metadata use EOPNOTSUPP for no driver support

Message ID 167673444093.2179692.14745621008776172374.stgit@firesoul (mailing list archive)
State Changes Requested
Delegated to: BPF
Headers show
Series [bpf-next,V3] xdp: bpf_xdp_metadata use EOPNOTSUPP for no driver support | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4 this patch: 4
netdev/cc_maintainers warning 12 maintainers not CCed: leon@kernel.org john.fastabend@gmail.com pabeni@redhat.com linux-rdma@vger.kernel.org corbet@lwn.net tariqt@nvidia.com linux-doc@vger.kernel.org kuba@kernel.org edumazet@google.com saeedm@nvidia.com hawk@kernel.org davem@davemloft.net
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 85 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 success Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 fail Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-21 fail Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 success Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix

Commit Message

Jesper Dangaard Brouer Feb. 18, 2023, 3:34 p.m. UTC
When driver doesn't implement a bpf_xdp_metadata kfunc the default
implementation returns EOPNOTSUPP, which indicate device driver doesn't
implement this kfunc.

Currently many drivers also return EOPNOTSUPP when the hint isn't
available. Instead change drivers to return ENODATA in these cases.
There can be natural cases why a driver doesn't provide any hardware
info for a specific hint, even on a frame to frame basis (e.g. PTP).
Lets keep these cases as separate return codes.

When describing the return values, adjust the function kernel-doc layout
to get proper rendering for the return values.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 Documentation/networking/xdp-rx-metadata.rst     |    7 +++++--
 drivers/net/ethernet/mellanox/mlx4/en_rx.c       |    4 ++--
 drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c |    4 ++--
 drivers/net/veth.c                               |    4 ++--
 net/core/xdp.c                                   |   10 ++++++++--
 5 files changed, 19 insertions(+), 10 deletions(-)

Comments

Stanislav Fomichev Feb. 21, 2023, 5:13 p.m. UTC | #1
On Sat, Feb 18, 2023 at 7:34 AM Jesper Dangaard Brouer
<brouer@redhat.com> wrote:
>
> When driver doesn't implement a bpf_xdp_metadata kfunc the default
> implementation returns EOPNOTSUPP, which indicate device driver doesn't
> implement this kfunc.
>
> Currently many drivers also return EOPNOTSUPP when the hint isn't
> available. Instead change drivers to return ENODATA in these cases.
> There can be natural cases why a driver doesn't provide any hardware
> info for a specific hint, even on a frame to frame basis (e.g. PTP).
> Lets keep these cases as separate return codes.
>
> When describing the return values, adjust the function kernel-doc layout
> to get proper rendering for the return values.

Acked-by: Stanislav Fomichev <sdf@google.com>

Thanks! ENODATA seems like a better fit for the actual implementation.
Long term probably still makes sense to export this info via xdp-features?
Not sure how long we can 100% ensure EOPNOTSUPP vs ENODATA convention :-)

> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  Documentation/networking/xdp-rx-metadata.rst     |    7 +++++--
>  drivers/net/ethernet/mellanox/mlx4/en_rx.c       |    4 ++--
>  drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c |    4 ++--
>  drivers/net/veth.c                               |    4 ++--
>  net/core/xdp.c                                   |   10 ++++++++--
>  5 files changed, 19 insertions(+), 10 deletions(-)
>
> diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
> index aac63fc2d08b..25ce72af81c2 100644
> --- a/Documentation/networking/xdp-rx-metadata.rst
> +++ b/Documentation/networking/xdp-rx-metadata.rst
> @@ -23,10 +23,13 @@ metadata is supported, this set will grow:
>  An XDP program can use these kfuncs to read the metadata into stack
>  variables for its own consumption. Or, to pass the metadata on to other
>  consumers, an XDP program can store it into the metadata area carried
> -ahead of the packet.
> +ahead of the packet. Not all packets will necessary have the requested
> +metadata available in which case the driver returns ``-ENODATA``.
>
>  Not all kfuncs have to be implemented by the device driver; when not
> -implemented, the default ones that return ``-EOPNOTSUPP`` will be used.
> +implemented, the default ones that return ``-EOPNOTSUPP`` will be used
> +to indicate the device driver have not implemented this kfunc.
> +
>
>  Within an XDP frame, the metadata layout (accessed via ``xdp_buff``) is
>  as follows::
> diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> index 0869d4fff17b..4b5e459b6d49 100644
> --- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> +++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
> @@ -674,7 +674,7 @@ int mlx4_en_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
>         struct mlx4_en_xdp_buff *_ctx = (void *)ctx;
>
>         if (unlikely(_ctx->ring->hwtstamp_rx_filter != HWTSTAMP_FILTER_ALL))
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *timestamp = mlx4_en_get_hwtstamp(_ctx->mdev,
>                                           mlx4_en_get_cqe_ts(_ctx->cqe));
> @@ -686,7 +686,7 @@ int mlx4_en_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
>         struct mlx4_en_xdp_buff *_ctx = (void *)ctx;
>
>         if (unlikely(!(_ctx->dev->features & NETIF_F_RXHASH)))
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *hash = be32_to_cpu(_ctx->cqe->immed_rss_invalid);
>         return 0;
> diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> index f7d52b1d293b..32c444c01906 100644
> --- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> +++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
> @@ -161,7 +161,7 @@ static int mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
>         const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
>
>         if (unlikely(!mlx5e_rx_hw_stamp(_ctx->rq->tstamp)))
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *timestamp =  mlx5e_cqe_ts_to_ns(_ctx->rq->ptp_cyc2time,
>                                          _ctx->rq->clock, get_cqe_ts(_ctx->cqe));
> @@ -173,7 +173,7 @@ static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
>         const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
>
>         if (unlikely(!(_ctx->xdp.rxq->dev->features & NETIF_F_RXHASH)))
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *hash = be32_to_cpu(_ctx->cqe->rss_hash_result);
>         return 0;
> diff --git a/drivers/net/veth.c b/drivers/net/veth.c
> index 1bb54de7124d..046461ee42ea 100644
> --- a/drivers/net/veth.c
> +++ b/drivers/net/veth.c
> @@ -1610,7 +1610,7 @@ static int veth_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
>         struct veth_xdp_buff *_ctx = (void *)ctx;
>
>         if (!_ctx->skb)
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *timestamp = skb_hwtstamps(_ctx->skb)->hwtstamp;
>         return 0;
> @@ -1621,7 +1621,7 @@ static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
>         struct veth_xdp_buff *_ctx = (void *)ctx;
>
>         if (!_ctx->skb)
> -               return -EOPNOTSUPP;
> +               return -ENODATA;
>
>         *hash = skb_get_hash(_ctx->skb);
>         return 0;
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 26483935b7a4..b71fe21b5c3e 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -721,7 +721,10 @@ __diag_ignore_all("-Wmissing-prototypes",
>   * @ctx: XDP context pointer.
>   * @timestamp: Return value pointer.
>   *
> - * Returns 0 on success or ``-errno`` on error.
> + * Return:
> + * * Returns 0 on success or ``-errno`` on error.
> + * * ``-EOPNOTSUPP`` : means device driver does not implement kfunc
> + * * ``-ENODATA``    : means no RX-timestamp available for this frame
>   */
>  __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
>  {
> @@ -733,7 +736,10 @@ __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
>   * @ctx: XDP context pointer.
>   * @hash: Return value pointer.
>   *
> - * Returns 0 on success or ``-errno`` on error.
> + * Return:
> + *  * Returns 0 on success or ``-errno`` on error.
> + *  * ``-EOPNOTSUPP`` : means device driver doesn't implement kfunc
> + *  * ``-ENODATA``    : means no RX-hash available for this frame
>   */
>  __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>  {
>
>
Martin KaFai Lau Feb. 21, 2023, 7:03 p.m. UTC | #2
On 2/21/23 9:13 AM, Stanislav Fomichev wrote:
> On Sat, Feb 18, 2023 at 7:34 AM Jesper Dangaard Brouer
> <brouer@redhat.com> wrote:
>>
>> When driver doesn't implement a bpf_xdp_metadata kfunc the default
>> implementation returns EOPNOTSUPP, which indicate device driver doesn't
>> implement this kfunc.
>>
>> Currently many drivers also return EOPNOTSUPP when the hint isn't
>> available. Instead change drivers to return ENODATA in these cases.
>> There can be natural cases why a driver doesn't provide any hardware
>> info for a specific hint, even on a frame to frame basis (e.g. PTP).
>> Lets keep these cases as separate return codes.

> Long term probably still makes sense to export this info via xdp-features?
> Not sure how long we can 100% ensure EOPNOTSUPP vs ENODATA convention :-)

I am also not sure if it makes the xdp-hints adoption easier for other drivers 
by enforcing ENODATA or what other return values a driver should or should not 
return while EOPNOTSUPP is a more common errno to use. May be the driver experts 
can prove me wrong here.

iiuc, it is for debugging if the bpf prog has been patched with the driver's xdp 
kfunc. Others have suggested method like dumping the bpf prog insn. It could 
also trace the driver xdp kfunc and see if it is actually called. Why these 
won't work?

Beside, it is more like a load time decision which should not need a runtime 
return error value to decide. eg. With xdp-features, the bpf prog can check a 
global const which can be set based on the query result from xdp-features. It 
will then be dead code removed by verifier. This could also handle the older 
kernel that does not have xdp-metadata support (ie. missing 
bpf_xdp_metadata_rx_{timestamp,hash}).
Jesper Dangaard Brouer Feb. 21, 2023, 8:39 p.m. UTC | #3
On 21/02/2023 20.03, Martin KaFai Lau wrote:
> On 2/21/23 9:13 AM, Stanislav Fomichev wrote:
>> On Sat, Feb 18, 2023 at 7:34 AM Jesper Dangaard Brouer
>>>
>>> When driver doesn't implement a bpf_xdp_metadata kfunc the default
>>> implementation returns EOPNOTSUPP, which indicate device driver doesn't
>>> implement this kfunc.
>>>
>>> Currently many drivers also return EOPNOTSUPP when the hint isn't
>>> available. Instead change drivers to return ENODATA in these cases.
>>> There can be natural cases why a driver doesn't provide any hardware
>>> info for a specific hint, even on a frame to frame basis (e.g. PTP).
>>> Lets keep these cases as separate return codes.
> 
>> Long term probably still makes sense to export this info via 
>> xdp-features? >> Not sure how long we can 100% ensure EOPNOTSUPP vs ENODATA 
convention :-)
> 
> I am also not sure if it makes the xdp-hints adoption easier for other 
> drivers by enforcing ENODATA or what other return values a driver should 
> or should not return while EOPNOTSUPP is a more common errno to use. May 
> be the driver experts can prove me wrong here.

Which is why I suggested an errno (ENODEV) that drivers will not want to
use by accident.

> iiuc, it is for debugging if the bpf prog has been patched with the 
> driver's xdp kfunc. Others have suggested method like dumping the bpf 
> prog insn. It could also trace the driver xdp kfunc and see if it is 
> actually called. Why these won't work?

I regret talking about this as a debugging tool.  IMHO it have steered
the conversation in a wrong direction, sorry.  There are (obviously)
other metods for debugging this.

For me this is more about the API we are giving the BPF-programmer.

There can be natural cases why a driver doesn't provide any hardware
info for a specific hint.  The RX-timestamp is a good practical example,
as often only PTP packets will be timestamped by hardware.

I can write a BPF-prog that create a stats-map for counting
RX-timestamps, expecting to catch any PTP packets with timestamps.  The
problem is my stats-map cannot record the difference of EOPNOTSUPP vs
ENODATA.  Thus, the user of my RX-timestamps stats program can draw the
wrong conclusion, that there are no packets with (PTP) timestamps, when
this was actually a case of driver not implementing this.

I hope this simple stats example make is clearer that the BPF-prog can
make use of this info runtime.  It is simply a question of keeping these
cases as separate return codes. Is that too much to ask for from an API?

--Jesper
Martin KaFai Lau Feb. 21, 2023, 9:58 p.m. UTC | #4
On 2/21/23 12:39 PM, Jesper Dangaard Brouer wrote:
> For me this is more about the API we are giving the BPF-programmer.
> 
> There can be natural cases why a driver doesn't provide any hardware
> info for a specific hint.  The RX-timestamp is a good practical example,
> as often only PTP packets will be timestamped by hardware.
> 
> I can write a BPF-prog that create a stats-map for counting
> RX-timestamps, expecting to catch any PTP packets with timestamps.  The
> problem is my stats-map cannot record the difference of EOPNOTSUPP vs
> ENODATA.  Thus, the user of my RX-timestamps stats program can draw the
> wrong conclusion, that there are no packets with (PTP) timestamps, when
> this was actually a case of driver not implementing this.
> 
> I hope this simple stats example make is clearer that the BPF-prog can
> make use of this info runtime.  It is simply a question of keeping these
> cases as separate return codes. Is that too much to ask for from an API?

Instead of reserving an errno for this purpose, it can be decided at load time 
instead of keep calling a kfunc always returning the same dedicated errno. I 
still don't hear why xdp-features + bpf global const won't work.
Jesper Dangaard Brouer Feb. 22, 2023, 9:49 p.m. UTC | #5
On 21/02/2023 22.58, Martin KaFai Lau wrote:
> On 2/21/23 12:39 PM, Jesper Dangaard Brouer wrote:
>> For me this is more about the API we are giving the BPF-programmer.
>>
>> There can be natural cases why a driver doesn't provide any hardware
>> info for a specific hint.  The RX-timestamp is a good practical example,
>> as often only PTP packets will be timestamped by hardware.
>>
>> I can write a BPF-prog that create a stats-map for counting
>> RX-timestamps, expecting to catch any PTP packets with timestamps.  The
>> problem is my stats-map cannot record the difference of EOPNOTSUPP vs
>> ENODATA.  Thus, the user of my RX-timestamps stats program can draw the
>> wrong conclusion, that there are no packets with (PTP) timestamps, when
>> this was actually a case of driver not implementing this.
>>
>> I hope this simple stats example make is clearer that the BPF-prog can
>> make use of this info runtime.  It is simply a question of keeping these
>> cases as separate return codes. Is that too much to ask for from an API?
> 
> Instead of reserving an errno for this purpose, it can be decided at 
> load time instead of keep calling a kfunc always returning the same 
> dedicated errno. I still don't hear why xdp-features + bpf global const 
> won't work.
> 

Sure, exposing this to xdp-features and combining this with a bpf global
const is a cool idea, slightly extensive work for the BPF-programmer,
but sure BPF is all about giving the BPF programmer flexibility.

I do feel it is orthogonal whether the API should return a consistent
errno when the driver doesn't implement the kfunc.

I'm actually hoping in the future that we can achieve dead code
elimination automatically without having to special case this.
When we do Stanislav's BPF unroll tricks we get a constant e.g.
EOPNOTSUPP when driver doesn't implement the kfunc.  This should allow
the verifier to do deadcode elimination right?

For my stats example, where I want to count both packets with and
without timestamps, but not miscount packets that actually had a
timestamp, but my driver just doesn't support querying this.

Consider program-A:

  int err = bpf_xdp_metadata_rx_timestamp(ctx, &ts);
  if (!err) {
	ts_stats[HAVE_TS]++;
  } else {
	ts_stats[NO_TS_DATA]++;
  }

Program-A clearly does the miscount issue. The const propagation and
deadcode code elimination would work, but is still miscounts.
Yes, program-A could be extended with the cool idea of xdp-feature
detection that updates a prog const, for solving the issue.

Consider program-B:

  int err = bpf_xdp_metadata_rx_timestamp(ctx, &ts);
  if (!err) {
	ts_stats[HAVE_TS]++;
  } else if (err == -ENODATA) {
	ts_stats[NO_TS_DATA]++;
  }

If I had a separate return, then I can avoid the miscount as demonstrate
in program-B.  In this program the const propagation and deadcode
elimination would *also* work and still avoid the miscounts.  It should
elimination any updates to ts_stats map.

I do get the cool idea of bpf global const, but we will hopefully get
this automatically when we can do BPF unroll.

I hope this make it more clear, why I think it is valuable to "reserve"
an errno for the case when kfunc isn't implemented by driver.

Thanks for reading this far,
--Jesper
Martin KaFai Lau Feb. 24, 2023, 7:44 a.m. UTC | #6
On 2/22/23 1:49 PM, Jesper Dangaard Brouer wrote:
> 
> On 21/02/2023 22.58, Martin KaFai Lau wrote:
>> On 2/21/23 12:39 PM, Jesper Dangaard Brouer wrote:
>>> For me this is more about the API we are giving the BPF-programmer.
>>>
>>> There can be natural cases why a driver doesn't provide any hardware
>>> info for a specific hint.  The RX-timestamp is a good practical example,
>>> as often only PTP packets will be timestamped by hardware.
>>>
>>> I can write a BPF-prog that create a stats-map for counting
>>> RX-timestamps, expecting to catch any PTP packets with timestamps.  The
>>> problem is my stats-map cannot record the difference of EOPNOTSUPP vs
>>> ENODATA.  Thus, the user of my RX-timestamps stats program can draw the
>>> wrong conclusion, that there are no packets with (PTP) timestamps, when
>>> this was actually a case of driver not implementing this.
>>>
>>> I hope this simple stats example make is clearer that the BPF-prog can
>>> make use of this info runtime.  It is simply a question of keeping these
>>> cases as separate return codes. Is that too much to ask for from an API?
>>
>> Instead of reserving an errno for this purpose, it can be decided at load time 
>> instead of keep calling a kfunc always returning the same dedicated errno. I 
>> still don't hear why xdp-features + bpf global const won't work.
>>
> 
> Sure, exposing this to xdp-features and combining this with a bpf global
> const is a cool idea, slightly extensive work for the BPF-programmer,
> but sure BPF is all about giving the BPF programmer flexibility.
> 
> I do feel it is orthogonal whether the API should return a consistent
> errno when the driver doesn't implement the kfunc.
> 
> I'm actually hoping in the future that we can achieve dead code
> elimination automatically without having to special case this.
> When we do Stanislav's BPF unroll tricks we get a constant e.g.
> EOPNOTSUPP when driver doesn't implement the kfunc.  This should allow
> the verifier to do deadcode elimination right?
> 
> For my stats example, where I want to count both packets with and
> without timestamps, but not miscount packets that actually had a
> timestamp, but my driver just doesn't support querying this.
> 
> Consider program-A:
> 
>   int err = bpf_xdp_metadata_rx_timestamp(ctx, &ts);
>   if (!err) {
>      ts_stats[HAVE_TS]++;
>   } else {
>      ts_stats[NO_TS_DATA]++;
>   }
> 
> Program-A clearly does the miscount issue. The const propagation and
> deadcode code elimination would work, but is still miscounts.
> Yes, program-A could be extended with the cool idea of xdp-feature
> detection that updates a prog const, for solving the issue.
> 
> Consider program-B:
> 
>   int err = bpf_xdp_metadata_rx_timestamp(ctx, &ts);
>   if (!err) {
>      ts_stats[HAVE_TS]++;
>   } else if (err == -ENODATA) {
>      ts_stats[NO_TS_DATA]++;
>   }
> 
> If I had a separate return, then I can avoid the miscount as demonstrate
> in program-B.  In this program the const propagation and deadcode
> elimination would *also* work and still avoid the miscounts.  It should
> elimination any updates to ts_stats map.
> 
> I do get the cool idea of bpf global const, but we will hopefully get
> this automatically when we can do BPF unroll.

I think the direction is to dual compile a kfunc to native code and bpf code and 
to get away from the manual unroll or hand written bpf insn. Not sure if the 
verifier can (and should) further check whether a compiled bpf subprog always 
returns a const scalar to optimize this particular case.

I think enough words have been exchanged on this subject. A few ways (eg. at 
load time) have been suggested to detect it without reserving an errno for an 
empty function. Beside, it is hard to miss when the stats is all one sided if 
the driver does not implement a xdp-hint. Quickly query the xdp-feature will 
confirm it. I assume ethtool will be able to check that soon also. It is what 
xdp-feature is for instead of reserving a run time value to detect if a driver 
has implemented each individual xdp feature.

May be a tie break vote is needed.
diff mbox series

Patch

diff --git a/Documentation/networking/xdp-rx-metadata.rst b/Documentation/networking/xdp-rx-metadata.rst
index aac63fc2d08b..25ce72af81c2 100644
--- a/Documentation/networking/xdp-rx-metadata.rst
+++ b/Documentation/networking/xdp-rx-metadata.rst
@@ -23,10 +23,13 @@  metadata is supported, this set will grow:
 An XDP program can use these kfuncs to read the metadata into stack
 variables for its own consumption. Or, to pass the metadata on to other
 consumers, an XDP program can store it into the metadata area carried
-ahead of the packet.
+ahead of the packet. Not all packets will necessary have the requested
+metadata available in which case the driver returns ``-ENODATA``.
 
 Not all kfuncs have to be implemented by the device driver; when not
-implemented, the default ones that return ``-EOPNOTSUPP`` will be used.
+implemented, the default ones that return ``-EOPNOTSUPP`` will be used
+to indicate the device driver have not implemented this kfunc.
+
 
 Within an XDP frame, the metadata layout (accessed via ``xdp_buff``) is
 as follows::
diff --git a/drivers/net/ethernet/mellanox/mlx4/en_rx.c b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
index 0869d4fff17b..4b5e459b6d49 100644
--- a/drivers/net/ethernet/mellanox/mlx4/en_rx.c
+++ b/drivers/net/ethernet/mellanox/mlx4/en_rx.c
@@ -674,7 +674,7 @@  int mlx4_en_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
 	struct mlx4_en_xdp_buff *_ctx = (void *)ctx;
 
 	if (unlikely(_ctx->ring->hwtstamp_rx_filter != HWTSTAMP_FILTER_ALL))
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*timestamp = mlx4_en_get_hwtstamp(_ctx->mdev,
 					  mlx4_en_get_cqe_ts(_ctx->cqe));
@@ -686,7 +686,7 @@  int mlx4_en_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
 	struct mlx4_en_xdp_buff *_ctx = (void *)ctx;
 
 	if (unlikely(!(_ctx->dev->features & NETIF_F_RXHASH)))
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*hash = be32_to_cpu(_ctx->cqe->immed_rss_invalid);
 	return 0;
diff --git a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
index f7d52b1d293b..32c444c01906 100644
--- a/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
+++ b/drivers/net/ethernet/mellanox/mlx5/core/en/xdp.c
@@ -161,7 +161,7 @@  static int mlx5e_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
 	const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
 
 	if (unlikely(!mlx5e_rx_hw_stamp(_ctx->rq->tstamp)))
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*timestamp =  mlx5e_cqe_ts_to_ns(_ctx->rq->ptp_cyc2time,
 					 _ctx->rq->clock, get_cqe_ts(_ctx->cqe));
@@ -173,7 +173,7 @@  static int mlx5e_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
 	const struct mlx5e_xdp_buff *_ctx = (void *)ctx;
 
 	if (unlikely(!(_ctx->xdp.rxq->dev->features & NETIF_F_RXHASH)))
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*hash = be32_to_cpu(_ctx->cqe->rss_hash_result);
 	return 0;
diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index 1bb54de7124d..046461ee42ea 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -1610,7 +1610,7 @@  static int veth_xdp_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
 	struct veth_xdp_buff *_ctx = (void *)ctx;
 
 	if (!_ctx->skb)
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*timestamp = skb_hwtstamps(_ctx->skb)->hwtstamp;
 	return 0;
@@ -1621,7 +1621,7 @@  static int veth_xdp_rx_hash(const struct xdp_md *ctx, u32 *hash)
 	struct veth_xdp_buff *_ctx = (void *)ctx;
 
 	if (!_ctx->skb)
-		return -EOPNOTSUPP;
+		return -ENODATA;
 
 	*hash = skb_get_hash(_ctx->skb);
 	return 0;
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 26483935b7a4..b71fe21b5c3e 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -721,7 +721,10 @@  __diag_ignore_all("-Wmissing-prototypes",
  * @ctx: XDP context pointer.
  * @timestamp: Return value pointer.
  *
- * Returns 0 on success or ``-errno`` on error.
+ * Return:
+ * * Returns 0 on success or ``-errno`` on error.
+ * * ``-EOPNOTSUPP`` : means device driver does not implement kfunc
+ * * ``-ENODATA``    : means no RX-timestamp available for this frame
  */
 __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
 {
@@ -733,7 +736,10 @@  __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
  * @ctx: XDP context pointer.
  * @hash: Return value pointer.
  *
- * Returns 0 on success or ``-errno`` on error.
+ * Return:
+ *  * Returns 0 on success or ``-errno`` on error.
+ *  * ``-EOPNOTSUPP`` : means device driver doesn't implement kfunc
+ *  * ``-ENODATA``    : means no RX-hash available for this frame
  */
 __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
 {