diff mbox series

[bpf-next,V1] xdp: bpf_xdp_metadata use NODEV for no device support

Message ID 167645577609.1860229.12489295285473044895.stgit@firesoul (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series [bpf-next,V1] xdp: bpf_xdp_metadata use NODEV for no device support | expand

Checks

Context Check Description
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next-VM_Test-4 success Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-9 success Logs for test_maps on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-10 success Logs for test_maps on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-12 success Logs for test_maps on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-13 success Logs for test_maps on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-14 fail Logs for test_progs on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-15 success Logs for test_progs on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-17 success Logs for test_progs on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-18 success Logs for test_progs on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-19 success Logs for test_progs_no_alu32 on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-20 success Logs for test_progs_no_alu32 on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-22 success Logs for test_progs_no_alu32 on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-23 success Logs for test_progs_no_alu32 on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-24 success Logs for test_progs_no_alu32_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-25 success Logs for test_progs_no_alu32_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-27 success Logs for test_progs_no_alu32_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-28 success Logs for test_progs_no_alu32_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-29 success Logs for test_progs_parallel on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-30 success Logs for test_progs_parallel on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-32 success Logs for test_progs_parallel on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-33 success Logs for test_progs_parallel on x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-34 success Logs for test_verifier on aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-35 success Logs for test_verifier on aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-36 success Logs for test_verifier on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-37 success Logs for test_verifier on x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-38 success Logs for test_verifier on x86_64 with llvm-16
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Single patches do not need cover letters
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 4 this patch: 4
netdev/cc_maintainers warning 6 maintainers not CCed: john.fastabend@gmail.com pabeni@redhat.com kuba@kernel.org edumazet@google.com hawk@kernel.org davem@davemloft.net
netdev/build_clang success Errors and warnings before: 1 this patch: 1
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 4 this patch: 4
netdev/checkpatch success total: 0 errors, 0 warnings, 0 checks, 26 lines checked
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-VM_Test-21 success Logs for test_progs_no_alu32 on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-26 success Logs for test_progs_no_alu32_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-31 success Logs for test_progs_parallel on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-16 success Logs for test_progs on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-11 success Logs for test_maps on s390x with gcc
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-7 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-8 success Logs for set-matrix
bpf/vmtest-bpf-next-VM_Test-2 success Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-3 success Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 success Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-6 success Logs for build for x86_64 with llvm-16

Commit Message

Jesper Dangaard Brouer Feb. 15, 2023, 10:09 a.m. UTC
With our XDP-hints kfunc approach, where individual drivers overload the
default implementation, it can be hard for API users to determine
whether or not the current device driver have this kfunc available.

Change the default implementations to use an errno (ENODEV), that
drivers shouldn't return, to make it possible for BPF runtime to
determine if bpf kfunc for xdp metadata isn't implemented by driver.

This is intended to ease supporting and troubleshooting setups. E.g.
when users on mailing list report -19 (ENODEV) as an error, then we can
immediately tell them their kernel is too old.

Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
---
 net/core/xdp.c |    8 ++++++--
 1 file changed, 6 insertions(+), 2 deletions(-)

Comments

Larysa Zaremba Feb. 15, 2023, 3:45 p.m. UTC | #1
On Wed, Feb 15, 2023 at 11:09:36AM +0100, Jesper Dangaard Brouer wrote:
> With our XDP-hints kfunc approach, where individual drivers overload the
> default implementation, it can be hard for API users to determine
> whether or not the current device driver have this kfunc available.
> 
> Change the default implementations to use an errno (ENODEV), that
> drivers shouldn't return, to make it possible for BPF runtime to
> determine if bpf kfunc for xdp metadata isn't implemented by driver.

I think it diverts ENODEV usage from its original purpose too much. Maybe 
providing information in dmesg would be a better solution?

> 
> This is intended to ease supporting and troubleshooting setups. E.g.
> when users on mailing list report -19 (ENODEV) as an error, then we can
> immediately tell them their kernel is too old.

Do you mean driver being too old, not kernel?

> 
> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> ---
>  net/core/xdp.c |    8 ++++++--
>  1 file changed, 6 insertions(+), 2 deletions(-)
> 
> diff --git a/net/core/xdp.c b/net/core/xdp.c
> index 26483935b7a4..7bb5984ae4f7 100644
> --- a/net/core/xdp.c
> +++ b/net/core/xdp.c
> @@ -722,10 +722,12 @@ __diag_ignore_all("-Wmissing-prototypes",
>   * @timestamp: Return value pointer.
>   *
>   * Returns 0 on success or ``-errno`` on error.
> + *
> + *  -ENODEV (19): means device driver doesn't implement kfunc
>   */
>  __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
>  {
> -	return -EOPNOTSUPP;
> +	return -ENODEV;
>  }
>  
>  /**
> @@ -734,10 +736,12 @@ __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
>   * @hash: Return value pointer.
>   *
>   * Returns 0 on success or ``-errno`` on error.
> + *
> + *  -ENODEV (19): means device driver doesn't implement kfunc
>   */
>  __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
>  {
> -	return -EOPNOTSUPP;
> +	return -ENODEV;
>  }
>  
>  __diag_pop();

Documentation contains the following lines:

  Not all kfuncs have to be implemented by the device driver; when not
  implemented, the default ones that return ``-EOPNOTSUPP`` will be used.

If you decide to proceed with current implementation, you'd need to update them 
in v2.

> 
>
Alexander Lobakin Feb. 15, 2023, 5:11 p.m. UTC | #2
From: Zaremba, Larysa <larysa.zaremba@intel.com>
Date: Wed, 15 Feb 2023 16:45:18 +0100

> On Wed, Feb 15, 2023 at 11:09:36AM +0100, Jesper Dangaard Brouer wrote:
>> With our XDP-hints kfunc approach, where individual drivers overload the
>> default implementation, it can be hard for API users to determine
>> whether or not the current device driver have this kfunc available.
>>
>> Change the default implementations to use an errno (ENODEV), that
>> drivers shouldn't return, to make it possible for BPF runtime to
>> determine if bpf kfunc for xdp metadata isn't implemented by driver.
> 
> I think it diverts ENODEV usage from its original purpose too much. Maybe 
> providing information in dmesg would be a better solution?

+1, -%ENODEV shouldn't be used here. It stands for "no device", for
example the driver probing core doesn't treat it as an error or that
something is not supported (rather than there's no device installed
in a slot / on a bus etc.).

> 
>>
>> This is intended to ease supporting and troubleshooting setups. E.g.
>> when users on mailing list report -19 (ENODEV) as an error, then we can
>> immediately tell them their kernel is too old.
> 
> Do you mean driver being too old, not kernel?
> 
>>
>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>> ---
[...]

Thanks,
Olek
Jesper Dangaard Brouer Feb. 15, 2023, 5:50 p.m. UTC | #3
On 15/02/2023 18.11, Alexander Lobakin wrote:
> From: Zaremba, Larysa <larysa.zaremba@intel.com>
> Date: Wed, 15 Feb 2023 16:45:18 +0100
> 
>> On Wed, Feb 15, 2023 at 11:09:36AM +0100, Jesper Dangaard Brouer wrote:
>>> With our XDP-hints kfunc approach, where individual drivers overload the
>>> default implementation, it can be hard for API users to determine
>>> whether or not the current device driver have this kfunc available.
>>>
>>> Change the default implementations to use an errno (ENODEV), that
>>> drivers shouldn't return, to make it possible for BPF runtime to
>>> determine if bpf kfunc for xdp metadata isn't implemented by driver.
>>
>> I think it diverts ENODEV usage from its original purpose too much. 

Can you suggest a errno that is a better fit?

>> Maybe providing information in dmesg would be a better solution?

IMHO we really don't want to print any information in this code path, as
this is being executed as part of the BPF-prog. This will lead to
unfortunate latency issues.  Also considering the packet rates this need
to operate at.

> 
> +1, -%ENODEV shouldn't be used here. It stands for "no device", for
> example the driver probing core doesn't treat it as an error or that
> something is not supported (rather than there's no device installed
> in a slot / on a bus etc.).
> 

I wanted to choose something that isn't natural for a device driver
developer to choose as a return code.  I choose the "no device", because
the "device" driver doesn't implement this.

The important part is help ourselves (and support) to reliably determine
if a device driver implements this kfunc or not. I'm not married to the
specific errno.

I hit this issue myself, when developing these kfuncs for igc.  I was
constantly loading and unloading the driver while developing this. And
my kfunc would return -EOPNOTSUPP in some cases, and I couldn't
understand why my code changes was not working, but in reality I was
hitting the default kfunc implementation as it wasn't the correct
version of the driver I had loaded.  It would in practice have save me
time while developing...

Please suggest a better errno if the color is important to you.

>>
>>>
>>> This is intended to ease supporting and troubleshooting setups. E.g.
>>> when users on mailing list report -19 (ENODEV) as an error, then we can
>>> immediately tell them their kernel is too old.
>>
>> Do you mean driver being too old, not kernel?

Sure I guess, I do mean the driver version.

I guess you are thinking in the lines of Intel customers compiling Intel
out-of-tree kernel modules, this will also be practical and ease
troubleshooting for Intel support teams.

>>>
>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>> ---
> [...]
> 
> Thanks,
> Olek
>
Alexander Lobakin Feb. 16, 2023, 11:33 a.m. UTC | #4
From: Jesper Dangaard Brouer <jbrouer@redhat.com>
Date: Wed, 15 Feb 2023 18:50:10 +0100

> 
> On 15/02/2023 18.11, Alexander Lobakin wrote:
>> From: Zaremba, Larysa <larysa.zaremba@intel.com>
>> Date: Wed, 15 Feb 2023 16:45:18 +0100

[...]

>>> I think it diverts ENODEV usage from its original purpose too much. 
> 
> Can you suggest a errno that is a better fit?
> 
>>> Maybe providing information in dmesg would be a better solution?
> 
> IMHO we really don't want to print any information in this code path, as
> this is being executed as part of the BPF-prog. This will lead to
> unfortunate latency issues.  Also considering the packet rates this need
> to operate at.
> 
>>
>> +1, -%ENODEV shouldn't be used here. It stands for "no device", for
>> example the driver probing core doesn't treat it as an error or that
>> something is not supported (rather than there's no device installed
>> in a slot / on a bus etc.).
>>
> 
> I wanted to choose something that isn't natural for a device driver
> developer to choose as a return code.  I choose the "no device", because
> the "device" driver doesn't implement this.
> 
> The important part is help ourselves (and support) to reliably determine
> if a device driver implements this kfunc or not. I'm not married to the
> specific errno.
> 
> I hit this issue myself, when developing these kfuncs for igc.  I was
> constantly loading and unloading the driver while developing this. And
> my kfunc would return -EOPNOTSUPP in some cases, and I couldn't
> understand why my code changes was not working, but in reality I was
> hitting the default kfunc implementation as it wasn't the correct
> version of the driver I had loaded.  It would in practice have save me
> time while developing...

So you suggest to pick the properly wrong errno only to make the life of
developers who messed up something in their code a bit easier? I see no
issues with using -%EOPNOTSUPP in every case when the driver can't
provide BPF prog with the hints requested by it.
What you suggest is basically something that we usually do locally to
test WIP stuff at early stages.

> 
> Please suggest a better errno if the color is important to you.
> 
>>>
>>>>
>>>> This is intended to ease supporting and troubleshooting setups. E.g.
>>>> when users on mailing list report -19 (ENODEV) as an error, then we can
>>>> immediately tell them their kernel is too old.
>>>
>>> Do you mean driver being too old, not kernel?
> 
> Sure I guess, I do mean the driver version.
> 
> I guess you are thinking in the lines of Intel customers compiling Intel
> out-of-tree kernel modules, this will also be practical and ease
> troubleshooting for Intel support teams.

The last thing our team thinks of is the Intel customers using
out-of-tree drivers xD

> 
>>>>
>>>> Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
>>>> ---
>> [...]
>>
>> Thanks,
>> Olek
>>
>

Thanks,
Olek
Larysa Zaremba Feb. 16, 2023, 12:16 p.m. UTC | #5
On Wed, Feb 15, 2023 at 06:50:10PM +0100, Jesper Dangaard Brouer wrote:
> 
> On 15/02/2023 18.11, Alexander Lobakin wrote:
> > From: Zaremba, Larysa <larysa.zaremba@intel.com>
> > Date: Wed, 15 Feb 2023 16:45:18 +0100
> > 
> > > On Wed, Feb 15, 2023 at 11:09:36AM +0100, Jesper Dangaard Brouer wrote:
> > > > With our XDP-hints kfunc approach, where individual drivers overload the
> > > > default implementation, it can be hard for API users to determine
> > > > whether or not the current device driver have this kfunc available.
> > > > 
> > > > Change the default implementations to use an errno (ENODEV), that
> > > > drivers shouldn't return, to make it possible for BPF runtime to
> > > > determine if bpf kfunc for xdp metadata isn't implemented by driver.
> > > 
> > > I think it diverts ENODEV usage from its original purpose too much.
> 
> Can you suggest a errno that is a better fit?

EOPNOTSUPP fits just fine.

> 
> > > Maybe providing information in dmesg would be a better solution?
> 
> IMHO we really don't want to print any information in this code path, as
> this is being executed as part of the BPF-prog. This will lead to
> unfortunate latency issues.  Also considering the packet rates this need
> to operate at.

I meant printing messages at bpf program load time...
When driver functions are patched-in, you have all the information you may need 
to inform user, if the default implementation for a particular function is used 
instead.

> 
> > 
> > +1, -%ENODEV shouldn't be used here. It stands for "no device", for
> > example the driver probing core doesn't treat it as an error or that
> > something is not supported (rather than there's no device installed
> > in a slot / on a bus etc.).
> > 
> 
> I wanted to choose something that isn't natural for a device driver
> developer to choose as a return code.  I choose the "no device", because
> the "device" driver doesn't implement this.
> 
> The important part is help ourselves (and support) to reliably determine
> if a device driver implements this kfunc or not. I'm not married to the
> specific errno.
> 
> I hit this issue myself, when developing these kfuncs for igc.  I was
> constantly loading and unloading the driver while developing this. And
> my kfunc would return -EOPNOTSUPP in some cases, and I couldn't
> understand why my code changes was not working, but in reality I was
> hitting the default kfunc implementation as it wasn't the correct
> version of the driver I had loaded.  It would in practice have save me
> time while developing...
> 
> Please suggest a better errno if the color is important to you.
> 
> > > 
> > > > 
> > > > This is intended to ease supporting and troubleshooting setups. E.g.
> > > > when users on mailing list report -19 (ENODEV) as an error, then we can
> > > > immediately tell them their kernel is too old.
> > > 
> > > Do you mean driver being too old, not kernel?
> 
> Sure I guess, I do mean the driver version.
> 
> I guess you are thinking in the lines of Intel customers compiling Intel
> out-of-tree kernel modules, this will also be practical and ease
> troubleshooting for Intel support teams.
> 
> > > > 
> > > > Signed-off-by: Jesper Dangaard Brouer <brouer@redhat.com>
> > > > ---
> > [...]
> > 
> > Thanks,
> > Olek
> > 
>
Toke Høiland-Jørgensen Feb. 16, 2023, 2:13 p.m. UTC | #6
Larysa Zaremba <larysa.zaremba@intel.com> writes:

> On Wed, Feb 15, 2023 at 06:50:10PM +0100, Jesper Dangaard Brouer wrote:
>> 
>> On 15/02/2023 18.11, Alexander Lobakin wrote:
>> > From: Zaremba, Larysa <larysa.zaremba@intel.com>
>> > Date: Wed, 15 Feb 2023 16:45:18 +0100
>> > 
>> > > On Wed, Feb 15, 2023 at 11:09:36AM +0100, Jesper Dangaard Brouer wrote:
>> > > > With our XDP-hints kfunc approach, where individual drivers overload the
>> > > > default implementation, it can be hard for API users to determine
>> > > > whether or not the current device driver have this kfunc available.
>> > > > 
>> > > > Change the default implementations to use an errno (ENODEV), that
>> > > > drivers shouldn't return, to make it possible for BPF runtime to
>> > > > determine if bpf kfunc for xdp metadata isn't implemented by driver.
>> > > 
>> > > I think it diverts ENODEV usage from its original purpose too much.
>> 
>> Can you suggest a errno that is a better fit?
>
> EOPNOTSUPP fits just fine.

An alternative to changing the return code of the default kfuncs is also
to just not have the driver functions themselves use that error code? :)

>> > > Maybe providing information in dmesg would be a better solution?
>> 
>> IMHO we really don't want to print any information in this code path, as
>> this is being executed as part of the BPF-prog. This will lead to
>> unfortunate latency issues.  Also considering the packet rates this need
>> to operate at.
>
> I meant printing messages at bpf program load time...
> When driver functions are patched-in, you have all the information you may need 
> to inform user, if the default implementation for a particular function is used 
> instead.

If you dump the byte code with bpftool (using `bpftool prog dump xlated`), the
name of the function being called will be in the output, which is also a
way to detect if the driver kfunc is being called...

-Toke
diff mbox series

Patch

diff --git a/net/core/xdp.c b/net/core/xdp.c
index 26483935b7a4..7bb5984ae4f7 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -722,10 +722,12 @@  __diag_ignore_all("-Wmissing-prototypes",
  * @timestamp: Return value pointer.
  *
  * Returns 0 on success or ``-errno`` on error.
+ *
+ *  -ENODEV (19): means device driver doesn't implement kfunc
  */
 __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *timestamp)
 {
-	return -EOPNOTSUPP;
+	return -ENODEV;
 }
 
 /**
@@ -734,10 +736,12 @@  __bpf_kfunc int bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx, u64 *tim
  * @hash: Return value pointer.
  *
  * Returns 0 on success or ``-errno`` on error.
+ *
+ *  -ENODEV (19): means device driver doesn't implement kfunc
  */
 __bpf_kfunc int bpf_xdp_metadata_rx_hash(const struct xdp_md *ctx, u32 *hash)
 {
-	return -EOPNOTSUPP;
+	return -ENODEV;
 }
 
 __diag_pop();