diff mbox series

[bpf-next,06/11] xdp: Carry over xdp metadata into skb context

Message ID 20221115030210.3159213-7-sdf@google.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series xdp: hints via kfuncs | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next, async
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline fail Detected static functions without inline keyword in header files: 1
netdev/build_32bit fail Errors and warnings before: 5362 this patch: 868
netdev/cc_maintainers warning 5 maintainers not CCed: hawk@kernel.org pabeni@redhat.com davem@davemloft.net edumazet@google.com imagedong@tencent.com
netdev/build_clang fail Errors and warnings before: 1119 this patch: 417
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/check_selftest success No net selftest shell script
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn fail Errors and warnings before: 5526 this patch: 697
netdev/checkpatch warning CHECK: Alignment should match open parenthesis CHECK: Please use a blank line after function/struct/union/enum declarations WARNING: line length of 81 exceeds 80 columns WARNING: line length of 82 exceeds 80 columns WARNING: line length of 83 exceeds 80 columns WARNING: line length of 85 exceeds 80 columns WARNING: line length of 86 exceeds 80 columns WARNING: line length of 92 exceeds 80 columns WARNING: line length of 97 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR fail PR summary
bpf/vmtest-bpf-next-VM_Test-1 success Logs for ${{ matrix.test }} on ${{ matrix.arch }} with ${{ matrix.toolchain }}
bpf/vmtest-bpf-next-VM_Test-2 success Logs for ShellCheck
bpf/vmtest-bpf-next-VM_Test-3 fail Logs for build for aarch64 with gcc
bpf/vmtest-bpf-next-VM_Test-4 fail Logs for build for aarch64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-5 fail Logs for build for s390x with gcc
bpf/vmtest-bpf-next-VM_Test-6 fail Logs for build for x86_64 with gcc
bpf/vmtest-bpf-next-VM_Test-7 fail Logs for build for x86_64 with llvm-16
bpf/vmtest-bpf-next-VM_Test-8 success Logs for llvm-toolchain
bpf/vmtest-bpf-next-VM_Test-9 success Logs for set-matrix

Commit Message

Stanislav Fomichev Nov. 15, 2022, 3:02 a.m. UTC
Implement new bpf_xdp_metadata_export_to_skb kfunc which
prepares compatible xdp metadata for kernel consumption.
This kfunc should be called prior to bpf_redirect
or when XDP_PASS'ing the frame into the kernel (note, the drivers
have to be updated to enable consuming XDP_PASS'ed metadata).

veth driver is amended to consume this metadata when converting to skb.

Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
whether the frame has skb metadata. The metadata is currently
stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
to work after a call to bpf_xdp_metadata_export_to_skb (can lift
this requirement later on if needed, we'd have to memmove
xdp_skb_metadata).

Cc: John Fastabend <john.fastabend@gmail.com>
Cc: David Ahern <dsahern@gmail.com>
Cc: Martin KaFai Lau <martin.lau@linux.dev>
Cc: Jakub Kicinski <kuba@kernel.org>
Cc: Willem de Bruijn <willemb@google.com>
Cc: Jesper Dangaard Brouer <brouer@redhat.com>
Cc: Anatoly Burakov <anatoly.burakov@intel.com>
Cc: Alexander Lobakin <alexandr.lobakin@intel.com>
Cc: Magnus Karlsson <magnus.karlsson@gmail.com>
Cc: Maryam Tahhan <mtahhan@redhat.com>
Cc: xdp-hints@xdp-project.net
Cc: netdev@vger.kernel.org
Signed-off-by: Stanislav Fomichev <sdf@google.com>
---
 drivers/net/veth.c             |  10 +--
 include/linux/skbuff.h         |   4 +
 include/net/xdp.h              |  17 ++++
 include/uapi/linux/bpf.h       |   7 ++
 kernel/bpf/verifier.c          |  15 ++++
 net/core/filter.c              |  40 +++++++++
 net/core/skbuff.c              |  20 +++++
 net/core/xdp.c                 | 145 +++++++++++++++++++++++++++++++--
 tools/include/uapi/linux/bpf.h |   7 ++
 9 files changed, 255 insertions(+), 10 deletions(-)

Comments

Toke Høiland-Jørgensen Nov. 15, 2022, 11:20 p.m. UTC | #1
> diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> index b444b1118c4f..71e3bc7ad839 100644
> --- a/include/uapi/linux/bpf.h
> +++ b/include/uapi/linux/bpf.h
> @@ -6116,6 +6116,12 @@ enum xdp_action {
>  	XDP_REDIRECT,
>  };
>  
> +/* Subset of XDP metadata exported to skb context.
> + */
> +struct xdp_skb_metadata {
> +	__u64 rx_timestamp;
> +};

Okay, so given Alexei's comment about __randomize_struct not actually
working, I think we need to come up with something else for this. Just
sticking this in a regular UAPI header seems like a bad idea; we'd just
be inviting people to use it as-is.

Do we actually need the full definition here? It's just a pointer
declaration below, so is an opaque forward-definition enough? Then we
could have the full definition in an internal header, moving the full
definition back to being in vmlinux.h only?

-Toke
Stanislav Fomichev Nov. 16, 2022, 3:49 a.m. UTC | #2
On Tue, Nov 15, 2022 at 3:20 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
> > index b444b1118c4f..71e3bc7ad839 100644
> > --- a/include/uapi/linux/bpf.h
> > +++ b/include/uapi/linux/bpf.h
> > @@ -6116,6 +6116,12 @@ enum xdp_action {
> >       XDP_REDIRECT,
> >  };
> >
> > +/* Subset of XDP metadata exported to skb context.
> > + */
> > +struct xdp_skb_metadata {
> > +     __u64 rx_timestamp;
> > +};
>
> Okay, so given Alexei's comment about __randomize_struct not actually
> working, I think we need to come up with something else for this. Just
> sticking this in a regular UAPI header seems like a bad idea; we'd just
> be inviting people to use it as-is.
>
> Do we actually need the full definition here? It's just a pointer
> declaration below, so is an opaque forward-definition enough? Then we
> could have the full definition in an internal header, moving the full
> definition back to being in vmlinux.h only?

Looks like having a uapi-declaration only (and moving the definition
into the kernel headers) might work. At least it does in my limited
testing :-) So let's go with that for now. Alexei, thanks for the
context on __randomize_struct!
kernel test robot Nov. 16, 2022, 4:40 a.m. UTC | #3
Hi Stanislav,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20221115030210.3159213-7-sdf%40google.com
patch subject: [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context
config: hexagon-randconfig-r031-20221115
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 463da45892e2d2a262277b91b96f5f8c05dc25d0)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5c7e4e60d13af8491563dd4d2ad5fc684420e540
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
        git checkout 5c7e4e60d13af8491563dd4d2ad5fc684420e540
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash net/sched/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from net/sched/cls_flow.c:14:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from net/sched/cls_flow.c:14:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from net/sched/cls_flow.c:14:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   In file included from net/sched/cls_flow.c:17:
   In file included from include/linux/ipv6.h:93:
   In file included from include/linux/tcp.h:19:
   In file included from include/net/sock.h:46:
   In file included from include/linux/netdevice.h:43:
   include/net/xdp.h:449:2: error: void function 'xdp_metadata_export_to_skb' should not return a value [-Wreturn-type]
           return 0;
           ^      ~
>> net/sched/cls_flow.c:63:52: warning: shift count >= width of type [-Wshift-count-overflow]
           return (a & 0xFFFFFFFF) ^ (BITS_PER_LONG > 32 ? a >> 32 : 0);
                                                             ^  ~~
   7 warnings and 1 error generated.


vim +63 net/sched/cls_flow.c

e5dfb815181fcb Patrick McHardy 2008-01-31  58  
e5dfb815181fcb Patrick McHardy 2008-01-31  59  static inline u32 addr_fold(void *addr)
e5dfb815181fcb Patrick McHardy 2008-01-31  60  {
e5dfb815181fcb Patrick McHardy 2008-01-31  61  	unsigned long a = (unsigned long)addr;
e5dfb815181fcb Patrick McHardy 2008-01-31  62  
e5dfb815181fcb Patrick McHardy 2008-01-31 @63  	return (a & 0xFFFFFFFF) ^ (BITS_PER_LONG > 32 ? a >> 32 : 0);
e5dfb815181fcb Patrick McHardy 2008-01-31  64  }
e5dfb815181fcb Patrick McHardy 2008-01-31  65
Martin KaFai Lau Nov. 16, 2022, 7:04 a.m. UTC | #4
On 11/14/22 7:02 PM, Stanislav Fomichev wrote:
> Implement new bpf_xdp_metadata_export_to_skb kfunc which
> prepares compatible xdp metadata for kernel consumption.
> This kfunc should be called prior to bpf_redirect
> or when XDP_PASS'ing the frame into the kernel (note, the drivers
> have to be updated to enable consuming XDP_PASS'ed metadata).
> 
> veth driver is amended to consume this metadata when converting to skb.
> 
> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> whether the frame has skb metadata. The metadata is currently
> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> this requirement later on if needed, we'd have to memmove
> xdp_skb_metadata).

It is ok to refuse bpf_xdp_adjust_meta() after bpf_xdp_metadata_export_to_skb() 
for now.  However, it will also need to refuse bpf_xdp_adjust_head().

[ ... ]

> +/* For the packets directed to the kernel, this kfunc exports XDP metadata
> + * into skb context.
> + */
> +noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
> +{
> +	return 0;
> +}
> +

I think it is still better to return 'struct xdp_skb_metata *' instead of 
true/false.  Like:

noinline struct xdp_skb_metata *bpf_xdp_metadata_export_to_skb(const struct 
xdp_md *ctx)
{
	return 0;
}

The KF_RET_NULL has already been set in 
BTF_SET8_START_GLOBAL(xdp_metadata_kfunc_ids).  There is 
"xdp_btf_struct_access()" that can allow write access to 'struct xdp_skb_metata' 
What else is missing? We can try to solve it.

Then there is no need for this double check in patch 8 selftest which is not 
easy to use:

+               if (bpf_xdp_metadata_export_to_skb(ctx) < 0) {
+                       bpf_printk("bpf_xdp_metadata_export_to_skb failed");
+                       return XDP_DROP;
+               }

[ ... ]

+               skb_metadata = ctx->skb_metadata;
+               if (!skb_metadata) {
+                       bpf_printk("no ctx->skb_metadata");
+                       return XDP_DROP;
+               }

[ ... ]


> diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> index b444b1118c4f..71e3bc7ad839 100644
> --- a/tools/include/uapi/linux/bpf.h
> +++ b/tools/include/uapi/linux/bpf.h
> @@ -6116,6 +6116,12 @@ enum xdp_action {
>   	XDP_REDIRECT,
>   };
>   
> +/* Subset of XDP metadata exported to skb context.
> + */
> +struct xdp_skb_metadata {
> +	__u64 rx_timestamp;
> +};
> +
>   /* user accessible metadata for XDP packet hook
>    * new fields must be added to the end of this structure
>    */
> @@ -6128,6 +6134,7 @@ struct xdp_md {
>   	__u32 rx_queue_index;  /* rxq->queue_index  */
>   
>   	__u32 egress_ifindex;  /* txq->dev->ifindex */
> +	__bpf_md_ptr(struct xdp_skb_metadata *, skb_metadata);

Once the above bpf_xdp_metadata_export_to_skb() returning a pointer works, then 
it can be another kfunc 'struct xdp_skb_metata * bpf_xdp_get_skb_metadata(const 
struct xdp_md *ctx)' to return the skb_metadata which was a similar point 
discussed in the previous RFC.
kernel test robot Nov. 16, 2022, 8:22 a.m. UTC | #5
Hi Stanislav,

I love your patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20221115030210.3159213-7-sdf%40google.com
patch subject: [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context
config: parisc-randconfig-r006-20221115
compiler: hppa-linux-gcc (GCC) 12.1.0
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5c7e4e60d13af8491563dd4d2ad5fc684420e540
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
        git checkout 5c7e4e60d13af8491563dd4d2ad5fc684420e540
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=gcc-12.1.0 make.cross W=1 O=build_dir ARCH=parisc SHELL=/bin/bash net/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from include/linux/netdevice.h:43,
                    from include/linux/if_vlan.h:10,
                    from include/linux/filter.h:20,
                    from net/core/xdp.c:9:
   include/net/xdp.h: In function 'xdp_metadata_export_to_skb':
   include/net/xdp.h:449:16: error: 'return' with a value, in function returning void [-Werror=return-type]
     449 |         return 0;
         |                ^
   include/net/xdp.h:447:13: note: declared here
     447 | static void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
         |             ^~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c: At top level:
   net/core/xdp.c:734:14: warning: no previous prototype for 'bpf_xdp_metadata_export_to_skb' [-Wmissing-prototypes]
     734 | noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
         |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c:743:14: warning: no previous prototype for 'bpf_xdp_metadata_rx_timestamp_supported' [-Wmissing-prototypes]
     743 | noinline int bpf_xdp_metadata_rx_timestamp_supported(const struct xdp_md *ctx)
         |              ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   net/core/xdp.c:751:10: warning: type qualifiers ignored on function return type [-Wignored-qualifiers]
     751 | noinline const __u64 bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx)
         |          ^~~~~
   net/core/xdp.c:751:22: warning: no previous prototype for 'bpf_xdp_metadata_rx_timestamp' [-Wmissing-prototypes]
     751 | noinline const __u64 bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx)
         |                      ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~
>> net/core/xdp.c:881:6: error: redefinition of 'xdp_metadata_export_to_skb'
     881 | void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
         |      ^~~~~~~~~~~~~~~~~~~~~~~~~~
   include/net/xdp.h:447:13: note: previous definition of 'xdp_metadata_export_to_skb' with type 'void(const struct bpf_prog *, struct bpf_patch *)'
     447 | static void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
         |             ^~~~~~~~~~~~~~~~~~~~~~~~~~
   include/net/xdp.h:447:13: warning: 'xdp_metadata_export_to_skb' defined but not used [-Wunused-function]
   cc1: some warnings being treated as errors


vim +/xdp_metadata_export_to_skb +881 net/core/xdp.c

   872	
   873	static int __init xdp_metadata_init(void)
   874	{
   875		return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set);
   876	}
   877	late_initcall(xdp_metadata_init);
   878	#else
   879	struct btf_id_set8 xdp_metadata_kfunc_ids = {};
   880	EXPORT_SYMBOL(xdp_metadata_kfunc_ids);
 > 881	void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
kernel test robot Nov. 16, 2022, 9:03 a.m. UTC | #6
Hi Stanislav,

I love your patch! Perhaps something to improve:

[auto build test WARNING on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20221115030210.3159213-7-sdf%40google.com
patch subject: [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context
config: arm64-randconfig-r001-20221115
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 463da45892e2d2a262277b91b96f5f8c05dc25d0)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # install arm64 cross compiling tool for clang build
        # apt-get install binutils-aarch64-linux-gnu
        # https://github.com/intel-lab-lkp/linux/commit/5c7e4e60d13af8491563dd4d2ad5fc684420e540
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
        git checkout 5c7e4e60d13af8491563dd4d2ad5fc684420e540
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=arm64 SHELL=/bin/bash net/mac80211/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All warnings (new ones prefixed by >>):

   In file included from net/mac80211/s1g.c:6:
   In file included from include/linux/ieee80211.h:20:
   In file included from include/linux/etherdevice.h:21:
   In file included from include/linux/netdevice.h:43:
   include/net/xdp.h:449:2: error: void function 'xdp_metadata_export_to_skb' should not return a value [-Wreturn-type]
           return 0;
           ^      ~
>> net/mac80211/s1g.c:103:36: warning: implicit conversion from 'unsigned long' to '__u16' (aka 'unsigned short') changes value from 18446744073709551614 to 65534 [-Wconstant-conversion]
           twt_agrt->req_type &= cpu_to_le16(~IEEE80211_TWT_REQTYPE_REQUEST);
                                 ~~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
   include/linux/byteorder/generic.h:90:21: note: expanded from macro 'cpu_to_le16'
   #define cpu_to_le16 __cpu_to_le16
                       ^
   include/uapi/linux/byteorder/big_endian.h:36:53: note: expanded from macro '__cpu_to_le16'
   #define __cpu_to_le16(x) ((__force __le16)__swab16((x)))
                                             ~~~~~~~~~~^~~
   include/uapi/linux/swab.h:107:12: note: expanded from macro '__swab16'
           __fswab16(x))
           ~~~~~~~~~ ^
   1 warning and 1 error generated.


vim +103 net/mac80211/s1g.c

f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   94  
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   95  static void
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   96  ieee80211_s1g_rx_twt_setup(struct ieee80211_sub_if_data *sdata,
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   97  			   struct sta_info *sta, struct sk_buff *skb)
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   98  {
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23   99  	struct ieee80211_mgmt *mgmt = (void *)skb->data;
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  100  	struct ieee80211_twt_setup *twt = (void *)mgmt->u.action.u.s1g.variable;
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  101  	struct ieee80211_twt_params *twt_agrt = (void *)twt->params;
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  102  
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23 @103  	twt_agrt->req_type &= cpu_to_le16(~IEEE80211_TWT_REQTYPE_REQUEST);
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  104  
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  105  	/* broadcast TWT not supported yet */
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  106  	if (twt->control & IEEE80211_TWT_CONTROL_NEG_TYPE_BROADCAST) {
7ff379ba2d4b7b Johannes Berg    2021-09-27  107  		twt_agrt->req_type &=
7ff379ba2d4b7b Johannes Berg    2021-09-27  108  			~cpu_to_le16(IEEE80211_TWT_REQTYPE_SETUP_CMD);
7ff379ba2d4b7b Johannes Berg    2021-09-27  109  		twt_agrt->req_type |=
7ff379ba2d4b7b Johannes Berg    2021-09-27  110  			le16_encode_bits(TWT_SETUP_CMD_REJECT,
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  111  					 IEEE80211_TWT_REQTYPE_SETUP_CMD);
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  112  		goto out;
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  113  	}
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  114  
30ac96f7cc973b Howard Hsu       2022-10-27  115  	/* TWT Information not supported yet */
30ac96f7cc973b Howard Hsu       2022-10-27  116  	twt->control |= IEEE80211_TWT_CONTROL_RX_DISABLED;
30ac96f7cc973b Howard Hsu       2022-10-27  117  
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  118  	drv_add_twt_setup(sdata->local, sdata, &sta->sta, twt);
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  119  out:
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  120  	ieee80211_s1g_send_twt_setup(sdata, mgmt->sa, sdata->vif.addr, twt);
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  121  }
f5a4c24e689f54 Lorenzo Bianconi 2021-08-23  122
Toke Høiland-Jørgensen Nov. 16, 2022, 9:30 a.m. UTC | #7
Stanislav Fomichev <sdf@google.com> writes:

> On Tue, Nov 15, 2022 at 3:20 PM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> > diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
>> > index b444b1118c4f..71e3bc7ad839 100644
>> > --- a/include/uapi/linux/bpf.h
>> > +++ b/include/uapi/linux/bpf.h
>> > @@ -6116,6 +6116,12 @@ enum xdp_action {
>> >       XDP_REDIRECT,
>> >  };
>> >
>> > +/* Subset of XDP metadata exported to skb context.
>> > + */
>> > +struct xdp_skb_metadata {
>> > +     __u64 rx_timestamp;
>> > +};
>>
>> Okay, so given Alexei's comment about __randomize_struct not actually
>> working, I think we need to come up with something else for this. Just
>> sticking this in a regular UAPI header seems like a bad idea; we'd just
>> be inviting people to use it as-is.
>>
>> Do we actually need the full definition here? It's just a pointer
>> declaration below, so is an opaque forward-definition enough? Then we
>> could have the full definition in an internal header, moving the full
>> definition back to being in vmlinux.h only?
>
> Looks like having a uapi-declaration only (and moving the definition
> into the kernel headers) might work. At least it does in my limited
> testing :-) So let's go with that for now. Alexei, thanks for the
> context on __randomize_struct!

Sounds good! :)

-Toke
Toke Høiland-Jørgensen Nov. 16, 2022, 9:48 a.m. UTC | #8
Martin KaFai Lau <martin.lau@linux.dev> writes:

> On 11/14/22 7:02 PM, Stanislav Fomichev wrote:
>> Implement new bpf_xdp_metadata_export_to_skb kfunc which
>> prepares compatible xdp metadata for kernel consumption.
>> This kfunc should be called prior to bpf_redirect
>> or when XDP_PASS'ing the frame into the kernel (note, the drivers
>> have to be updated to enable consuming XDP_PASS'ed metadata).
>> 
>> veth driver is amended to consume this metadata when converting to skb.
>> 
>> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
>> whether the frame has skb metadata. The metadata is currently
>> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
>> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
>> this requirement later on if needed, we'd have to memmove
>> xdp_skb_metadata).
>
> It is ok to refuse bpf_xdp_adjust_meta() after bpf_xdp_metadata_export_to_skb() 
> for now.  However, it will also need to refuse bpf_xdp_adjust_head().

I'm also OK with deferring this, although I'm wondering if it isn't just
as easy to just add the memmove() straight away? :)

-Toke
kernel test robot Nov. 16, 2022, 1:46 p.m. UTC | #9
Hi Stanislav,

I love your patch! Yet something to improve:

[auto build test ERROR on bpf-next/master]

url:    https://github.com/intel-lab-lkp/linux/commits/Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
base:   https://git.kernel.org/pub/scm/linux/kernel/git/bpf/bpf-next.git master
patch link:    https://lore.kernel.org/r/20221115030210.3159213-7-sdf%40google.com
patch subject: [PATCH bpf-next 06/11] xdp: Carry over xdp metadata into skb context
config: hexagon-randconfig-r041-20221115
compiler: clang version 16.0.0 (https://github.com/llvm/llvm-project 463da45892e2d2a262277b91b96f5f8c05dc25d0)
reproduce (this is a W=1 build):
        wget https://raw.githubusercontent.com/intel/lkp-tests/master/sbin/make.cross -O ~/bin/make.cross
        chmod +x ~/bin/make.cross
        # https://github.com/intel-lab-lkp/linux/commit/5c7e4e60d13af8491563dd4d2ad5fc684420e540
        git remote add linux-review https://github.com/intel-lab-lkp/linux
        git fetch --no-tags linux-review Stanislav-Fomichev/xdp-hints-via-kfuncs/20221115-110547
        git checkout 5c7e4e60d13af8491563dd4d2ad5fc684420e540
        # save the config file
        mkdir build_dir && cp config build_dir/.config
        COMPILER_INSTALL_PATH=$HOME/0day COMPILER=clang make.cross W=1 O=build_dir ARCH=hexagon SHELL=/bin/bash net/core/

If you fix the issue, kindly add following tag where applicable
| Reported-by: kernel test robot <lkp@intel.com>

All errors (new ones prefixed by >>):

   In file included from net/core/dev_addr_lists_test.c:4:
   In file included from include/linux/etherdevice.h:20:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from net/core/dev_addr_lists_test.c:4:
   In file included from include/linux/etherdevice.h:20:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from net/core/dev_addr_lists_test.c:4:
   In file included from include/linux/etherdevice.h:20:
   In file included from include/linux/if_ether.h:19:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   In file included from net/core/dev_addr_lists_test.c:4:
   In file included from include/linux/etherdevice.h:21:
   In file included from include/linux/netdevice.h:43:
>> include/net/xdp.h:449:2: error: void function 'xdp_metadata_export_to_skb' should not return a value [-Wreturn-type]
           return 0;
           ^      ~
   6 warnings and 1 error generated.
--
   In file included from net/core/xdp.c:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from net/core/xdp.c:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from net/core/xdp.c:9:
   In file included from include/linux/filter.h:12:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   In file included from net/core/xdp.c:9:
   In file included from include/linux/filter.h:20:
   In file included from include/linux/if_vlan.h:10:
   In file included from include/linux/netdevice.h:43:
>> include/net/xdp.h:449:2: error: void function 'xdp_metadata_export_to_skb' should not return a value [-Wreturn-type]
           return 0;
           ^      ~
   net/core/xdp.c:734:14: warning: no previous prototype for function 'bpf_xdp_metadata_export_to_skb' [-Wmissing-prototypes]
   noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
                ^
   net/core/xdp.c:734:10: note: declare 'static' if the function is not intended to be used outside of this translation unit
   noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
            ^
            static 
   net/core/xdp.c:743:14: warning: no previous prototype for function 'bpf_xdp_metadata_rx_timestamp_supported' [-Wmissing-prototypes]
   noinline int bpf_xdp_metadata_rx_timestamp_supported(const struct xdp_md *ctx)
                ^
   net/core/xdp.c:743:10: note: declare 'static' if the function is not intended to be used outside of this translation unit
   noinline int bpf_xdp_metadata_rx_timestamp_supported(const struct xdp_md *ctx)
            ^
            static 
   net/core/xdp.c:751:10: warning: 'const' type qualifier on return type has no effect [-Wignored-qualifiers]
   noinline const __u64 bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx)
            ^~~~~~
   net/core/xdp.c:751:22: warning: no previous prototype for function 'bpf_xdp_metadata_rx_timestamp' [-Wmissing-prototypes]
   noinline const __u64 bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx)
                        ^
   net/core/xdp.c:751:16: note: declare 'static' if the function is not intended to be used outside of this translation unit
   noinline const __u64 bpf_xdp_metadata_rx_timestamp(const struct xdp_md *ctx)
                  ^
                  static 
   net/core/xdp.c:881:6: error: redefinition of 'xdp_metadata_export_to_skb'
   void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
        ^
   include/net/xdp.h:447:13: note: previous definition is here
   static void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
               ^
   10 warnings and 2 errors generated.
--
   In file included from net/core/net-traces.c:8:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:547:31: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __raw_readb(PCI_IOBASE + addr);
                             ~~~~~~~~~~ ^
   include/asm-generic/io.h:560:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le16_to_cpu((__le16 __force)__raw_readw(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:37:51: note: expanded from macro '__le16_to_cpu'
   #define __le16_to_cpu(x) ((__force __u16)(__le16)(x))
                                                     ^
   In file included from net/core/net-traces.c:8:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:573:61: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           val = __le32_to_cpu((__le32 __force)__raw_readl(PCI_IOBASE + addr));
                                                           ~~~~~~~~~~ ^
   include/uapi/linux/byteorder/little_endian.h:35:51: note: expanded from macro '__le32_to_cpu'
   #define __le32_to_cpu(x) ((__force __u32)(__le32)(x))
                                                     ^
   In file included from net/core/net-traces.c:8:
   In file included from include/linux/netdevice.h:38:
   In file included from include/net/net_namespace.h:43:
   In file included from include/linux/skbuff.h:17:
   In file included from include/linux/bvec.h:10:
   In file included from include/linux/highmem.h:12:
   In file included from include/linux/hardirq.h:11:
   In file included from ./arch/hexagon/include/generated/asm/hardirq.h:1:
   In file included from include/asm-generic/hardirq.h:17:
   In file included from include/linux/irq.h:20:
   In file included from include/linux/io.h:13:
   In file included from arch/hexagon/include/asm/io.h:334:
   include/asm-generic/io.h:584:33: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writeb(value, PCI_IOBASE + addr);
                               ~~~~~~~~~~ ^
   include/asm-generic/io.h:594:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writew((u16 __force)cpu_to_le16(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   include/asm-generic/io.h:604:59: warning: performing pointer arithmetic on a null pointer has undefined behavior [-Wnull-pointer-arithmetic]
           __raw_writel((u32 __force)cpu_to_le32(value), PCI_IOBASE + addr);
                                                         ~~~~~~~~~~ ^
   In file included from net/core/net-traces.c:8:
   In file included from include/linux/netdevice.h:43:
>> include/net/xdp.h:449:2: error: void function 'xdp_metadata_export_to_skb' should not return a value [-Wreturn-type]
           return 0;
           ^      ~
   In file included from net/core/net-traces.c:50:
   In file included from include/trace/events/neigh.h:255:
   In file included from include/trace/define_trace.h:102:
   In file included from include/trace/trace_events.h:419:
   include/trace/events/neigh.h:42:20: warning: variable 'pin6' set but not used [-Wunused-but-set-variable]
                   struct in6_addr *pin6;
                                    ^
   In file included from net/core/net-traces.c:50:
   In file included from include/trace/events/neigh.h:255:
   In file included from include/trace/define_trace.h:103:
   In file included from include/trace/perf.h:113:
   include/trace/events/neigh.h:42:20: warning: variable 'pin6' set but not used [-Wunused-but-set-variable]
                   struct in6_addr *pin6;
                                    ^
   8 warnings and 1 error generated.


vim +/xdp_metadata_export_to_skb +449 include/net/xdp.h

   437	
   438	#ifdef CONFIG_DEBUG_INFO_BTF
   439	extern struct btf_id_set8 xdp_metadata_kfunc_ids;
   440	static inline u32 xdp_metadata_kfunc_id(int id)
   441	{
   442		return xdp_metadata_kfunc_ids.pairs[id].id;
   443	}
   444	void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch);
   445	#else
   446	static inline u32 xdp_metadata_kfunc_id(int id) { return 0; }
   447	static void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
   448	{
 > 449		return 0;
   450	}
   451	#endif
   452
Stanislav Fomichev Nov. 16, 2022, 8:51 p.m. UTC | #10
On Wed, Nov 16, 2022 at 1:48 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Martin KaFai Lau <martin.lau@linux.dev> writes:
>
> > On 11/14/22 7:02 PM, Stanislav Fomichev wrote:
> >> Implement new bpf_xdp_metadata_export_to_skb kfunc which
> >> prepares compatible xdp metadata for kernel consumption.
> >> This kfunc should be called prior to bpf_redirect
> >> or when XDP_PASS'ing the frame into the kernel (note, the drivers
> >> have to be updated to enable consuming XDP_PASS'ed metadata).
> >>
> >> veth driver is amended to consume this metadata when converting to skb.
> >>
> >> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> >> whether the frame has skb metadata. The metadata is currently
> >> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> >> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> >> this requirement later on if needed, we'd have to memmove
> >> xdp_skb_metadata).
> >
> > It is ok to refuse bpf_xdp_adjust_meta() after bpf_xdp_metadata_export_to_skb()
> > for now.  However, it will also need to refuse bpf_xdp_adjust_head().
>
> I'm also OK with deferring this, although I'm wondering if it isn't just
> as easy to just add the memmove() straight away? :)

SG, let me try that! Martin also mentioned bpf_xdp_adjust_head needs
to be taken care of..
Stanislav Fomichev Nov. 16, 2022, 8:51 p.m. UTC | #11
On Tue, Nov 15, 2022 at 11:04 PM Martin KaFai Lau <martin.lau@linux.dev> wrote:
>
> On 11/14/22 7:02 PM, Stanislav Fomichev wrote:
> > Implement new bpf_xdp_metadata_export_to_skb kfunc which
> > prepares compatible xdp metadata for kernel consumption.
> > This kfunc should be called prior to bpf_redirect
> > or when XDP_PASS'ing the frame into the kernel (note, the drivers
> > have to be updated to enable consuming XDP_PASS'ed metadata).
> >
> > veth driver is amended to consume this metadata when converting to skb.
> >
> > Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> > whether the frame has skb metadata. The metadata is currently
> > stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> > to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> > this requirement later on if needed, we'd have to memmove
> > xdp_skb_metadata).
>
> It is ok to refuse bpf_xdp_adjust_meta() after bpf_xdp_metadata_export_to_skb()
> for now.  However, it will also need to refuse bpf_xdp_adjust_head().

Good catch!

> [ ... ]
>
> > +/* For the packets directed to the kernel, this kfunc exports XDP metadata
> > + * into skb context.
> > + */
> > +noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
> > +{
> > +     return 0;
> > +}
> > +
>
> I think it is still better to return 'struct xdp_skb_metata *' instead of
> true/false.  Like:
>
> noinline struct xdp_skb_metata *bpf_xdp_metadata_export_to_skb(const struct
> xdp_md *ctx)
> {
>         return 0;
> }
>
> The KF_RET_NULL has already been set in
> BTF_SET8_START_GLOBAL(xdp_metadata_kfunc_ids).  There is
> "xdp_btf_struct_access()" that can allow write access to 'struct xdp_skb_metata'
> What else is missing? We can try to solve it.

Ah, that KF_RET_NULL is a left-over from my previous attempt to return
pointers :-)
I can retry with returning a pointer, I don't have a preference, but I
felt like returning simple -errno might be more simple api-wise.
(with bpf_xdp_metadata_export_to_skb returning NULL vs return loggable
errno - I'd prefer the latter, but not strongly)

> Then there is no need for this double check in patch 8 selftest which is not
> easy to use:
>
> +               if (bpf_xdp_metadata_export_to_skb(ctx) < 0) {
> +                       bpf_printk("bpf_xdp_metadata_export_to_skb failed");
> +                       return XDP_DROP;
> +               }
>
> [ ... ]
>
> +               skb_metadata = ctx->skb_metadata;
> +               if (!skb_metadata) {
> +                       bpf_printk("no ctx->skb_metadata");
> +                       return XDP_DROP;
> +               }
>
> [ ... ]
>
>
> > diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
> > index b444b1118c4f..71e3bc7ad839 100644
> > --- a/tools/include/uapi/linux/bpf.h
> > +++ b/tools/include/uapi/linux/bpf.h
> > @@ -6116,6 +6116,12 @@ enum xdp_action {
> >       XDP_REDIRECT,
> >   };
> >
> > +/* Subset of XDP metadata exported to skb context.
> > + */
> > +struct xdp_skb_metadata {
> > +     __u64 rx_timestamp;
> > +};
> > +
> >   /* user accessible metadata for XDP packet hook
> >    * new fields must be added to the end of this structure
> >    */
> > @@ -6128,6 +6134,7 @@ struct xdp_md {
> >       __u32 rx_queue_index;  /* rxq->queue_index  */
> >
> >       __u32 egress_ifindex;  /* txq->dev->ifindex */
> > +     __bpf_md_ptr(struct xdp_skb_metadata *, skb_metadata);
>
> Once the above bpf_xdp_metadata_export_to_skb() returning a pointer works, then
> it can be another kfunc 'struct xdp_skb_metata * bpf_xdp_get_skb_metadata(const
> struct xdp_md *ctx)' to return the skb_metadata which was a similar point
> discussed in the previous RFC.

I see. I think you've mentioned it previously somewhere to have a
kfunc accessor vs uapi field and I totally forgot. Will try to address
it, ty!
Jakub Kicinski Nov. 16, 2022, 9:12 p.m. UTC | #12
On Mon, 14 Nov 2022 19:02:05 -0800 Stanislav Fomichev wrote:
> Implement new bpf_xdp_metadata_export_to_skb kfunc which
> prepares compatible xdp metadata for kernel consumption.
> This kfunc should be called prior to bpf_redirect
> or when XDP_PASS'ing the frame into the kernel (note, the drivers
> have to be updated to enable consuming XDP_PASS'ed metadata).
> 
> veth driver is amended to consume this metadata when converting to skb.
> 
> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> whether the frame has skb metadata. The metadata is currently
> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> this requirement later on if needed, we'd have to memmove
> xdp_skb_metadata).

IMO we should split the xdp -> skb work from the pure HW data access 
in XDP. We have a proof point here that there is a way of building 
on top of the first 5 patches to achieve the objective, and that's
sufficient to let the prior work going in.

..because some of us may not agree that we should be pushing in a
fixed-format structure that's also listed in uAPI. And prefer to,
say, let the user define their format and add a call point for a
BPF prog to populate the skb from whatever data they decided to stash...

That's how I interpreted some of John's comments as well, but I may be
wrong.

Either way, there is no technical reason for the xdp -> skb field
propagation to hold up the HW descriptor access, right? If anything 
we may be able to make quicker progress if prior work is in tree
and multiple people can hack...
Martin KaFai Lau Nov. 16, 2022, 9:49 p.m. UTC | #13
On 11/16/22 1:12 PM, Jakub Kicinski wrote:
> On Mon, 14 Nov 2022 19:02:05 -0800 Stanislav Fomichev wrote:
>> Implement new bpf_xdp_metadata_export_to_skb kfunc which
>> prepares compatible xdp metadata for kernel consumption.
>> This kfunc should be called prior to bpf_redirect
>> or when XDP_PASS'ing the frame into the kernel (note, the drivers
>> have to be updated to enable consuming XDP_PASS'ed metadata).
>>
>> veth driver is amended to consume this metadata when converting to skb.
>>
>> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
>> whether the frame has skb metadata. The metadata is currently
>> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
>> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
>> this requirement later on if needed, we'd have to memmove
>> xdp_skb_metadata).
> 
> IMO we should split the xdp -> skb work from the pure HW data access
> in XDP. We have a proof point here that there is a way of building
> on top of the first 5 patches to achieve the objective, and that's
> sufficient to let the prior work going in.

+1

Good idea.
Jesper Dangaard Brouer Nov. 18, 2022, 2:05 p.m. UTC | #14
On 15/11/2022 04.02, Stanislav Fomichev wrote:
> Implement new bpf_xdp_metadata_export_to_skb kfunc which
> prepares compatible xdp metadata for kernel consumption.
> This kfunc should be called prior to bpf_redirect
> or when XDP_PASS'ing the frame into the kernel (note, the drivers
> have to be updated to enable consuming XDP_PASS'ed metadata).
> 
> veth driver is amended to consume this metadata when converting to skb.
> 
> Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> whether the frame has skb metadata. The metadata is currently
> stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> this requirement later on if needed, we'd have to memmove
> xdp_skb_metadata).
> 

I think it is wrong to refuses using metadata area (bpf_xdp_adjust_meta)
when the function bpf_xdp_metadata_export_to_skb() have been called.
In my design they were suppose to co-exist, and BPF-prog was expected to
access this directly themselves.

With this current design, I think it is better to place the struct
xdp_skb_metadata (maybe call it xdp_skb_hints) after xdp_frame (in the
top of the frame).  This way we don't conflict with metadata and
headroom use-cases.  Plus, verifier will keep BPF-prog from accessing
this area directly (which seems to be one of the new design goals).

By placing it after xdp_frame, I think it would be possible to let veth 
unroll functions seamlessly access this info for XDP_REDIRECT'ed 
xdp_frame's.

WDYT?

--Jesper
Stanislav Fomichev Nov. 18, 2022, 6:18 p.m. UTC | #15
On Fri, Nov 18, 2022 at 6:05 AM Jesper Dangaard Brouer
<jbrouer@redhat.com> wrote:
>
>
> On 15/11/2022 04.02, Stanislav Fomichev wrote:
> > Implement new bpf_xdp_metadata_export_to_skb kfunc which
> > prepares compatible xdp metadata for kernel consumption.
> > This kfunc should be called prior to bpf_redirect
> > or when XDP_PASS'ing the frame into the kernel (note, the drivers
> > have to be updated to enable consuming XDP_PASS'ed metadata).
> >
> > veth driver is amended to consume this metadata when converting to skb.
> >
> > Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> > whether the frame has skb metadata. The metadata is currently
> > stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> > to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> > this requirement later on if needed, we'd have to memmove
> > xdp_skb_metadata).
> >
>
> I think it is wrong to refuses using metadata area (bpf_xdp_adjust_meta)
> when the function bpf_xdp_metadata_export_to_skb() have been called.
> In my design they were suppose to co-exist, and BPF-prog was expected to
> access this directly themselves.
>
> With this current design, I think it is better to place the struct
> xdp_skb_metadata (maybe call it xdp_skb_hints) after xdp_frame (in the
> top of the frame).  This way we don't conflict with metadata and
> headroom use-cases.  Plus, verifier will keep BPF-prog from accessing
> this area directly (which seems to be one of the new design goals).
>
> By placing it after xdp_frame, I think it would be possible to let veth
> unroll functions seamlessly access this info for XDP_REDIRECT'ed
> xdp_frame's.
>
> WDYT?

Not everyone seems to be happy with exposing this xdp_skb_metadata via
uapi though :-(
So I'll drop this part in the v2 for now. But let's definitely keep
talking about the future approach.

Putting it after xdp_frame SGTM; with this we seem to avoid the need
to memmove it on adjust_{head,meta}.

But going back to the uapi part, what if we add separate kfunc
accessors for skb exported metadata?

To export:
bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx, rx_timestamp)
bpf_xdp_metadata_export_rx_hash_to_skb(ctx, rx_hash)
// ^^ these prepare xdp_skb_metadata after xdp_frame, but not expose
it via uapi/af_xdp/etc

Then bpf_xdp_metadata_export_to_skb can be 'static inline' define in
the headers:

void bpf_xdp_metadata_export_to_skb(ctx)
{
  if (bpf_xdp_metadata_rx_timestamp_supported(ctx))
    bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx,
bpf_xdp_metadata_rx_timestamp(ctx));
  if (bpf_xdp_metadata_rx_hash_supported(ctx))
    bpf_xdp_metadata_export_rx_hash_to_skb(ctx, bpf_xdp_metadata_rx_hash(ctx));
}

We can also do the accessors:
u64 bpf_xdp_metadata_skb_rx_timestamp(ctx)
u32 bpf_xdp_metadata_skb_rx_hash(ctx)

Hopefully we can unroll at least these, since they are not part of the
drivers, it should be easier to argue...

The only issue, it seems, is that if the final bpf program would like
to export this metadata to af_xdp, it has to manually adj_meta and use
bpf_xdp_metadata_skb_rx_xxx to prepare a custom layout. Not sure
whether performance would suffer with this extra copy; but we can at
least try and see..
Toke Høiland-Jørgensen Nov. 19, 2022, 12:31 p.m. UTC | #16
Stanislav Fomichev <sdf@google.com> writes:

> On Fri, Nov 18, 2022 at 6:05 AM Jesper Dangaard Brouer
> <jbrouer@redhat.com> wrote:
>>
>>
>> On 15/11/2022 04.02, Stanislav Fomichev wrote:
>> > Implement new bpf_xdp_metadata_export_to_skb kfunc which
>> > prepares compatible xdp metadata for kernel consumption.
>> > This kfunc should be called prior to bpf_redirect
>> > or when XDP_PASS'ing the frame into the kernel (note, the drivers
>> > have to be updated to enable consuming XDP_PASS'ed metadata).
>> >
>> > veth driver is amended to consume this metadata when converting to skb.
>> >
>> > Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
>> > whether the frame has skb metadata. The metadata is currently
>> > stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
>> > to work after a call to bpf_xdp_metadata_export_to_skb (can lift
>> > this requirement later on if needed, we'd have to memmove
>> > xdp_skb_metadata).
>> >
>>
>> I think it is wrong to refuses using metadata area (bpf_xdp_adjust_meta)
>> when the function bpf_xdp_metadata_export_to_skb() have been called.
>> In my design they were suppose to co-exist, and BPF-prog was expected to
>> access this directly themselves.
>>
>> With this current design, I think it is better to place the struct
>> xdp_skb_metadata (maybe call it xdp_skb_hints) after xdp_frame (in the
>> top of the frame).  This way we don't conflict with metadata and
>> headroom use-cases.  Plus, verifier will keep BPF-prog from accessing
>> this area directly (which seems to be one of the new design goals).
>>
>> By placing it after xdp_frame, I think it would be possible to let veth
>> unroll functions seamlessly access this info for XDP_REDIRECT'ed
>> xdp_frame's.
>>
>> WDYT?
>
> Not everyone seems to be happy with exposing this xdp_skb_metadata via
> uapi though :-(
> So I'll drop this part in the v2 for now. But let's definitely keep
> talking about the future approach.

Jakub was objecting to putting it in the UAPI header, but didn't we
already agree that this wasn't necessary?

I.e., if we just define

struct xdp_skb_metadata *bpf_xdp_metadata_export_to_skb()

as a kfunc, the xdp_skb_metadata struct won't appear in any UAPI headers
and will only be accessible via BTF? And we can put the actual data
wherever we choose, since that bit is nicely hidden behind the kfunc,
while the returned pointer still allows programs to access it.

We could even make that kfunc smart enough that it checks if the field
is already populated and just return the pointer to the existing data
instead of re-populating it int his case (with a flag to override,
maybe?).

> Putting it after xdp_frame SGTM; with this we seem to avoid the need
> to memmove it on adjust_{head,meta}.
>
> But going back to the uapi part, what if we add separate kfunc
> accessors for skb exported metadata?
>
> To export:
> bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx, rx_timestamp)
> bpf_xdp_metadata_export_rx_hash_to_skb(ctx, rx_hash)
> // ^^ these prepare xdp_skb_metadata after xdp_frame, but not expose
> it via uapi/af_xdp/etc
>
> Then bpf_xdp_metadata_export_to_skb can be 'static inline' define in
> the headers:
>
> void bpf_xdp_metadata_export_to_skb(ctx)
> {
>   if (bpf_xdp_metadata_rx_timestamp_supported(ctx))
>     bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx,
> bpf_xdp_metadata_rx_timestamp(ctx));
>   if (bpf_xdp_metadata_rx_hash_supported(ctx))
>     bpf_xdp_metadata_export_rx_hash_to_skb(ctx, bpf_xdp_metadata_rx_hash(ctx));
> }

The problem with this is that the BPF programs then have to keep up with
the kernel. I.e., if the kernel later adds support for a new field that
is used in the SKB, old XDP programs won't be populating it, which seems
suboptimal. I think rather the kernel should be in control of the SKB
metadata, and just allow XDP to consume it (and change individual fields
as needed).

> The only issue, it seems, is that if the final bpf program would like
> to export this metadata to af_xdp, it has to manually adj_meta and use
> bpf_xdp_metadata_skb_rx_xxx to prepare a custom layout. Not sure
> whether performance would suffer with this extra copy; but we can at
> least try and see..

If we write the metadata after the packet data, that could still be
transferred to AF_XDP, couldn't it? Userspace would just have to know
how to find and read it, like it would  if it's before the metadata.

-Toke
Stanislav Fomichev Nov. 21, 2022, 5:53 p.m. UTC | #17
On Sat, Nov 19, 2022 at 4:31 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Stanislav Fomichev <sdf@google.com> writes:
>
> > On Fri, Nov 18, 2022 at 6:05 AM Jesper Dangaard Brouer
> > <jbrouer@redhat.com> wrote:
> >>
> >>
> >> On 15/11/2022 04.02, Stanislav Fomichev wrote:
> >> > Implement new bpf_xdp_metadata_export_to_skb kfunc which
> >> > prepares compatible xdp metadata for kernel consumption.
> >> > This kfunc should be called prior to bpf_redirect
> >> > or when XDP_PASS'ing the frame into the kernel (note, the drivers
> >> > have to be updated to enable consuming XDP_PASS'ed metadata).
> >> >
> >> > veth driver is amended to consume this metadata when converting to skb.
> >> >
> >> > Internally, XDP_FLAGS_HAS_SKB_METADATA flag is used to indicate
> >> > whether the frame has skb metadata. The metadata is currently
> >> > stored prior to xdp->data_meta. bpf_xdp_adjust_meta refuses
> >> > to work after a call to bpf_xdp_metadata_export_to_skb (can lift
> >> > this requirement later on if needed, we'd have to memmove
> >> > xdp_skb_metadata).
> >> >
> >>
> >> I think it is wrong to refuses using metadata area (bpf_xdp_adjust_meta)
> >> when the function bpf_xdp_metadata_export_to_skb() have been called.
> >> In my design they were suppose to co-exist, and BPF-prog was expected to
> >> access this directly themselves.
> >>
> >> With this current design, I think it is better to place the struct
> >> xdp_skb_metadata (maybe call it xdp_skb_hints) after xdp_frame (in the
> >> top of the frame).  This way we don't conflict with metadata and
> >> headroom use-cases.  Plus, verifier will keep BPF-prog from accessing
> >> this area directly (which seems to be one of the new design goals).
> >>
> >> By placing it after xdp_frame, I think it would be possible to let veth
> >> unroll functions seamlessly access this info for XDP_REDIRECT'ed
> >> xdp_frame's.
> >>
> >> WDYT?
> >
> > Not everyone seems to be happy with exposing this xdp_skb_metadata via
> > uapi though :-(
> > So I'll drop this part in the v2 for now. But let's definitely keep
> > talking about the future approach.
>
> Jakub was objecting to putting it in the UAPI header, but didn't we
> already agree that this wasn't necessary?
>
> I.e., if we just define
>
> struct xdp_skb_metadata *bpf_xdp_metadata_export_to_skb()
>
> as a kfunc, the xdp_skb_metadata struct won't appear in any UAPI headers
> and will only be accessible via BTF? And we can put the actual data
> wherever we choose, since that bit is nicely hidden behind the kfunc,
> while the returned pointer still allows programs to access it.
>
> We could even make that kfunc smart enough that it checks if the field
> is already populated and just return the pointer to the existing data
> instead of re-populating it int his case (with a flag to override,
> maybe?).

Even if we only expose it via btf, I think the fact that we still
expose a somewhat fixed layout is the problem?
I'm not sure the fact that we're not technically putting in the uapi
header is the issue here, but maybe I'm wrong?
Jakub?

> > Putting it after xdp_frame SGTM; with this we seem to avoid the need
> > to memmove it on adjust_{head,meta}.
> >
> > But going back to the uapi part, what if we add separate kfunc
> > accessors for skb exported metadata?
> >
> > To export:
> > bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx, rx_timestamp)
> > bpf_xdp_metadata_export_rx_hash_to_skb(ctx, rx_hash)
> > // ^^ these prepare xdp_skb_metadata after xdp_frame, but not expose
> > it via uapi/af_xdp/etc
> >
> > Then bpf_xdp_metadata_export_to_skb can be 'static inline' define in
> > the headers:
> >
> > void bpf_xdp_metadata_export_to_skb(ctx)
> > {
> >   if (bpf_xdp_metadata_rx_timestamp_supported(ctx))
> >     bpf_xdp_metadata_export_rx_timestamp_to_skb(ctx,
> > bpf_xdp_metadata_rx_timestamp(ctx));
> >   if (bpf_xdp_metadata_rx_hash_supported(ctx))
> >     bpf_xdp_metadata_export_rx_hash_to_skb(ctx, bpf_xdp_metadata_rx_hash(ctx));
> > }
>
> The problem with this is that the BPF programs then have to keep up with
> the kernel. I.e., if the kernel later adds support for a new field that
> is used in the SKB, old XDP programs won't be populating it, which seems
> suboptimal. I think rather the kernel should be in control of the SKB
> metadata, and just allow XDP to consume it (and change individual fields
> as needed).

Good point. Although doesn't sound like a huge drawback to me? If that
bpf_xdp_metadata_export_to_skb is a part of libbpf/libxdp, the new
fields will get populated after a library update..

> > The only issue, it seems, is that if the final bpf program would like
> > to export this metadata to af_xdp, it has to manually adj_meta and use
> > bpf_xdp_metadata_skb_rx_xxx to prepare a custom layout. Not sure
> > whether performance would suffer with this extra copy; but we can at
> > least try and see..
>
> If we write the metadata after the packet data, that could still be
> transferred to AF_XDP, couldn't it? Userspace would just have to know
> how to find and read it, like it would  if it's before the metadata.

Right, but here we again bump into the fact that we need to somehow
communicate that layout to the userspace (via btf ids) which doesn't
make everybody excited :-)
Jakub Kicinski Nov. 21, 2022, 6:47 p.m. UTC | #18
On Mon, 21 Nov 2022 09:53:02 -0800 Stanislav Fomichev wrote:
> > Jakub was objecting to putting it in the UAPI header, but didn't we
> > already agree that this wasn't necessary?
> >
> > I.e., if we just define
> >
> > struct xdp_skb_metadata *bpf_xdp_metadata_export_to_skb()
> >
> > as a kfunc, the xdp_skb_metadata struct won't appear in any UAPI headers
> > and will only be accessible via BTF? And we can put the actual data
> > wherever we choose, since that bit is nicely hidden behind the kfunc,
> > while the returned pointer still allows programs to access it.
> >
> > We could even make that kfunc smart enough that it checks if the field
> > is already populated and just return the pointer to the existing data
> > instead of re-populating it int his case (with a flag to override,
> > maybe?).  
> 
> Even if we only expose it via btf, I think the fact that we still
> expose a somewhat fixed layout is the problem?
> I'm not sure the fact that we're not technically putting in the uapi
> header is the issue here, but maybe I'm wrong?
> Jakub?

Until the device metadata access from BPF is in bpf-next the only
opinion I have on this is something along the lines of "not right now".

I may be missing some concerns / perspectives, in which case - when
is the next "BPF office hours" meeting?
Stanislav Fomichev Nov. 21, 2022, 7:41 p.m. UTC | #19
On Mon, Nov 21, 2022 at 10:47 AM Jakub Kicinski <kuba@kernel.org> wrote:
>
> On Mon, 21 Nov 2022 09:53:02 -0800 Stanislav Fomichev wrote:
> > > Jakub was objecting to putting it in the UAPI header, but didn't we
> > > already agree that this wasn't necessary?
> > >
> > > I.e., if we just define
> > >
> > > struct xdp_skb_metadata *bpf_xdp_metadata_export_to_skb()
> > >
> > > as a kfunc, the xdp_skb_metadata struct won't appear in any UAPI headers
> > > and will only be accessible via BTF? And we can put the actual data
> > > wherever we choose, since that bit is nicely hidden behind the kfunc,
> > > while the returned pointer still allows programs to access it.
> > >
> > > We could even make that kfunc smart enough that it checks if the field
> > > is already populated and just return the pointer to the existing data
> > > instead of re-populating it int his case (with a flag to override,
> > > maybe?).
> >
> > Even if we only expose it via btf, I think the fact that we still
> > expose a somewhat fixed layout is the problem?
> > I'm not sure the fact that we're not technically putting in the uapi
> > header is the issue here, but maybe I'm wrong?
> > Jakub?
>
> Until the device metadata access from BPF is in bpf-next the only
> opinion I have on this is something along the lines of "not right now".
>
> I may be missing some concerns / perspectives, in which case - when
> is the next "BPF office hours" meeting?

SG! Let's get back to it once we get the basic rx metadata sorted out.
I'll probably look at the tx part next though; that xdp->skb path is
of least priority for me.
diff mbox series

Patch

diff --git a/drivers/net/veth.c b/drivers/net/veth.c
index c626580a2294..35349a232209 100644
--- a/drivers/net/veth.c
+++ b/drivers/net/veth.c
@@ -803,7 +803,7 @@  static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 	void *orig_data, *orig_data_end;
 	struct bpf_prog *xdp_prog;
 	struct xdp_buff xdp;
-	u32 act, metalen;
+	u32 act;
 	int off;
 
 	skb_prepare_for_gro(skb);
@@ -886,9 +886,7 @@  static struct sk_buff *veth_xdp_rcv_skb(struct veth_rq *rq,
 
 	skb->protocol = eth_type_trans(skb, rq->dev);
 
-	metalen = xdp.data - xdp.data_meta;
-	if (metalen)
-		skb_metadata_set(skb, metalen);
+	xdp_convert_skb_metadata(&xdp, skb);
 out:
 	return skb;
 drop:
@@ -1663,7 +1661,9 @@  static int veth_xdp(struct net_device *dev, struct netdev_bpf *xdp)
 static void veth_unroll_kfunc(const struct bpf_prog *prog, u32 func_id,
 			      struct bpf_patch *patch)
 {
-	if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP_SUPPORTED)) {
+	if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_EXPORT_TO_SKB)) {
+		return xdp_metadata_export_to_skb(prog, patch);
+	} else if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP_SUPPORTED)) {
 		/* return true; */
 		bpf_patch_append(patch, BPF_MOV64_IMM(BPF_REG_0, 1));
 	} else if (func_id == xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP)) {
diff --git a/include/linux/skbuff.h b/include/linux/skbuff.h
index 4e464a27adaf..be6a9559dbf1 100644
--- a/include/linux/skbuff.h
+++ b/include/linux/skbuff.h
@@ -4219,6 +4219,10 @@  static inline bool skb_metadata_differs(const struct sk_buff *skb_a,
 	       true : __skb_metadata_differs(skb_a, skb_b, len_a);
 }
 
+struct xdp_skb_metadata;
+bool skb_metadata_import_from_xdp(struct sk_buff *skb,
+				  struct xdp_skb_metadata *meta);
+
 static inline void skb_metadata_set(struct sk_buff *skb, u8 meta_len)
 {
 	skb_shinfo(skb)->meta_len = meta_len;
diff --git a/include/net/xdp.h b/include/net/xdp.h
index 2a82a98f2f9f..547a6a0e99f8 100644
--- a/include/net/xdp.h
+++ b/include/net/xdp.h
@@ -73,6 +73,7 @@  enum xdp_buff_flags {
 	XDP_FLAGS_FRAGS_PF_MEMALLOC	= BIT(1), /* xdp paged memory is under
 						   * pressure
 						   */
+	XDP_FLAGS_HAS_SKB_METADATA	= BIT(2), /* xdp_skb_metadata */
 };
 
 struct xdp_buff {
@@ -91,6 +92,11 @@  static __always_inline bool xdp_buff_has_frags(struct xdp_buff *xdp)
 	return !!(xdp->flags & XDP_FLAGS_HAS_FRAGS);
 }
 
+static __always_inline bool xdp_buff_has_skb_metadata(struct xdp_buff *xdp)
+{
+	return !!(xdp->flags & XDP_FLAGS_HAS_SKB_METADATA);
+}
+
 static __always_inline void xdp_buff_set_frags_flag(struct xdp_buff *xdp)
 {
 	xdp->flags |= XDP_FLAGS_HAS_FRAGS;
@@ -306,6 +312,8 @@  struct xdp_frame *xdp_convert_buff_to_frame(struct xdp_buff *xdp)
 	return xdp_frame;
 }
 
+bool xdp_convert_skb_metadata(struct xdp_buff *xdp, struct sk_buff *skb);
+
 void __xdp_return(void *data, struct xdp_mem_info *mem, bool napi_direct,
 		  struct xdp_buff *xdp);
 void xdp_return_frame(struct xdp_frame *xdpf);
@@ -411,6 +419,8 @@  void xdp_attachment_setup(struct xdp_attachment_info *info,
 #define DEV_MAP_BULK_SIZE XDP_BULK_QUEUE_SIZE
 
 #define XDP_METADATA_KFUNC_xxx	\
+	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_EXPORT_TO_SKB, \
+			   bpf_xdp_metadata_export_to_skb) \
 	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_TIMESTAMP_SUPPORTED, \
 			   bpf_xdp_metadata_rx_timestamp_supported) \
 	XDP_METADATA_KFUNC(XDP_METADATA_KFUNC_RX_TIMESTAMP, \
@@ -423,14 +433,21 @@  XDP_METADATA_KFUNC_xxx
 MAX_XDP_METADATA_KFUNC,
 };
 
+struct bpf_patch;
+
 #ifdef CONFIG_DEBUG_INFO_BTF
 extern struct btf_id_set8 xdp_metadata_kfunc_ids;
 static inline u32 xdp_metadata_kfunc_id(int id)
 {
 	return xdp_metadata_kfunc_ids.pairs[id].id;
 }
+void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch);
 #else
 static inline u32 xdp_metadata_kfunc_id(int id) { return 0; }
+static void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
+{
+	return 0;
+}
 #endif
 
 #endif /* __LINUX_NET_XDP_H__ */
diff --git a/include/uapi/linux/bpf.h b/include/uapi/linux/bpf.h
index b444b1118c4f..71e3bc7ad839 100644
--- a/include/uapi/linux/bpf.h
+++ b/include/uapi/linux/bpf.h
@@ -6116,6 +6116,12 @@  enum xdp_action {
 	XDP_REDIRECT,
 };
 
+/* Subset of XDP metadata exported to skb context.
+ */
+struct xdp_skb_metadata {
+	__u64 rx_timestamp;
+};
+
 /* user accessible metadata for XDP packet hook
  * new fields must be added to the end of this structure
  */
@@ -6128,6 +6134,7 @@  struct xdp_md {
 	__u32 rx_queue_index;  /* rxq->queue_index  */
 
 	__u32 egress_ifindex;  /* txq->dev->ifindex */
+	__bpf_md_ptr(struct xdp_skb_metadata *, skb_metadata);
 };
 
 /* DEVMAP map-value layout
diff --git a/kernel/bpf/verifier.c b/kernel/bpf/verifier.c
index b657ed6eb277..6879ad3a6026 100644
--- a/kernel/bpf/verifier.c
+++ b/kernel/bpf/verifier.c
@@ -14023,6 +14023,7 @@  static int unroll_kfunc_call(struct bpf_verifier_env *env,
 	enum bpf_prog_type prog_type;
 	struct bpf_prog_aux *aux;
 	struct btf *desc_btf;
+	u32 allowed, mangled;
 	u32 *kfunc_flags;
 	u32 func_id;
 
@@ -14050,6 +14051,20 @@  static int unroll_kfunc_call(struct bpf_verifier_env *env,
 		 */
 		bpf_patch_append(patch, BPF_MOV64_IMM(BPF_REG_0, 0));
 	}
+
+	allowed = 1 << BPF_REG_0;
+	allowed |= 1 << BPF_REG_1;
+	allowed |= 1 << BPF_REG_2;
+	allowed |= 1 << BPF_REG_3;
+	allowed |= 1 << BPF_REG_4;
+	allowed |= 1 << BPF_REG_5;
+	mangled = bpf_patch_magles_registers(patch);
+	if (WARN_ON_ONCE(mangled & ~allowed)) {
+		bpf_patch_free(patch);
+		verbose(env, "bpf verifier is misconfigured\n");
+		return -EINVAL;
+	}
+
 	return bpf_patch_err(patch);
 }
 
diff --git a/net/core/filter.c b/net/core/filter.c
index 6dd2baf5eeb2..2497144e4216 100644
--- a/net/core/filter.c
+++ b/net/core/filter.c
@@ -4094,6 +4094,8 @@  BPF_CALL_2(bpf_xdp_adjust_meta, struct xdp_buff *, xdp, int, offset)
 		return -EINVAL;
 	if (unlikely(xdp_metalen_invalid(metalen)))
 		return -EACCES;
+	if (unlikely(xdp_buff_has_skb_metadata(xdp)))
+		return -EACCES;
 
 	xdp->data_meta = meta;
 
@@ -8690,6 +8692,8 @@  static bool __is_valid_xdp_access(int off, int size)
 	return true;
 }
 
+BTF_ID_LIST_SINGLE(xdp_to_skb_metadata_btf_ids, struct, xdp_skb_metadata);
+
 static bool xdp_is_valid_access(int off, int size,
 				enum bpf_access_type type,
 				const struct bpf_prog *prog,
@@ -8722,6 +8726,18 @@  static bool xdp_is_valid_access(int off, int size,
 	case offsetof(struct xdp_md, data_end):
 		info->reg_type = PTR_TO_PACKET_END;
 		break;
+	case offsetof(struct xdp_md, skb_metadata):
+		info->btf = bpf_get_btf_vmlinux();
+		if (IS_ERR(info->btf))
+			return PTR_ERR(info->btf);
+		if (!info->btf)
+			return -EINVAL;
+
+		info->reg_type = PTR_TO_BTF_ID_OR_NULL;
+		info->btf_id = xdp_to_skb_metadata_btf_ids[0];
+		if (size == sizeof(__u64))
+			return true;
+		return false;
 	}
 
 	return __is_valid_xdp_access(off, size);
@@ -9814,6 +9830,30 @@  static u32 xdp_convert_ctx_access(enum bpf_access_type type,
 		*insn++ = BPF_LDX_MEM(BPF_W, si->dst_reg, si->dst_reg,
 				      offsetof(struct net_device, ifindex));
 		break;
+	case offsetof(struct xdp_md, skb_metadata):
+		/* dst_reg = xdp_buff->flags; */
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, flags),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct xdp_buff, flags));
+		/* dst_reg &= XDP_FLAGS_HAS_SKB_METADATA; */
+		*insn++ = BPF_ALU64_IMM(BPF_AND, si->dst_reg,
+					XDP_FLAGS_HAS_SKB_METADATA);
+
+		/* if (dst_reg != 0) { */
+		*insn++ = BPF_JMP_IMM(BPF_JEQ, si->dst_reg, 0, 3);
+		/*	dst_reg = xdp_buff->data_meta; */
+		*insn++ = BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, data_meta),
+				      si->dst_reg, si->src_reg,
+				      offsetof(struct xdp_buff, data_meta));
+		/*	dst_reg -= sizeof(struct xdp_skb_metadata); */
+		*insn++ = BPF_ALU64_IMM(BPF_SUB, si->dst_reg,
+					sizeof(struct xdp_skb_metadata));
+		*insn++ = BPF_JMP_A(1);
+		/* } else { */
+		/*	return 0; */
+		*insn++ = BPF_MOV32_IMM(si->dst_reg, 0);
+		/* } */
+		break;
 	}
 
 	return insn - insn_buf;
diff --git a/net/core/skbuff.c b/net/core/skbuff.c
index 90d085290d49..0cc24ca20e4d 100644
--- a/net/core/skbuff.c
+++ b/net/core/skbuff.c
@@ -72,6 +72,7 @@ 
 #include <net/mptcp.h>
 #include <net/mctp.h>
 #include <net/page_pool.h>
+#include <net/xdp.h>
 
 #include <linux/uaccess.h>
 #include <trace/events/skb.h>
@@ -6675,3 +6676,22 @@  nodefer:	__kfree_skb(skb);
 	if (unlikely(kick) && !cmpxchg(&sd->defer_ipi_scheduled, 0, 1))
 		smp_call_function_single_async(cpu, &sd->defer_csd);
 }
+
+bool skb_metadata_import_from_xdp(struct sk_buff *skb,
+				  struct xdp_skb_metadata *meta)
+{
+	/* Optional SKB info, currently missing:
+	 * - HW checksum info		(skb->ip_summed)
+	 * - HW RX hash			(skb_set_hash)
+	 * - RX ring dev queue index	(skb_record_rx_queue)
+	 */
+
+	if (meta->rx_timestamp) {
+		*skb_hwtstamps(skb) = (struct skb_shared_hwtstamps){
+			.hwtstamp = ns_to_ktime(meta->rx_timestamp),
+		};
+	}
+
+	return true;
+}
+EXPORT_SYMBOL(skb_metadata_import_from_xdp);
diff --git a/net/core/xdp.c b/net/core/xdp.c
index 22f1e44700eb..ede9b1b987d9 100644
--- a/net/core/xdp.c
+++ b/net/core/xdp.c
@@ -368,6 +368,22 @@  int xdp_rxq_info_reg_mem_model(struct xdp_rxq_info *xdp_rxq,
 
 EXPORT_SYMBOL_GPL(xdp_rxq_info_reg_mem_model);
 
+bool xdp_convert_skb_metadata(struct xdp_buff *xdp, struct sk_buff *skb)
+{
+	struct xdp_skb_metadata *meta;
+	u32 metalen;
+
+	metalen = xdp->data - xdp->data_meta;
+	if (metalen)
+		skb_metadata_set(skb, metalen);
+	if (xdp_buff_has_skb_metadata(xdp)) {
+		meta = xdp->data_meta - sizeof(*meta);
+		return skb_metadata_import_from_xdp(skb, meta);
+	}
+	return false;
+}
+EXPORT_SYMBOL(xdp_convert_skb_metadata);
+
 /* XDP RX runs under NAPI protection, and in different delivery error
  * scenarios (e.g. queue full), it is possible to return the xdp_frame
  * while still leveraging this protection.  The @napi_direct boolean
@@ -619,6 +635,7 @@  struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 {
 	struct skb_shared_info *sinfo = xdp_get_shared_info_from_frame(xdpf);
 	unsigned int headroom, frame_size;
+	struct xdp_skb_metadata *meta;
 	void *hard_start;
 	u8 nr_frags;
 
@@ -653,11 +670,10 @@  struct sk_buff *__xdp_build_skb_from_frame(struct xdp_frame *xdpf,
 	/* Essential SKB info: protocol and skb->dev */
 	skb->protocol = eth_type_trans(skb, dev);
 
-	/* Optional SKB info, currently missing:
-	 * - HW checksum info		(skb->ip_summed)
-	 * - HW RX hash			(skb_set_hash)
-	 * - RX ring dev queue index	(skb_record_rx_queue)
-	 */
+	if (xdpf->flags & XDP_FLAGS_HAS_SKB_METADATA) {
+		meta = xdpf->data - xdpf->metasize - sizeof(*meta);
+		skb_metadata_import_from_xdp(skb, meta);
+	}
 
 	/* Until page_pool get SKB return path, release DMA here */
 	xdp_release_frame(xdpf);
@@ -712,6 +728,14 @@  struct xdp_frame *xdpf_clone(struct xdp_frame *xdpf)
 	return nxdpf;
 }
 
+/* For the packets directed to the kernel, this kfunc exports XDP metadata
+ * into skb context.
+ */
+noinline int bpf_xdp_metadata_export_to_skb(const struct xdp_md *ctx)
+{
+	return 0;
+}
+
 /* Indicates whether particular device supports rx_timestamp metadata.
  * This is an optional helper to support marking some branches as
  * "dead code" in the BPF programs.
@@ -736,15 +760,126 @@  BTF_SET8_START_GLOBAL(xdp_metadata_kfunc_ids)
 XDP_METADATA_KFUNC_xxx
 #undef XDP_METADATA_KFUNC
 BTF_SET8_END(xdp_metadata_kfunc_ids)
+EXPORT_SYMBOL(xdp_metadata_kfunc_ids);
 
 static const struct btf_kfunc_id_set xdp_metadata_kfunc_set = {
 	.owner = THIS_MODULE,
 	.set   = &xdp_metadata_kfunc_ids,
 };
 
+/* Since we're not actually doing a call but instead rewriting
+ * in place, we can only afford to use R0-R5 scratch registers
+ * and hidden BPF_PUSH64/BPF_POP64 opcodes to spill to the stack.
+ */
+void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
+{
+	u32 func_id;
+
+	/* The code below generates the following:
+	 *
+	 * int bpf_xdp_metadata_export_to_skb(struct xdp_md *ctx)
+	 * {
+	 *	struct xdp_skb_metadata *meta = ctx->data_meta - sizeof(*meta);
+	 *	int ret;
+	 *
+	 *	if (ctx->flags & XDP_FLAGS_HAS_SKB_METADATA)
+	 *		return -1;
+	 *
+	 *	if (meta < ctx->data_hard_start + sizeof(struct xdp_frame))
+	 *		return -1;
+	 *
+	 *	meta->rx_timestamp = bpf_xdp_metadata_rx_timestamp(ctx);
+	 *	ctx->flags |= BPF_F_XDP_HAS_METADATA;
+	 *
+	 *	return 0;
+	 * }
+	 */
+
+	bpf_patch_append(patch,
+		BPF_MOV64_IMM(BPF_REG_0, -1),
+
+		/* r2 = ((struct xdp_buff *)r1)->flags; */
+		BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, flags),
+			    BPF_REG_2, BPF_REG_1,
+			    offsetof(struct xdp_buff, flags)),
+
+		/* r2 &= XDP_FLAGS_HAS_SKB_METADATA; */
+		BPF_ALU64_IMM(BPF_AND, BPF_REG_2, XDP_FLAGS_HAS_SKB_METADATA),
+
+		/* if (xdp_buff->flags & XDP_FLAGS_HAS_SKB_METADATA) return -1; */
+		BPF_JMP_IMM(BPF_JNE, BPF_REG_2, 0, S16_MAX),
+
+		/* r2 = ((struct xdp_buff *)r1)->data_meta; */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1,
+			    offsetof(struct xdp_buff, data_meta)),
+		/* r2 -= sizeof(struct xdp_skb_metadata); */
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2,
+			      sizeof(struct xdp_skb_metadata)),
+		/* r3 = ((struct xdp_buff *)r1)->data_hard_start; */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_3, BPF_REG_1,
+			    offsetof(struct xdp_buff, data_hard_start)),
+		/* r3 += sizeof(struct xdp_frame) */
+		BPF_ALU64_IMM(BPF_ADD, BPF_REG_3,
+			      sizeof(struct xdp_frame)),
+		/* if (data_meta-sizeof(struct xdp_skb_metadata) <
+		 *     data_hard_start+sizeof(struct xdp_frame)) return -1;
+		 */
+		BPF_JMP_REG(BPF_JLT, BPF_REG_2, BPF_REG_3, S16_MAX),
+
+		/* r2 = ((struct xdp_buff *)r1)->flags; */
+		BPF_LDX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, flags),
+			    BPF_REG_2, BPF_REG_1,
+			    offsetof(struct xdp_buff, flags)),
+
+		/* r2 |= XDP_FLAGS_HAS_SKB_METADATA; */
+		BPF_ALU64_IMM(BPF_OR, BPF_REG_2, XDP_FLAGS_HAS_SKB_METADATA),
+
+		/* ((struct xdp_buff *)r1)->flags = r2; */
+		BPF_STX_MEM(BPF_FIELD_SIZEOF(struct xdp_buff, flags),
+			    BPF_REG_1, BPF_REG_2,
+			    offsetof(struct xdp_buff, flags)),
+
+		/* push r1 */
+		BPF_PUSH64(BPF_REG_1),
+	);
+
+	/*	r0 = bpf_xdp_metadata_rx_timestamp(ctx); */
+	func_id = xdp_metadata_kfunc_id(XDP_METADATA_KFUNC_RX_TIMESTAMP);
+	prog->aux->xdp_kfunc_ndo->ndo_unroll_kfunc(prog, func_id, patch);
+
+	bpf_patch_append(patch,
+		/* pop r1 */
+		BPF_POP64(BPF_REG_1),
+
+		/* r2 = ((struct xdp_buff *)r1)->data_meta; */
+		BPF_LDX_MEM(BPF_DW, BPF_REG_2, BPF_REG_1,
+			    offsetof(struct xdp_buff, data_meta)),
+		/* r2 -= sizeof(struct xdp_skb_metadata); */
+		BPF_ALU64_IMM(BPF_SUB, BPF_REG_2,
+			      sizeof(struct xdp_skb_metadata)),
+
+		/* *((struct xdp_skb_metadata *)r2)->rx_timestamp = r0; */
+		BPF_STX_MEM(BPF_DW, BPF_REG_2, BPF_REG_0,
+			    offsetof(struct xdp_skb_metadata, rx_timestamp)),
+
+		/* return 0; */
+		BPF_MOV64_IMM(BPF_REG_0, 0),
+	);
+
+	bpf_patch_resolve_jmp(patch);
+}
+EXPORT_SYMBOL(xdp_metadata_export_to_skb);
+
 static int __init xdp_metadata_init(void)
 {
 	return register_btf_kfunc_id_set(BPF_PROG_TYPE_XDP, &xdp_metadata_kfunc_set);
 }
 late_initcall(xdp_metadata_init);
+#else
+struct btf_id_set8 xdp_metadata_kfunc_ids = {};
+EXPORT_SYMBOL(xdp_metadata_kfunc_ids);
+void xdp_metadata_export_to_skb(const struct bpf_prog *prog, struct bpf_patch *patch)
+{
+}
+EXPORT_SYMBOL(xdp_metadata_export_to_skb);
 #endif
diff --git a/tools/include/uapi/linux/bpf.h b/tools/include/uapi/linux/bpf.h
index b444b1118c4f..71e3bc7ad839 100644
--- a/tools/include/uapi/linux/bpf.h
+++ b/tools/include/uapi/linux/bpf.h
@@ -6116,6 +6116,12 @@  enum xdp_action {
 	XDP_REDIRECT,
 };
 
+/* Subset of XDP metadata exported to skb context.
+ */
+struct xdp_skb_metadata {
+	__u64 rx_timestamp;
+};
+
 /* user accessible metadata for XDP packet hook
  * new fields must be added to the end of this structure
  */
@@ -6128,6 +6134,7 @@  struct xdp_md {
 	__u32 rx_queue_index;  /* rxq->queue_index  */
 
 	__u32 egress_ifindex;  /* txq->dev->ifindex */
+	__bpf_md_ptr(struct xdp_skb_metadata *, skb_metadata);
 };
 
 /* DEVMAP map-value layout