diff mbox series

[bpf-next,v5,7/7] selftests/bpf: Add selftest for XDP_REDIRECT in bpf_prog_run()

Message ID 20220103150812.87914-8-toke@redhat.com (mailing list archive)
State Superseded
Delegated to: BPF
Headers show
Series Add support for transmitting packets using XDP in bpf_prog_run() | expand

Checks

Context Check Description
netdev/tree_selection success Clearly marked for bpf-next
netdev/fixes_present success Fixes tag not required for -next series
netdev/subject_prefix success Link
netdev/cover_letter success Series has a cover letter
netdev/patch_count success Link
netdev/header_inline success No static functions without inline keyword in header files
netdev/build_32bit success Errors and warnings before: 0 this patch: 0
netdev/cc_maintainers warning 1 maintainers not CCed: linux-kselftest@vger.kernel.org
netdev/build_clang success Errors and warnings before: 0 this patch: 0
netdev/module_param success Was 0 now: 0
netdev/verify_signedoff success Signed-off-by tag matches author and committer
netdev/verify_fixes success No Fixes tag
netdev/build_allmodconfig_warn success Errors and warnings before: 0 this patch: 0
netdev/checkpatch warning WARNING: Macros with flow control statements should be avoided WARNING: Use of volatile is usually wrong: see Documentation/process/volatile-considered-harmful.rst WARNING: added, moved or deleted file(s), does MAINTAINERS need updating? WARNING: line length of 83 exceeds 80 columns
netdev/kdoc success Errors and warnings before: 0 this patch: 0
netdev/source_inline success Was 0 now: 0
bpf/vmtest-bpf-next-PR success PR summary
bpf/vmtest-bpf-next success VM_Test

Commit Message

Toke Høiland-Jørgensen Jan. 3, 2022, 3:08 p.m. UTC
This adds a selftest for the XDP_REDIRECT facility in bpf_prog_run, that
redirects packets into a veth and counts them using an XDP program on the
other side of the veth pair.

Signed-off-by: Toke Høiland-Jørgensen <toke@redhat.com>
---
 .../bpf/prog_tests/xdp_do_redirect.c          | 118 ++++++++++++++++++
 .../bpf/progs/test_xdp_do_redirect.c          |  39 ++++++
 2 files changed, 157 insertions(+)
 create mode 100644 tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
 create mode 100644 tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c

Comments

Alexei Starovoitov Jan. 6, 2022, 4:20 a.m. UTC | #1
On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
> +
> +#define NUM_PKTS 3

May be send a bit more than 3 packets?
Just to test skb_list logic for XDP_PASS.

> +
> +	/* We setup a veth pair that we can not only XDP_REDIRECT packets
> +	 * between, but also route them. The test packet (defined above) has
> +	 * address information so it will be routed back out the same interface
> +	 * after it has been received, which will allow it to be picked up by
> +	 * the XDP program on the destination interface.
> +	 *
> +	 * The XDP program we run with bpf_prog_run() will cycle through all
> +	 * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
> +	 * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
> +	 * the UDP payload.
> +	 */
> +	SYS("ip link add veth_src type veth peer name veth_dst");
> +	SYS("ip link set dev veth_src address 00:11:22:33:44:55");
> +	SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
> +	SYS("ip link set dev veth_src up");
> +	SYS("ip link set dev veth_dst up");
> +	SYS("ip addr add dev veth_src fc00::1/64");
> +	SYS("ip addr add dev veth_dst fc00::2/64");
> +	SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
> +	SYS("sysctl -w net.ipv6.conf.all.forwarding=1");

These commands pollute current netns. The test has to create its own netns
like other tests do.

The forwarding=1 is odd. Nothing in the comments or commit logs
talks about it.
I'm guessing it's due to patch 6 limitation of picking loopback
for XDP_PASS and XDP_TX, right?
There is ingress_ifindex field in struct xdp_md.
May be use that to setup dev and rxq in test_run in patch 6?
Then there will be no need to hack through forwarding=1 ?
Toke Høiland-Jørgensen Jan. 6, 2022, 2:34 p.m. UTC | #2
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
>> +
>> +#define NUM_PKTS 3
>
> May be send a bit more than 3 packets?
> Just to test skb_list logic for XDP_PASS.

OK, can do.

>> +
>> +	/* We setup a veth pair that we can not only XDP_REDIRECT packets
>> +	 * between, but also route them. The test packet (defined above) has
>> +	 * address information so it will be routed back out the same interface
>> +	 * after it has been received, which will allow it to be picked up by
>> +	 * the XDP program on the destination interface.
>> +	 *
>> +	 * The XDP program we run with bpf_prog_run() will cycle through all
>> +	 * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
>> +	 * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
>> +	 * the UDP payload.
>> +	 */
>> +	SYS("ip link add veth_src type veth peer name veth_dst");
>> +	SYS("ip link set dev veth_src address 00:11:22:33:44:55");
>> +	SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
>> +	SYS("ip link set dev veth_src up");
>> +	SYS("ip link set dev veth_dst up");
>> +	SYS("ip addr add dev veth_src fc00::1/64");
>> +	SYS("ip addr add dev veth_dst fc00::2/64");
>> +	SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
>> +	SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
>
> These commands pollute current netns. The test has to create its own netns
> like other tests do.

Right, will fix.

> The forwarding=1 is odd. Nothing in the comments or commit logs
> talks about it.

Hmm, yeah, should probably have added an explanation, sorry about that :)

> I'm guessing it's due to patch 6 limitation of picking loopback
> for XDP_PASS and XDP_TX, right?
> There is ingress_ifindex field in struct xdp_md.
> May be use that to setup dev and rxq in test_run in patch 6?
> Then there will be no need to hack through forwarding=1 ?

No, as you note there's already ingress_ifindex to set the device, and
the test does use that:

+	memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
+	skel->rodata->ifindex_out = ifindex_src;
+	ctx_in.ingress_ifindex = ifindex_src;

I enable forwarding because the XDP program that counts the packets is
running on the other end of the veth pair (on veth_dst), while the
traffic gen is using veth_src as its ingress ifindex. So for XDP_TX and
XDP_REDIRECT we send the frame back out the veth device, and it ends up
being processed by the XDP program on veth_dst, and counted. But when
the test program returns XDP_PASS, the packet will go up the frame; so
to get it back to the counting program I enable forwarding and set the
packet dst IP so that the stack routes it back out the same interface.

I'll admit this is a bit hacky; I guess I can add a second TC ingress
program that will count the packets being XDP_PASS'ed instead...

-Toke
Alexei Starovoitov Jan. 6, 2022, 5:18 p.m. UTC | #3
On Thu, Jan 6, 2022 at 6:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
> >> +
> >> +#define NUM_PKTS 3
> >
> > May be send a bit more than 3 packets?
> > Just to test skb_list logic for XDP_PASS.
>
> OK, can do.
>
> >> +
> >> +    /* We setup a veth pair that we can not only XDP_REDIRECT packets
> >> +     * between, but also route them. The test packet (defined above) has
> >> +     * address information so it will be routed back out the same interface
> >> +     * after it has been received, which will allow it to be picked up by
> >> +     * the XDP program on the destination interface.
> >> +     *
> >> +     * The XDP program we run with bpf_prog_run() will cycle through all
> >> +     * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
> >> +     * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
> >> +     * the UDP payload.
> >> +     */
> >> +    SYS("ip link add veth_src type veth peer name veth_dst");
> >> +    SYS("ip link set dev veth_src address 00:11:22:33:44:55");
> >> +    SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
> >> +    SYS("ip link set dev veth_src up");
> >> +    SYS("ip link set dev veth_dst up");
> >> +    SYS("ip addr add dev veth_src fc00::1/64");
> >> +    SYS("ip addr add dev veth_dst fc00::2/64");
> >> +    SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
> >> +    SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
> >
> > These commands pollute current netns. The test has to create its own netns
> > like other tests do.
>
> Right, will fix.
>
> > The forwarding=1 is odd. Nothing in the comments or commit logs
> > talks about it.
>
> Hmm, yeah, should probably have added an explanation, sorry about that :)
>
> > I'm guessing it's due to patch 6 limitation of picking loopback
> > for XDP_PASS and XDP_TX, right?
> > There is ingress_ifindex field in struct xdp_md.
> > May be use that to setup dev and rxq in test_run in patch 6?
> > Then there will be no need to hack through forwarding=1 ?
>
> No, as you note there's already ingress_ifindex to set the device, and
> the test does use that:
>
> +       memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
> +       skel->rodata->ifindex_out = ifindex_src;
> +       ctx_in.ingress_ifindex = ifindex_src;

My point is that this ingress_ifindex should be used instead of loopback.
Otherwise the test_run infra is lying to the xdp program.

> I enable forwarding because the XDP program that counts the packets is
> running on the other end of the veth pair (on veth_dst), while the
> traffic gen is using veth_src as its ingress ifindex. So for XDP_TX and
> XDP_REDIRECT we send the frame back out the veth device, and it ends up
> being processed by the XDP program on veth_dst, and counted.

Not for XDP_TX. If I'm reading patch 6 correctly it gets xmited
out of loopback.

> But when
> the test program returns XDP_PASS, the packet will go up the frame; so
> to get it back to the counting program I enable forwarding and set the
> packet dst IP so that the stack routes it back out the same interface.
>
> I'll admit this is a bit hacky; I guess I can add a second TC ingress
> program that will count the packets being XDP_PASS'ed instead...

No. Please figure out how to XDP_PASS and XDP_TX without enabling forward
and counting in different places.
imo the forwarding hides the issue in the design that should be addressed.
When rx ifindex is an actual ifindex given by user space instead of
loopback all problems go away.
Toke Høiland-Jørgensen Jan. 6, 2022, 6:21 p.m. UTC | #4
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Jan 6, 2022 at 6:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
>> >> +
>> >> +#define NUM_PKTS 3
>> >
>> > May be send a bit more than 3 packets?
>> > Just to test skb_list logic for XDP_PASS.
>>
>> OK, can do.
>>
>> >> +
>> >> +    /* We setup a veth pair that we can not only XDP_REDIRECT packets
>> >> +     * between, but also route them. The test packet (defined above) has
>> >> +     * address information so it will be routed back out the same interface
>> >> +     * after it has been received, which will allow it to be picked up by
>> >> +     * the XDP program on the destination interface.
>> >> +     *
>> >> +     * The XDP program we run with bpf_prog_run() will cycle through all
>> >> +     * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
>> >> +     * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
>> >> +     * the UDP payload.
>> >> +     */
>> >> +    SYS("ip link add veth_src type veth peer name veth_dst");
>> >> +    SYS("ip link set dev veth_src address 00:11:22:33:44:55");
>> >> +    SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
>> >> +    SYS("ip link set dev veth_src up");
>> >> +    SYS("ip link set dev veth_dst up");
>> >> +    SYS("ip addr add dev veth_src fc00::1/64");
>> >> +    SYS("ip addr add dev veth_dst fc00::2/64");
>> >> +    SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
>> >> +    SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
>> >
>> > These commands pollute current netns. The test has to create its own netns
>> > like other tests do.
>>
>> Right, will fix.
>>
>> > The forwarding=1 is odd. Nothing in the comments or commit logs
>> > talks about it.
>>
>> Hmm, yeah, should probably have added an explanation, sorry about that :)
>>
>> > I'm guessing it's due to patch 6 limitation of picking loopback
>> > for XDP_PASS and XDP_TX, right?
>> > There is ingress_ifindex field in struct xdp_md.
>> > May be use that to setup dev and rxq in test_run in patch 6?
>> > Then there will be no need to hack through forwarding=1 ?
>>
>> No, as you note there's already ingress_ifindex to set the device, and
>> the test does use that:
>>
>> +       memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
>> +       skel->rodata->ifindex_out = ifindex_src;
>> +       ctx_in.ingress_ifindex = ifindex_src;
>
> My point is that this ingress_ifindex should be used instead of loopback.
> Otherwise the test_run infra is lying to the xdp program.

But it is already using that! There is just no explicit code in patch 6
to do that because that was already part of the XDP prog_run
functionality.

Specifically, the existing bpf_prog_test_run_xdp() will pass the context
through xdp_convert_md_to_buff() which will resolve the ifindex and get
a dev reference. So the xdp_buff object being passed to the new
bpf_test_run_xdp_live() function already has the right device in
ctx->rxq.

I'll add a check for this to the selftest to make it explicit.

>> I enable forwarding because the XDP program that counts the packets is
>> running on the other end of the veth pair (on veth_dst), while the
>> traffic gen is using veth_src as its ingress ifindex. So for XDP_TX and
>> XDP_REDIRECT we send the frame back out the veth device, and it ends up
>> being processed by the XDP program on veth_dst, and counted.
>
> Not for XDP_TX. If I'm reading patch 6 correctly it gets xmited
> out of loopback.

See above.

>> But when
>> the test program returns XDP_PASS, the packet will go up the frame; so
>> to get it back to the counting program I enable forwarding and set the
>> packet dst IP so that the stack routes it back out the same interface.
>>
>> I'll admit this is a bit hacky; I guess I can add a second TC ingress
>> program that will count the packets being XDP_PASS'ed instead...
>
> No. Please figure out how to XDP_PASS and XDP_TX without enabling forward
> and counting in different places.
> imo the forwarding hides the issue in the design that should be addressed.
> When rx ifindex is an actual ifindex given by user space instead of
> loopback all problems go away.

No the problem of XDP_PASS going in the opposite direction of XDP_TX and
XDP_REDIRECT remains. This is just like on a physical interface: if you
XDP_TX a packet it goes back out, if you XDP_PASS it, it goes up the
stack. To intercept both after the fact, you need to look in two
different places.

Anyhow, just using a TC hook for XDP_PASS works fine and gets rid of the
forwarding hack; I'll send a v6 with that just as soon as I verify that
I didn't break anything when running the traffic generator on bare metal :)

-Toke
Alexei Starovoitov Jan. 6, 2022, 7:56 p.m. UTC | #5
On Thu, Jan 6, 2022 at 10:21 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>
> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>
> > On Thu, Jan 6, 2022 at 6:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
> >>
> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
> >>
> >> > On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
> >> >> +
> >> >> +#define NUM_PKTS 3
> >> >
> >> > May be send a bit more than 3 packets?
> >> > Just to test skb_list logic for XDP_PASS.
> >>
> >> OK, can do.
> >>
> >> >> +
> >> >> +    /* We setup a veth pair that we can not only XDP_REDIRECT packets
> >> >> +     * between, but also route them. The test packet (defined above) has
> >> >> +     * address information so it will be routed back out the same interface
> >> >> +     * after it has been received, which will allow it to be picked up by
> >> >> +     * the XDP program on the destination interface.
> >> >> +     *
> >> >> +     * The XDP program we run with bpf_prog_run() will cycle through all
> >> >> +     * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
> >> >> +     * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
> >> >> +     * the UDP payload.
> >> >> +     */
> >> >> +    SYS("ip link add veth_src type veth peer name veth_dst");
> >> >> +    SYS("ip link set dev veth_src address 00:11:22:33:44:55");
> >> >> +    SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
> >> >> +    SYS("ip link set dev veth_src up");
> >> >> +    SYS("ip link set dev veth_dst up");
> >> >> +    SYS("ip addr add dev veth_src fc00::1/64");
> >> >> +    SYS("ip addr add dev veth_dst fc00::2/64");
> >> >> +    SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
> >> >> +    SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
> >> >
> >> > These commands pollute current netns. The test has to create its own netns
> >> > like other tests do.
> >>
> >> Right, will fix.
> >>
> >> > The forwarding=1 is odd. Nothing in the comments or commit logs
> >> > talks about it.
> >>
> >> Hmm, yeah, should probably have added an explanation, sorry about that :)
> >>
> >> > I'm guessing it's due to patch 6 limitation of picking loopback
> >> > for XDP_PASS and XDP_TX, right?
> >> > There is ingress_ifindex field in struct xdp_md.
> >> > May be use that to setup dev and rxq in test_run in patch 6?
> >> > Then there will be no need to hack through forwarding=1 ?
> >>
> >> No, as you note there's already ingress_ifindex to set the device, and
> >> the test does use that:
> >>
> >> +       memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
> >> +       skel->rodata->ifindex_out = ifindex_src;
> >> +       ctx_in.ingress_ifindex = ifindex_src;
> >
> > My point is that this ingress_ifindex should be used instead of loopback.
> > Otherwise the test_run infra is lying to the xdp program.
>
> But it is already using that! There is just no explicit code in patch 6
> to do that because that was already part of the XDP prog_run
> functionality.
>
> Specifically, the existing bpf_prog_test_run_xdp() will pass the context
> through xdp_convert_md_to_buff() which will resolve the ifindex and get
> a dev reference. So the xdp_buff object being passed to the new
> bpf_test_run_xdp_live() function already has the right device in
> ctx->rxq.

Got it. Please make it clear in the commit log.

> No the problem of XDP_PASS going in the opposite direction of XDP_TX and
> XDP_REDIRECT remains. This is just like on a physical interface: if you
> XDP_TX a packet it goes back out, if you XDP_PASS it, it goes up the
> stack. To intercept both after the fact, you need to look in two
> different places.
>
> Anyhow, just using a TC hook for XDP_PASS works fine and gets rid of the
> forwarding hack; I'll send a v6 with that just as soon as I verify that
> I didn't break anything when running the traffic generator on bare metal :)

Got it. You mean a tc ingress prog attached to veth_src ? That should work.
Toke Høiland-Jørgensen Jan. 6, 2022, 8:20 p.m. UTC | #6
Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:

> On Thu, Jan 6, 2022 at 10:21 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>>
>> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>>
>> > On Thu, Jan 6, 2022 at 6:34 AM Toke Høiland-Jørgensen <toke@redhat.com> wrote:
>> >>
>> >> Alexei Starovoitov <alexei.starovoitov@gmail.com> writes:
>> >>
>> >> > On Mon, Jan 03, 2022 at 04:08:12PM +0100, Toke Høiland-Jørgensen wrote:
>> >> >> +
>> >> >> +#define NUM_PKTS 3
>> >> >
>> >> > May be send a bit more than 3 packets?
>> >> > Just to test skb_list logic for XDP_PASS.
>> >>
>> >> OK, can do.
>> >>
>> >> >> +
>> >> >> +    /* We setup a veth pair that we can not only XDP_REDIRECT packets
>> >> >> +     * between, but also route them. The test packet (defined above) has
>> >> >> +     * address information so it will be routed back out the same interface
>> >> >> +     * after it has been received, which will allow it to be picked up by
>> >> >> +     * the XDP program on the destination interface.
>> >> >> +     *
>> >> >> +     * The XDP program we run with bpf_prog_run() will cycle through all
>> >> >> +     * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
>> >> >> +     * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
>> >> >> +     * the UDP payload.
>> >> >> +     */
>> >> >> +    SYS("ip link add veth_src type veth peer name veth_dst");
>> >> >> +    SYS("ip link set dev veth_src address 00:11:22:33:44:55");
>> >> >> +    SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
>> >> >> +    SYS("ip link set dev veth_src up");
>> >> >> +    SYS("ip link set dev veth_dst up");
>> >> >> +    SYS("ip addr add dev veth_src fc00::1/64");
>> >> >> +    SYS("ip addr add dev veth_dst fc00::2/64");
>> >> >> +    SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
>> >> >> +    SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
>> >> >
>> >> > These commands pollute current netns. The test has to create its own netns
>> >> > like other tests do.
>> >>
>> >> Right, will fix.
>> >>
>> >> > The forwarding=1 is odd. Nothing in the comments or commit logs
>> >> > talks about it.
>> >>
>> >> Hmm, yeah, should probably have added an explanation, sorry about that :)
>> >>
>> >> > I'm guessing it's due to patch 6 limitation of picking loopback
>> >> > for XDP_PASS and XDP_TX, right?
>> >> > There is ingress_ifindex field in struct xdp_md.
>> >> > May be use that to setup dev and rxq in test_run in patch 6?
>> >> > Then there will be no need to hack through forwarding=1 ?
>> >>
>> >> No, as you note there's already ingress_ifindex to set the device, and
>> >> the test does use that:
>> >>
>> >> +       memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
>> >> +       skel->rodata->ifindex_out = ifindex_src;
>> >> +       ctx_in.ingress_ifindex = ifindex_src;
>> >
>> > My point is that this ingress_ifindex should be used instead of loopback.
>> > Otherwise the test_run infra is lying to the xdp program.
>>
>> But it is already using that! There is just no explicit code in patch 6
>> to do that because that was already part of the XDP prog_run
>> functionality.
>>
>> Specifically, the existing bpf_prog_test_run_xdp() will pass the context
>> through xdp_convert_md_to_buff() which will resolve the ifindex and get
>> a dev reference. So the xdp_buff object being passed to the new
>> bpf_test_run_xdp_live() function already has the right device in
>> ctx->rxq.
>
> Got it. Please make it clear in the commit log.

Ah, sorry, already hit send on v6 before I saw this. If you want to fix
up the commit message while applying, how about a paragraph at the end
like:


The new mode reuses the setup code from the existing bpf_prog_run() for
XDP. This means that userspace can set the ingress ifindex and RXQ
number as part of the context object being passed to the kernel, in
which case packets will look like they arrived on that interface when
the test program returns XDP_PASS and the packets go up the stack.

>> No the problem of XDP_PASS going in the opposite direction of XDP_TX and
>> XDP_REDIRECT remains. This is just like on a physical interface: if you
>> XDP_TX a packet it goes back out, if you XDP_PASS it, it goes up the
>> stack. To intercept both after the fact, you need to look in two
>> different places.
>>
>> Anyhow, just using a TC hook for XDP_PASS works fine and gets rid of the
>> forwarding hack; I'll send a v6 with that just as soon as I verify that
>> I didn't break anything when running the traffic generator on bare metal :)
>
> Got it. You mean a tc ingress prog attached to veth_src ? That should work.

Yup, exactly!

-Toke
diff mbox series

Patch

diff --git a/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
new file mode 100644
index 000000000000..a587c351d495
--- /dev/null
+++ b/tools/testing/selftests/bpf/prog_tests/xdp_do_redirect.c
@@ -0,0 +1,118 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <test_progs.h>
+#include <network_helpers.h>
+#include <net/if.h>
+#include <linux/if_ether.h>
+#include <linux/if_packet.h>
+#include <linux/ipv6.h>
+#include <linux/in6.h>
+#include <linux/udp.h>
+#include <bpf/bpf_endian.h>
+#include "test_xdp_do_redirect.skel.h"
+
+#define SYS(fmt, ...)						\
+	({							\
+		char cmd[1024];					\
+		snprintf(cmd, sizeof(cmd), fmt, ##__VA_ARGS__);	\
+		if (!ASSERT_OK(system(cmd), cmd))		\
+			goto fail;				\
+	})
+
+struct udp_packet {
+	struct ethhdr eth;
+	struct ipv6hdr iph;
+	struct udphdr udp;
+	__u8 payload[64 - sizeof(struct udphdr)
+		     - sizeof(struct ethhdr) - sizeof(struct ipv6hdr)];
+} __packed;
+
+static struct udp_packet pkt_udp = {
+	.eth.h_proto = __bpf_constant_htons(ETH_P_IPV6),
+	.eth.h_dest = {0x00, 0x11, 0x22, 0x33, 0x44, 0x55},
+	.eth.h_source = {0x66, 0x77, 0x88, 0x99, 0xaa, 0xbb},
+	.iph.version = 6,
+	.iph.nexthdr = IPPROTO_UDP,
+	.iph.payload_len = bpf_htons(sizeof(struct udp_packet)
+				     - offsetof(struct udp_packet, udp)),
+	.iph.hop_limit = 2,
+	.iph.saddr.s6_addr16 = {bpf_htons(0xfc00), 0, 0, 0, 0, 0, 0, bpf_htons(1)},
+	.iph.daddr.s6_addr16 = {bpf_htons(0xfc00), 0, 0, 0, 0, 0, 0, bpf_htons(2)},
+	.udp.source = bpf_htons(1),
+	.udp.dest = bpf_htons(1),
+	.udp.len = bpf_htons(sizeof(struct udp_packet)
+			     - offsetof(struct udp_packet, udp)),
+	.payload = {0x42}, /* receiver XDP program matches on this */
+};
+
+#define NUM_PKTS 3
+void test_xdp_do_redirect(void)
+{
+	struct test_xdp_do_redirect *skel = NULL;
+	struct xdp_md ctx_in = { .data_end = sizeof(pkt_udp) };
+	DECLARE_LIBBPF_OPTS(bpf_test_run_opts, opts,
+			    .data_in = &pkt_udp,
+			    .data_size_in = sizeof(pkt_udp),
+			    .ctx_in = &ctx_in,
+			    .ctx_size_in = sizeof(ctx_in),
+			    .flags = BPF_F_TEST_XDP_LIVE_FRAMES,
+			    .repeat = NUM_PKTS,
+		);
+	int err, prog_fd, ifindex_src, ifindex_dst;
+	struct bpf_link *link;
+
+	skel = test_xdp_do_redirect__open();
+	if (!ASSERT_OK_PTR(skel, "skel"))
+		return;
+
+	/* We setup a veth pair that we can not only XDP_REDIRECT packets
+	 * between, but also route them. The test packet (defined above) has
+	 * address information so it will be routed back out the same interface
+	 * after it has been received, which will allow it to be picked up by
+	 * the XDP program on the destination interface.
+	 *
+	 * The XDP program we run with bpf_prog_run() will cycle through all
+	 * four return codes (DROP/PASS/TX/REDIRECT), so we should end up with
+	 * NUM_PKTS - 1 packets seen on the dst iface. We match the packets on
+	 * the UDP payload.
+	 */
+	SYS("ip link add veth_src type veth peer name veth_dst");
+	SYS("ip link set dev veth_src address 00:11:22:33:44:55");
+	SYS("ip link set dev veth_dst address 66:77:88:99:aa:bb");
+	SYS("ip link set dev veth_src up");
+	SYS("ip link set dev veth_dst up");
+	SYS("ip addr add dev veth_src fc00::1/64");
+	SYS("ip addr add dev veth_dst fc00::2/64");
+	SYS("ip neigh add fc00::2 dev veth_src lladdr 66:77:88:99:aa:bb");
+	SYS("sysctl -w net.ipv6.conf.all.forwarding=1");
+
+	ifindex_src = if_nametoindex("veth_src");
+	ifindex_dst = if_nametoindex("veth_dst");
+	if (!ASSERT_NEQ(ifindex_src, 0, "ifindex_src") ||
+	    !ASSERT_NEQ(ifindex_dst, 0, "ifindex_dst"))
+		goto fail;
+
+	memcpy(skel->rodata->expect_dst, &pkt_udp.eth.h_dest, ETH_ALEN);
+	skel->rodata->ifindex_out = ifindex_src;
+	ctx_in.ingress_ifindex = ifindex_src;
+
+	if (!ASSERT_OK(test_xdp_do_redirect__load(skel), "load"))
+		goto fail;
+
+	link = bpf_program__attach_xdp(skel->progs.xdp_count_pkts, ifindex_dst);
+	if (!ASSERT_OK_PTR(link, "prog_attach"))
+		goto fail;
+	skel->links.xdp_count_pkts = link;
+
+	prog_fd = bpf_program__fd(skel->progs.xdp_redirect_notouch);
+	err = bpf_prog_test_run_opts(prog_fd, &opts);
+	if (!ASSERT_OK(err, "prog_run"))
+		goto fail;
+
+	/* wait for the packets to be flushed */
+	kern_sync_rcu();
+
+	ASSERT_EQ(skel->bss->pkts_seen, NUM_PKTS - 1, "pkt_count");
+fail:
+	system("ip link del dev veth_src");
+	test_xdp_do_redirect__destroy(skel);
+}
diff --git a/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
new file mode 100644
index 000000000000..f9ea587b0876
--- /dev/null
+++ b/tools/testing/selftests/bpf/progs/test_xdp_do_redirect.c
@@ -0,0 +1,39 @@ 
+// SPDX-License-Identifier: GPL-2.0
+#include <vmlinux.h>
+#include <bpf/bpf_helpers.h>
+
+#define ETH_ALEN 6
+const volatile int ifindex_out;
+const volatile __u8 expect_dst[ETH_ALEN];
+volatile int pkts_seen = 0;
+volatile int retcode = XDP_DROP;
+
+SEC("xdp")
+int xdp_redirect_notouch(struct xdp_md *xdp)
+{
+	if (retcode == XDP_REDIRECT)
+		bpf_redirect(ifindex_out, 0);
+	return retcode++;
+}
+
+SEC("xdp")
+int xdp_count_pkts(struct xdp_md *xdp)
+{
+	void *data = (void *)(long)xdp->data;
+	void *data_end = (void *)(long)xdp->data_end;
+	struct ethhdr *eth = data;
+	struct ipv6hdr *iph = (void *)(eth + 1);
+	struct udphdr *udp = (void *)(iph + 1);
+	__u8 *payload = (void *)(udp + 1);
+	int i;
+
+	if (payload + 1 > data_end)
+		return XDP_ABORTED;
+
+	if (iph->nexthdr == IPPROTO_UDP && *payload == 0x42)
+		pkts_seen++;
+
+	return XDP_PASS;
+}
+
+char _license[] SEC("license") = "GPL";