From patchwork Thu Aug 17 14:55:50 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Breno Leitao X-Patchwork-Id: 13356661 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 43658C3063E for ; Thu, 17 Aug 2023 14:57:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352428AbjHQO5C (ORCPT ); Thu, 17 Aug 2023 10:57:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57702 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352469AbjHQO4p (ORCPT ); Thu, 17 Aug 2023 10:56:45 -0400 Received: from mail-ej1-f52.google.com (mail-ej1-f52.google.com [209.85.218.52]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id BC93410E; Thu, 17 Aug 2023 07:56:23 -0700 (PDT) Received: by mail-ej1-f52.google.com with SMTP id a640c23a62f3a-99cdb0fd093so1052701666b.1; Thu, 17 Aug 2023 07:56:23 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692284182; x=1692888982; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=RGNwDE5n1Pywpt5f1ql5HbOxSxrsPq6WY4GFBT0GDAY=; b=VxGcvaO2Z6yeKOyj0+QwE8uqkeuWrB7xXE2hag6JGxOCmXhRYEcr3Y4nRf9tBhIbzB EI0se3M39h9hrdqYryVGoGfiWNH66BigutvNKUwzkn79FYMM5WXg6VdLGIsmSYLDrCBH 9/mVEcr5645gT9gNx7PLvoc/w3WNCHaUKdg8cCvlm5mDuYs32qsb7AVh3wo2kfUa9DRx a06CyWK2K350cgD0Q+XtuiwxCoLVHZjjxRwneBgZBn8I6XXkDPW3ReRelkgmlhMxSpjg /tcMdcF9b0W3hvL9Y0OE4x4SWwxIhQ0sJEyqgUHwV7TJSvWc2fyZI8gJMvpBcr942JlV v4vg== X-Gm-Message-State: AOJu0YyBbuiYqnpCvDYctwNN3kQRr4VpDvRFaV57dfapL+QQOnSoU4TI 6g1eT7cP0kMj8jqcfOZ6u4Y= X-Google-Smtp-Source: AGHT+IHgnxNCmJjExpKazyCtK8CJurvrlMn8eTc1TGVTX048rGDlw4O5A2oPGZYBo+GPvbkNmHpwcg== X-Received: by 2002:a17:906:5dd9:b0:99c:e38d:e47f with SMTP id p25-20020a1709065dd900b0099ce38de47fmr3752671ejv.33.1692284182147; Thu, 17 Aug 2023 07:56:22 -0700 (PDT) Received: from localhost (fwdproxy-cln-013.fbsv.net. [2a03:2880:31ff:d::face:b00c]) by smtp.gmail.com with ESMTPSA id j19-20020a170906255300b0099b42c90830sm10193510ejb.36.2023.08.17.07.56.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Aug 2023 07:56:21 -0700 (PDT) From: Breno Leitao To: sdf@google.com, axboe@kernel.dk, asml.silence@gmail.com, willemdebruijn.kernel@gmail.com, martin.lau@linux.dev, "David S. Miller" , Eric Dumazet , Jakub Kicinski , Paolo Abeni , Shuah Khan Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, io-uring@vger.kernel.org, krisman@suse.de, linux-kselftest@vger.kernel.org (open list:KERNEL SELFTEST FRAMEWORK) Subject: [PATCH v3 5/9] selftests/net: Extract uring helpers to be reusable Date: Thu, 17 Aug 2023 07:55:50 -0700 Message-Id: <20230817145554.892543-6-leitao@debian.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230817145554.892543-1-leitao@debian.org> References: <20230817145554.892543-1-leitao@debian.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Instead of defining basic io_uring functions in the test case, move them to a common directory, so, other tests can use them. This simplify the test code and reuse the common liburing infrastructure. This is basically a copy of what we have in io_uring_zerocopy_tx with some minor improvements to make checkpatch happy. A follow-up test will use the same helpers in a BPF sockopt test. Signed-off-by: Breno Leitao --- tools/include/io_uring/mini_liburing.h | 282 ++++++++++++++++++ tools/testing/selftests/net/Makefile | 1 + .../selftests/net/io_uring_zerocopy_tx.c | 268 +---------------- 3 files changed, 285 insertions(+), 266 deletions(-) create mode 100644 tools/include/io_uring/mini_liburing.h diff --git a/tools/include/io_uring/mini_liburing.h b/tools/include/io_uring/mini_liburing.h new file mode 100644 index 000000000000..9ccb16074eb5 --- /dev/null +++ b/tools/include/io_uring/mini_liburing.h @@ -0,0 +1,282 @@ +/* SPDX-License-Identifier: MIT */ + +#include +#include +#include +#include +#include +#include + +struct io_sq_ring { + unsigned int *head; + unsigned int *tail; + unsigned int *ring_mask; + unsigned int *ring_entries; + unsigned int *flags; + unsigned int *array; +}; + +struct io_cq_ring { + unsigned int *head; + unsigned int *tail; + unsigned int *ring_mask; + unsigned int *ring_entries; + struct io_uring_cqe *cqes; +}; + +struct io_uring_sq { + unsigned int *khead; + unsigned int *ktail; + unsigned int *kring_mask; + unsigned int *kring_entries; + unsigned int *kflags; + unsigned int *kdropped; + unsigned int *array; + struct io_uring_sqe *sqes; + + unsigned int sqe_head; + unsigned int sqe_tail; + + size_t ring_sz; +}; + +struct io_uring_cq { + unsigned int *khead; + unsigned int *ktail; + unsigned int *kring_mask; + unsigned int *kring_entries; + unsigned int *koverflow; + struct io_uring_cqe *cqes; + + size_t ring_sz; +}; + +struct io_uring { + struct io_uring_sq sq; + struct io_uring_cq cq; + int ring_fd; +}; + +#if defined(__x86_64) || defined(__i386__) +#define read_barrier() __asm__ __volatile__("":::"memory") +#define write_barrier() __asm__ __volatile__("":::"memory") +#else +#define read_barrier() __sync_synchronize() +#define write_barrier() __sync_synchronize() +#endif + +static inline int io_uring_mmap(int fd, struct io_uring_params *p, + struct io_uring_sq *sq, struct io_uring_cq *cq) +{ + size_t size; + void *ptr; + int ret; + + sq->ring_sz = p->sq_off.array + p->sq_entries * sizeof(unsigned int); + ptr = mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQ_RING); + if (ptr == MAP_FAILED) + return -errno; + sq->khead = ptr + p->sq_off.head; + sq->ktail = ptr + p->sq_off.tail; + sq->kring_mask = ptr + p->sq_off.ring_mask; + sq->kring_entries = ptr + p->sq_off.ring_entries; + sq->kflags = ptr + p->sq_off.flags; + sq->kdropped = ptr + p->sq_off.dropped; + sq->array = ptr + p->sq_off.array; + + size = p->sq_entries * sizeof(struct io_uring_sqe); + sq->sqes = mmap(0, size, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQES); + if (sq->sqes == MAP_FAILED) { + ret = -errno; +err: + munmap(sq->khead, sq->ring_sz); + return ret; + } + + cq->ring_sz = p->cq_off.cqes + p->cq_entries * sizeof(struct io_uring_cqe); + ptr = mmap(0, cq->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_CQ_RING); + if (ptr == MAP_FAILED) { + ret = -errno; + munmap(sq->sqes, p->sq_entries * sizeof(struct io_uring_sqe)); + goto err; + } + cq->khead = ptr + p->cq_off.head; + cq->ktail = ptr + p->cq_off.tail; + cq->kring_mask = ptr + p->cq_off.ring_mask; + cq->kring_entries = ptr + p->cq_off.ring_entries; + cq->koverflow = ptr + p->cq_off.overflow; + cq->cqes = ptr + p->cq_off.cqes; + return 0; +} + +static inline int io_uring_setup(unsigned int entries, + struct io_uring_params *p) +{ + return syscall(__NR_io_uring_setup, entries, p); +} + +static inline int io_uring_enter(int fd, unsigned int to_submit, + unsigned int min_complete, + unsigned int flags, sigset_t *sig) +{ + return syscall(__NR_io_uring_enter, fd, to_submit, min_complete, + flags, sig, _NSIG / 8); +} + +static inline int io_uring_queue_init(unsigned int entries, + struct io_uring *ring, + unsigned int flags) +{ + struct io_uring_params p; + int fd, ret; + + memset(ring, 0, sizeof(*ring)); + memset(&p, 0, sizeof(p)); + p.flags = flags; + + fd = io_uring_setup(entries, &p); + if (fd < 0) + return fd; + ret = io_uring_mmap(fd, &p, &ring->sq, &ring->cq); + if (!ret) + ring->ring_fd = fd; + else + close(fd); + return ret; +} + +/* Get a sqe */ +static inline struct io_uring_sqe *io_uring_get_sqe(struct io_uring *ring) +{ + struct io_uring_sq *sq = &ring->sq; + + if (sq->sqe_tail + 1 - sq->sqe_head > *sq->kring_entries) + return NULL; + return &sq->sqes[sq->sqe_tail++ & *sq->kring_mask]; +} + +static inline int io_uring_wait_cqe(struct io_uring *ring, + struct io_uring_cqe **cqe_ptr) +{ + struct io_uring_cq *cq = &ring->cq; + const unsigned int mask = *cq->kring_mask; + unsigned int head = *cq->khead; + int ret; + + *cqe_ptr = NULL; + do { + read_barrier(); + if (head != *cq->ktail) { + *cqe_ptr = &cq->cqes[head & mask]; + break; + } + ret = io_uring_enter(ring->ring_fd, 0, 1, + IORING_ENTER_GETEVENTS, NULL); + if (ret < 0) + return -errno; + } while (1); + + return 0; +} + +static inline int io_uring_submit(struct io_uring *ring) +{ + struct io_uring_sq *sq = &ring->sq; + const unsigned int mask = *sq->kring_mask; + unsigned int ktail, submitted, to_submit; + int ret; + + read_barrier(); + if (*sq->khead != *sq->ktail) { + submitted = *sq->kring_entries; + goto submit; + } + if (sq->sqe_head == sq->sqe_tail) + return 0; + + ktail = *sq->ktail; + to_submit = sq->sqe_tail - sq->sqe_head; + for (submitted = 0; submitted < to_submit; submitted++) { + read_barrier(); + sq->array[ktail++ & mask] = sq->sqe_head++ & mask; + } + if (!submitted) + return 0; + + if (*sq->ktail != ktail) { + write_barrier(); + *sq->ktail = ktail; + write_barrier(); + } +submit: + ret = io_uring_enter(ring->ring_fd, submitted, 0, + IORING_ENTER_GETEVENTS, NULL); + return ret < 0 ? -errno : ret; +} + +static inline void io_uring_queue_exit(struct io_uring *ring) +{ + struct io_uring_sq *sq = &ring->sq; + + munmap(sq->sqes, *sq->kring_entries * sizeof(struct io_uring_sqe)); + munmap(sq->khead, sq->ring_sz); + close(ring->ring_fd); +} + +/* Prepare and send the SQE */ +static inline void io_uring_prep_cmd(struct io_uring_sqe *sqe, int op, + int sockfd, + int level, int optname, + const void *optval, + int optlen) +{ + memset(sqe, 0, sizeof(*sqe)); + sqe->opcode = (__u8)IORING_OP_URING_CMD; + sqe->fd = sockfd; + sqe->cmd_op = op; + + sqe->level = level; + sqe->optname = optname; + sqe->optval = (unsigned long long)optval; + sqe->optlen = optlen; +} + +static inline int io_uring_register_buffers(struct io_uring *ring, + const struct iovec *iovecs, + unsigned int nr_iovecs) +{ + int ret; + + ret = syscall(__NR_io_uring_register, ring->ring_fd, + IORING_REGISTER_BUFFERS, iovecs, nr_iovecs); + return (ret < 0) ? -errno : ret; +} + +static inline void io_uring_prep_send(struct io_uring_sqe *sqe, int sockfd, + const void *buf, size_t len, int flags) +{ + memset(sqe, 0, sizeof(*sqe)); + sqe->opcode = (__u8)IORING_OP_SEND; + sqe->fd = sockfd; + sqe->addr = (unsigned long)buf; + sqe->len = len; + sqe->msg_flags = (__u32)flags; +} + +static inline void io_uring_prep_sendzc(struct io_uring_sqe *sqe, int sockfd, + const void *buf, size_t len, int flags, + unsigned int zc_flags) +{ + io_uring_prep_send(sqe, sockfd, buf, len, flags); + sqe->opcode = (__u8)IORING_OP_SEND_ZC; + sqe->ioprio = zc_flags; +} + +static inline void io_uring_cqe_seen(struct io_uring *ring) +{ + *(&ring->cq)->khead += 1; + write_barrier(); +} diff --git a/tools/testing/selftests/net/Makefile b/tools/testing/selftests/net/Makefile index 7f3ab2a93ed6..70f4a5982b01 100644 --- a/tools/testing/selftests/net/Makefile +++ b/tools/testing/selftests/net/Makefile @@ -94,6 +94,7 @@ $(OUTPUT)/reuseport_bpf_numa: LDLIBS += -lnuma $(OUTPUT)/tcp_mmap: LDLIBS += -lpthread -lcrypto $(OUTPUT)/tcp_inq: LDLIBS += -lpthread $(OUTPUT)/bind_bhash: LDLIBS += -lpthread +$(OUTPUT)/io_uring_zerocopy_tx: CFLAGS += -I../../../include/ # Rules to generate bpf obj nat6to4.o CLANG ?= clang diff --git a/tools/testing/selftests/net/io_uring_zerocopy_tx.c b/tools/testing/selftests/net/io_uring_zerocopy_tx.c index 154287740172..76e604e4810e 100644 --- a/tools/testing/selftests/net/io_uring_zerocopy_tx.c +++ b/tools/testing/selftests/net/io_uring_zerocopy_tx.c @@ -36,6 +36,8 @@ #include #include +#include + #define NOTIF_TAG 0xfffffffULL #define NONZC_TAG 0 #define ZC_TAG 1 @@ -60,272 +62,6 @@ static struct sockaddr_storage cfg_dst_addr; static char payload[IP_MAXPACKET] __attribute__((aligned(4096))); -struct io_sq_ring { - unsigned *head; - unsigned *tail; - unsigned *ring_mask; - unsigned *ring_entries; - unsigned *flags; - unsigned *array; -}; - -struct io_cq_ring { - unsigned *head; - unsigned *tail; - unsigned *ring_mask; - unsigned *ring_entries; - struct io_uring_cqe *cqes; -}; - -struct io_uring_sq { - unsigned *khead; - unsigned *ktail; - unsigned *kring_mask; - unsigned *kring_entries; - unsigned *kflags; - unsigned *kdropped; - unsigned *array; - struct io_uring_sqe *sqes; - - unsigned sqe_head; - unsigned sqe_tail; - - size_t ring_sz; -}; - -struct io_uring_cq { - unsigned *khead; - unsigned *ktail; - unsigned *kring_mask; - unsigned *kring_entries; - unsigned *koverflow; - struct io_uring_cqe *cqes; - - size_t ring_sz; -}; - -struct io_uring { - struct io_uring_sq sq; - struct io_uring_cq cq; - int ring_fd; -}; - -#ifdef __alpha__ -# ifndef __NR_io_uring_setup -# define __NR_io_uring_setup 535 -# endif -# ifndef __NR_io_uring_enter -# define __NR_io_uring_enter 536 -# endif -# ifndef __NR_io_uring_register -# define __NR_io_uring_register 537 -# endif -#else /* !__alpha__ */ -# ifndef __NR_io_uring_setup -# define __NR_io_uring_setup 425 -# endif -# ifndef __NR_io_uring_enter -# define __NR_io_uring_enter 426 -# endif -# ifndef __NR_io_uring_register -# define __NR_io_uring_register 427 -# endif -#endif - -#if defined(__x86_64) || defined(__i386__) -#define read_barrier() __asm__ __volatile__("":::"memory") -#define write_barrier() __asm__ __volatile__("":::"memory") -#else - -#define read_barrier() __sync_synchronize() -#define write_barrier() __sync_synchronize() -#endif - -static int io_uring_setup(unsigned int entries, struct io_uring_params *p) -{ - return syscall(__NR_io_uring_setup, entries, p); -} - -static int io_uring_enter(int fd, unsigned int to_submit, - unsigned int min_complete, - unsigned int flags, sigset_t *sig) -{ - return syscall(__NR_io_uring_enter, fd, to_submit, min_complete, - flags, sig, _NSIG / 8); -} - -static int io_uring_register_buffers(struct io_uring *ring, - const struct iovec *iovecs, - unsigned nr_iovecs) -{ - int ret; - - ret = syscall(__NR_io_uring_register, ring->ring_fd, - IORING_REGISTER_BUFFERS, iovecs, nr_iovecs); - return (ret < 0) ? -errno : ret; -} - -static int io_uring_mmap(int fd, struct io_uring_params *p, - struct io_uring_sq *sq, struct io_uring_cq *cq) -{ - size_t size; - void *ptr; - int ret; - - sq->ring_sz = p->sq_off.array + p->sq_entries * sizeof(unsigned); - ptr = mmap(0, sq->ring_sz, PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQ_RING); - if (ptr == MAP_FAILED) - return -errno; - sq->khead = ptr + p->sq_off.head; - sq->ktail = ptr + p->sq_off.tail; - sq->kring_mask = ptr + p->sq_off.ring_mask; - sq->kring_entries = ptr + p->sq_off.ring_entries; - sq->kflags = ptr + p->sq_off.flags; - sq->kdropped = ptr + p->sq_off.dropped; - sq->array = ptr + p->sq_off.array; - - size = p->sq_entries * sizeof(struct io_uring_sqe); - sq->sqes = mmap(0, size, PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_SQES); - if (sq->sqes == MAP_FAILED) { - ret = -errno; -err: - munmap(sq->khead, sq->ring_sz); - return ret; - } - - cq->ring_sz = p->cq_off.cqes + p->cq_entries * sizeof(struct io_uring_cqe); - ptr = mmap(0, cq->ring_sz, PROT_READ | PROT_WRITE, - MAP_SHARED | MAP_POPULATE, fd, IORING_OFF_CQ_RING); - if (ptr == MAP_FAILED) { - ret = -errno; - munmap(sq->sqes, p->sq_entries * sizeof(struct io_uring_sqe)); - goto err; - } - cq->khead = ptr + p->cq_off.head; - cq->ktail = ptr + p->cq_off.tail; - cq->kring_mask = ptr + p->cq_off.ring_mask; - cq->kring_entries = ptr + p->cq_off.ring_entries; - cq->koverflow = ptr + p->cq_off.overflow; - cq->cqes = ptr + p->cq_off.cqes; - return 0; -} - -static int io_uring_queue_init(unsigned entries, struct io_uring *ring, - unsigned flags) -{ - struct io_uring_params p; - int fd, ret; - - memset(ring, 0, sizeof(*ring)); - memset(&p, 0, sizeof(p)); - p.flags = flags; - - fd = io_uring_setup(entries, &p); - if (fd < 0) - return fd; - ret = io_uring_mmap(fd, &p, &ring->sq, &ring->cq); - if (!ret) - ring->ring_fd = fd; - else - close(fd); - return ret; -} - -static int io_uring_submit(struct io_uring *ring) -{ - struct io_uring_sq *sq = &ring->sq; - const unsigned mask = *sq->kring_mask; - unsigned ktail, submitted, to_submit; - int ret; - - read_barrier(); - if (*sq->khead != *sq->ktail) { - submitted = *sq->kring_entries; - goto submit; - } - if (sq->sqe_head == sq->sqe_tail) - return 0; - - ktail = *sq->ktail; - to_submit = sq->sqe_tail - sq->sqe_head; - for (submitted = 0; submitted < to_submit; submitted++) { - read_barrier(); - sq->array[ktail++ & mask] = sq->sqe_head++ & mask; - } - if (!submitted) - return 0; - - if (*sq->ktail != ktail) { - write_barrier(); - *sq->ktail = ktail; - write_barrier(); - } -submit: - ret = io_uring_enter(ring->ring_fd, submitted, 0, - IORING_ENTER_GETEVENTS, NULL); - return ret < 0 ? -errno : ret; -} - -static inline void io_uring_prep_send(struct io_uring_sqe *sqe, int sockfd, - const void *buf, size_t len, int flags) -{ - memset(sqe, 0, sizeof(*sqe)); - sqe->opcode = (__u8) IORING_OP_SEND; - sqe->fd = sockfd; - sqe->addr = (unsigned long) buf; - sqe->len = len; - sqe->msg_flags = (__u32) flags; -} - -static inline void io_uring_prep_sendzc(struct io_uring_sqe *sqe, int sockfd, - const void *buf, size_t len, int flags, - unsigned zc_flags) -{ - io_uring_prep_send(sqe, sockfd, buf, len, flags); - sqe->opcode = (__u8) IORING_OP_SEND_ZC; - sqe->ioprio = zc_flags; -} - -static struct io_uring_sqe *io_uring_get_sqe(struct io_uring *ring) -{ - struct io_uring_sq *sq = &ring->sq; - - if (sq->sqe_tail + 1 - sq->sqe_head > *sq->kring_entries) - return NULL; - return &sq->sqes[sq->sqe_tail++ & *sq->kring_mask]; -} - -static int io_uring_wait_cqe(struct io_uring *ring, struct io_uring_cqe **cqe_ptr) -{ - struct io_uring_cq *cq = &ring->cq; - const unsigned mask = *cq->kring_mask; - unsigned head = *cq->khead; - int ret; - - *cqe_ptr = NULL; - do { - read_barrier(); - if (head != *cq->ktail) { - *cqe_ptr = &cq->cqes[head & mask]; - break; - } - ret = io_uring_enter(ring->ring_fd, 0, 1, - IORING_ENTER_GETEVENTS, NULL); - if (ret < 0) - return -errno; - } while (1); - - return 0; -} - -static inline void io_uring_cqe_seen(struct io_uring *ring) -{ - *(&ring->cq)->khead += 1; - write_barrier(); -} - static unsigned long gettimeofday_ms(void) { struct timeval tv; From patchwork Thu Aug 17 14:55:54 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Breno Leitao X-Patchwork-Id: 13356660 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from vger.kernel.org (vger.kernel.org [23.128.96.18]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3325CC3063F for ; Thu, 17 Aug 2023 14:57:33 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S1352433AbjHQO5C (ORCPT ); Thu, 17 Aug 2023 10:57:02 -0400 Received: from lindbergh.monkeyblade.net ([23.128.96.19]:57778 "EHLO lindbergh.monkeyblade.net" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1352488AbjHQO4r (ORCPT ); Thu, 17 Aug 2023 10:56:47 -0400 Received: from mail-ed1-f49.google.com (mail-ed1-f49.google.com [209.85.208.49]) by lindbergh.monkeyblade.net (Postfix) with ESMTPS id 5743A35AA; Thu, 17 Aug 2023 07:56:31 -0700 (PDT) Received: by mail-ed1-f49.google.com with SMTP id 4fb4d7f45d1cf-5255da974c4so6380970a12.3; Thu, 17 Aug 2023 07:56:31 -0700 (PDT) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20221208; t=1692284190; x=1692888990; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=ke4IWzmqO1N+bGCPCfqmfSAH5jeG6iX85TTRlsQcMDs=; b=I5LxGv+5Y4qlBcuPQRMVz+pQ/yxKydRfY2Ou/ZGP/OWlOfdCcMb6j5aea2RzkmlWZT 7SeWnIhAYr5+3MKzoAlLrXcomiN9FNmlW++dKmt1XqEJFI1r980W9m7t9KCFlhP9jfcL SbordW4ll0zPKfk9om+Bdc0S4n0OV+GrxTg9dF9OZ8s2OzhizviBeyXbvVndCwVgeFcx aJ2ORoknB/domvDdCwN/Q20DzYpX/KnD87JurqSq/TkhFzXjciJQxoBs15GJe39hnhY1 4hLYccZb6VnKbJdCjCmeStMCM2J6nEWmYq5NQMGuw10feDuCghZmjlp1k6SIq0CdNhSd lkRA== X-Gm-Message-State: AOJu0Yx/xFHqf169G+po0Cl0UymHxQYrpw5RPWWd9yOqK9D+ZtYYXxuO y/E4LCMCbJei8rkjhA4Ze8Q= X-Google-Smtp-Source: AGHT+IGG++fNvmWQJgN7uptZFJRzrGCuJT2vPHS0WF5JyTy6uNOX81AEB0/UY7tK5PTgf6IM32MEpw== X-Received: by 2002:a17:907:98e4:b0:994:542c:8718 with SMTP id ke4-20020a17090798e400b00994542c8718mr4187880ejc.76.1692284189770; Thu, 17 Aug 2023 07:56:29 -0700 (PDT) Received: from localhost (fwdproxy-cln-012.fbsv.net. [2a03:2880:31ff:c::face:b00c]) by smtp.gmail.com with ESMTPSA id sd17-20020a170906ce3100b0099cc3c7ace2sm10327123ejb.140.2023.08.17.07.56.29 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 17 Aug 2023 07:56:29 -0700 (PDT) From: Breno Leitao To: sdf@google.com, axboe@kernel.dk, asml.silence@gmail.com, willemdebruijn.kernel@gmail.com, martin.lau@linux.dev, Andrii Nakryiko , Mykola Lysenko , Alexei Starovoitov , Daniel Borkmann , Song Liu , Yonghong Song , John Fastabend , KP Singh , Hao Luo , Jiri Olsa , Shuah Khan Cc: bpf@vger.kernel.org, linux-kernel@vger.kernel.org, netdev@vger.kernel.org, io-uring@vger.kernel.org, kuba@kernel.org, pabeni@redhat.com, krisman@suse.de, Wang Yufen , =?utf-8?q?Daniel_M?= =?utf-8?q?=C3=BCller?= , linux-kselftest@vger.kernel.org (open list:KERNEL SELFTEST FRAMEWORK) Subject: [PATCH v3 9/9] selftests/bpf/sockopt: Add io_uring support Date: Thu, 17 Aug 2023 07:55:54 -0700 Message-Id: <20230817145554.892543-10-leitao@debian.org> X-Mailer: git-send-email 2.34.1 In-Reply-To: <20230817145554.892543-1-leitao@debian.org> References: <20230817145554.892543-1-leitao@debian.org> MIME-Version: 1.0 Precedence: bulk List-ID: X-Mailing-List: linux-kselftest@vger.kernel.org Expand the sockopt test to use also check for io_uring {g,s}etsockopt commands operations. This patch starts by marking marks each test if they support io_uring support or not. Right now, io_uring cmd getsockopt() has a limitation of only accepting level == SOL_SOCKET, otherwise it returns -EOPNOTSUPP. Later, each test runs using regular {g,s}etsockopt systemcalls, and, if liburing is supported, execute the same test (again), but calling liburing {g,s}setsockopt commands. For that, leverage the mini liburing to do basic io_uring operations. Signed-off-by: Breno Leitao --- tools/testing/selftests/bpf/Makefile | 1 + .../selftests/bpf/prog_tests/sockopt.c | 129 +++++++++++++++++- 2 files changed, 124 insertions(+), 6 deletions(-) diff --git a/tools/testing/selftests/bpf/Makefile b/tools/testing/selftests/bpf/Makefile index 538df8fb8c42..4da04242b848 100644 --- a/tools/testing/selftests/bpf/Makefile +++ b/tools/testing/selftests/bpf/Makefile @@ -362,6 +362,7 @@ CLANG_CFLAGS = $(CLANG_SYS_INCLUDES) \ $(OUTPUT)/test_l4lb_noinline.o: BPF_CFLAGS += -fno-inline $(OUTPUT)/test_xdp_noinline.o: BPF_CFLAGS += -fno-inline +$(OUTPUT)/test_progs.o: CFLAGS += -I../../../include/ $(OUTPUT)/flow_dissector_load.o: flow_dissector_load.h $(OUTPUT)/cgroup_getset_retval_hooks.o: cgroup_getset_retval_hooks.h diff --git a/tools/testing/selftests/bpf/prog_tests/sockopt.c b/tools/testing/selftests/bpf/prog_tests/sockopt.c index 9e6a5e3ed4de..4693ad8bfe8f 100644 --- a/tools/testing/selftests/bpf/prog_tests/sockopt.c +++ b/tools/testing/selftests/bpf/prog_tests/sockopt.c @@ -1,5 +1,6 @@ // SPDX-License-Identifier: GPL-2.0 #include +#include #include "cgroup_helpers.h" static char bpf_log_buf[4096]; @@ -38,6 +39,7 @@ static struct sockopt_test { socklen_t get_optlen_ret; enum sockopt_test_error error; + bool io_uring_support; } tests[] = { /* ==================== getsockopt ==================== */ @@ -53,6 +55,7 @@ static struct sockopt_test { .attach_type = BPF_CGROUP_GETSOCKOPT, .expected_attach_type = 0, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: wrong expected_attach_type", @@ -65,6 +68,7 @@ static struct sockopt_test { .attach_type = BPF_CGROUP_GETSOCKOPT, .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .error = DENY_ATTACH, + .io_uring_support = true, }, { .descr = "getsockopt: bypass bpf hook", @@ -121,6 +125,7 @@ static struct sockopt_test { .attach_type = BPF_CGROUP_GETSOCKOPT, .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: read ctx->level", @@ -164,6 +169,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: read ctx->optname", @@ -225,6 +231,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: read ctx->optlen", @@ -252,6 +259,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .get_optlen = 64, + .io_uring_support = true, }, { .descr = "getsockopt: deny bigger ctx->optlen", @@ -276,6 +284,7 @@ static struct sockopt_test { .get_optlen = 64, .error = EFAULT_GETSOCKOPT, + .io_uring_support = true, }, { .descr = "getsockopt: ignore >PAGE_SIZE optlen", @@ -318,6 +327,7 @@ static struct sockopt_test { .get_optval = {}, /* the changes are ignored */ .get_optlen = PAGE_SIZE + 1, .error = EOPNOTSUPP_GETSOCKOPT, + .io_uring_support = true, }, { .descr = "getsockopt: support smaller ctx->optlen", @@ -339,6 +349,7 @@ static struct sockopt_test { .get_optlen = 64, .get_optlen_ret = 32, + .io_uring_support = true, }, { .descr = "getsockopt: deny writing to ctx->optval", @@ -353,6 +364,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: deny writing to ctx->optval_end", @@ -367,6 +379,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "getsockopt: rewrite value", @@ -421,6 +434,7 @@ static struct sockopt_test { .attach_type = BPF_CGROUP_SETSOCKOPT, .expected_attach_type = 0, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "setsockopt: wrong expected_attach_type", @@ -433,6 +447,7 @@ static struct sockopt_test { .attach_type = BPF_CGROUP_SETSOCKOPT, .expected_attach_type = BPF_CGROUP_GETSOCKOPT, .error = DENY_ATTACH, + .io_uring_support = true, }, { .descr = "setsockopt: bypass bpf hook", @@ -518,6 +533,7 @@ static struct sockopt_test { .set_level = 123, .set_optlen = 1, + .io_uring_support = true, }, { .descr = "setsockopt: allow changing ctx->level", @@ -572,6 +588,7 @@ static struct sockopt_test { .set_optname = 123, .set_optlen = 1, + .io_uring_support = true, }, { .descr = "setsockopt: allow changing ctx->optname", @@ -624,6 +641,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .set_optlen = 64, + .io_uring_support = true, }, { .descr = "setsockopt: ctx->optlen == -1 is ok", @@ -640,6 +658,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .set_optlen = 64, + .io_uring_support = true, }, { .descr = "setsockopt: deny ctx->optlen < 0 (except -1)", @@ -658,6 +677,7 @@ static struct sockopt_test { .set_optlen = 4, .error = EFAULT_SETSOCKOPT, + .io_uring_support = true, }, { .descr = "setsockopt: deny ctx->optlen > input optlen", @@ -675,6 +695,7 @@ static struct sockopt_test { .set_optlen = 64, .error = EFAULT_SETSOCKOPT, + .io_uring_support = true, }, { .descr = "setsockopt: ignore >PAGE_SIZE optlen", @@ -775,6 +796,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "setsockopt: deny read ctx->retval", @@ -791,6 +813,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "setsockopt: deny writing to ctx->optval", @@ -805,6 +828,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "setsockopt: deny writing to ctx->optval_end", @@ -819,6 +843,7 @@ static struct sockopt_test { .expected_attach_type = BPF_CGROUP_SETSOCKOPT, .error = DENY_LOAD, + .io_uring_support = true, }, { .descr = "setsockopt: allow IP_TOS <= 128", @@ -940,7 +965,94 @@ static int load_prog(const struct bpf_insn *insns, return fd; } -static int run_test(int cgroup_fd, struct sockopt_test *test) +/* Core function that handles io_uring ring initialization, + * sending SQE with sockopt command and waiting for the CQE. + */ +static int uring_sockopt(int op, int fd, int level, int optname, + const void *optval, socklen_t optlen) +{ + struct io_uring_cqe *cqe; + struct io_uring_sqe *sqe; + struct io_uring ring; + int err; + + err = io_uring_queue_init(1, &ring, 0); + if (err) { + log_err("Failed to initialize io_uring ring"); + return err; + } + + sqe = io_uring_get_sqe(&ring); + if (!sqe) { + log_err("Failed to get an SQE"); + return -1; + } + + io_uring_prep_cmd(sqe, op, fd, level, optname, optval, optlen); + + err = io_uring_submit(&ring); + if (err != 1) { + log_err("Failed to submit SQE"); + return err; + } + + err = io_uring_wait_cqe(&ring, &cqe); + if (err) { + log_err("Failed to wait for CQE"); + return err; + } + + err = cqe->res; + + io_uring_queue_exit(&ring); + + return err; +} + +static int uring_setsockopt(int fd, int level, int optname, const void *optval, + socklen_t optlen) +{ + return uring_sockopt(SOCKET_URING_OP_SETSOCKOPT, fd, level, optname, + optval, optlen); +} + +static int uring_getsockopt(int fd, int level, int optname, void *optval, + socklen_t *optlen) +{ + int ret = uring_sockopt(SOCKET_URING_OP_GETSOCKOPT, fd, level, optname, + optval, *optlen); + if (ret < 0) + return ret; + + /* Populate optlen back to be compatible with systemcall interface, + * and simplify the test. + */ + *optlen = ret; + + return 0; +} + +/* Execute the setsocktopt operation */ +static int call_setsockopt(bool use_io_uring, int fd, int level, int optname, + const void *optval, socklen_t optlen) +{ + if (use_io_uring) + return uring_setsockopt(fd, level, optname, optval, optlen); + + return setsockopt(fd, level, optname, optval, optlen); +} + +/* Execute the getsocktopt operation */ +static int call_getsockopt(bool use_io_uring, int fd, int level, int optname, + void *optval, socklen_t *optlen) +{ + if (use_io_uring) + return uring_getsockopt(fd, level, optname, optval, optlen); + + return getsockopt(fd, level, optname, optval, optlen); +} + +static int run_test(int cgroup_fd, struct sockopt_test *test, bool use_io_uring) { int sock_fd, err, prog_fd; void *optval = NULL; @@ -980,8 +1092,9 @@ static int run_test(int cgroup_fd, struct sockopt_test *test) test->set_optlen = num_pages * sysconf(_SC_PAGESIZE) + remainder; } - err = setsockopt(sock_fd, test->set_level, test->set_optname, - test->set_optval, test->set_optlen); + err = call_setsockopt(use_io_uring, sock_fd, test->set_level, + test->set_optname, test->set_optval, + test->set_optlen); if (err) { if (errno == EPERM && test->error == EPERM_SETSOCKOPT) goto close_sock_fd; @@ -1008,8 +1121,8 @@ static int run_test(int cgroup_fd, struct sockopt_test *test) socklen_t expected_get_optlen = test->get_optlen_ret ?: test->get_optlen; - err = getsockopt(sock_fd, test->get_level, test->get_optname, - optval, &optlen); + err = call_getsockopt(use_io_uring, sock_fd, test->get_level, + test->get_optname, optval, &optlen); if (err) { if (errno == EOPNOTSUPP && test->error == EOPNOTSUPP_GETSOCKOPT) goto free_optval; @@ -1063,7 +1176,11 @@ void test_sockopt(void) if (!test__start_subtest(tests[i].descr)) continue; - ASSERT_OK(run_test(cgroup_fd, &tests[i]), tests[i].descr); + ASSERT_OK(run_test(cgroup_fd, &tests[i], false), + tests[i].descr); + if (tests[i].io_uring_support) + ASSERT_OK(run_test(cgroup_fd, &tests[i], true), + tests[i].descr); } close(cgroup_fd);