From patchwork Wed Jan 11 12:37:33 2023 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A . Shutemov" X-Patchwork-Id: 13096658 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 573C1C46467 for ; Wed, 11 Jan 2023 13:24:02 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E8A3A900004; Wed, 11 Jan 2023 08:24:01 -0500 (EST) Received: by kanga.kvack.org (Postfix, from userid 40) id E3A85900002; Wed, 11 Jan 2023 08:24:01 -0500 (EST) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id D02D4900004; Wed, 11 Jan 2023 08:24:01 -0500 (EST) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id C3544900002 for ; Wed, 11 Jan 2023 08:24:01 -0500 (EST) Received: from smtpin23.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 970E5C0552 for ; Wed, 11 Jan 2023 13:24:01 +0000 (UTC) X-FDA: 80342586282.23.FFB9F38 Received: from mga07.intel.com (mga07.intel.com [134.134.136.100]) by imf17.hostedemail.com (Postfix) with ESMTP id E4A114000E for ; Wed, 11 Jan 2023 13:23:58 +0000 (UTC) Authentication-Results: imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ScoAVMcW; spf=none (imf17.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.100) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1673443439; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=uuiBgt0fB2kTjl1Ku+81wfV6gcDi8StWysDGKsByKac=; b=Z7a6YovenLvjWk2vUSdLv88KkVMwjCxY/koEcNB/mtLO6CuH0OWQaj68NhLNaahClelr2A l91JBpWbeno0eGUk7VA0WKbgxPa5N0jBH4gnTkhWlEGcYDg2D6wkt7ux5lz+aYcp40ex5p Hi53RXvrs48+zC1ZE2R91Vy1trxtyxc= ARC-Authentication-Results: i=1; imf17.hostedemail.com; dkim=pass header.d=intel.com header.s=Intel header.b=ScoAVMcW; spf=none (imf17.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.100) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=pass (policy=none) header.from=intel.com ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1673443439; a=rsa-sha256; cv=none; b=beawe2rzW/USqN9fuR2OYEKGaZQ0px5AfhnADcn9xKIwSK5X+z77TOFDawAAXXntN7pEf/ G0fRF7/JGBKwN6nsqcfgti8IG9IdV8DvPcqGQxG9lIfwCN1TkeDOFR5ysI9Tit77wMXb8O BHYRjyQYfBNkyZcxJHleplonRy9Hr20= DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1673443439; x=1704979439; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=9gNQ5WNG2l+gZLhtnf8anitiVNgYtGVQwEkYU4Rdi08=; b=ScoAVMcWMVoo7K1Vn2ywS2YrUfvp3VNgTMvcM2T6AFlORhfZyiu1IGQW JJPHSEhrrcdV5nJ4mJnvRfw5wscpPr/E6wVCIt3R7B5Js9XKPQYU2u6TZ oX20UbAGC0Q1S6z904r/SrkvKaspyn6C5j8yNgOZWS2AOT5BPlmvLmq6X svtSPN9Lmeodx1pz5eb90ixKGZLoyb0IkTtQybXpg3ma5xp71bpCpMom5 M4dpxAlKJ6/JPkgFVcsG4pNYRs267c61F/lEQZDF9AXq2excO7C8LNNwd ilukl67za9MaRIb1i7t00ni9MdWVlCaBG1HnY9eEzRBz+cKVU4apxCsoR A==; X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="387872533" X-IronPort-AV: E=Sophos;i="5.96,317,1665471600"; d="scan'208";a="387872533" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga105.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 05:23:55 -0800 X-IronPort-AV: E=McAfee;i="6500,9779,10586"; a="687927419" X-IronPort-AV: E=Sophos;i="5.96,317,1665471600"; d="scan'208";a="687927419" Received: from bachaue1-mobl1.ger.corp.intel.com (HELO box.shutemov.name) ([10.252.37.250]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 11 Jan 2023 05:23:50 -0800 Received: by box.shutemov.name (Postfix, from userid 1000) id 95E44109D23; Wed, 11 Jan 2023 15:37:41 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , Bharata B Rao , Jacob Pan , Ashok Raj , Linus Torvalds , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv14 14/17] selftests/x86/lam: Add io_uring test cases for linear-address masking Date: Wed, 11 Jan 2023 15:37:33 +0300 Message-Id: <20230111123736.20025-15-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.38.2 In-Reply-To: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> References: <20230111123736.20025-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: E4A114000E X-Stat-Signature: zdyhh6ze8ausg7ceryt3uihrcm3n9iz1 X-Rspam-User: X-HE-Tag: 1673443438-732273 X-HE-Meta: U2FsdGVkX19fE4jNVuQVpsvKCfJPI38B97rschH5RbCUiMI2xO0AgxBlvZU8X0bVLSDHPXZnWUkupAeKfjxjeiFiYCpw3dQIdr//ATfEvsI+ExZibpwRH9CL9RAZL5NgLdFewXvw1GROP36HCgoyI8y5wbSwyyB+CGphEGt2OikjmG/ru+Zz6pp9cxo3ER4aJEo/+5EO5lWApUQtPKoM3MExJLAGZf3EpVt+47du2scrMOO1qrBJRUtH5XHgtDz7W10iuT/tLAtV+uruR/ielzGqECz5w5kOKyoeZ90ZvEmcQkf0NeIVZ9wMo4aeDQ6FHZVrNM/yCGo206tyohCTICVMzUsUnyurBrTpuYNuikq4Mj+hIvehWlJRAJc1IQZB6yvOnVG6Rp2Es0lEXlD1Wwor1yBXh7jIYTBXtUnNMpX4+tzKloY7o1AEMb1edLuCb3t1UY5CCMZPIxF9dBiuyiGfHkFv/vhPUMYiWJwgRVeZlCRiiIdq80JiofM4A8hHIFmqW9SwzjNmMHz/deLutclYOocxjwYV4jjXT2c1XRxIbpXCm5XJIIWcv8O3FGUADYa2EMAST2xQEFvxBQZA8xfY5ExPIgPlj8i3eYW+WW738Gonhx1CZNsGus8T/dp8UFMRa2FFvATCNPlFapUpaR/WiCDdPeQNUs4SmQ0xaAL017Fk97nAGGFHPwM/m1ygncs8DgHzNJU8tXVqvSvyzvo2TrEaOAUUBYt88wR7s4dCUTNEnop6kxEEACASQnEjxPwfyErIT9uTmvwH0BTQsbE9pKwFiYo7tcgMjvah5R0S6V2IqulS1M9jWCYEgiZlFybxlhFrAhjpk8ZcEebP7vtUx3oixAiGMDjmarAzUsavjldngXrUBw+LyJZp+2hVZwrlalWn+rDWCocfxSDD4z0j8oiEq8oK4Mp83JqOMpUdoTI2r8K9/PsOaYPZqPn01+tw7lvSg+wNyA0z0Ie 8gpBYL5V fcJLJOdXvbNeGezKeI0rKc8kajN8ncu9xKtIT X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Weihong Zhang LAM should be supported in kernel thread, using io_uring to verify LAM feature. The test cases implement read a file through io_uring, the test cases choose an iovec array as receiving buffer, which used to receive data, according to LAM mode, set metadata in high bits of these buffer. io_uring can deal with these buffers that pointed to pointers with the metadata in high bits. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov --- tools/testing/selftests/x86/lam.c | 341 +++++++++++++++++++++++++++++- 1 file changed, 339 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 39ebfc511685..52750ebd0887 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -9,8 +9,12 @@ #include #include #include +#include +#include #include +#include +#include #include "../kselftest.h" #ifndef __x86_64__ @@ -32,8 +36,9 @@ #define FUNC_BITS 0x2 #define FUNC_MMAP 0x4 #define FUNC_SYSCALL 0x8 +#define FUNC_URING 0x10 -#define TEST_MASK 0xf +#define TEST_MASK 0x1f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -42,6 +47,13 @@ #define PAGE_SIZE (4 << 10) +#define barrier() ({ \ + __asm__ __volatile__("" : : : "memory"); \ +}) + +#define URING_QUEUE_SZ 1 +#define URING_BLOCK_SZ 2048 + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ @@ -51,6 +63,33 @@ struct testcases { const char *msg; }; +/* Used by CQ of uring, source file handler and file's size */ +struct file_io { + int file_fd; + off_t file_sz; + struct iovec iovecs[]; +}; + +struct io_uring_queue { + unsigned int *head; + unsigned int *tail; + unsigned int *ring_mask; + unsigned int *ring_entries; + unsigned int *flags; + unsigned int *array; + union { + struct io_uring_cqe *cqes; + struct io_uring_sqe *sqes; + } queue; + size_t ring_sz; +}; + +struct io_ring { + int ring_fd; + struct io_uring_queue sq_ring; + struct io_uring_queue cq_ring; +}; + int tests_cnt; jmp_buf segv_env; @@ -294,6 +333,285 @@ static int handle_syscall(struct testcases *test) return ret; } +int sys_uring_setup(unsigned int entries, struct io_uring_params *p) +{ + return (int)syscall(__NR_io_uring_setup, entries, p); +} + +int sys_uring_enter(int fd, unsigned int to, unsigned int min, unsigned int flags) +{ + return (int)syscall(__NR_io_uring_enter, fd, to, min, flags, NULL, 0); +} + +/* Init submission queue and completion queue */ +int mmap_io_uring(struct io_uring_params p, struct io_ring *s) +{ + struct io_uring_queue *sring = &s->sq_ring; + struct io_uring_queue *cring = &s->cq_ring; + + sring->ring_sz = p.sq_off.array + p.sq_entries * sizeof(unsigned int); + cring->ring_sz = p.cq_off.cqes + p.cq_entries * sizeof(struct io_uring_cqe); + + if (p.features & IORING_FEAT_SINGLE_MMAP) { + if (cring->ring_sz > sring->ring_sz) + sring->ring_sz = cring->ring_sz; + + cring->ring_sz = sring->ring_sz; + } + + void *sq_ptr = mmap(0, sring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_SQ_RING); + + if (sq_ptr == MAP_FAILED) { + perror("sub-queue!"); + return 1; + } + + void *cq_ptr = sq_ptr; + + if (!(p.features & IORING_FEAT_SINGLE_MMAP)) { + cq_ptr = mmap(0, cring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_CQ_RING); + if (cq_ptr == MAP_FAILED) { + perror("cpl-queue!"); + munmap(sq_ptr, sring->ring_sz); + return 1; + } + } + + sring->head = sq_ptr + p.sq_off.head; + sring->tail = sq_ptr + p.sq_off.tail; + sring->ring_mask = sq_ptr + p.sq_off.ring_mask; + sring->ring_entries = sq_ptr + p.sq_off.ring_entries; + sring->flags = sq_ptr + p.sq_off.flags; + sring->array = sq_ptr + p.sq_off.array; + + /* Map a queue as mem map */ + s->sq_ring.queue.sqes = mmap(0, p.sq_entries * sizeof(struct io_uring_sqe), + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, + s->ring_fd, IORING_OFF_SQES); + if (s->sq_ring.queue.sqes == MAP_FAILED) { + munmap(sq_ptr, sring->ring_sz); + if (sq_ptr != cq_ptr) { + ksft_print_msg("failed to mmap uring queue!"); + munmap(cq_ptr, cring->ring_sz); + return 1; + } + } + + cring->head = cq_ptr + p.cq_off.head; + cring->tail = cq_ptr + p.cq_off.tail; + cring->ring_mask = cq_ptr + p.cq_off.ring_mask; + cring->ring_entries = cq_ptr + p.cq_off.ring_entries; + cring->queue.cqes = cq_ptr + p.cq_off.cqes; + + return 0; +} + +/* Init io_uring queues */ +int setup_io_uring(struct io_ring *s) +{ + struct io_uring_params para; + + memset(¶, 0, sizeof(para)); + s->ring_fd = sys_uring_setup(URING_QUEUE_SZ, ¶); + if (s->ring_fd < 0) + return 1; + + return mmap_io_uring(para, s); +} + +/* + * Get data from completion queue. the data buffer saved the file data + * return 0: success; others: error; + */ +int handle_uring_cq(struct io_ring *s) +{ + struct file_io *fi = NULL; + struct io_uring_queue *cring = &s->cq_ring; + struct io_uring_cqe *cqe; + unsigned int head; + off_t len = 0; + + head = *cring->head; + + do { + barrier(); + if (head == *cring->tail) + break; + /* Get the entry */ + cqe = &cring->queue.cqes[head & *s->cq_ring.ring_mask]; + fi = (struct file_io *)cqe->user_data; + if (cqe->res < 0) + break; + + int blocks = (int)(fi->file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + for (int i = 0; i < blocks; i++) + len += fi->iovecs[i].iov_len; + + head++; + } while (1); + + *cring->head = head; + barrier(); + + return (len != fi->file_sz); +} + +/* + * Submit squeue. specify via IORING_OP_READV. + * the buffer need to be set metadata according to LAM mode + */ +int handle_uring_sq(struct io_ring *ring, struct file_io *fi, unsigned long lam) +{ + int file_fd = fi->file_fd; + struct io_uring_queue *sring = &ring->sq_ring; + unsigned int index = 0, cur_block = 0, tail = 0, next_tail = 0; + struct io_uring_sqe *sqe; + + off_t remain = fi->file_sz; + int blocks = (int)(remain + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + while (remain) { + off_t bytes = remain; + void *buf; + + if (bytes > URING_BLOCK_SZ) + bytes = URING_BLOCK_SZ; + + fi->iovecs[cur_block].iov_len = bytes; + + if (posix_memalign(&buf, URING_BLOCK_SZ, URING_BLOCK_SZ)) + return 1; + + fi->iovecs[cur_block].iov_base = (void *)set_metadata((uint64_t)buf, lam); + remain -= bytes; + cur_block++; + } + + next_tail = *sring->tail; + tail = next_tail; + next_tail++; + + barrier(); + + index = tail & *ring->sq_ring.ring_mask; + + sqe = &ring->sq_ring.queue.sqes[index]; + sqe->fd = file_fd; + sqe->flags = 0; + sqe->opcode = IORING_OP_READV; + sqe->addr = (unsigned long)fi->iovecs; + sqe->len = blocks; + sqe->off = 0; + sqe->user_data = (uint64_t)fi; + + sring->array[index] = index; + tail = next_tail; + + if (*sring->tail != tail) { + *sring->tail = tail; + barrier(); + } + + if (sys_uring_enter(ring->ring_fd, 1, 1, IORING_ENTER_GETEVENTS) < 0) + return 1; + + return 0; +} + +/* + * Test LAM in async I/O and io_uring, read current binery through io_uring + * Set metadata in pointers to iovecs buffer. + */ +int do_uring(unsigned long lam) +{ + struct io_ring *ring; + struct file_io *fi; + struct stat st; + int ret = 1; + char path[PATH_MAX]; + + /* get current process path */ + if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) + return 1; + + int file_fd = open(path, O_RDONLY); + + if (file_fd < 0) + return 1; + + if (fstat(file_fd, &st) < 0) + return 1; + + off_t file_sz = st.st_size; + + int blocks = (int)(file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + fi = malloc(sizeof(*fi) + sizeof(struct iovec) * blocks); + if (!fi) + return 1; + + fi->file_sz = file_sz; + fi->file_fd = file_fd; + + ring = malloc(sizeof(*ring)); + if (!ring) + return 1; + + memset(ring, 0, sizeof(struct io_ring)); + + if (setup_io_uring(ring)) + goto out; + + if (handle_uring_sq(ring, fi, lam)) + goto out; + + ret = handle_uring_cq(ring); + +out: + free(ring); + + for (int i = 0; i < blocks; i++) { + if (fi->iovecs[i].iov_base) { + uint64_t addr = ((uint64_t)fi->iovecs[i].iov_base); + + switch (lam) { + case LAM_U57_BITS: /* Clear bits 62:57 */ + addr = (addr & ~(0x3fULL << 57)); + break; + } + free((void *)addr); + fi->iovecs[i].iov_base = NULL; + } + } + + free(fi); + + return ret; +} + +int handle_uring(struct testcases *test) +{ + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = do_uring(test->lam); + } else { + ret = 2; + } + + return ret; +} + static int fork_test(struct testcases *test) { int ret, child_ret; @@ -340,6 +658,22 @@ static void run_test(struct testcases *test, int count) } } +static struct testcases uring_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING: LAM_U57. Dereferencing pointer with metadata\n", + }, + { + .later = 1, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + static struct testcases malloc_cases[] = { { .later = 0, @@ -410,7 +744,7 @@ static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall.\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring.\n"); printf("\t-h: help\n"); } @@ -456,6 +790,9 @@ int main(int argc, char **argv) if (tests & FUNC_SYSCALL) run_test(syscall_cases, ARRAY_SIZE(syscall_cases)); + if (tests & FUNC_URING) + run_test(uring_cases, ARRAY_SIZE(uring_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass();