From patchwork Tue Aug 30 01:00:54 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958581 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3051EECAAD2 for ; Tue, 30 Aug 2022 01:01:34 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 16EF66B0074; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 15DD9940008; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F18C2940007; Mon, 29 Aug 2022 21:01:33 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id E19D66B0073 for ; Mon, 29 Aug 2022 21:01:33 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id BE1FB8031E for ; Tue, 30 Aug 2022 01:01:33 +0000 (UTC) X-FDA: 79854456066.02.653C33E Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf18.hostedemail.com (Postfix) with ESMTP id 2D77D1C0017 for ; Tue, 30 Aug 2022 01:01:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821293; x=1693357293; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=Q72u8o9NplBRndRYgPTxCNaKqaPHER7MLnoX2dvImDc=; b=QnHPGmN+FRHx0q9eYA0ZfC+lZo+QhNWYnvH7PVG3t/btYygJ6BeXbzeO naLqPEs1QzmTNL4GF5RKRipt4sDIYsiG8Lb2m7hdJ++NsxP2cWrWr0wvq xpm1kKJVkeua/S1wvyvo+KDVUR79jTWftFlf4KsS86wAqIh+eN7rFnjZe BHlthFyOY106w6qlqmtxy2HmGoC0qO3ZGocSoqZA350HAbz2cwECyb1WG ei6idjrLP31Ck2JHrOp2ZTnCKZ5iFIZIljMcUp6K3wF+8DwyHSd5Froj5 JQ07vxiA/DA53Irtl1PywniKR1qTKJF6pvm4sGpWY9mXFiRSh7sEA7OvE A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="381342281" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="381342281" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:31 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="787305212" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:27 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 6425A1040C2; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 01/11] x86/mm: Fix CR3_ADDR_MASK Date: Tue, 30 Aug 2022 04:00:54 +0300 Message-Id: <20220830010104.1282-2-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821293; a=rsa-sha256; cv=none; b=0vGfSJRekA99Aqu7gbYw1k2pOYkFAuvV2yUn7kyvn6LME9kHL3A31aXZePD1m/F8yGc6A4 P3jlXkN425shfW68lbqA37R75GmubwfpSQ7DSn+CrD/ul4vGIPTaHfjwLvfRRjVAmTo6ia Txd6bq2gThOcH5lSoc9aERcoMUPBGAk= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=QnHPGmN+; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=6mUU88d77bz9mORxAl7NMj1cyDeY2JAozvJp4Bvmrgc=; b=I16PDAUjq8m+dM6KL8SktusIgNgw0+y22qqh841i4sMjCAncjBTeiu7t+tuwUimGzDgrtr PMyNJcrUJG+VePNg++g7TmRFhmGsjqCBkC4BNN28xgZ5Ajdpy/17BAZc/FNO8TXMhnrfQT XZE4k1bGan5y+vaGEnWJ2vgw7e54o2E= X-Stat-Signature: 8ubgz4p58esrjao47q6jnk6igjdsoupo X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: 2D77D1C0017 Authentication-Results: imf18.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=QnHPGmN+; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspam-User: X-HE-Tag: 1661821292-805175 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The mask must not include bits above physical address mask. These bits are reserved and can be used for other things. Bits 61 and 62 are used for Linear Address Masking. Signed-off-by: Kirill A. Shutemov Reviewed-by: Rick Edgecombe Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/processor-flags.h | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index 02c2cbda4a74..a7f3d9100adb 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -35,7 +35,7 @@ */ #ifdef CONFIG_X86_64 /* Mask off the address space ID and SME encryption bits. */ -#define CR3_ADDR_MASK __sme_clr(0x7FFFFFFFFFFFF000ull) +#define CR3_ADDR_MASK __sme_clr(PHYSICAL_PAGE_MASK) #define CR3_PCID_MASK 0xFFFull #define CR3_NOFLUSH BIT_ULL(63) From patchwork Tue Aug 30 01:00:55 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958584 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 8D57EECAAD2 for ; Tue, 30 Aug 2022 01:01:40 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9BDC940009; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 999F694000A; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7EC06940007; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id 5FE9D940008 for ; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: from smtpin08.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 21612140B4C for ; Tue, 30 Aug 2022 01:01:34 +0000 (UTC) X-FDA: 79854456108.08.6812D86 Received: from mga11.intel.com (mga11.intel.com [192.55.52.93]) by imf02.hostedemail.com (Postfix) with ESMTP id 6703E8004C for ; Tue, 30 Aug 2022 01:01:33 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821293; x=1693357293; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=yXN+5JLX5zkyOdU6mDFL78amRIUnBbLZ8kI+mjwqWTM=; b=M1KeGQS2fQYIfoqBuJd1qxRYzuFn92xikLcxJOoIkBDZgTqYBfCdZHqt 1i42wYeHJBFyk7ZYH+VAV/FI6qUHI1w78HGy0Yr3fmu06Thaugs9aJCMa 6dGHLsOsCNjjZFOU7P2GSXt1rwnahS1UKFGDo0Qiz66x27Tl8FCY7G/bG JR0ZPNYUAdMV1dVb622QTz+vXfeGel+WgntvYD7SpRkqFWykUmIUH45yK /CsxOaAAl/Njz5s+Q/b2XgUELnHJ5VxwtXF0P/FJjsGjnn2gkYc+Hndm2 aqyc2D3uQ2Gw0lKQz9Ee6B7Am4TVW+SU+hHFS57nauhoUOlPnYLTfBgPH w==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="292622764" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="292622764" Received: from orsmga004.jf.intel.com ([10.7.209.38]) by fmsmga102.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:30 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="737549978" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga004-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:26 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 6D7DC1040C3; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 02/11] x86: CPUID and CR3/CR4 flags for Linear Address Masking Date: Tue, 30 Aug 2022 04:00:55 +0300 Message-Id: <20220830010104.1282-3-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821293; a=rsa-sha256; cv=none; b=tD5DktqK5RAZRDvOxNp5XqJHxwgmyHNY1AmY3kFyUgmEpmQqDNyn0/dEf91agGHLQVXHDn 5tCpMnYLcx3K5AxcmVXRFaeof4Pe9B89ON2eaC87kS12/EY3DwBUvEFmkkWD3Qlm2FU4Xs ia8iq/4JYjoXpvkVSmXUwpyEcTeZDXw= ARC-Authentication-Results: i=1; imf02.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=M1KeGQS2; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf02.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=PeqbkaNgpUsZaiiO8skv1lk5XMXT8HIrdI5C04nNfV0=; b=LnOlMDpBhCICD2FRB4cRDTjPGvIAimIPc7YL/brtbkCYrvnbzuvo/JHrYGZK9fztjNr94a gxs/J4DGpBK06YAG5+/3xJIJCZEDCGmjjruorqy3hqCd0D8MgP37LSHXB38r2Ny9g+cAE1 HY2Qw85e5bvD3IgSBmEj+0sCwxjJ4nE= X-Rspam-User: X-Rspamd-Queue-Id: 6703E8004C X-Rspamd-Server: rspam03 X-Stat-Signature: auztqq6hbk6xzkcgcykknac7b1gtufpo Authentication-Results: imf02.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=M1KeGQS2; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf02.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.93) smtp.mailfrom=kirill.shutemov@linux.intel.com X-HE-Tag: 1661821293-766634 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Enumerate Linear Address Masking and provide defines for CR3 and CR4 flags. Signed-off-by: Kirill A. Shutemov Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/cpufeatures.h | 1 + arch/x86/include/asm/processor-flags.h | 2 ++ arch/x86/include/uapi/asm/processor-flags.h | 6 ++++++ 3 files changed, 9 insertions(+) diff --git a/arch/x86/include/asm/cpufeatures.h b/arch/x86/include/asm/cpufeatures.h index 235dc85c91c3..73c0cf5bd8a1 100644 --- a/arch/x86/include/asm/cpufeatures.h +++ b/arch/x86/include/asm/cpufeatures.h @@ -308,6 +308,7 @@ /* Intel-defined CPU features, CPUID level 0x00000007:1 (EAX), word 12 */ #define X86_FEATURE_AVX_VNNI (12*32+ 4) /* AVX VNNI instructions */ #define X86_FEATURE_AVX512_BF16 (12*32+ 5) /* AVX512 BFLOAT16 instructions */ +#define X86_FEATURE_LAM (12*32+26) /* Linear Address Masking */ /* AMD-defined CPU features, CPUID level 0x80000008 (EBX), word 13 */ #define X86_FEATURE_CLZERO (13*32+ 0) /* CLZERO instruction */ diff --git a/arch/x86/include/asm/processor-flags.h b/arch/x86/include/asm/processor-flags.h index a7f3d9100adb..d8cccadc83a6 100644 --- a/arch/x86/include/asm/processor-flags.h +++ b/arch/x86/include/asm/processor-flags.h @@ -28,6 +28,8 @@ * On systems with SME, one bit (in a variable position!) is stolen to indicate * that the top-level paging structure is encrypted. * + * On systemms with LAM, bits 61 and 62 are used to indicate LAM mode. + * * All of the remaining bits indicate the physical address of the top-level * paging structure. * diff --git a/arch/x86/include/uapi/asm/processor-flags.h b/arch/x86/include/uapi/asm/processor-flags.h index c47cc7f2feeb..d898432947ff 100644 --- a/arch/x86/include/uapi/asm/processor-flags.h +++ b/arch/x86/include/uapi/asm/processor-flags.h @@ -82,6 +82,10 @@ #define X86_CR3_PCID_BITS 12 #define X86_CR3_PCID_MASK (_AC((1UL << X86_CR3_PCID_BITS) - 1, UL)) +#define X86_CR3_LAM_U57_BIT 61 /* Activate LAM for userspace, 62:57 bits masked */ +#define X86_CR3_LAM_U57 _BITULL(X86_CR3_LAM_U57_BIT) +#define X86_CR3_LAM_U48_BIT 62 /* Activate LAM for userspace, 62:48 bits masked */ +#define X86_CR3_LAM_U48 _BITULL(X86_CR3_LAM_U48_BIT) #define X86_CR3_PCID_NOFLUSH_BIT 63 /* Preserve old PCID */ #define X86_CR3_PCID_NOFLUSH _BITULL(X86_CR3_PCID_NOFLUSH_BIT) @@ -132,6 +136,8 @@ #define X86_CR4_PKE _BITUL(X86_CR4_PKE_BIT) #define X86_CR4_CET_BIT 23 /* enable Control-flow Enforcement Technology */ #define X86_CR4_CET _BITUL(X86_CR4_CET_BIT) +#define X86_CR4_LAM_SUP_BIT 28 /* LAM for supervisor pointers */ +#define X86_CR4_LAM_SUP _BITUL(X86_CR4_LAM_SUP_BIT) /* * x86-64 Task Priority Register, CR8 From patchwork Tue Aug 30 01:00:56 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958583 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C32C9C0502C for ; Tue, 30 Aug 2022 01:01:37 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 841E4940008; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 728DE940009; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 57ECA940007; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id 331286B0075 for ; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 001E4AAC73 for ; Tue, 30 Aug 2022 01:01:33 +0000 (UTC) X-FDA: 79854456066.29.4C51400 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf24.hostedemail.com (Postfix) with ESMTP id CF809180012 for ; Tue, 30 Aug 2022 01:01:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821292; x=1693357292; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=veo5OpinfCD1h5ORpII6jqRUYHXSqkFscgpnAOXxvIc=; b=SKmbTTXX1iBKFPtjC8eqoLGARpfuV4IqUOtwtd9qhMlTFrzMIXSuiwlK ekAbQ6ye0WT+s+S65RZj+qjbRIMzXE9s/K0+vnfwBV9GTpetsj2WrsF/I EgoZBuzpIR2IxWD32y8jxRwSmAFMprT1p8Ax3AZ6CCJmn39b8ScjUyCF5 4h6WRWlI9PBu/Ny59ZtFD7n6cdyz+zJzu8cZXSkS5rpaLk2oFwHDh/kIw 7YwGiOYNtCMUV/KDjD0UPclcy91Z737XoWSu63we46HUvhz0iIxsG+8Tl 2f+T87syKt4zfOHCPwY1V2kvBwx7QT9Pe11uTuTP9wxCa9KxSXDjungMj A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="296322923" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="296322923" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:31 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="641155092" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:27 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 7895D1040C4; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 03/11] mm: Pass down mm_struct to untagged_addr() Date: Tue, 30 Aug 2022 04:00:56 +0300 Message-Id: <20220830010104.1282-4-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821293; a=rsa-sha256; cv=none; b=hC8vlFO9GsP+K8uQyZ+HNsTsBGMHSYGIj/5sVQtG67Vrqk/q4h50cZrS7NTjZkektX41b1 ZsA0gd/RNJsDehqUZ5dEOs8DNwGMYjbR0dxuJOJYDLCAReDCwiwbKFnS8eL/Z9+dPVPxCo asapHLFDx4hWXmDQZ1J8s4nIr849Vt4= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=SKmbTTXX; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=kBfLFk/alMsICMxoawaTxuAC3HeuaX3bsw0PCDYnw0w=; b=7MFrD/AO9IeGTI28+Osox6qq5k7QvAJncdNxnWFt8K8bewpboizphG7CHbMemVqmzZRe0c 48SdLZnMFvTVvzwxdxNLfAjd9lZ5VmjqzepSmay7YahnEpUujnQDRppTJDlAr3NP3XblG/ C8kE6Co4r8OtkhMfgghZLT8ySbQBs3Y= Authentication-Results: imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=SKmbTTXX; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 814qsbo5wmdari4161xnis3fhopwratt X-Rspamd-Queue-Id: CF809180012 X-HE-Tag: 1661821292-381320 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Intel Linear Address Masking (LAM) brings per-mm untagging rules. Pass down mm_struct to the untagging helper. It will help to apply untagging policy correctly. In most cases, current->mm is the one to use, but there are some exceptions, such as get_user_page_remote(). Move dummy implementation of untagged_addr() from to . can override the implementation. Moving the dummy header outside helps to avoid header hell if you need to defer mm_struct within the helper. Signed-off-by: Kirill A. Shutemov Reviewed-by: Rick Edgecombe Reviewed-by: Alexander Potapenko Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/arm64/include/asm/memory.h | 4 ++-- arch/arm64/include/asm/signal.h | 2 +- arch/arm64/include/asm/uaccess.h | 4 ++-- arch/arm64/kernel/hw_breakpoint.c | 2 +- arch/arm64/kernel/traps.c | 4 ++-- arch/arm64/mm/fault.c | 10 +++++----- arch/sparc/include/asm/pgtable_64.h | 2 +- arch/sparc/include/asm/uaccess_64.h | 2 ++ drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c | 2 +- drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c | 2 +- drivers/gpu/drm/radeon/radeon_gem.c | 2 +- drivers/infiniband/hw/mlx4/mr.c | 2 +- drivers/media/common/videobuf2/frame_vector.c | 2 +- drivers/media/v4l2-core/videobuf-dma-contig.c | 2 +- drivers/staging/media/atomisp/pci/hmm/hmm_bo.c | 2 +- drivers/tee/tee_shm.c | 2 +- drivers/vfio/vfio_iommu_type1.c | 2 +- fs/proc/task_mmu.c | 2 +- include/linux/mm.h | 11 ----------- include/linux/uaccess.h | 15 +++++++++++++++ lib/strncpy_from_user.c | 2 +- lib/strnlen_user.c | 2 +- mm/gup.c | 6 +++--- mm/madvise.c | 2 +- mm/mempolicy.c | 6 +++--- mm/migrate.c | 2 +- mm/mincore.c | 2 +- mm/mlock.c | 4 ++-- mm/mmap.c | 2 +- mm/mprotect.c | 2 +- mm/mremap.c | 2 +- mm/msync.c | 2 +- virt/kvm/kvm_main.c | 2 +- 33 files changed, 59 insertions(+), 53 deletions(-) diff --git a/arch/arm64/include/asm/memory.h b/arch/arm64/include/asm/memory.h index 9dd08cd339c3..5b24ef93c6b9 100644 --- a/arch/arm64/include/asm/memory.h +++ b/arch/arm64/include/asm/memory.h @@ -227,8 +227,8 @@ static inline unsigned long kaslr_offset(void) #define __untagged_addr(addr) \ ((__force __typeof__(addr))sign_extend64((__force u64)(addr), 55)) -#define untagged_addr(addr) ({ \ - u64 __addr = (__force u64)(addr); \ +#define untagged_addr(mm, addr) ({ \ + u64 __addr = (__force u64)(addr); \ __addr &= __untagged_addr(__addr); \ (__force __typeof__(addr))__addr; \ }) diff --git a/arch/arm64/include/asm/signal.h b/arch/arm64/include/asm/signal.h index ef449f5f4ba8..0899c355c398 100644 --- a/arch/arm64/include/asm/signal.h +++ b/arch/arm64/include/asm/signal.h @@ -18,7 +18,7 @@ static inline void __user *arch_untagged_si_addr(void __user *addr, if (sig == SIGTRAP && si_code == TRAP_BRKPT) return addr; - return untagged_addr(addr); + return untagged_addr(current->mm, addr); } #define arch_untagged_si_addr arch_untagged_si_addr diff --git a/arch/arm64/include/asm/uaccess.h b/arch/arm64/include/asm/uaccess.h index 2fc9f0861769..0e17f44cf997 100644 --- a/arch/arm64/include/asm/uaccess.h +++ b/arch/arm64/include/asm/uaccess.h @@ -44,7 +44,7 @@ static inline int access_ok(const void __user *addr, unsigned long size) */ if (IS_ENABLED(CONFIG_ARM64_TAGGED_ADDR_ABI) && (current->flags & PF_KTHREAD || test_thread_flag(TIF_TAGGED_ADDR))) - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); return likely(__access_ok(addr, size)); } @@ -217,7 +217,7 @@ static inline void __user *__uaccess_mask_ptr(const void __user *ptr) " csel %0, %1, xzr, eq\n" : "=&r" (safe_ptr) : "r" (ptr), "r" (TASK_SIZE_MAX - 1), - "r" (untagged_addr(ptr)) + "r" (untagged_addr(current->mm, ptr)) : "cc"); csdb(); diff --git a/arch/arm64/kernel/hw_breakpoint.c b/arch/arm64/kernel/hw_breakpoint.c index b29a311bb055..d637cee7b771 100644 --- a/arch/arm64/kernel/hw_breakpoint.c +++ b/arch/arm64/kernel/hw_breakpoint.c @@ -715,7 +715,7 @@ static u64 get_distance_from_watchpoint(unsigned long addr, u64 val, u64 wp_low, wp_high; u32 lens, lene; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); lens = __ffs(ctrl->len); lene = __fls(ctrl->len); diff --git a/arch/arm64/kernel/traps.c b/arch/arm64/kernel/traps.c index b7fed33981f7..9edef0e155b6 100644 --- a/arch/arm64/kernel/traps.c +++ b/arch/arm64/kernel/traps.c @@ -476,7 +476,7 @@ void arm64_notify_segfault(unsigned long addr) int code; mmap_read_lock(current->mm); - if (find_vma(current->mm, untagged_addr(addr)) == NULL) + if (find_vma(current->mm, untagged_addr(current->mm, addr)) == NULL) code = SEGV_MAPERR; else code = SEGV_ACCERR; @@ -540,7 +540,7 @@ static void user_cache_maint_handler(unsigned long esr, struct pt_regs *regs) int ret = 0; tagged_address = pt_regs_read_reg(regs, rt); - address = untagged_addr(tagged_address); + address = untagged_addr(current->mm, tagged_address); switch (crm) { case ESR_ELx_SYS64_ISS_CRM_DC_CVAU: /* DC CVAU, gets promoted */ diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index c33f1fad2745..1fa0f1166ac0 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -454,7 +454,7 @@ static void set_thread_esr(unsigned long address, unsigned long esr) static void do_bad_area(unsigned long far, unsigned long esr, struct pt_regs *regs) { - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); /* * If we are in kernel mode at this point, we have no context to @@ -524,7 +524,7 @@ static int __kprobes do_page_fault(unsigned long far, unsigned long esr, vm_fault_t fault; unsigned long vm_flags; unsigned int mm_flags = FAULT_FLAG_DEFAULT; - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(mm, far); if (kprobe_page_fault(regs, esr)) return 0; @@ -679,7 +679,7 @@ static int __kprobes do_translation_fault(unsigned long far, unsigned long esr, struct pt_regs *regs) { - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); if (is_ttbr0_addr(addr)) return do_page_fault(far, esr, regs); @@ -723,7 +723,7 @@ static int do_sea(unsigned long far, unsigned long esr, struct pt_regs *regs) * UNKNOWN for synchronous external aborts. Mask them out now * so that userspace doesn't see them. */ - siaddr = untagged_addr(far); + siaddr = untagged_addr(current->mm, far); } arm64_notify_die(inf->name, regs, inf->sig, inf->code, siaddr, esr); @@ -813,7 +813,7 @@ static const struct fault_info fault_info[] = { void do_mem_abort(unsigned long far, unsigned long esr, struct pt_regs *regs) { const struct fault_info *inf = esr_to_fault_info(esr); - unsigned long addr = untagged_addr(far); + unsigned long addr = untagged_addr(current->mm, far); if (!inf->fn(far, esr, regs)) return; diff --git a/arch/sparc/include/asm/pgtable_64.h b/arch/sparc/include/asm/pgtable_64.h index a779418ceba9..aa996ffe5c8c 100644 --- a/arch/sparc/include/asm/pgtable_64.h +++ b/arch/sparc/include/asm/pgtable_64.h @@ -1052,7 +1052,7 @@ static inline unsigned long __untagged_addr(unsigned long start) return start; } -#define untagged_addr(addr) \ +#define untagged_addr(mm, addr) \ ((__typeof__(addr))(__untagged_addr((unsigned long)(addr)))) static inline bool pte_access_permitted(pte_t pte, bool write) diff --git a/arch/sparc/include/asm/uaccess_64.h b/arch/sparc/include/asm/uaccess_64.h index 94266a5c5b04..b825a5dd0210 100644 --- a/arch/sparc/include/asm/uaccess_64.h +++ b/arch/sparc/include/asm/uaccess_64.h @@ -8,8 +8,10 @@ #include #include +#include #include #include +#include #include #include diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c index a699134a1e8c..71babd0eb70a 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_amdkfd_gpuvm.c @@ -1653,7 +1653,7 @@ int amdgpu_amdkfd_gpuvm_alloc_memory_of_gpu( if (flags & KFD_IOC_ALLOC_MEM_FLAGS_USERPTR) { if (!offset || !*offset) return -EINVAL; - user_addr = untagged_addr(*offset); + user_addr = untagged_addr(current->mm, *offset); } else if (flags & (KFD_IOC_ALLOC_MEM_FLAGS_DOORBELL | KFD_IOC_ALLOC_MEM_FLAGS_MMIO_REMAP)) { bo_type = ttm_bo_type_sg; diff --git a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c index 8ef31d687ef3..691dfb3f2c0e 100644 --- a/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c +++ b/drivers/gpu/drm/amd/amdgpu/amdgpu_gem.c @@ -382,7 +382,7 @@ int amdgpu_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r; - args->addr = untagged_addr(args->addr); + args->addr = untagged_addr(current->mm, args->addr); if (offset_in_page(args->addr | args->size)) return -EINVAL; diff --git a/drivers/gpu/drm/radeon/radeon_gem.c b/drivers/gpu/drm/radeon/radeon_gem.c index 261fcbae88d7..cba2f4b19838 100644 --- a/drivers/gpu/drm/radeon/radeon_gem.c +++ b/drivers/gpu/drm/radeon/radeon_gem.c @@ -371,7 +371,7 @@ int radeon_gem_userptr_ioctl(struct drm_device *dev, void *data, uint32_t handle; int r; - args->addr = untagged_addr(args->addr); + args->addr = untagged_addr(current->mm, args->addr); if (offset_in_page(args->addr | args->size)) return -EINVAL; diff --git a/drivers/infiniband/hw/mlx4/mr.c b/drivers/infiniband/hw/mlx4/mr.c index 04a67b481608..b2860feeae3c 100644 --- a/drivers/infiniband/hw/mlx4/mr.c +++ b/drivers/infiniband/hw/mlx4/mr.c @@ -379,7 +379,7 @@ static struct ib_umem *mlx4_get_umem_mr(struct ib_device *device, u64 start, * again */ if (!ib_access_writable(access_flags)) { - unsigned long untagged_start = untagged_addr(start); + unsigned long untagged_start = untagged_addr(current->mm, start); struct vm_area_struct *vma; mmap_read_lock(current->mm); diff --git a/drivers/media/common/videobuf2/frame_vector.c b/drivers/media/common/videobuf2/frame_vector.c index 542dde9d2609..7e62f7a2555d 100644 --- a/drivers/media/common/videobuf2/frame_vector.c +++ b/drivers/media/common/videobuf2/frame_vector.c @@ -47,7 +47,7 @@ int get_vaddr_frames(unsigned long start, unsigned int nr_frames, if (WARN_ON_ONCE(nr_frames > vec->nr_allocated)) nr_frames = vec->nr_allocated; - start = untagged_addr(start); + start = untagged_addr(mm, start); ret = pin_user_pages_fast(start, nr_frames, FOLL_FORCE | FOLL_WRITE | FOLL_LONGTERM, diff --git a/drivers/media/v4l2-core/videobuf-dma-contig.c b/drivers/media/v4l2-core/videobuf-dma-contig.c index 52312ce2ba05..a1444f8afa05 100644 --- a/drivers/media/v4l2-core/videobuf-dma-contig.c +++ b/drivers/media/v4l2-core/videobuf-dma-contig.c @@ -157,8 +157,8 @@ static void videobuf_dma_contig_user_put(struct videobuf_dma_contig_memory *mem) static int videobuf_dma_contig_user_get(struct videobuf_dma_contig_memory *mem, struct videobuf_buffer *vb) { - unsigned long untagged_baddr = untagged_addr(vb->baddr); struct mm_struct *mm = current->mm; + unsigned long untagged_baddr = untagged_addr(mm, vb->baddr); struct vm_area_struct *vma; unsigned long prev_pfn, this_pfn; unsigned long pages_done, user_address; diff --git a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c index f50494123f03..a43c65950554 100644 --- a/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c +++ b/drivers/staging/media/atomisp/pci/hmm/hmm_bo.c @@ -794,7 +794,7 @@ static int alloc_user_pages(struct hmm_buffer_object *bo, * and map to user space */ - userptr = untagged_addr(userptr); + userptr = untagged_addr(current->mm, userptr); if (vma->vm_flags & (VM_IO | VM_PFNMAP)) { page_nr = pin_user_pages((unsigned long)userptr, bo->pgnr, diff --git a/drivers/tee/tee_shm.c b/drivers/tee/tee_shm.c index f2b1bcefcadd..386be09cb2cd 100644 --- a/drivers/tee/tee_shm.c +++ b/drivers/tee/tee_shm.c @@ -261,7 +261,7 @@ register_shm_helper(struct tee_context *ctx, unsigned long addr, shm->flags = flags; shm->ctx = ctx; shm->id = id; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); start = rounddown(addr, PAGE_SIZE); shm->offset = addr - start; shm->size = length; diff --git a/drivers/vfio/vfio_iommu_type1.c b/drivers/vfio/vfio_iommu_type1.c index db516c90a977..1ab5adba6482 100644 --- a/drivers/vfio/vfio_iommu_type1.c +++ b/drivers/vfio/vfio_iommu_type1.c @@ -562,7 +562,7 @@ static int vaddr_get_pfns(struct mm_struct *mm, unsigned long vaddr, goto done; } - vaddr = untagged_addr(vaddr); + vaddr = untagged_addr(mm, vaddr); retry: vma = vma_lookup(mm, vaddr); diff --git a/fs/proc/task_mmu.c b/fs/proc/task_mmu.c index a3398d0f1927..9255596b690f 100644 --- a/fs/proc/task_mmu.c +++ b/fs/proc/task_mmu.c @@ -1662,7 +1662,7 @@ static ssize_t pagemap_read(struct file *file, char __user *buf, /* watch out for wraparound */ start_vaddr = end_vaddr; if (svpfn <= (ULONG_MAX >> PAGE_SHIFT)) - start_vaddr = untagged_addr(svpfn << PAGE_SHIFT); + start_vaddr = untagged_addr(mm, svpfn << PAGE_SHIFT); /* Ensure the address is inside the task */ if (start_vaddr > mm->task_size) diff --git a/include/linux/mm.h b/include/linux/mm.h index 3bedc449c14d..ae2f8c9fbc4d 100644 --- a/include/linux/mm.h +++ b/include/linux/mm.h @@ -95,17 +95,6 @@ extern int mmap_rnd_compat_bits __read_mostly; #include #include -/* - * Architectures that support memory tagging (assigning tags to memory regions, - * embedding these tags into addresses that point to these memory regions, and - * checking that the memory and the pointer tags match on memory accesses) - * redefine this macro to strip tags from pointers. - * It's defined as noop for architectures that don't support memory tagging. - */ -#ifndef untagged_addr -#define untagged_addr(addr) (addr) -#endif - #ifndef __pa_symbol #define __pa_symbol(x) __pa(RELOC_HIDE((unsigned long)(x), 0)) #endif diff --git a/include/linux/uaccess.h b/include/linux/uaccess.h index 47e5d374c7eb..aed9555aed67 100644 --- a/include/linux/uaccess.h +++ b/include/linux/uaccess.h @@ -10,6 +10,21 @@ #include +/* + * Architectures that support memory tagging (assigning tags to memory regions, + * embedding these tags into addresses that point to these memory regions, and + * checking that the memory and the pointer tags match on memory accesses) + * redefine this macro to strip tags from pointers. + * + * Passing down mm_struct allows to define untagging rules on per-process + * basis. + * + * It's defined as noop for architectures that don't support memory tagging. + */ +#ifndef untagged_addr +#define untagged_addr(mm, addr) (addr) +#endif + /* * Architectures should provide two primitives (raw_copy_{to,from}_user()) * and get rid of their private instances of copy_{to,from}_user() and diff --git a/lib/strncpy_from_user.c b/lib/strncpy_from_user.c index 6432b8c3e431..6e1e2aa0c994 100644 --- a/lib/strncpy_from_user.c +++ b/lib/strncpy_from_user.c @@ -121,7 +121,7 @@ long strncpy_from_user(char *dst, const char __user *src, long count) return 0; max_addr = TASK_SIZE_MAX; - src_addr = (unsigned long)untagged_addr(src); + src_addr = (unsigned long)untagged_addr(current->mm, src); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/lib/strnlen_user.c b/lib/strnlen_user.c index feeb935a2299..abc096a68f05 100644 --- a/lib/strnlen_user.c +++ b/lib/strnlen_user.c @@ -97,7 +97,7 @@ long strnlen_user(const char __user *str, long count) return 0; max_addr = TASK_SIZE_MAX; - src_addr = (unsigned long)untagged_addr(str); + src_addr = (unsigned long)untagged_addr(current->mm, str); if (likely(src_addr < max_addr)) { unsigned long max = max_addr - src_addr; long retval; diff --git a/mm/gup.c b/mm/gup.c index 732825157430..9f2a14f7e77a 100644 --- a/mm/gup.c +++ b/mm/gup.c @@ -1125,7 +1125,7 @@ static long __get_user_pages(struct mm_struct *mm, if (!nr_pages) return 0; - start = untagged_addr(start); + start = untagged_addr(mm, start); VM_BUG_ON(!!pages != !!(gup_flags & (FOLL_GET | FOLL_PIN))); @@ -1307,7 +1307,7 @@ int fixup_user_fault(struct mm_struct *mm, struct vm_area_struct *vma; vm_fault_t ret; - address = untagged_addr(address); + address = untagged_addr(mm, address); if (unlocked) fault_flags |= FAULT_FLAG_ALLOW_RETRY | FAULT_FLAG_KILLABLE; @@ -2935,7 +2935,7 @@ static int internal_get_user_pages_fast(unsigned long start, if (!(gup_flags & FOLL_FAST_ONLY)) might_lock_read(¤t->mm->mmap_lock); - start = untagged_addr(start) & PAGE_MASK; + start = untagged_addr(current->mm, start) & PAGE_MASK; len = nr_pages << PAGE_SHIFT; if (check_add_overflow(start, len, &end)) return 0; diff --git a/mm/madvise.c b/mm/madvise.c index 5f0f0948a50e..fb78881329c4 100644 --- a/mm/madvise.c +++ b/mm/madvise.c @@ -1373,7 +1373,7 @@ int do_madvise(struct mm_struct *mm, unsigned long start, size_t len_in, int beh size_t len; struct blk_plug plug; - start = untagged_addr(start); + start = untagged_addr(mm, start); if (!madvise_behavior_valid(behavior)) return -EINVAL; diff --git a/mm/mempolicy.c b/mm/mempolicy.c index b73d3248d976..92819bccb082 100644 --- a/mm/mempolicy.c +++ b/mm/mempolicy.c @@ -1456,7 +1456,7 @@ static long kernel_mbind(unsigned long start, unsigned long len, int lmode = mode; int err; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); err = sanitize_mpol_flags(&lmode, &mode_flags); if (err) return err; @@ -1479,7 +1479,7 @@ SYSCALL_DEFINE4(set_mempolicy_home_node, unsigned long, start, unsigned long, le unsigned long end; int err = -ENOENT; - start = untagged_addr(start); + start = untagged_addr(mm, start); if (start & ~PAGE_MASK) return -EINVAL; /* @@ -1682,7 +1682,7 @@ static int kernel_get_mempolicy(int __user *policy, if (nmask != NULL && maxnode < nr_node_ids) return -EINVAL; - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); err = do_get_mempolicy(&pval, &nodes, addr, flags); diff --git a/mm/migrate.c b/mm/migrate.c index 6a1597c92261..e05e56605bce 100644 --- a/mm/migrate.c +++ b/mm/migrate.c @@ -1768,7 +1768,7 @@ static int do_pages_move(struct mm_struct *mm, nodemask_t task_nodes, goto out_flush; if (get_user(node, nodes + i)) goto out_flush; - addr = (unsigned long)untagged_addr(p); + addr = (unsigned long)untagged_addr(mm, p); err = -ENODEV; if (node < 0 || node >= MAX_NUMNODES) diff --git a/mm/mincore.c b/mm/mincore.c index fa200c14185f..72c55bd9d184 100644 --- a/mm/mincore.c +++ b/mm/mincore.c @@ -236,7 +236,7 @@ SYSCALL_DEFINE3(mincore, unsigned long, start, size_t, len, unsigned long pages; unsigned char *tmp; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); /* Check the start address: needs to be page-aligned.. */ if (start & ~PAGE_MASK) diff --git a/mm/mlock.c b/mm/mlock.c index b14e929084cc..1ea100fe3db4 100644 --- a/mm/mlock.c +++ b/mm/mlock.c @@ -571,7 +571,7 @@ static __must_check int do_mlock(unsigned long start, size_t len, vm_flags_t fla unsigned long lock_limit; int error = -ENOMEM; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); if (!can_do_mlock()) return -EPERM; @@ -634,7 +634,7 @@ SYSCALL_DEFINE2(munlock, unsigned long, start, size_t, len) { int ret; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); len = PAGE_ALIGN(len + (offset_in_page(start))); start &= PAGE_MASK; diff --git a/mm/mmap.c b/mm/mmap.c index c035020d0c89..90047dc5098a 100644 --- a/mm/mmap.c +++ b/mm/mmap.c @@ -2877,7 +2877,7 @@ EXPORT_SYMBOL(vm_munmap); SYSCALL_DEFINE2(munmap, unsigned long, addr, size_t, len) { - addr = untagged_addr(addr); + addr = untagged_addr(current->mm, addr); return __vm_munmap(addr, len, true); } diff --git a/mm/mprotect.c b/mm/mprotect.c index 3a23dde73723..78adf9635194 100644 --- a/mm/mprotect.c +++ b/mm/mprotect.c @@ -669,7 +669,7 @@ static int do_mprotect_pkey(unsigned long start, size_t len, (prot & PROT_READ); struct mmu_gather tlb; - start = untagged_addr(start); + start = untagged_addr(current->mm, start); prot &= ~(PROT_GROWSDOWN|PROT_GROWSUP); if (grows == (PROT_GROWSDOWN|PROT_GROWSUP)) /* can't be both */ diff --git a/mm/mremap.c b/mm/mremap.c index b522cd0259a0..f76648bc4f67 100644 --- a/mm/mremap.c +++ b/mm/mremap.c @@ -906,7 +906,7 @@ SYSCALL_DEFINE5(mremap, unsigned long, addr, unsigned long, old_len, * * See Documentation/arm64/tagged-address-abi.rst for more information. */ - addr = untagged_addr(addr); + addr = untagged_addr(mm, addr); if (flags & ~(MREMAP_FIXED | MREMAP_MAYMOVE | MREMAP_DONTUNMAP)) return ret; diff --git a/mm/msync.c b/mm/msync.c index 137d1c104f3e..5fe989bd3c4b 100644 --- a/mm/msync.c +++ b/mm/msync.c @@ -37,7 +37,7 @@ SYSCALL_DEFINE3(msync, unsigned long, start, size_t, len, int, flags) int unmapped_error = 0; int error = -EINVAL; - start = untagged_addr(start); + start = untagged_addr(mm, start); if (flags & ~(MS_ASYNC | MS_INVALIDATE | MS_SYNC)) goto out; diff --git a/virt/kvm/kvm_main.c b/virt/kvm/kvm_main.c index 515dfe9d3bcf..d2239aa85cf5 100644 --- a/virt/kvm/kvm_main.c +++ b/virt/kvm/kvm_main.c @@ -1943,7 +1943,7 @@ int __kvm_set_memory_region(struct kvm *kvm, return -EINVAL; /* We can read the guest memory with __xxx_user() later on. */ if ((mem->userspace_addr & (PAGE_SIZE - 1)) || - (mem->userspace_addr != untagged_addr(mem->userspace_addr)) || + (mem->userspace_addr != untagged_addr(kvm->mm, mem->userspace_addr)) || !access_ok((void __user *)(unsigned long)mem->userspace_addr, mem->memory_size)) return -EINVAL; From patchwork Tue Aug 30 01:00:57 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958585 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 802B1ECAAD4 for ; Tue, 30 Aug 2022 01:01:42 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EFD1D94000A; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D513B940007; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BE28594000B; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id A6C7F940007 for ; Mon, 29 Aug 2022 21:01:34 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay06.hostedemail.com (Postfix) with ESMTP id 7D1C8AAC73 for ; Tue, 30 Aug 2022 01:01:34 +0000 (UTC) X-FDA: 79854456108.10.2202F4A Received: from mga06.intel.com (mga06b.intel.com [134.134.136.31]) by imf28.hostedemail.com (Postfix) with ESMTP id B17CDC0032 for ; Tue, 30 Aug 2022 01:01:32 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821292; x=1693357292; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=8J6pDAIm6QgLoimLJ7h7q7C6o4sSo9xSYFO7v0QZKb4=; b=OBszPv/Wc2OJh3n6NCr16HDxLqkTlfVP+N1P0MucEt5/5u2swOrj0EKs rypMyPqz/s549TKFEXyqYCvhsHEs//GoI60qtxBxbz6BaP3H+JcJJevK5 fIKGvmWV//AhBKuznFRtEkt+TkBJmtGE2yDRj2TYwr+ZNWtuLQf47NJmn 8NY1s4eAXr9d+CqKEZairrGiESme/wehbCpgG+kTUaj0ZqwklfZxpT64w Aednx4lu+J9bIwMikuWNqQ7uuMT3zBq3Kr6oPuSrjrL6f5Vg7mr3ClpQY n7Jn0qvP3qZv7VbAuwZotxjGGUkY9/OCllm/AK7NpS5Ex+mYC5QpMOleQ A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="356762061" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="356762061" Received: from fmsmga003.fm.intel.com ([10.253.24.29]) by orsmga104.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:30 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="700784183" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by fmsmga003-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:26 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 81FF11040C5; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 04/11] x86/mm: Handle LAM on context switch Date: Tue, 30 Aug 2022 04:00:57 +0300 Message-Id: <20220830010104.1282-5-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821293; a=rsa-sha256; cv=none; b=XpGelm4tADP1Nlk/gwdOdcVipjUw0nJFf6Wf2n/a11/Q+5+S6+IpBNJX9YfqzXCdd1N4oX D16vz8CxRamEYA6MgMmGO0MhE5Ov7aWxuQ1s3ZqJPDKaBLWd5ZkMQNL+xSMnjvKG//kH6v JiPxlY2Ck7H+GrLgIQsjU9yRq2qlx6A= ARC-Authentication-Results: i=1; imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="OBszPv/W"; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821293; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=tzs/kdPjiBstMBvtDPuXTbdGvQ289egQY5+JuxNQGLo=; b=wV81PzrAXVpdpveS1aNb60tGoR7q0yXGngeGx1lfj/8VPItBXQJznjG58wajRyiOguCcNH gDuPEm1/OR+iFMCzwtq/+XAY6bHeJJ0iIXFDAHTaR1y7h5Sshu2dc72e/CexduQJd/tppE zeY8UwJEjEfGiNooj2tqL45ARFmKxtA= X-Stat-Signature: ync7jfu7m9ztk8zrug8j8rpkc5foebkt X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: B17CDC0032 Authentication-Results: imf28.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="OBszPv/W"; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf28.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.31) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspam-User: X-HE-Tag: 1661821292-335609 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Linear Address Masking mode for userspace pointers encoded in CR3 bits. The mode is selected per-process and stored in mm_context_t. switch_mm_irqs_off() now respects selected LAM mode and constructs CR3 accordingly. The active LAM mode gets recorded in the tlb_state. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu.h | 3 ++ arch/x86/include/asm/mmu_context.h | 24 +++++++++++++++ arch/x86/include/asm/tlbflush.h | 35 ++++++++++++++++++++++ arch/x86/mm/tlb.c | 48 ++++++++++++++++++++---------- 4 files changed, 94 insertions(+), 16 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 5d7494631ea9..002889ca8978 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -40,6 +40,9 @@ typedef struct { #ifdef CONFIG_X86_64 unsigned short flags; + + /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ + unsigned long lam_cr3_mask; #endif struct mutex lock; diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index b8d40ddeab00..69c943b2ae90 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -91,6 +91,29 @@ static inline void switch_ldt(struct mm_struct *prev, struct mm_struct *next) } #endif +#ifdef CONFIG_X86_64 +static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) +{ + return mm->context.lam_cr3_mask; +} + +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) +{ + mm->context.lam_cr3_mask = oldmm->context.lam_cr3_mask; +} + +#else + +static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) +{ + return 0; +} + +static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) +{ +} +#endif + #define enter_lazy_tlb enter_lazy_tlb extern void enter_lazy_tlb(struct mm_struct *mm, struct task_struct *tsk); @@ -168,6 +191,7 @@ static inline int arch_dup_mmap(struct mm_struct *oldmm, struct mm_struct *mm) { arch_dup_pkeys(oldmm, mm); paravirt_arch_dup_mmap(oldmm, mm); + dup_lam(oldmm, mm); return ldt_dup_context(oldmm, mm); } diff --git a/arch/x86/include/asm/tlbflush.h b/arch/x86/include/asm/tlbflush.h index cda3118f3b27..1ad080163363 100644 --- a/arch/x86/include/asm/tlbflush.h +++ b/arch/x86/include/asm/tlbflush.h @@ -101,6 +101,16 @@ struct tlb_state { */ bool invalidate_other; +#ifdef CONFIG_X86_64 + /* + * Active LAM mode. + * + * X86_CR3_LAM_U57/U48 shifted right by X86_CR3_LAM_U57_BIT or 0 if LAM + * disabled. + */ + u8 lam; +#endif + /* * Mask that contains TLB_NR_DYN_ASIDS+1 bits to indicate * the corresponding user PCID needs a flush next time we @@ -357,6 +367,30 @@ static inline bool huge_pmd_needs_flush(pmd_t oldpmd, pmd_t newpmd) } #define huge_pmd_needs_flush huge_pmd_needs_flush +#ifdef CONFIG_X86_64 +static inline unsigned long tlbstate_lam_cr3_mask(void) +{ + unsigned long lam = this_cpu_read(cpu_tlbstate.lam); + + return lam << X86_CR3_LAM_U57_BIT; +} + +static inline void set_tlbstate_cr3_lam_mask(unsigned long mask) +{ + this_cpu_write(cpu_tlbstate.lam, mask >> X86_CR3_LAM_U57_BIT); +} + +#else + +static inline unsigned long tlbstate_lam_cr3_mask(void) +{ + return 0; +} + +static inline void set_tlbstate_cr3_lam_mask(u64 mask) +{ +} +#endif #endif /* !MODULE */ static inline void __native_tlb_flush_global(unsigned long cr4) @@ -364,4 +398,5 @@ static inline void __native_tlb_flush_global(unsigned long cr4) native_write_cr4(cr4 ^ X86_CR4_PGE); native_write_cr4(cr4); } + #endif /* _ASM_X86_TLBFLUSH_H */ diff --git a/arch/x86/mm/tlb.c b/arch/x86/mm/tlb.c index c1e31e9a85d7..d6c9c15d2ad2 100644 --- a/arch/x86/mm/tlb.c +++ b/arch/x86/mm/tlb.c @@ -154,26 +154,30 @@ static inline u16 user_pcid(u16 asid) return ret; } -static inline unsigned long build_cr3(pgd_t *pgd, u16 asid) +static inline unsigned long build_cr3(pgd_t *pgd, u16 asid, unsigned long lam) { + unsigned long cr3 = __sme_pa(pgd) | lam; + if (static_cpu_has(X86_FEATURE_PCID)) { - return __sme_pa(pgd) | kern_pcid(asid); + VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); + cr3 |= kern_pcid(asid); } else { VM_WARN_ON_ONCE(asid != 0); - return __sme_pa(pgd); } + + return cr3; } -static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid) +static inline unsigned long build_cr3_noflush(pgd_t *pgd, u16 asid, + unsigned long lam) { - VM_WARN_ON_ONCE(asid > MAX_ASID_AVAILABLE); /* * Use boot_cpu_has() instead of this_cpu_has() as this function * might be called during early boot. This should work even after * boot because all CPU's the have same capabilities: */ VM_WARN_ON_ONCE(!boot_cpu_has(X86_FEATURE_PCID)); - return __sme_pa(pgd) | kern_pcid(asid) | CR3_NOFLUSH; + return build_cr3(pgd, asid, lam) | CR3_NOFLUSH; } /* @@ -274,15 +278,16 @@ static inline void invalidate_user_asid(u16 asid) (unsigned long *)this_cpu_ptr(&cpu_tlbstate.user_pcid_flush_mask)); } -static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, bool need_flush) +static void load_new_mm_cr3(pgd_t *pgdir, u16 new_asid, unsigned long lam, + bool need_flush) { unsigned long new_mm_cr3; if (need_flush) { invalidate_user_asid(new_asid); - new_mm_cr3 = build_cr3(pgdir, new_asid); + new_mm_cr3 = build_cr3(pgdir, new_asid, lam); } else { - new_mm_cr3 = build_cr3_noflush(pgdir, new_asid); + new_mm_cr3 = build_cr3_noflush(pgdir, new_asid, lam); } /* @@ -491,6 +496,8 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, { struct mm_struct *real_prev = this_cpu_read(cpu_tlbstate.loaded_mm); u16 prev_asid = this_cpu_read(cpu_tlbstate.loaded_mm_asid); + unsigned long prev_lam = tlbstate_lam_cr3_mask(); + unsigned long new_lam = mm_lam_cr3_mask(next); bool was_lazy = this_cpu_read(cpu_tlbstate_shared.is_lazy); unsigned cpu = smp_processor_id(); u64 next_tlb_gen; @@ -520,7 +527,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, * isn't free. */ #ifdef CONFIG_DEBUG_VM - if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid))) { + if (WARN_ON_ONCE(__read_cr3() != build_cr3(real_prev->pgd, prev_asid, prev_lam))) { /* * If we were to BUG here, we'd be very likely to kill * the system so hard that we don't see the call trace. @@ -554,6 +561,7 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, if (real_prev == next) { VM_WARN_ON(this_cpu_read(cpu_tlbstate.ctxs[prev_asid].ctx_id) != next->context.ctx_id); + VM_WARN_ON(prev_lam != new_lam); /* * Even in lazy TLB mode, the CPU should stay set in the @@ -622,15 +630,16 @@ void switch_mm_irqs_off(struct mm_struct *prev, struct mm_struct *next, barrier(); } + set_tlbstate_cr3_lam_mask(new_lam); if (need_flush) { this_cpu_write(cpu_tlbstate.ctxs[new_asid].ctx_id, next->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[new_asid].tlb_gen, next_tlb_gen); - load_new_mm_cr3(next->pgd, new_asid, true); + load_new_mm_cr3(next->pgd, new_asid, new_lam, true); trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, TLB_FLUSH_ALL); } else { /* The new ASID is already up to date. */ - load_new_mm_cr3(next->pgd, new_asid, false); + load_new_mm_cr3(next->pgd, new_asid, new_lam, false); trace_tlb_flush(TLB_FLUSH_ON_TASK_SWITCH, 0); } @@ -691,6 +700,10 @@ void initialize_tlbstate_and_flush(void) /* Assert that CR3 already references the right mm. */ WARN_ON((cr3 & CR3_ADDR_MASK) != __pa(mm->pgd)); + /* LAM expected to be disabled in CR3 and init_mm */ + WARN_ON(cr3 & (X86_CR3_LAM_U48 | X86_CR3_LAM_U57)); + WARN_ON(mm_lam_cr3_mask(&init_mm)); + /* * Assert that CR4.PCIDE is set if needed. (CR4.PCIDE initialization * doesn't work like other CR4 bits because it can only be set from @@ -699,8 +712,8 @@ void initialize_tlbstate_and_flush(void) WARN_ON(boot_cpu_has(X86_FEATURE_PCID) && !(cr4_read_shadow() & X86_CR4_PCIDE)); - /* Force ASID 0 and force a TLB flush. */ - write_cr3(build_cr3(mm->pgd, 0)); + /* Disable LAM, force ASID 0 and force a TLB flush. */ + write_cr3(build_cr3(mm->pgd, 0, 0)); /* Reinitialize tlbstate. */ this_cpu_write(cpu_tlbstate.last_user_mm_spec, LAST_USER_MM_INIT); @@ -708,6 +721,7 @@ void initialize_tlbstate_and_flush(void) this_cpu_write(cpu_tlbstate.next_asid, 1); this_cpu_write(cpu_tlbstate.ctxs[0].ctx_id, mm->context.ctx_id); this_cpu_write(cpu_tlbstate.ctxs[0].tlb_gen, tlb_gen); + set_tlbstate_cr3_lam_mask(0); for (i = 1; i < TLB_NR_DYN_ASIDS; i++) this_cpu_write(cpu_tlbstate.ctxs[i].ctx_id, 0); @@ -1071,8 +1085,10 @@ void flush_tlb_kernel_range(unsigned long start, unsigned long end) */ unsigned long __get_current_cr3_fast(void) { - unsigned long cr3 = build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, - this_cpu_read(cpu_tlbstate.loaded_mm_asid)); + unsigned long cr3 = + build_cr3(this_cpu_read(cpu_tlbstate.loaded_mm)->pgd, + this_cpu_read(cpu_tlbstate.loaded_mm_asid), + tlbstate_lam_cr3_mask()); /* For now, be very restrictive about when this can be called. */ VM_WARN_ON(in_nmi() || preemptible()); From patchwork Tue Aug 30 01:00:58 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958591 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 5B688ECAAD3 for ; Tue, 30 Aug 2022 01:02:04 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E0A65940010; Mon, 29 Aug 2022 21:02:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id DB8B5940007; Mon, 29 Aug 2022 21:02:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id C32D0940010; Mon, 29 Aug 2022 21:02:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0014.hostedemail.com [216.40.44.14]) by kanga.kvack.org (Postfix) with ESMTP id B3D07940007 for ; Mon, 29 Aug 2022 21:02:03 -0400 (EDT) Received: from smtpin02.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay10.hostedemail.com (Postfix) with ESMTP id 8E510C0A97 for ; Tue, 30 Aug 2022 01:02:03 +0000 (UTC) X-FDA: 79854457326.02.06B3298 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf03.hostedemail.com (Postfix) with ESMTP id F1A1B20012 for ; Tue, 30 Aug 2022 01:02:02 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821323; x=1693357323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=T/1SON8x3I6gBc9Gilj4Fh9sNy8RCIdk2WdRlZ6DEwk=; b=NGt90BTfZHwFIbypgX+INtu6+lqILpjW9rX+OjL9o8pRCcH6hxiL3ozr LXHZ/L9CWBptEy1thfJT2lv+2SPhdXJg6h/G0GInVn+cBHD+xU+VEloXB SEr00/nGEif06qIK7QaTpNLXW+l2cAETX759A++ZZQ0CSH+CML/5ZNQFj eJ794uZuiEuUCHyZHNMm8z6dbbubTFhRbH383VO2aNmgd4XGMwjNOFit7 vndB+Hb2XAm9XLyd+Z5x9GWDaBlGRvJSFXqJqThqr8UsU2JcFprni0YY7 Q1THuK1iDKpPArsfyfmAwcOG5srFY8TXV68r4RoAT2onMm3dIG8yojroi A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="321170266" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="321170266" Received: from fmsmga005.fm.intel.com ([10.253.24.32]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:36 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="939803136" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by fmsmga005-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:32 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 8B9E1104167; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 05/11] x86/uaccess: Provide untagged_addr() and remove tags before address check Date: Tue, 30 Aug 2022 04:00:58 +0300 Message-Id: <20220830010104.1282-6-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=NGt90BTf; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.88) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821323; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=+gpgieOlIPxx05fS70p4VSaJETO7zeherGqCBT59DhU=; b=0EGcM3HJB8Zsnh7iXr1kehTuTxdlAvQHa7LOrpftBOGjUPJDvGATrijoQLXCeIXCO3vnmD 4XiDfg6EUMfo+Xdlw+wmeeH6OxtK+MaUM+OecZ0bhpZ1zKaYT2NFB/m4QrlBPr7hzkh8pI sTNOQOdHo3kfA8ECAlPMiOPazCDcDjQ= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821323; a=rsa-sha256; cv=none; b=6zi2XMOLJnQL0tXcUmZjEsPfXJ373ussDSnuBO+JrZOIqLMVHUW4uUo8ISvwMk+4VZr+fl ewFviK1vJm15jK1lKVlz60Wa0+UuAcOZQ4P52AWRDbKLr2SY1t/qlr7Oz0TJdqdAA1jHqx Kdrm+gmejX4JeFdCEjWHTcCxSL6g4/A= X-Stat-Signature: ohnssoau9feedma5cutyeyqg7bnno8bc X-Rspamd-Queue-Id: F1A1B20012 Authentication-Results: imf03.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=NGt90BTf; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.88) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1661821322-473450 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: untagged_addr() is a helper used by the core-mm to strip tag bits and get the address to the canonical shape. In only handles userspace addresses. The untagging mask is stored in mmu_context and will be set on enabling LAM for the process. The tags must not be included into check whether it's okay to access the userspace address. Strip tags in access_ok(). get_user() and put_user() don't use access_ok(), but check access against TASK_SIZE directly in assembly. Strip tags, before calling into the assembly helper. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu.h | 3 +++ arch/x86/include/asm/mmu_context.h | 11 ++++++++ arch/x86/include/asm/uaccess.h | 42 +++++++++++++++++++++++++++--- arch/x86/kernel/process.c | 3 +++ 4 files changed, 56 insertions(+), 3 deletions(-) diff --git a/arch/x86/include/asm/mmu.h b/arch/x86/include/asm/mmu.h index 002889ca8978..2fdb390040b5 100644 --- a/arch/x86/include/asm/mmu.h +++ b/arch/x86/include/asm/mmu.h @@ -43,6 +43,9 @@ typedef struct { /* Active LAM mode: X86_CR3_LAM_U48 or X86_CR3_LAM_U57 or 0 (disabled) */ unsigned long lam_cr3_mask; + + /* Significant bits of the virtual address. Excludes tag bits. */ + u64 untag_mask; #endif struct mutex lock; diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 69c943b2ae90..5bd3d46685dc 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -100,6 +100,12 @@ static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { mm->context.lam_cr3_mask = oldmm->context.lam_cr3_mask; + mm->context.untag_mask = oldmm->context.untag_mask; +} + +static inline void mm_reset_untag_mask(struct mm_struct *mm) +{ + mm->context.untag_mask = -1UL; } #else @@ -112,6 +118,10 @@ static inline unsigned long mm_lam_cr3_mask(struct mm_struct *mm) static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { } + +static inline void mm_reset_untag_mask(struct mm_struct *mm) +{ +} #endif #define enter_lazy_tlb enter_lazy_tlb @@ -138,6 +148,7 @@ static inline int init_new_context(struct task_struct *tsk, mm->context.execute_only_pkey = -1; } #endif + mm_reset_untag_mask(mm); init_new_context_ldt(mm); return 0; } diff --git a/arch/x86/include/asm/uaccess.h b/arch/x86/include/asm/uaccess.h index 913e593a3b45..803241dfc473 100644 --- a/arch/x86/include/asm/uaccess.h +++ b/arch/x86/include/asm/uaccess.h @@ -6,6 +6,7 @@ */ #include #include +#include #include #include #include @@ -20,6 +21,30 @@ static inline bool pagefault_disabled(void); # define WARN_ON_IN_IRQ() #endif +#ifdef CONFIG_X86_64 +/* + * Mask out tag bits from the address. + * + * Magic with the 'sign' allows to untag userspace pointer without any branches + * while leaving kernel addresses intact. + */ +#define untagged_addr(mm, addr) ({ \ + u64 __addr = (__force u64)(addr); \ + s64 sign = (s64)__addr >> 63; \ + __addr &= (mm)->context.untag_mask | sign; \ + (__force __typeof__(addr))__addr; \ +}) + +#define untagged_ptr(mm, ptr) ({ \ + u64 __ptrval = (__force u64)(ptr); \ + __ptrval = untagged_addr(mm, __ptrval); \ + (__force __typeof__(*(ptr)) *)__ptrval; \ +}) +#else +#define untagged_addr(mm, addr) (addr) +#define untagged_ptr(mm, ptr) (ptr) +#endif + /** * access_ok - Checks if a user space pointer is valid * @addr: User space pointer to start of block to check @@ -40,7 +65,7 @@ static inline bool pagefault_disabled(void); #define access_ok(addr, size) \ ({ \ WARN_ON_IN_IRQ(); \ - likely(__access_ok(addr, size)); \ + likely(__access_ok(untagged_addr(current->mm, addr), size)); \ }) #include @@ -125,7 +150,13 @@ extern int __get_user_bad(void); * Return: zero on success, or -EFAULT on error. * On error, the variable @x is set to zero. */ -#define get_user(x,ptr) ({ might_fault(); do_get_user_call(get_user,x,ptr); }) +#define get_user(x,ptr) \ +({ \ + __typeof__(*(ptr)) __user *__ptr_clean; \ + __ptr_clean = untagged_ptr(current->mm, ptr); \ + might_fault(); \ + do_get_user_call(get_user,x,__ptr_clean); \ +}) /** * __get_user - Get a simple variable from user space, with less checking. @@ -222,7 +253,12 @@ extern void __put_user_nocheck_8(void); * * Return: zero on success, or -EFAULT on error. */ -#define put_user(x, ptr) ({ might_fault(); do_put_user_call(put_user,x,ptr); }) +#define put_user(x, ptr) ({ \ + __typeof__(*(ptr)) __user *__ptr_clean; \ + __ptr_clean = untagged_ptr(current->mm, ptr); \ + might_fault(); \ + do_put_user_call(put_user,x,__ptr_clean); \ +}) /** * __put_user - Write a simple value into user space, with less checking. diff --git a/arch/x86/kernel/process.c b/arch/x86/kernel/process.c index 58a6ea472db9..b0e86fb11ffa 100644 --- a/arch/x86/kernel/process.c +++ b/arch/x86/kernel/process.c @@ -47,6 +47,7 @@ #include #include #include +#include #include "process.h" @@ -367,6 +368,8 @@ void arch_setup_new_exec(void) task_clear_spec_ssb_noexec(current); speculation_ctrl_update(read_thread_flags()); } + + mm_reset_untag_mask(current->mm); } #ifdef CONFIG_X86_IOPL_IOPERM From patchwork Tue Aug 30 01:00:59 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958586 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 83DFBECAAD2 for ; Tue, 30 Aug 2022 01:01:44 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id EF3CC94000B; Mon, 29 Aug 2022 21:01:37 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id E7CF0940007; Mon, 29 Aug 2022 21:01:37 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CCF1094000B; Mon, 29 Aug 2022 21:01:37 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0016.hostedemail.com [216.40.44.16]) by kanga.kvack.org (Postfix) with ESMTP id BA6F6940007 for ; Mon, 29 Aug 2022 21:01:37 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 88B28160689 for ; Tue, 30 Aug 2022 01:01:37 +0000 (UTC) X-FDA: 79854456234.26.78DADE1 Received: from mga05.intel.com (mga05.intel.com [192.55.52.43]) by imf18.hostedemail.com (Postfix) with ESMTP id D8EC91C0010 for ; Tue, 30 Aug 2022 01:01:36 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821296; x=1693357296; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=gJsP/Lgemx2usecScQbMEyqztN6q+W6cu2oGGdXpnWc=; b=ilwlbiyG3czK7XCwXDobki+6YV/H89cyYTUhDyD7RHFuklzj0CRKFOPp CBSTpW6vZiiStbEOszKoE5cLOGQqh+zzM5E73eHew4Ex6hTOhYcA3Y2r1 azSdjPOslmUFwhYuJ+JYLc9FwqDFBbd58KpJDR3UcYIYFEn5qTQHhFtZU Gdy1pfTohVBp1vfig2ROYdOsXO/KzBQHcGDdqtJ0tcDH0ygE0fvVj/vYk CTCjbUmJxTe4BiNYWA2troywxy1eMCHcLiPrEtOkd+JrFCh98121IBrjS ARA+Gi7C9aGXgaczVwbjlmHnCYy6wePNjFhHlowClbsDah6Xe0WQakt9E Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="381342292" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="381342292" Received: from fmsmga002.fm.intel.com ([10.253.24.26]) by fmsmga105.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:36 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="715096951" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by fmsmga002-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 9588610419A; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 06/11] x86/mm: Provide arch_prctl() interface for LAM Date: Tue, 30 Aug 2022 04:00:59 +0300 Message-Id: <20220830010104.1282-7-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821297; a=rsa-sha256; cv=none; b=l9ZI9Oo/0uBkNCMznbRs0+y4CrHtqmeFY/AaJqUQDqt5nkpatXKFaq/WMJUNiAnOPGzx6r O26GD1VTQvLz+3Dd9OsQpa9wKhIVFJ7LhuoEaxTGpxM3NCGEEocebsNiZFjqXJ1E7UGgTy yhqY4fDQL0JtGBcTv7MzTCdQy6JhTU8= ARC-Authentication-Results: i=1; imf18.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ilwlbiyG; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821297; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=gjgrFfjsqL4RUcsssh7Fcj0Rl9fhqB64ajPEVgRnofs=; b=qmeEJeHTeCYKImy4Sj2Hp2MUBVUN9LWiGobZNnxX/tpLMuZBfA1KLOku8UOl1/nhiQ5MxD lx5bieyb+yv9sWwJbJ5BT0cFFYZP1F9NYg+nnWYovaTE/DjjeV+crbwPRhwfJqvNbaaTB1 ZhJW2E9r+q7P5xCtLuPgUY8sWSoYSok= X-Stat-Signature: tbickfrnf6sh8sq3ayzeycb4u3zkphe5 X-Rspamd-Server: rspam09 X-Rspamd-Queue-Id: D8EC91C0010 Authentication-Results: imf18.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ilwlbiyG; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none); spf=none (imf18.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.43) smtp.mailfrom=kirill.shutemov@linux.intel.com X-Rspam-User: X-HE-Tag: 1661821296-394787 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a couple of arch_prctl() handles: - ARCH_ENABLE_TAGGED_ADDR enabled LAM. The argument is required number of tag bits. It is rounded up to the nearest LAM mode that can provide it. For now only LAM_U57 is supported, with 6 tag bits. - ARCH_GET_UNTAG_MASK returns untag mask. It can indicates where tag bits located in the address. - ARCH_GET_MAX_TAG_BITS returns the maximum tag bits user can request. Zero if LAM is not supported. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Reviewed-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/uapi/asm/prctl.h | 4 ++ arch/x86/kernel/process_64.c | 64 ++++++++++++++++++++++++++++++- 2 files changed, 67 insertions(+), 1 deletion(-) diff --git a/arch/x86/include/uapi/asm/prctl.h b/arch/x86/include/uapi/asm/prctl.h index 500b96e71f18..a31e27b95b19 100644 --- a/arch/x86/include/uapi/asm/prctl.h +++ b/arch/x86/include/uapi/asm/prctl.h @@ -20,4 +20,8 @@ #define ARCH_MAP_VDSO_32 0x2002 #define ARCH_MAP_VDSO_64 0x2003 +#define ARCH_GET_UNTAG_MASK 0x4001 +#define ARCH_ENABLE_TAGGED_ADDR 0x4002 +#define ARCH_GET_MAX_TAG_BITS 0x4003 + #endif /* _ASM_X86_PRCTL_H */ diff --git a/arch/x86/kernel/process_64.c b/arch/x86/kernel/process_64.c index 1962008fe743..455601a3cc3f 100644 --- a/arch/x86/kernel/process_64.c +++ b/arch/x86/kernel/process_64.c @@ -742,6 +742,59 @@ static long prctl_map_vdso(const struct vdso_image *image, unsigned long addr) } #endif +static void enable_lam_func(void *mm) +{ + struct mm_struct *loaded_mm = this_cpu_read(cpu_tlbstate.loaded_mm); + unsigned long lam_mask; + unsigned long cr3; + + if (loaded_mm != mm) + return; + + lam_mask = READ_ONCE(loaded_mm->context.lam_cr3_mask); + + /* Update CR3 to get LAM active on the CPU */ + cr3 = __read_cr3(); + cr3 &= ~(X86_CR3_LAM_U48 | X86_CR3_LAM_U57); + cr3 |= lam_mask; + write_cr3(cr3); + set_tlbstate_cr3_lam_mask(lam_mask); +} + +#define LAM_U57_BITS 6 + +static int prctl_enable_tagged_addr(struct mm_struct *mm, unsigned long nr_bits) +{ + int ret = 0; + + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return -ENODEV; + + mutex_lock(&mm->context.lock); + + /* Already enabled? */ + if (mm->context.lam_cr3_mask) { + ret = -EBUSY; + goto out; + } + + if (!nr_bits) { + ret = -EINVAL; + goto out; + } else if (nr_bits <= LAM_U57_BITS) { + mm->context.lam_cr3_mask = X86_CR3_LAM_U57; + mm->context.untag_mask = ~GENMASK(62, 57); + } else { + ret = -EINVAL; + goto out; + } + + on_each_cpu_mask(mm_cpumask(mm), enable_lam_func, mm, true); +out: + mutex_unlock(&mm->context.lock); + return ret; +} + long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) { int ret = 0; @@ -829,7 +882,16 @@ long do_arch_prctl_64(struct task_struct *task, int option, unsigned long arg2) case ARCH_MAP_VDSO_64: return prctl_map_vdso(&vdso_image_64, arg2); #endif - + case ARCH_GET_UNTAG_MASK: + return put_user(task->mm->context.untag_mask, + (unsigned long __user *)arg2); + case ARCH_ENABLE_TAGGED_ADDR: + return prctl_enable_tagged_addr(task->mm, arg2); + case ARCH_GET_MAX_TAG_BITS: + if (!cpu_feature_enabled(X86_FEATURE_LAM)) + return put_user(0, (unsigned long __user *)arg2); + else + return put_user(LAM_U57_BITS, (unsigned long __user *)arg2); default: ret = -EINVAL; break; From patchwork Tue Aug 30 01:01:00 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958588 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 3C2CDECAAD2 for ; Tue, 30 Aug 2022 01:01:47 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 9533494000C; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 8FF9F940007; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 7000294000E; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0015.hostedemail.com [216.40.44.15]) by kanga.kvack.org (Postfix) with ESMTP id 40B1994000C for ; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) Received: from smtpin10.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id 10F611603D4 for ; Tue, 30 Aug 2022 01:01:39 +0000 (UTC) X-FDA: 79854456318.10.24F9B53 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf31.hostedemail.com (Postfix) with ESMTP id 84A8820028 for ; Tue, 30 Aug 2022 01:01:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821298; x=1693357298; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=jd8VhWR6+MxI2xd+VMMl3UWdT0J2xvWmPhIEJUGOumA=; b=CsczzA2Q8eDJX0ENrP7p3ZvCNCuTRcEgxEUggA6pRASS/aAOsf0c6Sf3 zljCiPSjzyrbC1H32JWRKIclHox0yGUdfmB1rzAHqK2riGsCF1hpTxe1n vtumJQeLw9i82kHmdOPwTJ5KGB+2+JznPMjcf/b6by77l5oMYS8Cf+DOu O8K55SHCQ4OQ40pBKlQ97KMudJUXkw6N9bFDmBcoyXoshdDNWc34ZGivo 5qBTKvK4MExrrQo9JuEfdSjfz+GqilyyK+PquACftGJLte1gYGq2FUkxv 5Lp77EJlU9XoawZ6gpmJGOm6S+sqlaErzAgsf6NqTjDEVuIK4R5TW3YFk A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="282015882" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="282015882" Received: from fmsmga007.fm.intel.com ([10.253.24.52]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:36 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="614412094" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by fmsmga007-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id 9ED8D1041A8; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, "Kirill A. Shutemov" Subject: [PATCHv8 07/11] x86: Expose untagging mask in /proc/$PID/arch_status Date: Tue, 30 Aug 2022 04:01:00 +0300 Message-Id: <20220830010104.1282-8-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821298; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=ZUpT27YX5llTrHeH6urwZh29rqjbnoTlZoihQgUdkw0=; b=S8Iic4FkQUi/IVWSdzrtIiy85bXheyZMFORcGURz9S1mCTdjaFkejUbvluVQj3WwW9rT3S adIH0K3+3nfA9AyVr5nCSCAujwimp6j5kk3PzLf7SFY4MIRY4FKrFqfYFfncLEkFUWNtFO fmHwXds94M/+GAUQmt0r/NTWQkcXA/o= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=CsczzA2Q; spf=none (imf31.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821298; a=rsa-sha256; cv=none; b=hTqt12NGDsTo9mNqt2SisyvYD0oBkPwmtEdVHcMYwWhpVJ20Lcnr9kOXZ+KzSrbHtujyU7 DOe0R18XSnMvJnUoCjtA2Y8K/O3xZlz6n03Wn4XXRFJYcX0yyDm6fLox8RgQHUUO1Ae5l/ Y251JRa0QF/U23kV9vo+gYXkKGnPL40= X-Rspam-User: X-Rspamd-Queue-Id: 84A8820028 X-Stat-Signature: 8bc6c74aam1kdump3jrj7663upzrck7n Authentication-Results: imf31.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=CsczzA2Q; spf=none (imf31.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspamd-Server: rspam08 X-HE-Tag: 1661821298-992716 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: Add a line in /proc/$PID/arch_status to report untag_mask. It can be used to find out LAM status of the process from the outside. It is useful for debuggers. Signed-off-by: Kirill A. Shutemov Tested-by: Alexander Potapenko Acked-by: Peter Zijlstra (Intel) --- arch/x86/include/asm/mmu_context.h | 10 +++++ arch/x86/kernel/Makefile | 2 + arch/x86/kernel/fpu/xstate.c | 47 ----------------------- arch/x86/kernel/proc.c | 60 ++++++++++++++++++++++++++++++ 4 files changed, 72 insertions(+), 47 deletions(-) create mode 100644 arch/x86/kernel/proc.c diff --git a/arch/x86/include/asm/mmu_context.h b/arch/x86/include/asm/mmu_context.h index 5bd3d46685dc..b0e9ea23758b 100644 --- a/arch/x86/include/asm/mmu_context.h +++ b/arch/x86/include/asm/mmu_context.h @@ -103,6 +103,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) mm->context.untag_mask = oldmm->context.untag_mask; } +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return mm->context.untag_mask; +} + static inline void mm_reset_untag_mask(struct mm_struct *mm) { mm->context.untag_mask = -1UL; @@ -119,6 +124,11 @@ static inline void dup_lam(struct mm_struct *oldmm, struct mm_struct *mm) { } +static inline unsigned long mm_untag_mask(struct mm_struct *mm) +{ + return -1UL; +} + static inline void mm_reset_untag_mask(struct mm_struct *mm) { } diff --git a/arch/x86/kernel/Makefile b/arch/x86/kernel/Makefile index a20a5ebfacd7..fada0e36031b 100644 --- a/arch/x86/kernel/Makefile +++ b/arch/x86/kernel/Makefile @@ -139,6 +139,8 @@ obj-$(CONFIG_UNWINDER_GUESS) += unwind_guess.o obj-$(CONFIG_AMD_MEM_ENCRYPT) += sev.o +obj-$(CONFIG_PROC_FS) += proc.o + ### # 64 bit specific files ifeq ($(CONFIG_X86_64),y) diff --git a/arch/x86/kernel/fpu/xstate.c b/arch/x86/kernel/fpu/xstate.c index c8340156bfd2..838a6f0627fd 100644 --- a/arch/x86/kernel/fpu/xstate.c +++ b/arch/x86/kernel/fpu/xstate.c @@ -10,8 +10,6 @@ #include #include #include -#include -#include #include #include @@ -1745,48 +1743,3 @@ long fpu_xstate_prctl(int option, unsigned long arg2) return -EINVAL; } } - -#ifdef CONFIG_PROC_PID_ARCH_STATUS -/* - * Report the amount of time elapsed in millisecond since last AVX512 - * use in the task. - */ -static void avx512_status(struct seq_file *m, struct task_struct *task) -{ - unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp); - long delta; - - if (!timestamp) { - /* - * Report -1 if no AVX512 usage - */ - delta = -1; - } else { - delta = (long)(jiffies - timestamp); - /* - * Cap to LONG_MAX if time difference > LONG_MAX - */ - if (delta < 0) - delta = LONG_MAX; - delta = jiffies_to_msecs(delta); - } - - seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); - seq_putc(m, '\n'); -} - -/* - * Report architecture specific information - */ -int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, - struct pid *pid, struct task_struct *task) -{ - /* - * Report AVX512 state if the processor and build option supported. - */ - if (cpu_feature_enabled(X86_FEATURE_AVX512F)) - avx512_status(m, task); - - return 0; -} -#endif /* CONFIG_PROC_PID_ARCH_STATUS */ diff --git a/arch/x86/kernel/proc.c b/arch/x86/kernel/proc.c new file mode 100644 index 000000000000..9765b4d05ce4 --- /dev/null +++ b/arch/x86/kernel/proc.c @@ -0,0 +1,60 @@ +// SPDX-License-Identifier: GPL-2.0-only +#include +#include +#include +#include +#include + +/* + * Report the amount of time elapsed in millisecond since last AVX512 + * use in the task. + */ +static void avx512_status(struct seq_file *m, struct task_struct *task) +{ + unsigned long timestamp = READ_ONCE(task->thread.fpu.avx512_timestamp); + long delta; + + if (!timestamp) { + /* + * Report -1 if no AVX512 usage + */ + delta = -1; + } else { + delta = (long)(jiffies - timestamp); + /* + * Cap to LONG_MAX if time difference > LONG_MAX + */ + if (delta < 0) + delta = LONG_MAX; + delta = jiffies_to_msecs(delta); + } + + seq_put_decimal_ll(m, "AVX512_elapsed_ms:\t", delta); + seq_putc(m, '\n'); +} + +/* + * Report architecture specific information + */ +int proc_pid_arch_status(struct seq_file *m, struct pid_namespace *ns, + struct pid *pid, struct task_struct *task) +{ + struct mm_struct *mm; + unsigned long untag_mask = -1UL; + + /* + * Report AVX512 state if the processor and build option supported. + */ + if (cpu_feature_enabled(X86_FEATURE_AVX512F)) + avx512_status(m, task); + + mm = get_task_mm(task); + if (mm) { + untag_mask = mm_untag_mask(task->mm); + mmput(mm); + } + + seq_printf(m, "untag_mask:\t%#lx\n", untag_mask); + + return 0; +} From patchwork Tue Aug 30 01:01:01 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958590 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 0059CC0502C for ; Tue, 30 Aug 2022 01:01:49 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B757C94000F; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AF98E940007; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 8D7C294000F; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0013.hostedemail.com [216.40.44.13]) by kanga.kvack.org (Postfix) with ESMTP id 76DF6940007 for ; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) Received: from smtpin18.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay04.hostedemail.com (Postfix) with ESMTP id 544541A0B34 for ; Tue, 30 Aug 2022 01:01:40 +0000 (UTC) X-FDA: 79854456360.18.9AC68FA Received: from mga12.intel.com (mga12.intel.com [192.55.52.136]) by imf27.hostedemail.com (Postfix) with ESMTP id A55614005F for ; Tue, 30 Aug 2022 01:01:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821299; x=1693357299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=zPu33O1T6sljBPKjIPbbTHJ1NfVpNGPZfq20bC8/1jM=; b=ZMWRy3EPLb/xzBylj9JzebHGEvzUeIYQQHE5Vq/rGC+tfihrz9ZMjx+n DO+skQVLyvyALcOAaETT3pw0lfVVPAAuBRxdzPKUUZSLnJtUFyDNLJC8K 2Go7OD0/Hz5KFySbtM5ekobCXsdPEXcqq8UmlYhzxiIs5dseqzt3bHiSn +hdFor9SlJDva5Xys6+VFnHeUdnBl1LEy7TgsF21d9G88WSYPO9ew+CGe xbDlwA4XLJbcdqJkcro1QT4XsYkrAAoJi0XEF9eNNf5cEKNwKEDu1+ry4 cwX9QG8bnfdIZZzWFbDfXpL/CK6hR7i5mryGhb0DsOpiISeqfmbF2hQiI A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="274793660" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="274793660" Received: from orsmga008.jf.intel.com ([10.7.209.65]) by fmsmga106.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:38 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="641155109" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga008-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id A7FD01041E2; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv8 08/11] selftests/x86/lam: Add malloc and tag-bits test cases for linear-address masking Date: Tue, 30 Aug 2022 04:01:01 +0300 Message-Id: <20220830010104.1282-9-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821300; a=rsa-sha256; cv=none; b=h+nmOjnU4snBvdiBrkHTZ/meJyEILnYBLWLoWpYYDz7L6cHwvAAfgEtO45zGSaAaXT79n0 48rJVIUfRlqR5EsvRrSn7Nng7ju/OR7TJ/2KtQwFunwqhDnomv2xhMlFNamTfrHyBoy9Or EroKSwl4NLy1tyCUoY7GKguRa9BpkX8= ARC-Authentication-Results: i=1; imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ZMWRy3EP; spf=none (imf27.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821300; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wORd8+rAE/qgM6efl0xzFe3Jzp3uv5G57A3pBUZB7YI=; b=MvtwpbcHWhfa49SICvHWsGoRMyRQa4goFgXM6i3CyAb9Bha91IeJSHkO5cmlWZ5ysZGZ58 YeKG9wJEpfzGtI6bB8u6hXBsAX4cMFM2ZCpmMlWa2tH6jZepg9O5mtq2aOlrddjX2Rnb+h rOLntInhJRsJqH3tykQeGuyqQoY4g7w= Authentication-Results: imf27.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=ZMWRy3EP; spf=none (imf27.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.136) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspamd-Server: rspam05 X-Stat-Signature: b96s8yqfzxt4ffapwkz86kzjkz8aw43b X-Rspamd-Queue-Id: A55614005F X-Rspam-User: X-HE-Tag: 1661821299-660570 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Weihong Zhang LAM is supported only in 64-bit mode and applies only addresses used for data accesses. In 64-bit mode, linear address have 64 bits. LAM is applied to 64-bit linear address and allow software to use high bits for metadata. LAM supports configurations that differ regarding which pointer bits are masked and can be used for metadata. LAM includes following mode: - LAM_U57, pointer bits in positions 62:57 are masked (LAM width 6), allows bits 62:57 of a user pointer to be used as metadata. There are some arch_prctls: ARCH_ENABLE_TAGGED_ADDR: enable LAM mode, mask high bits of a user pointer. ARCH_GET_UNTAG_MASK: get current untagged mask. ARCH_GET_MAX_TAG_BITS: the maximum tag bits user can request. zero if LAM is not supported. The LAM mode is for pre-process, a process has only one chance to set LAM mode. But there is no API to disable LAM mode. So all of test cases are run under child process. Functions of this test: MALLOC - LAM_U57 masks bits 57:62 of a user pointer. Process on user space can dereference such pointers. - Disable LAM, dereference a pointer with metadata above 48 bit or 57 bit lead to trigger SIGSEGV. TAG_BITS - Max tag bits of LAM_U57 is 6. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov Acked-by: Peter Zijlstra (Intel) --- tools/testing/selftests/x86/Makefile | 2 +- tools/testing/selftests/x86/lam.c | 326 +++++++++++++++++++++++++++ 2 files changed, 327 insertions(+), 1 deletion(-) create mode 100644 tools/testing/selftests/x86/lam.c diff --git a/tools/testing/selftests/x86/Makefile b/tools/testing/selftests/x86/Makefile index 0388c4d60af0..c1a16a9d4f2f 100644 --- a/tools/testing/selftests/x86/Makefile +++ b/tools/testing/selftests/x86/Makefile @@ -18,7 +18,7 @@ TARGETS_C_32BIT_ONLY := entry_from_vm86 test_syscall_vdso unwind_vdso \ test_FCMOV test_FCOMI test_FISTTP \ vdso_restorer TARGETS_C_64BIT_ONLY := fsgsbase sysret_rip syscall_numbering \ - corrupt_xstate_header amx + corrupt_xstate_header amx lam # Some selftests require 32bit support enabled also on 64bit systems TARGETS_C_32BIT_NEEDED := ldt_gdt ptrace_syscall diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c new file mode 100644 index 000000000000..900a3a0fb709 --- /dev/null +++ b/tools/testing/selftests/x86/lam.c @@ -0,0 +1,326 @@ +// SPDX-License-Identifier: GPL-2.0 +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +#include "../kselftest.h" + +#ifndef __x86_64__ +# error This test is 64-bit only +#endif + +/* LAM modes, these definitions were copied from kernel code */ +#define LAM_NONE 0 +#define LAM_U57_BITS 6 + +#define LAM_U57_MASK (0x3fULL << 57) +/* arch prctl for LAM */ +#define ARCH_GET_UNTAG_MASK 0x4001 +#define ARCH_ENABLE_TAGGED_ADDR 0x4002 +#define ARCH_GET_MAX_TAG_BITS 0x4003 + +/* Specified test function bits */ +#define FUNC_MALLOC 0x1 +#define FUNC_BITS 0x2 + +#define TEST_MASK 0x3 + +#define MALLOC_LEN 32 + +struct testcases { + unsigned int later; + int expected; /* 2: SIGSEGV Error; 1: other errors */ + unsigned long lam; + uint64_t addr; + int (*test_func)(struct testcases *test); + const char *msg; +}; + +int tests_cnt; +jmp_buf segv_env; + +static void segv_handler(int sig) +{ + ksft_print_msg("Get segmentation fault(%d).", sig); + siglongjmp(segv_env, 1); +} + +static inline int cpu_has_lam(void) +{ + unsigned int cpuinfo[4]; + + __cpuid_count(0x7, 1, cpuinfo[0], cpuinfo[1], cpuinfo[2], cpuinfo[3]); + + return (cpuinfo[0] & (1 << 26)); +} + +/* + * Set tagged address and read back untag mask. + * check if the untagged mask is expected. + * + * @return: + * 0: Set LAM mode successfully + * others: failed to set LAM + */ +static int set_lam(unsigned long lam) +{ + int ret = 0; + uint64_t ptr = 0; + + if (lam != LAM_U57_BITS && lam != LAM_NONE) + return -1; + + /* Skip check return */ + syscall(SYS_arch_prctl, ARCH_ENABLE_TAGGED_ADDR, lam); + + /* Get untagged mask */ + syscall(SYS_arch_prctl, ARCH_GET_UNTAG_MASK, &ptr); + + /* Check mask returned is expected */ + if (lam == LAM_U57_BITS) + ret = (ptr != ~(LAM_U57_MASK)); + else if (lam == LAM_NONE) + ret = (ptr != -1ULL); + + return ret; +} + +static unsigned long get_default_tag_bits(void) +{ + pid_t pid; + int lam = LAM_NONE; + int ret = 0; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + } else if (pid == 0) { + /* Set LAM mode in child process */ + if (set_lam(LAM_U57_BITS) == 0) + lam = LAM_U57_BITS; + else + lam = LAM_NONE; + exit(lam); + } else { + wait(&ret); + lam = WEXITSTATUS(ret); + } + + return lam; +} + +/* According to LAM mode, set metadata in high bits */ +static uint64_t set_metadata(uint64_t src, unsigned long lam) +{ + uint64_t metadata; + + srand(time(NULL)); + /* Get a random value as metadata */ + metadata = rand(); + + switch (lam) { + case LAM_U57_BITS: /* Set metadata in bits 62:57 */ + metadata = (src & ~(LAM_U57_MASK)) | ((metadata & 0x3f) << 57); + break; + default: + metadata = src; + break; + } + + return metadata; +} + +/* + * Set metadata in user pointer, compare new pointer with original pointer. + * both pointers should point to the same address. + * + * @return: + * 0: value on the pointer with metadate and value on original are same + * 1: not same. + */ +static int handle_lam_test(void *src, unsigned int lam) +{ + char *ptr; + + strcpy((char *)src, "USER POINTER"); + + ptr = (char *)set_metadata((uint64_t)src, lam); + if (src == ptr) + return 0; + + /* Copy a string into the pointer with metadata */ + strcpy((char *)ptr, "METADATA POINTER"); + + return (!!strcmp((char *)src, (char *)ptr)); +} + + +int handle_max_bits(struct testcases *test) +{ + unsigned long exp_bits = get_default_tag_bits(); + unsigned long bits = 0; + + if (exp_bits != LAM_NONE) + exp_bits = LAM_U57_BITS; + + /* Get LAM max tag bits */ + if (syscall(SYS_arch_prctl, ARCH_GET_MAX_TAG_BITS, &bits) == -1) + return 1; + + return (exp_bits != bits); +} + +/* + * Test lam feature through dereference pointer get from malloc. + * @return 0: Pass test. 1: Get failure during test 2: Get SIGSEGV + */ +static int handle_malloc(struct testcases *test) +{ + char *ptr = NULL; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) == -1) + return 1; + + ptr = (char *)malloc(MALLOC_LEN); + if (ptr == NULL) { + perror("malloc() failure\n"); + return 1; + } + + /* Set signal handler */ + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = handle_lam_test(ptr, test->lam); + } else { + ret = 2; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) == -1 && ret == 0) + ret = 1; + + free(ptr); + + return ret; +} + +static int fork_test(struct testcases *test) +{ + int ret, child_ret; + pid_t pid; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + ret = 1; + } else if (pid == 0) { + ret = test->test_func(test); + exit(ret); + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + } + + return ret; +} + +static void run_test(struct testcases *test, int count) +{ + int i, ret = 0; + + for (i = 0; i < count; i++) { + struct testcases *t = test + i; + + /* fork a process to run test case */ + ret = fork_test(t); + if (ret != 0) + ret = (t->expected == ret); + else + ret = !(t->expected); + + tests_cnt++; + ksft_test_result(ret, t->msg); + } +} + +static struct testcases malloc_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_malloc, + .msg = "MALLOC: LAM_U57. Dereferencing pointer with metadata\n", + }, + { + .later = 1, + .expected = 2, + .lam = LAM_U57_BITS, + .test_func = handle_malloc, + .msg = "MALLOC:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + + +static struct testcases bits_cases[] = { + { + .test_func = handle_max_bits, + .msg = "BITS: Check default tag bits\n", + }, +}; + +static void cmd_help(void) +{ + printf("usage: lam [-h] [-t test list]\n"); + printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); + printf("\t\t0x1:malloc; 0x2:max_bits;\n"); + printf("\t-h: help\n"); +} + +int main(int argc, char **argv) +{ + int c = 0; + unsigned int tests = TEST_MASK; + + tests_cnt = 0; + + if (!cpu_has_lam()) { + ksft_print_msg("Unsupported LAM feature!\n"); + return -1; + } + + while ((c = getopt(argc, argv, "ht:")) != -1) { + switch (c) { + case 't': + tests = strtoul(optarg, NULL, 16); + if (!(tests & TEST_MASK)) { + ksft_print_msg("Invalid argument!\n"); + return -1; + } + break; + case 'h': + cmd_help(); + return 0; + default: + ksft_print_msg("Invalid argument\n"); + return -1; + } + } + + if (tests & FUNC_MALLOC) + run_test(malloc_cases, ARRAY_SIZE(malloc_cases)); + + if (tests & FUNC_BITS) + run_test(bits_cases, ARRAY_SIZE(bits_cases)); + + ksft_set_plan(tests_cnt); + + return ksft_exit_pass(); +} From patchwork Tue Aug 30 01:01:02 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958587 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 049B4ECAAD3 for ; Tue, 30 Aug 2022 01:01:46 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 4980594000D; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 24A43940007; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 0994F94000C; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EDBF9940007 for ; Mon, 29 Aug 2022 21:01:38 -0400 (EDT) Received: from smtpin31.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay03.hostedemail.com (Postfix) with ESMTP id CBEE9A04CD for ; Tue, 30 Aug 2022 01:01:38 +0000 (UTC) X-FDA: 79854456276.31.DF03721 Received: from mga03.intel.com (mga03.intel.com [134.134.136.65]) by imf24.hostedemail.com (Postfix) with ESMTP id 38BCC180012 for ; Tue, 30 Aug 2022 01:01:38 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821298; x=1693357298; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=paLLfnuCD3vCLKnZtUsd0mpWPekiZ/gq8Xe6lyMWCSA=; b=EZur8Cd9BcROwVmaDDK6nkqClPGoYJIKEvLugCsRCHJUFpChvfBOdt/v jplAEb1JCqevju46jGhX6VX9bvJblqlIbX3RbZcR7IvV3loEYL+jwplDH yJLxIzDv6d5baBgdBcxD6YodZtXMODbcYtyUTzLBs92ijSgsKfv+7ENKN 96issaMF7TVaoejKjfXDn57uc+XuBM3lcjWTHFH5B+0KVIBB0GKp/Rl4d lSaG76Db4EYUqcOIFMJHhcUaZ/cSoRl+CFLVz93LIosFr/7oAcDpZZS+a CDaW4r2C/pl9OAuVRw6ecSeZ/mhfa0+oQJkZvsPpmZadgFfkkiYhJOWGF Q==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="296322937" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="296322937" Received: from orsmga005.jf.intel.com ([10.7.209.41]) by orsmga103.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:37 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="787305244" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga005-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id B1E3C10428A; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv8 09/11] selftests/x86/lam: Add mmap and SYSCALL test cases for linear-address masking Date: Tue, 30 Aug 2022 04:01:02 +0300 Message-Id: <20220830010104.1282-10-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821298; a=rsa-sha256; cv=none; b=SAIzxc65Opn0KyALQkvZEY2zNik/AM0XeYYz4LFNm27hS4BWVvYDxIfTAidqHoiIKcLnKO l3Sgd3Ahwt3NQKVAqbojqe18dFdKDB5gs3SU90VxfQ9g2T6LNYPW6D2m4hMCHH8xkIIVPS /jbkZh0kqAnDReWNCGLId+w56hFrtJI= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=EZur8Cd9; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821298; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=2nynb095/h4K5Rcr0D/L0f1/TgQgPtmEt2VP6orJNkk=; b=Fg1UqPTYoLOehL4xSILrA9gIO50RT1+LmxfTcWHNV8le1I/0oy6j+pFuKeud2zHcvtjGSj XwsyBkdVR5EWQXIy38a5ta+VysQ7KOF96auuEnZIYF2bVN1uv/A3SX3BLIoi4P7CuQJb9u KRFnLYudhsw/8/IapKU6ifxiYspEHig= Authentication-Results: imf24.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=EZur8Cd9; spf=none (imf24.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.65) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspam-User: X-Rspamd-Server: rspam11 X-Stat-Signature: 6i4of5jaqm4uiis8rb931mnyg9kg8xte X-Rspamd-Queue-Id: 38BCC180012 X-HE-Tag: 1661821298-410831 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Weihong Zhang Add mmap and SYSCALL test cases. SYSCALL test cases: - LAM supports set metadata in high bits 62:57 (LAM_U57) of a user pointer, pass the pointer to SYSCALL, SYSCALL can dereference the pointer and return correct result. - Disable LAM, pass a pointer with metadata in high bits to SYSCALL, SYSCALL returns -1 (EFAULT). MMAP test cases: - Enable LAM_U57, MMAP with low address (below bits 47), set metadata in high bits of the address, dereference the address should be allowed. - Enable LAM_U57, MMAP with high address (above bits 47), set metadata in high bits of the address, dereference the address should be allowed. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov Acked-by: Peter Zijlstra (Intel) --- tools/testing/selftests/x86/lam.c | 135 +++++++++++++++++++++++++++++- 1 file changed, 132 insertions(+), 3 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 900a3a0fb709..b88e007ee0a3 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -7,6 +7,7 @@ #include #include #include +#include #include #include @@ -29,11 +30,18 @@ /* Specified test function bits */ #define FUNC_MALLOC 0x1 #define FUNC_BITS 0x2 +#define FUNC_MMAP 0x4 +#define FUNC_SYSCALL 0x8 -#define TEST_MASK 0x3 +#define TEST_MASK 0xf + +#define LOW_ADDR (0x1UL << 30) +#define HIGH_ADDR (0x3UL << 48) #define MALLOC_LEN 32 +#define PAGE_SIZE (4 << 10) + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ @@ -49,6 +57,7 @@ jmp_buf segv_env; static void segv_handler(int sig) { ksft_print_msg("Get segmentation fault(%d).", sig); + siglongjmp(segv_env, 1); } @@ -61,6 +70,16 @@ static inline int cpu_has_lam(void) return (cpuinfo[0] & (1 << 26)); } +/* Check 5-level page table feature in CPUID.(EAX=07H, ECX=00H):ECX.[bit 16] */ +static inline int cpu_has_la57(void) +{ + unsigned int cpuinfo[4]; + + __cpuid_count(0x7, 0, cpuinfo[0], cpuinfo[1], cpuinfo[2], cpuinfo[3]); + + return (cpuinfo[2] & (1 << 16)); +} + /* * Set tagged address and read back untag mask. * check if the untagged mask is expected. @@ -213,6 +232,68 @@ static int handle_malloc(struct testcases *test) return ret; } +static int handle_mmap(struct testcases *test) +{ + void *ptr; + unsigned int flags = MAP_PRIVATE | MAP_ANONYMOUS | MAP_FIXED; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + ptr = mmap((void *)test->addr, PAGE_SIZE, PROT_READ | PROT_WRITE, + flags, -1, 0); + if (ptr == MAP_FAILED) { + if (test->addr == HIGH_ADDR) + if (!cpu_has_la57()) + return 3; /* unsupport LA57 */ + return 1; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + ret = 1; + + if (ret == 0) { + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = handle_lam_test(ptr, test->lam); + } else { + ret = 2; + } + } + + munmap(ptr, PAGE_SIZE); + return ret; +} + +static int handle_syscall(struct testcases *test) +{ + struct utsname unme, *pu; + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + pu = (struct utsname *)set_metadata((uint64_t)&unme, test->lam); + ret = uname(pu); + if (ret < 0) + ret = 1; + } else { + ret = 2; + } + + if (test->later != 0 && test->lam != 0) + if (set_lam(test->lam) != -1 && ret == 0) + ret = 1; + + return ret; +} + static int fork_test(struct testcases *test) { int ret, child_ret; @@ -268,7 +349,6 @@ static struct testcases malloc_cases[] = { }, }; - static struct testcases bits_cases[] = { { .test_func = handle_max_bits, @@ -276,11 +356,54 @@ static struct testcases bits_cases[] = { }, }; +static struct testcases syscall_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_syscall, + .msg = "SYSCALL: LAM_U57. syscall with metadata\n", + }, + { + .later = 1, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = handle_syscall, + .msg = "SYSCALL:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + +static struct testcases mmap_cases[] = { + { + .later = 1, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = HIGH_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First mmap high address, then set LAM_U57.\n", + }, + { + .later = 0, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = HIGH_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First LAM_U57, then High address.\n", + }, + { + .later = 0, + .expected = 0, + .lam = LAM_U57_BITS, + .addr = LOW_ADDR, + .test_func = handle_mmap, + .msg = "MMAP: First LAM_U57, then Low address.\n", + }, +}; + static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits;\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall.\n"); printf("\t-h: help\n"); } @@ -320,6 +443,12 @@ int main(int argc, char **argv) if (tests & FUNC_BITS) run_test(bits_cases, ARRAY_SIZE(bits_cases)); + if (tests & FUNC_MMAP) + run_test(mmap_cases, ARRAY_SIZE(mmap_cases)); + + if (tests & FUNC_SYSCALL) + run_test(syscall_cases, ARRAY_SIZE(syscall_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass(); From patchwork Tue Aug 30 01:01:03 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958592 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id 370B6ECAAD2 for ; Tue, 30 Aug 2022 01:02:05 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id B9949940011; Mon, 29 Aug 2022 21:02:04 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id AFC5C940007; Mon, 29 Aug 2022 21:02:04 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 88A16940011; Mon, 29 Aug 2022 21:02:04 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0011.hostedemail.com [216.40.44.11]) by kanga.kvack.org (Postfix) with ESMTP id 716E2940007 for ; Mon, 29 Aug 2022 21:02:04 -0400 (EDT) Received: from smtpin30.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay08.hostedemail.com (Postfix) with ESMTP id 49112140A04 for ; Tue, 30 Aug 2022 01:02:04 +0000 (UTC) X-FDA: 79854457368.30.B6A5114 Received: from mga01.intel.com (mga01.intel.com [192.55.52.88]) by imf03.hostedemail.com (Postfix) with ESMTP id AF8A020012 for ; Tue, 30 Aug 2022 01:02:03 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821323; x=1693357323; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=tKXUIYG/Xz5e/k5YeoUZSvllZOQ+wWGcTbccvaDbCKc=; b=A/6kEa3mTBAoYxYMwjR5fLluN3Khtnt4tVrdCACoUQVvRYoqs/QunBNF uWYF+yIpixsMc+qSf+aOKxwod+pZiuJHLbBy/F9SBK7h21S5JD0hGTngO 4UbeTO9VJ45hDgG54OscU/sqVhFrUvKFsRBtUSMi17lUmbV0s+m569pkm ove8ojfWmyydqLxkrWuauze8MB4FQdBujXLD3RuNuvezmUBu1ml4aa8WB nYzdiEUV6iY3CKbKlcldBlIB8byKDwL9OMnpWw6OwkSlaaSIWJVhMuKvc CEr5V5OcZPpaXWonRGr1a/17aRK+xxk1gyI6GF/x+saM2oCEOHFDxz5oD w==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="321170269" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="321170269" Received: from fmsmga004.fm.intel.com ([10.253.24.48]) by fmsmga101.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:37 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="679838305" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by fmsmga004-auth.fm.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id BC31C10428B; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv8 10/11] selftests/x86/lam: Add io_uring test cases for linear-address masking Date: Tue, 30 Aug 2022 04:01:03 +0300 Message-Id: <20220830010104.1282-11-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Authentication-Results: i=1; imf03.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="A/6kEa3m"; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.88) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821323; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=FijMIhLfr7IFjJRcf1DUYaBmCXVqBTdBsHiL7pVG8G0=; b=dmhUCugAoakvnkqPLTC9Z+9F3PbPYk35cB/3r9qTPSBgsj2LZytwokyIPUz/4Zq3ks25O2 nXdoUWVXz913lIINf3fPtryOZWZC39Lo//4vWdhBZImVd2lCj+pDHHoO5e+XT46Y0dlzqu 8Zb6g09vuNB66Kv8G5D+XO9xEo0O+3Q= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821323; a=rsa-sha256; cv=none; b=U7iP5825yBIFTyYN2kJK7ZrbIsBI2tVdL7dPpUrSaoupS42V+7fYNaxSxegokpXhlQfcau 2MshcYthVS7VmYC3tg4627zxVzysZUFCOzqECdvd9rX7gNQL+FBEmQfJ23JFz4B8/WVyXA O8It6+6CAweKeDTl2AU4uhbY8EdC12A= X-Stat-Signature: 1o3zo1z65e3drb43us5mmczbrw3tyt3z X-Rspamd-Queue-Id: AF8A020012 Authentication-Results: imf03.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b="A/6kEa3m"; spf=none (imf03.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 192.55.52.88) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspam-User: X-Rspamd-Server: rspam01 X-HE-Tag: 1661821323-300082 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Weihong Zhang LAM should be supported in kernel thread, using io_uring to verify LAM feature. The test cases implement read a file through io_uring, the test cases choose an iovec array as receiving buffer, which used to receive data, according to LAM mode, set metadata in high bits of these buffer. io_uring can deal with these buffers that pointed to pointers with the metadata in high bits. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov Acked-by: Peter Zijlstra (Intel) --- tools/testing/selftests/x86/lam.c | 341 +++++++++++++++++++++++++++++- 1 file changed, 339 insertions(+), 2 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index b88e007ee0a3..25e6d0fc4a83 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -9,8 +9,12 @@ #include #include #include +#include +#include #include +#include +#include #include "../kselftest.h" #ifndef __x86_64__ @@ -32,8 +36,9 @@ #define FUNC_BITS 0x2 #define FUNC_MMAP 0x4 #define FUNC_SYSCALL 0x8 +#define FUNC_URING 0x10 -#define TEST_MASK 0xf +#define TEST_MASK 0x1f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -42,6 +47,13 @@ #define PAGE_SIZE (4 << 10) +#define barrier() ({ \ + __asm__ __volatile__("" : : : "memory"); \ +}) + +#define URING_QUEUE_SZ 1 +#define URING_BLOCK_SZ 2048 + struct testcases { unsigned int later; int expected; /* 2: SIGSEGV Error; 1: other errors */ @@ -51,6 +63,33 @@ struct testcases { const char *msg; }; +/* Used by CQ of uring, source file handler and file's size */ +struct file_io { + int file_fd; + off_t file_sz; + struct iovec iovecs[]; +}; + +struct io_uring_queue { + unsigned int *head; + unsigned int *tail; + unsigned int *ring_mask; + unsigned int *ring_entries; + unsigned int *flags; + unsigned int *array; + union { + struct io_uring_cqe *cqes; + struct io_uring_sqe *sqes; + } queue; + size_t ring_sz; +}; + +struct io_ring { + int ring_fd; + struct io_uring_queue sq_ring; + struct io_uring_queue cq_ring; +}; + int tests_cnt; jmp_buf segv_env; @@ -294,6 +333,285 @@ static int handle_syscall(struct testcases *test) return ret; } +int sys_uring_setup(unsigned int entries, struct io_uring_params *p) +{ + return (int)syscall(__NR_io_uring_setup, entries, p); +} + +int sys_uring_enter(int fd, unsigned int to, unsigned int min, unsigned int flags) +{ + return (int)syscall(__NR_io_uring_enter, fd, to, min, flags, NULL, 0); +} + +/* Init submission queue and completion queue */ +int mmap_io_uring(struct io_uring_params p, struct io_ring *s) +{ + struct io_uring_queue *sring = &s->sq_ring; + struct io_uring_queue *cring = &s->cq_ring; + + sring->ring_sz = p.sq_off.array + p.sq_entries * sizeof(unsigned int); + cring->ring_sz = p.cq_off.cqes + p.cq_entries * sizeof(struct io_uring_cqe); + + if (p.features & IORING_FEAT_SINGLE_MMAP) { + if (cring->ring_sz > sring->ring_sz) + sring->ring_sz = cring->ring_sz; + + cring->ring_sz = sring->ring_sz; + } + + void *sq_ptr = mmap(0, sring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_SQ_RING); + + if (sq_ptr == MAP_FAILED) { + perror("sub-queue!"); + return 1; + } + + void *cq_ptr = sq_ptr; + + if (!(p.features & IORING_FEAT_SINGLE_MMAP)) { + cq_ptr = mmap(0, cring->ring_sz, PROT_READ | PROT_WRITE, + MAP_SHARED | MAP_POPULATE, s->ring_fd, + IORING_OFF_CQ_RING); + if (cq_ptr == MAP_FAILED) { + perror("cpl-queue!"); + munmap(sq_ptr, sring->ring_sz); + return 1; + } + } + + sring->head = sq_ptr + p.sq_off.head; + sring->tail = sq_ptr + p.sq_off.tail; + sring->ring_mask = sq_ptr + p.sq_off.ring_mask; + sring->ring_entries = sq_ptr + p.sq_off.ring_entries; + sring->flags = sq_ptr + p.sq_off.flags; + sring->array = sq_ptr + p.sq_off.array; + + /* Map a queue as mem map */ + s->sq_ring.queue.sqes = mmap(0, p.sq_entries * sizeof(struct io_uring_sqe), + PROT_READ | PROT_WRITE, MAP_SHARED | MAP_POPULATE, + s->ring_fd, IORING_OFF_SQES); + if (s->sq_ring.queue.sqes == MAP_FAILED) { + munmap(sq_ptr, sring->ring_sz); + if (sq_ptr != cq_ptr) { + ksft_print_msg("failed to mmap uring queue!"); + munmap(cq_ptr, cring->ring_sz); + return 1; + } + } + + cring->head = cq_ptr + p.cq_off.head; + cring->tail = cq_ptr + p.cq_off.tail; + cring->ring_mask = cq_ptr + p.cq_off.ring_mask; + cring->ring_entries = cq_ptr + p.cq_off.ring_entries; + cring->queue.cqes = cq_ptr + p.cq_off.cqes; + + return 0; +} + +/* Init io_uring queues */ +int setup_io_uring(struct io_ring *s) +{ + struct io_uring_params para; + + memset(¶, 0, sizeof(para)); + s->ring_fd = sys_uring_setup(URING_QUEUE_SZ, ¶); + if (s->ring_fd < 0) + return 1; + + return mmap_io_uring(para, s); +} + +/* + * Get data from completion queue. the data buffer saved the file data + * return 0: success; others: error; + */ +int handle_uring_cq(struct io_ring *s) +{ + struct file_io *fi = NULL; + struct io_uring_queue *cring = &s->cq_ring; + struct io_uring_cqe *cqe; + unsigned int head; + off_t len = 0; + + head = *cring->head; + + do { + barrier(); + if (head == *cring->tail) + break; + /* Get the entry */ + cqe = &cring->queue.cqes[head & *s->cq_ring.ring_mask]; + fi = (struct file_io *)cqe->user_data; + if (cqe->res < 0) + break; + + int blocks = (int)(fi->file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + for (int i = 0; i < blocks; i++) + len += fi->iovecs[i].iov_len; + + head++; + } while (1); + + *cring->head = head; + barrier(); + + return (len != fi->file_sz); +} + +/* + * Submit squeue. specify via IORING_OP_READV. + * the buffer need to be set metadata according to LAM mode + */ +int handle_uring_sq(struct io_ring *ring, struct file_io *fi, unsigned long lam) +{ + int file_fd = fi->file_fd; + struct io_uring_queue *sring = &ring->sq_ring; + unsigned int index = 0, cur_block = 0, tail = 0, next_tail = 0; + struct io_uring_sqe *sqe; + + off_t remain = fi->file_sz; + int blocks = (int)(remain + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + while (remain) { + off_t bytes = remain; + void *buf; + + if (bytes > URING_BLOCK_SZ) + bytes = URING_BLOCK_SZ; + + fi->iovecs[cur_block].iov_len = bytes; + + if (posix_memalign(&buf, URING_BLOCK_SZ, URING_BLOCK_SZ)) + return 1; + + fi->iovecs[cur_block].iov_base = (void *)set_metadata((uint64_t)buf, lam); + remain -= bytes; + cur_block++; + } + + next_tail = *sring->tail; + tail = next_tail; + next_tail++; + + barrier(); + + index = tail & *ring->sq_ring.ring_mask; + + sqe = &ring->sq_ring.queue.sqes[index]; + sqe->fd = file_fd; + sqe->flags = 0; + sqe->opcode = IORING_OP_READV; + sqe->addr = (unsigned long)fi->iovecs; + sqe->len = blocks; + sqe->off = 0; + sqe->user_data = (uint64_t)fi; + + sring->array[index] = index; + tail = next_tail; + + if (*sring->tail != tail) { + *sring->tail = tail; + barrier(); + } + + if (sys_uring_enter(ring->ring_fd, 1, 1, IORING_ENTER_GETEVENTS) < 0) + return 1; + + return 0; +} + +/* + * Test LAM in async I/O and io_uring, read current binery through io_uring + * Set metadata in pointers to iovecs buffer. + */ +int do_uring(unsigned long lam) +{ + struct io_ring *ring; + struct file_io *fi; + struct stat st; + int ret = 1; + char path[PATH_MAX]; + + /* get current process path */ + if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) + return 1; + + int file_fd = open(path, O_RDONLY); + + if (file_fd < 0) + return 1; + + if (fstat(file_fd, &st) < 0) + return 1; + + off_t file_sz = st.st_size; + + int blocks = (int)(file_sz + URING_BLOCK_SZ - 1) / URING_BLOCK_SZ; + + fi = malloc(sizeof(*fi) + sizeof(struct iovec) * blocks); + if (!fi) + return 1; + + fi->file_sz = file_sz; + fi->file_fd = file_fd; + + ring = malloc(sizeof(*ring)); + if (!ring) + return 1; + + memset(ring, 0, sizeof(struct io_ring)); + + if (setup_io_uring(ring)) + goto out; + + if (handle_uring_sq(ring, fi, lam)) + goto out; + + ret = handle_uring_cq(ring); + +out: + free(ring); + + for (int i = 0; i < blocks; i++) { + if (fi->iovecs[i].iov_base) { + uint64_t addr = ((uint64_t)fi->iovecs[i].iov_base); + + switch (lam) { + case LAM_U57_BITS: /* Clear bits 62:57 */ + addr = (addr & ~(0x3fULL << 57)); + break; + } + free((void *)addr); + fi->iovecs[i].iov_base = NULL; + } + } + + free(fi); + + return ret; +} + +int handle_uring(struct testcases *test) +{ + int ret = 0; + + if (test->later == 0 && test->lam != 0) + if (set_lam(test->lam) != 0) + return 1; + + if (sigsetjmp(segv_env, 1) == 0) { + signal(SIGSEGV, segv_handler); + ret = do_uring(test->lam); + } else { + ret = 2; + } + + return ret; +} + static int fork_test(struct testcases *test) { int ret, child_ret; @@ -333,6 +651,22 @@ static void run_test(struct testcases *test, int count) } } +static struct testcases uring_cases[] = { + { + .later = 0, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING: LAM_U57. Dereferencing pointer with metadata\n", + }, + { + .later = 1, + .expected = 1, + .lam = LAM_U57_BITS, + .test_func = handle_uring, + .msg = "URING:[Negative] Disable LAM. Dereferencing pointer with metadata.\n", + }, +}; + static struct testcases malloc_cases[] = { { .later = 0, @@ -403,7 +737,7 @@ static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall.\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring.\n"); printf("\t-h: help\n"); } @@ -449,6 +783,9 @@ int main(int argc, char **argv) if (tests & FUNC_SYSCALL) run_test(syscall_cases, ARRAY_SIZE(syscall_cases)); + if (tests & FUNC_URING) + run_test(uring_cases, ARRAY_SIZE(uring_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass(); From patchwork Tue Aug 30 01:01:04 2022 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: "Kirill A. Shutemov" X-Patchwork-Id: 12958589 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id A3FC0ECAAD3 for ; Tue, 30 Aug 2022 01:01:48 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 2E73894000E; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 29E75940007; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 1114794000E; Mon, 29 Aug 2022 21:01:40 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0012.hostedemail.com [216.40.44.12]) by kanga.kvack.org (Postfix) with ESMTP id F3852940007 for ; Mon, 29 Aug 2022 21:01:39 -0400 (EDT) Received: from smtpin29.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay01.hostedemail.com (Postfix) with ESMTP id CFFB51C5BD1 for ; Tue, 30 Aug 2022 01:01:39 +0000 (UTC) X-FDA: 79854456318.29.B91AC14 Received: from mga02.intel.com (mga02.intel.com [134.134.136.20]) by imf31.hostedemail.com (Postfix) with ESMTP id 3559A2004A for ; Tue, 30 Aug 2022 01:01:39 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=intel.com; i=@intel.com; q=dns/txt; s=Intel; t=1661821299; x=1693357299; h=from:to:cc:subject:date:message-id:in-reply-to: references:mime-version:content-transfer-encoding; bh=4VaUqe2NAYSwSGWHP2GlkAzpRNxCF7gpiQRDfS8DC1s=; b=IDRHK3Uk7cJEs+ki5dyMHe/vQoZaygvQGpaSk4m29Q/GudOYfU/lGWK2 WF2nyCpX2QebgEAZOIAbmxVeuM82ZPVCFg8ScT8Ea0Gcw5JUwdZ2PG79R ozKz1D6bh2M4A+h+hLvKEXAsGsxxiVZfoMNt8zERa4PWjhcfd01erhXth LH1RTdjgzc24MBx0tYQpM554c6Lrl2AHoY7jT2zD3I4Fw4gKfGU0dYuos H4+0iCM9bo726J12oEn7ni9sw1I/Pik3B+ySieo2Td6iqvJhZ9P+kq0uu GGhKgPc8xliRXJfV4jJdU9X59XzHFjytRa+f9UR11TcZniIOGqLqygt7o A==; X-IronPort-AV: E=McAfee;i="6500,9779,10454"; a="282015891" X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="282015891" Received: from orsmga006.jf.intel.com ([10.7.209.51]) by orsmga101.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:37 -0700 X-IronPort-AV: E=Sophos;i="5.93,273,1654585200"; d="scan'208";a="588386803" Received: from fpalamon-mobl.amr.corp.intel.com (HELO box.shutemov.name) ([10.252.54.23]) by orsmga006-auth.jf.intel.com with ESMTP/TLS/ECDHE-RSA-AES256-GCM-SHA384; 29 Aug 2022 18:01:33 -0700 Received: by box.shutemov.name (Postfix, from userid 1000) id C691210428C; Tue, 30 Aug 2022 04:01:24 +0300 (+03) From: "Kirill A. Shutemov" To: Dave Hansen , Andy Lutomirski , Peter Zijlstra Cc: x86@kernel.org, Kostya Serebryany , Andrey Ryabinin , Andrey Konovalov , Alexander Potapenko , Taras Madan , Dmitry Vyukov , "H . J . Lu" , Andi Kleen , Rick Edgecombe , linux-mm@kvack.org, linux-kernel@vger.kernel.org, Weihong Zhang , "Kirill A . Shutemov" Subject: [PATCHv8 11/11] selftests/x86/lam: Add inherit test cases for linear-address masking Date: Tue, 30 Aug 2022 04:01:04 +0300 Message-Id: <20220830010104.1282-12-kirill.shutemov@linux.intel.com> X-Mailer: git-send-email 2.35.1 In-Reply-To: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> References: <20220830010104.1282-1-kirill.shutemov@linux.intel.com> MIME-Version: 1.0 ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1661821299; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=/UTn8/5psiAx3BgvFZcaYA443QDpEWyuY9IsVLa0sjs=; b=XMer2SR6huaE/EU5aZi6QXGcEWFdJ0TycTTOao3pLjqzYurJPdiFwroIFEzvnlH0f+7U1d x0rfsyOQ1VPnhzp8X0bplRumT2aSZD7p1lp9CLKLBj9Kr7uVKi19JIfncfOSixW8kiVty5 Lm3kiVbGMmYaTc7wEsT0pNievteFEJY= ARC-Authentication-Results: i=1; imf31.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=IDRHK3Uk; spf=none (imf31.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1661821299; a=rsa-sha256; cv=none; b=BMOL0m00dhGXBFueI5rZNZFq749ZBaq1Wiz8ULlKtGez1WiN/5JJo4XnKH3wK+mk//GdfS 772zmYSM/8JYb07MHfP7UpdV1ulDRr9wEqGCw5HuC1DDrFIIAamDeuTCIMmRcTTCwxPDsV fxitkUQgXyvfP47UK5G5VSSTepXAx5s= X-Rspam-User: X-Rspamd-Queue-Id: 3559A2004A X-Stat-Signature: s4am5wx5a6fo9inn6xwzin59z38uwoqt Authentication-Results: imf31.hostedemail.com; dkim=none ("invalid DKIM record") header.d=intel.com header.s=Intel header.b=IDRHK3Uk; spf=none (imf31.hostedemail.com: domain of kirill.shutemov@linux.intel.com has no SPF policy when checking 134.134.136.20) smtp.mailfrom=kirill.shutemov@linux.intel.com; dmarc=fail reason="No valid SPF" header.from=intel.com (policy=none) X-Rspamd-Server: rspam08 X-HE-Tag: 1661821299-537987 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Weihong Zhang LAM is enabled per-thread and gets inherited on fork(2)/clone(2). exec() reverts LAM status to the default disabled state. There are two test scenarios: - Fork test cases: These cases were used to test the inheritance of LAM for per-thread, Child process generated by fork() should inherit LAM feature from parent process, Child process can get the LAM mode same as parent process. - Execve test cases: Processes generated by execve() are different from processes generated by fork(), these processes revert LAM status to disabled status. Signed-off-by: Weihong Zhang Signed-off-by: Kirill A. Shutemov Acked-by: Peter Zijlstra (Intel) --- tools/testing/selftests/x86/lam.c | 125 +++++++++++++++++++++++++++++- 1 file changed, 121 insertions(+), 4 deletions(-) diff --git a/tools/testing/selftests/x86/lam.c b/tools/testing/selftests/x86/lam.c index 25e6d0fc4a83..0d76f883efdd 100644 --- a/tools/testing/selftests/x86/lam.c +++ b/tools/testing/selftests/x86/lam.c @@ -37,8 +37,9 @@ #define FUNC_MMAP 0x4 #define FUNC_SYSCALL 0x8 #define FUNC_URING 0x10 +#define FUNC_INHERITE 0x20 -#define TEST_MASK 0x1f +#define TEST_MASK 0x3f #define LOW_ADDR (0x1UL << 30) #define HIGH_ADDR (0x3UL << 48) @@ -174,6 +175,28 @@ static unsigned long get_default_tag_bits(void) return lam; } +/* + * Set tagged address and read back untag mask. + * check if the untag mask is expected. + */ +static int get_lam(void) +{ + uint64_t ptr = 0; + int ret = -1; + /* Get untagged mask */ + if (syscall(SYS_arch_prctl, ARCH_GET_UNTAG_MASK, &ptr) == -1) + return -1; + + /* Check mask returned is expected */ + if (ptr == ~(LAM_U57_MASK)) + ret = LAM_U57_BITS; + else if (ptr == -1ULL) + ret = LAM_NONE; + + + return ret; +} + /* According to LAM mode, set metadata in high bits */ static uint64_t set_metadata(uint64_t src, unsigned long lam) { @@ -581,7 +604,7 @@ int do_uring(unsigned long lam) switch (lam) { case LAM_U57_BITS: /* Clear bits 62:57 */ - addr = (addr & ~(0x3fULL << 57)); + addr = (addr & ~(LAM_U57_MASK)); break; } free((void *)addr); @@ -632,6 +655,72 @@ static int fork_test(struct testcases *test) return ret; } +static int handle_execve(struct testcases *test) +{ + int ret, child_ret; + int lam = test->lam; + pid_t pid; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + ret = 1; + } else if (pid == 0) { + char path[PATH_MAX]; + + /* Set LAM mode in parent process */ + if (set_lam(lam) != 0) + return 1; + + /* Get current binary's path and the binary was run by execve */ + if (readlink("/proc/self/exe", path, PATH_MAX) <= 0) + exit(-1); + + /* run binary to get LAM mode and return to parent process */ + if (execlp(path, path, "-t 0x0", NULL) < 0) { + perror("error on exec"); + exit(-1); + } + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + if (ret != LAM_NONE) + return 1; + } + + return 0; +} + +static int handle_inheritance(struct testcases *test) +{ + int ret, child_ret; + int lam = test->lam; + pid_t pid; + + /* Set LAM mode in parent process */ + if (set_lam(lam) != 0) + return 1; + + pid = fork(); + if (pid < 0) { + perror("Fork failed."); + return 1; + } else if (pid == 0) { + /* Set LAM mode in parent process */ + int child_lam = get_lam(); + + exit(child_lam); + } else { + wait(&child_ret); + ret = WEXITSTATUS(child_ret); + + if (lam != ret) + return 1; + } + + return 0; +} + static void run_test(struct testcases *test, int count) { int i, ret = 0; @@ -733,11 +822,26 @@ static struct testcases mmap_cases[] = { }, }; +static struct testcases inheritance_cases[] = { + { + .expected = 0, + .lam = LAM_U57_BITS, + .test_func = handle_inheritance, + .msg = "FORK: LAM_U57, child process should get LAM mode same as parent\n", + }, + { + .expected = 0, + .lam = LAM_U57_BITS, + .test_func = handle_execve, + .msg = "EXECVE: LAM_U57, child process should get disabled LAM mode\n", + }, +}; + static void cmd_help(void) { printf("usage: lam [-h] [-t test list]\n"); printf("\t-t test list: run tests specified in the test list, default:0x%x\n", TEST_MASK); - printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring.\n"); + printf("\t\t0x1:malloc; 0x2:max_bits; 0x4:mmap; 0x8:syscall; 0x10:io_uring; 0x20:inherit;\n"); printf("\t-h: help\n"); } @@ -757,7 +861,7 @@ int main(int argc, char **argv) switch (c) { case 't': tests = strtoul(optarg, NULL, 16); - if (!(tests & TEST_MASK)) { + if (tests && !(tests & TEST_MASK)) { ksft_print_msg("Invalid argument!\n"); return -1; } @@ -771,6 +875,16 @@ int main(int argc, char **argv) } } + /* + * When tests is 0, it is not a real test case; + * the option used by test case(execve) to check the lam mode in + * process generated by execve, the process read back lam mode and + * check with lam mode in parent process. + */ + if (!tests) + return (get_lam()); + + /* Run test cases */ if (tests & FUNC_MALLOC) run_test(malloc_cases, ARRAY_SIZE(malloc_cases)); @@ -786,6 +900,9 @@ int main(int argc, char **argv) if (tests & FUNC_URING) run_test(uring_cases, ARRAY_SIZE(uring_cases)); + if (tests & FUNC_INHERITE) + run_test(inheritance_cases, ARRAY_SIZE(inheritance_cases)); + ksft_set_plan(tests_cnt); return ksft_exit_pass();