From patchwork Tue Aug 27 13:08:27 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 13779530 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 9739E1C57AC for ; Tue, 27 Aug 2024 13:08:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764118; cv=none; b=dtcJjmD+qwkVPcbPhQj273JlFAcNGKueBeNF80uFcada8wn/HG2jgrPre9eSl4iEvymDjOfm1MDEp2wMRhi+vTMhfx4slfT9+riNuWSIAMBuagMksRditqAuuc3b/QZc8pyenh/jiprqITdwLyz/0fCyshJZhfx8LucMIb08eBE= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764118; c=relaxed/simple; bh=LUcRncYkstw7segiyPcylSfJ5R32kGx71z17VxE4yG8=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=SUjCXEXGxXItQkata70G6iGdrXBVHrRI+npKEwLJ4l2FzfUwa7gEMCyeILr4dU382zqaWJNu8cZLuuuoKWzhgujTpe1o3ud634ILZIVvApE5RwKG73UzcOdn5wCZDoQUAUbNSpwZ50amxL2LeKVFAuQ1fl7fWF7SF6NJ8GwCrHg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=Fm9LmITs; arc=none smtp.client-ip=209.85.128.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="Fm9LmITs" Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-428141be2ddso46892985e9.2 for ; Tue, 27 Aug 2024 06:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764113; x=1725368913; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=4MD8wc4g8KI9rh8uybWL5vkhuXQ1ZDf9pDxd792jHps=; b=Fm9LmITsuiwXlUV3dLoMz3wZeQVZgnkz0DGpetUG/yT2TV3BQN4I8JvgHJEALvKsuX J2MLUG51Zsd28v7JQP2YTEgq5729sW1A9BMDuqgPTawWErqA7g+XuZNG12hrts+n9joD 41feg5IUwVjXApKvvCbYEhJhMVH0qZtdnTDtnmweV9Dt6+JL4ez9q31AbM3284Fg6JyF +II6Koc5pWDqKBC/K1OANOTsjrjLJJHXIbXGLRorgJ/NdFbZRSPG4Vyfl9HYO+THVqTJ QzNQh6BCwmh8LiShRjvaCIoBw1QtksdBBI0QsYyQjuLDALN9BC2zzLPpIczhBdE8gyLY jTHA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764113; x=1725368913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=4MD8wc4g8KI9rh8uybWL5vkhuXQ1ZDf9pDxd792jHps=; b=BZzOE4ZqQNzfyraOqDjXn/UIgAeFBM1bxj8ecIYB2scl9eUUkS24Ta98vNxMY+/aRx VkGi5pjPRRac3FDBIefK8Cvdy417BxPtR1Mud/dbWJd9l3RGQPtHRXBmjkqyZMjcvmOW PhoHowKZGXwi8j0RTwWnQoD2jTENIMZhkt0ziO45dwU78/nBNtV5I5IfqkDW2ylSbyOr c0tQGQkLOyFbgGJaMm/3UNLYfJZYHVWim3WIucFj1pYqwQWTy9CpuGzy7NaVX/+/JVG2 7MCFW4MJC2f6C+lc43gk5OztGuj1zjL9LeNhPzWNUcIE4zuNkD1H7Rb/QJTi81IA0B4Y 4tMw== X-Gm-Message-State: AOJu0YwVP8jeSziu0KtpjqpuvmQ7vGI3KCam+GZPGn/nVJ+t0saryXlC SF89f1Fobj0BF4vdEyldTHOr/QZu9gR7SfSCwdt2cPSRsj5FdCw+gW69OV+RrmP7gN76AxCr9bf 3 X-Google-Smtp-Source: AGHT+IGf25+bx2RIunIzXYD7ZgDC8cl0SkQ9RTsMDgpu1jLvGJSLbOebGk8UXVEvEVDEuniRA+7y/w== X-Received: by 2002:a05:600c:524f:b0:425:80d5:b8b2 with SMTP id 5b1f17b1804b1-42acd55df54mr82544845e9.16.1724764112183; Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42abeff8e28sm222633065e9.40.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:30 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 1BC295F9EB; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, D Scott Phillips , =?utf-8?q?Alex_Benn=C3=A9e?= Subject: [PATCH 1/3] ampere/arm64: Add a fixup handler for alignment faults in aarch64 code Date: Tue, 27 Aug 2024 14:08:27 +0100 Message-Id: <20240827130829.43632-2-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: D Scott Phillips A later patch will hand out Device memory in some cases to code which expects a Normal memory type, as an errata workaround. Unaligned accesses to Device memory will fault though, so here we add a fixup handler to emulate faulting accesses, at a performance penalty. Many of the instructions in the Loads and Stores group are supported, but these groups are not handled here: * Advanced SIMD load/store multiple structures * Advanced SIMD load/store multiple structures (post-indexed) * Advanced SIMD load/store single structure * Advanced SIMD load/store single structure (post-indexed) * Load/store memory tags * Load/store exclusive * LDAPR/STLR (unscaled immediate) * Load register (literal) [cannot Alignment fault] * Load/store register (unprivileged) * Atomic memory operations * Load/store register (pac) Instruction implementations are translated from the Exploration tools' ASL specifications. Upstream-Status: Pending Signed-off-by: D Scott Phillips [AJB: fix align_ldst_regoff_simdfp] Signed-off-by: Alex Bennée --- v2 - fix handling of some registers vAJB: - fix align_ldst_regoff_simdfp - fix scale calculation (ternary instead of |) - don't skip n == t && n != 31 (not relevant to simd/fp) - check for invalid option<1> - expand opc & 0x2 check to include size - add failure pr_warn to fixup_alignment --- arch/arm64/include/asm/insn.h | 1 + arch/arm64/mm/Makefile | 3 +- arch/arm64/mm/fault.c | 721 ++++++++++++++++++++++++++++++++++ arch/arm64/mm/fault_neon.c | 59 +++ 4 files changed, 783 insertions(+), 1 deletion(-) create mode 100644 arch/arm64/mm/fault_neon.c diff --git a/arch/arm64/include/asm/insn.h b/arch/arm64/include/asm/insn.h index 8c0a36f72d6fc..d6e926b5046c1 100644 --- a/arch/arm64/include/asm/insn.h +++ b/arch/arm64/include/asm/insn.h @@ -431,6 +431,7 @@ __AARCH64_INSN_FUNCS(clrex, 0xFFFFF0FF, 0xD503305F) __AARCH64_INSN_FUNCS(ssbb, 0xFFFFFFFF, 0xD503309F) __AARCH64_INSN_FUNCS(pssbb, 0xFFFFFFFF, 0xD503349F) __AARCH64_INSN_FUNCS(bti, 0xFFFFFF3F, 0xD503241f) +__AARCH64_INSN_FUNCS(dc_zva, 0xFFFFFFE0, 0xD50B7420) #undef __AARCH64_INSN_FUNCS diff --git a/arch/arm64/mm/Makefile b/arch/arm64/mm/Makefile index 60454256945b8..05f1ac75e315c 100644 --- a/arch/arm64/mm/Makefile +++ b/arch/arm64/mm/Makefile @@ -1,5 +1,5 @@ # SPDX-License-Identifier: GPL-2.0 -obj-y := dma-mapping.o extable.o fault.o init.o \ +obj-y := dma-mapping.o extable.o fault.o fault_neon.o init.o \ cache.o copypage.o flush.o \ ioremap.o mmap.o pgd.o mmu.o \ context.o proc.o pageattr.o fixmap.o @@ -13,5 +13,6 @@ obj-$(CONFIG_DEBUG_VIRTUAL) += physaddr.o obj-$(CONFIG_ARM64_MTE) += mteswap.o KASAN_SANITIZE_physaddr.o += n ++CFLAGS_REMOVE_fault_neon.o += -mgeneral-regs-only obj-$(CONFIG_KASAN) += kasan_init.o KASAN_SANITIZE_kasan_init.o := n diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 451ba7cbd5adb..744e7b1664b1c 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -5,6 +5,7 @@ * Copyright (C) 1995 Linus Torvalds * Copyright (C) 1995-2004 Russell King * Copyright (C) 2012 ARM Ltd. + * Copyright (C) 2020 Ampere Computing LLC */ #include @@ -42,8 +43,10 @@ #include #include #include +#include struct fault_info { + /* fault handler, return 0 on successful handling */ int (*fn)(unsigned long far, unsigned long esr, struct pt_regs *regs); int sig; @@ -693,9 +696,727 @@ static int __kprobes do_translation_fault(unsigned long far, return 0; } +static int copy_from_user_io(void *to, const void __user *from, unsigned long n) +{ + const u8 __user *src = from; + u8 *dest = to; + + for (; n; n--) + if (get_user(*dest++, src++)) + break; + return n; +} + +static int copy_to_user_io(void __user *to, const void *from, unsigned long n) +{ + const u8 *src = from; + u8 __user *dest = to; + + for (; n; n--) + if (put_user(*src++, dest++)) + break; + return n; +} + +static int align_load(unsigned long addr, int sz, u64 *out) +{ + union { + u8 d8; + u16 d16; + u32 d32; + u64 d64; + char c[8]; + } data; + + if (sz != 1 && sz != 2 && sz != 4 && sz != 8) + return 1; + if (is_ttbr0_addr(addr)) { + if (copy_from_user_io(data.c, (const void __user *)addr, sz)) + return 1; + } else + memcpy_fromio(data.c, (const void __iomem *)addr, sz); + switch (sz) { + case 1: + *out = data.d8; + break; + case 2: + *out = data.d16; + break; + case 4: + *out = data.d32; + break; + case 8: + *out = data.d64; + break; + default: + return 1; + } + return 0; +} + +static int align_store(unsigned long addr, int sz, u64 val) +{ + union { + u8 d8; + u16 d16; + u32 d32; + u64 d64; + char c[8]; + } data; + + switch (sz) { + case 1: + data.d8 = val; + break; + case 2: + data.d16 = val; + break; + case 4: + data.d32 = val; + break; + case 8: + data.d64 = val; + break; + default: + return 1; + } + if (is_ttbr0_addr(addr)) { + if (copy_to_user_io((void __user *)addr, data.c, sz)) + return 1; + } else + memcpy_toio((void __iomem *)addr, data.c, sz); + return 0; +} + +static int align_dc_zva(unsigned long addr, struct pt_regs *regs) +{ + int bs = read_cpuid(DCZID_EL0) & 0xf; + int sz = 1 << (bs + 2); + + addr &= ~(sz - 1); + if (is_ttbr0_addr(addr)) { + for (; sz; sz--) { + if (align_store(addr, 1, 0)) + return 1; + } + } else + memset_io((void *)addr, 0, sz); + return 0; +} + +extern u64 __arm64_get_vn_dt(int n, int t); +extern void __arm64_set_vn_dt(int n, int t, u64 val); + +#define get_vn_dt __arm64_get_vn_dt +#define set_vn_dt __arm64_set_vn_dt + +static int align_ldst_pair(u32 insn, struct pt_regs *regs) +{ + const u32 OPC = GENMASK(31, 30); + const u32 L_MASK = BIT(22); + + int opc = FIELD_GET(OPC, insn); + int L = FIELD_GET(L_MASK, insn); + + bool wback = !!(insn & BIT(23)); + bool postindex = !(insn & BIT(24)); + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int t2 = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT2, insn); + bool is_store = !L; + bool is_signed = !!(opc & 1); + int scale = 2 + (opc >> 1); + int datasize = 8 << scale; + u64 uoffset = aarch64_insn_decode_immediate(AARCH64_INSN_IMM_7, insn); + s64 offset = sign_extend64(uoffset, 6) << scale; + u64 address; + u64 data1, data2; + u64 dbytes; + + if ((is_store && (opc & 1)) || opc == 3) + return 1; + + if (wback && (t == n || t2 == n) && n != 31) + return 1; + + if (!is_store && t == t2) + return 1; + + dbytes = datasize / 8; + + address = regs_get_register(regs, n << 3); + + if (!postindex) + address += offset; + + if (is_store) { + data1 = pt_regs_read_reg(regs, t); + data2 = pt_regs_read_reg(regs, t2); + if (align_store(address, dbytes, data1) || + align_store(address + dbytes, dbytes, data2)) + return 1; + } else { + if (align_load(address, dbytes, &data1) || + align_load(address + dbytes, dbytes, &data2)) + return 1; + if (is_signed) { + data1 = sign_extend64(data1, datasize - 1); + data2 = sign_extend64(data2, datasize - 1); + } + pt_regs_write_reg(regs, t, data1); + pt_regs_write_reg(regs, t2, data2); + } + + if (wback) { + if (postindex) + address += offset; + if (n == 31) + regs->sp = address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_pair_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 OPC = GENMASK(31, 30); + const u32 L_MASK = BIT(22); + + int opc = FIELD_GET(OPC, insn); + int L = FIELD_GET(L_MASK, insn); + + bool wback = !!(insn & BIT(23)); + bool postindex = !(insn & BIT(24)); + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int t2 = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT2, insn); + bool is_store = !L; + int scale = 2 + opc; + int datasize = 8 << scale; + u64 uoffset = aarch64_insn_decode_immediate(AARCH64_INSN_IMM_7, insn); + s64 offset = sign_extend64(uoffset, 6) << scale; + u64 address; + u64 data1_d0, data1_d1, data2_d0, data2_d1; + u64 dbytes; + + if (opc == 0x3) + return 1; + + if (!is_store && t == t2) + return 1; + + dbytes = datasize / 8; + + address = regs_get_register(regs, n << 3); + + if (!postindex) + address += offset; + + if (is_store) { + data1_d0 = get_vn_dt(t, 0); + data2_d0 = get_vn_dt(t2, 0); + if (datasize == 128) { + data1_d1 = get_vn_dt(t, 1); + data2_d1 = get_vn_dt(t2, 1); + if (align_store(address, 8, data1_d0) || + align_store(address + 8, 8, data1_d1) || + align_store(address + 16, 8, data2_d0) || + align_store(address + 24, 8, data2_d1)) + return 1; + } else { + if (align_store(address, dbytes, data1_d0) || + align_store(address + dbytes, dbytes, data2_d0)) + return 1; + } + } else { + if (datasize == 128) { + if (align_load(address, 8, &data1_d0) || + align_load(address + 8, 8, &data1_d1) || + align_load(address + 16, 8, &data2_d0) || + align_load(address + 24, 8, &data2_d1)) + return 1; + } else { + if (align_load(address, dbytes, &data1_d0) || + align_load(address + dbytes, dbytes, &data2_d0)) + return 1; + data1_d1 = data2_d1 = 0; + } + set_vn_dt(t, 0, data1_d0); + set_vn_dt(t, 1, data1_d1); + set_vn_dt(t2, 0, data2_d0); + set_vn_dt(t2, 1, data2_d1); + } + + if (wback) { + if (postindex) + address += offset; + if (n == 31) + regs->sp = address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_regoff(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE = GENMASK(31, 30); + const u32 OPC = GENMASK(23, 22); + const u32 OPTION = GENMASK(15, 13); + const u32 S = BIT(12); + + u32 size = FIELD_GET(SIZE, insn); + u32 opc = FIELD_GET(OPC, insn); + u32 option = FIELD_GET(OPTION, insn); + u32 s = FIELD_GET(S, insn); + int scale = size; + int extend_len = (option & 0x1) ? 64 : 32; + bool extend_unsigned = !(option & 0x4); + int shift = s ? scale : 0; + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int m = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RM, insn); + bool is_store; + bool is_signed; + int regsize; + int datasize; + u64 offset; + u64 address; + u64 data; + + if ((opc & 0x2) == 0) { + /* store or zero-extending load */ + is_store = !(opc & 0x1); + regsize = size == 0x3 ? 64 : 32; + is_signed = false; + } else { + if (size == 0x3) { + if ((opc & 0x1) == 0) { + /* prefetch */ + return 0; + } else { + /* undefined */ + return 1; + } + } else { + /* sign-extending load */ + is_store = false; + if (size == 0x2 && (opc & 0x1) == 0x1) { + /* undefined */ + return 1; + } + regsize = (opc & 0x1) == 0x1 ? 32 : 64; + is_signed = true; + } + } + + datasize = 8 << scale; + + if (n == t && n != 31) + return 1; + + offset = pt_regs_read_reg(regs, m); + if (extend_len == 32) { + offset &= (u32)~0; + if (!extend_unsigned) + sign_extend64(offset, 31); + } + offset <<= shift; + + address = regs_get_register(regs, n << 3) + offset; + + if (is_store) { + data = pt_regs_read_reg(regs, t); + if (align_store(address, datasize / 8, data)) + return 1; + } else { + if (align_load(address, datasize / 8, &data)) + return 1; + if (is_signed) { + if (regsize == 32) + data = sign_extend32(data, datasize - 1); + else + data = sign_extend64(data, datasize - 1); + } + } + + return 0; +} + +static int align_ldst_regoff_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE = GENMASK(31, 30); + const u32 OPC = GENMASK(23, 22); + const u32 OPTION = GENMASK(15, 13); + const u32 S = BIT(12); + + u32 size = FIELD_GET(SIZE, insn); + u32 opc = FIELD_GET(OPC, insn); + u32 option = FIELD_GET(OPTION, insn); + u32 s = FIELD_GET(S, insn); + /* this elides the 8/16 bit sign extensions */ + int extend_len = (option & 0x1) ? 64 : 32; + bool extend_unsigned = !(option & 0x4); + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + int m = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RM, insn); + bool is_store = !(opc & BIT(0)); + int scale; + int shift; + int datasize; + u64 offset; + u64 address; + u64 data_d0, data_d1; + + /* if option<1> == '0' then UNDEFINED; // sub-word index */ + if ((option & 0x2) == 0) { + pr_warn("option<1> == 0 is UNDEFINED"); + return 1; + } + + /* if opc<1> == '1' && size != '00' then UNDEFINED;*/ + if ((opc & 0x2) && size != 0b00) { + pr_warn("opc<1> == '1' && size != '00' is UNDEFINED\n"); + return 1; + } + + /* + * constant integer scale = if opc<1> == '1' then 4 else UInt(size); + */ + scale = opc & 0x2 ? 4 : size; + shift = s ? scale : 0; + + datasize = 8 << scale; + + offset = pt_regs_read_reg(regs, m); + if (extend_len == 32) { + offset &= (u32)~0; + if (!extend_unsigned) + sign_extend64(offset, 31); + } + offset <<= shift; + + address = regs_get_register(regs, n << 3) + offset; + + if (is_store) { + data_d0 = get_vn_dt(t, 0); + if (datasize == 128) { + data_d1 = get_vn_dt(t, 1); + if (align_store(address, 8, data_d0) || + align_store(address + 8, 8, data_d1)) + return 1; + } else { + if (align_store(address, datasize / 8, data_d0)) + return 1; + } + } else { + if (datasize == 128) { + if (align_load(address, 8, &data_d0) || + align_load(address + 8, 8, &data_d1)) + return 1; + } else { + if (align_load(address, datasize / 8, &data_d0)) + return 1; + data_d1 = 0; + } + set_vn_dt(t, 0, data_d0); + set_vn_dt(t, 1, data_d1); + } + + return 0; +} + +static int align_ldst_imm(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE = GENMASK(31, 30); + const u32 OPC = GENMASK(23, 22); + + u32 size = FIELD_GET(SIZE, insn); + u32 opc = FIELD_GET(OPC, insn); + bool wback = !(insn & BIT(24)) && !!(insn & BIT(10)); + bool postindex = wback && !(insn & BIT(11)); + int scale = size; + u64 offset; + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + bool is_store; + bool is_signed; + int regsize; + int datasize; + u64 address; + u64 data; + + if (!(insn & BIT(24))) { + u64 uoffset = + aarch64_insn_decode_immediate(AARCH64_INSN_IMM_9, insn); + offset = sign_extend64(uoffset, 8); + } else { + offset = aarch64_insn_decode_immediate(AARCH64_INSN_IMM_12, insn); + offset <<= scale; + } + + if ((opc & 0x2) == 0) { + /* store or zero-extending load */ + is_store = !(opc & 0x1); + regsize = size == 0x3 ? 64 : 32; + is_signed = false; + } else { + if (size == 0x3) { + if (FIELD_GET(GENMASK(11, 10), insn) == 0 && (opc & 0x1) == 0) { + /* prefetch */ + return 0; + } else { + /* undefined */ + return 1; + } + } else { + /* sign-extending load */ + is_store = false; + if (size == 0x2 && (opc & 0x1) == 0x1) { + /* undefined */ + return 1; + } + regsize = (opc & 0x1) == 0x1 ? 32 : 64; + is_signed = true; + } + } + + datasize = 8 << scale; + + if (n == t && n != 31) + return 1; + + address = regs_get_register(regs, n << 3); + + if (!postindex) + address += offset; + + if (is_store) { + data = pt_regs_read_reg(regs, t); + if (align_store(address, datasize / 8, data)) + return 1; + } else { + if (align_load(address, datasize / 8, &data)) + return 1; + if (is_signed) { + if (regsize == 32) + data = sign_extend32(data, datasize - 1); + else + data = sign_extend64(data, datasize - 1); + } + pt_regs_write_reg(regs, t, data); + } + + if (wback) { + if (postindex) + address += offset; + if (n == 31) + regs->sp = address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst_imm_simdfp(u32 insn, struct pt_regs *regs) +{ + const u32 SIZE = GENMASK(31, 30); + const u32 OPC = GENMASK(23, 22); + + u32 size = FIELD_GET(SIZE, insn); + u32 opc = FIELD_GET(OPC, insn); + bool wback = !(insn & BIT(24)) && !!(insn & BIT(10)); + bool postindex = wback && !(insn & BIT(11)); + int scale = (opc & 0x2) << 1 | size; + u64 offset; + + int n = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RN, insn); + int t = aarch64_insn_decode_register(AARCH64_INSN_REGTYPE_RT, insn); + bool is_store = !(opc & BIT(0)) ; + int datasize; + u64 address; + u64 data_d0, data_d1; + + if (scale > 4) + return 1; + + if (!(insn & BIT(24))) { + u64 uoffset = + aarch64_insn_decode_immediate(AARCH64_INSN_IMM_9, insn); + offset = sign_extend64(uoffset, 8); + } else { + offset = aarch64_insn_decode_immediate(AARCH64_INSN_IMM_12, insn); + offset <<= scale; + } + + datasize = 8 << scale; + + address = regs_get_register(regs, n << 3); + + if (!postindex) + address += offset; + + if (is_store) { + data_d0 = get_vn_dt(t, 0); + if (datasize == 128) { + data_d1 = get_vn_dt(t, 1); + if (align_store(address, 8, data_d0) || + align_store(address + 8, 8, data_d1)) + return 1; + } else { + if (align_store(address, datasize / 8, data_d0)) + return 1; + } + } else { + if (datasize == 128) { + if (align_load(address, 8, &data_d0) || + align_load(address + 8, 8, &data_d1)) + return 1; + } else { + if (align_load(address, datasize / 8, &data_d0)) + return 1; + data_d1 = 0; + } + set_vn_dt(t, 0, data_d0); + set_vn_dt(t, 1, data_d1); + } + + if (wback) { + if (postindex) + address += offset; + if (n == 31) + regs->sp = address; + else + pt_regs_write_reg(regs, n, address); + } + + return 0; +} + +static int align_ldst(u32 insn, struct pt_regs *regs) +{ + const u32 op0 = FIELD_GET(GENMASK(31, 28), insn); + const u32 op1 = FIELD_GET(BIT(26), insn); + const u32 op2 = FIELD_GET(GENMASK(24, 23), insn); + const u32 op3 = FIELD_GET(GENMASK(21, 16), insn); + const u32 op4 = FIELD_GET(GENMASK(11, 10), insn); + + if ((op0 & 0x3) == 0x2) { + /* + * |------+-----+-----+-----+-----+-----------------------------------------| + * | op0 | op1 | op2 | op3 | op4 | Decode group | + * |------+-----+-----+-----+-----+-----------------------------------------| + * | xx10 | - | 00 | - | - | Load/store no-allocate pair (offset) | + * | xx10 | - | 01 | - | - | Load/store register pair (post-indexed) | + * | xx10 | - | 10 | - | - | Load/store register pair (offset) | + * | xx10 | - | 11 | - | - | Load/store register pair (pre-indexed) | + * |------+-----+-----+-----+-----+-----------------------------------------| + */ + + if (op1 == 0) { /* V == 0 */ + /* general */ + return align_ldst_pair(insn, regs); + } else { + /* simdfp */ + return align_ldst_pair_simdfp(insn, regs); + } + } else if ((op0 & 0x3) == 0x3 && + (((op2 & 0x2) == 0 && (op3 & 0x20) == 0 && op4 != 0x2) || + ((op2 & 0x2) == 0x2))) { + /* + * |------+-----+-----+--------+-----+----------------------------------------------| + * | op0 | op1 | op2 | op3 | op4 | Decode group | + * |------+-----+-----+--------+-----+----------------------------------------------| + * | xx11 | - | 0x | 0xxxxx | 00 | Load/store register (unscaled immediate) | + * | xx11 | - | 0x | 0xxxxx | 01 | Load/store register (immediate post-indexed) | + * | xx11 | - | 0x | 0xxxxx | 11 | Load/store register (immediate pre-indexed) | + * | xx11 | - | 1x | - | - | Load/store register (unsigned immediate) | + * |------+-----+-----+--------+-----+----------------------------------------------| + */ + + if (op1 == 0) { /* V == 0 */ + /* general */ + return align_ldst_imm(insn, regs); + } else { + /* simdfp */ + return align_ldst_imm_simdfp(insn, regs); + } + } else if ((op0 & 0x3) == 0x3 && (op2 & 0x2) == 0 && + (op3 & 0x20) == 0x20 && op4 == 0x2) { + /* + * |------+-----+-----+--------+-----+---------------------------------------| + * | op0 | op1 | op2 | op3 | op4 | | + * |------+-----+-----+--------+-----+---------------------------------------| + * | xx11 | - | 0x | 1xxxxx | 10 | Load/store register (register offset) | + * |------+-----+-----+--------+-----+---------------------------------------| + */ + if (op1 == 0) { /* V == 0 */ + /* general */ + return align_ldst_regoff(insn, regs); + } else { + /* simdfp */ + return align_ldst_regoff_simdfp(insn, regs); + } + } else + return 1; +} + +static int fixup_alignment(unsigned long addr, unsigned int esr, + struct pt_regs *regs) +{ + u32 insn; + int res; + + if (user_mode(regs)) { + __le32 insn_le; + + if (!is_ttbr0_addr(addr)) + return 1; + + if (get_user(insn_le, + (__le32 __user *)instruction_pointer(regs))) + return 1; + insn = le32_to_cpu(insn_le); + } else { + if (aarch64_insn_read((void *)instruction_pointer(regs), &insn)) + return 1; + } + + if (aarch64_insn_is_class_branch_sys(insn)) { + if (aarch64_insn_is_dc_zva(insn)) + res = align_dc_zva(addr, regs); + else + res = 1; + } else if (((insn >> 25) & 0x5) == 0x4) { + res = align_ldst(insn, regs); + } else { + res = 1; + } + + if (!res) + instruction_pointer_set(regs, instruction_pointer(regs) + 4); + else + pr_warn("%s: failed to fixup 0x%04x", __func__, insn); + + return res; +} + static int do_alignment_fault(unsigned long far, unsigned long esr, struct pt_regs *regs) { +#ifdef CONFIG_ALTRA_ERRATUM_82288 + if (!fixup_alignment(far, esr, regs)) + return 0; +#endif if (IS_ENABLED(CONFIG_COMPAT_ALIGNMENT_FIXUPS) && compat_user_mode(regs)) return do_compat_alignment_fixup(far, regs); diff --git a/arch/arm64/mm/fault_neon.c b/arch/arm64/mm/fault_neon.c new file mode 100644 index 0000000000000..d5319ed07d89b --- /dev/null +++ b/arch/arm64/mm/fault_neon.c @@ -0,0 +1,59 @@ +/* + * These functions require asimd, which is not accepted by Clang in normal + * kernel code, which is compiled with -mgeneral-regs-only. GCC will somehow + * eat it regardless, but we want it to be portable, so move these in their + * own translation unit. This allows us to turn off -mgeneral-regs-only for + * these (where it should be harmless) without risking the compiler doing + * wrong things in places where we don't want it to. + * + * Otherwise this is identical to the original patch. + * + * -- q66 + * + */ + +#include + +u64 __arm64_get_vn_dt(int n, int t) { + u64 res; + + switch (n) { +#define V(n) \ + case n: \ + asm("cbnz %w1, 1f\n\t" \ + "mov %0, v"#n".d[0]\n\t" \ + "b 2f\n\t" \ + "1: mov %0, v"#n".d[1]\n\t" \ + "2:" : "=r" (res) : "r" (t)); \ + break + V( 0); V( 1); V( 2); V( 3); V( 4); V( 5); V( 6); V( 7); + V( 8); V( 9); V(10); V(11); V(12); V(13); V(14); V(15); + V(16); V(17); V(18); V(19); V(20); V(21); V(22); V(23); + V(24); V(25); V(26); V(27); V(28); V(29); V(30); V(31); +#undef V + default: + res = 0; + break; + } + return res; +} + +void __arm64_set_vn_dt(int n, int t, u64 val) { + switch (n) { +#define V(n) \ + case n: \ + asm("cbnz %w1, 1f\n\t" \ + "mov v"#n".d[0], %0\n\t" \ + "b 2f\n\t" \ + "1: mov v"#n".d[1], %0\n\t" \ + "2:" :: "r" (val), "r" (t)); \ + break + V( 0); V( 1); V( 2); V( 3); V( 4); V( 5); V( 6); V( 7); + V( 8); V( 9); V(10); V(11); V(12); V(13); V(14); V(15); + V(16); V(17); V(18); V(19); V(20); V(21); V(22); V(23); + V(24); V(25); V(26); V(27); V(28); V(29); V(30); V(31); +#undef Q + default: + break; + } +} From patchwork Tue Aug 27 13:08:28 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 13779528 Received: from mail-wm1-f44.google.com (mail-wm1-f44.google.com [209.85.128.44]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 136481C68A1 for ; Tue, 27 Aug 2024 13:08:33 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.128.44 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764116; cv=none; b=hAi2WyazcKzxkm+PyyRwQm4MrQKN5/nNfwgKf78SJnJH7gthELips7lPWXitcg/2GZeS5KtMiy0g6d1L6KSAQ0ll25UjhWJluS6VWMYFO5XmOChcNvN/irT5Nwc59iamtBQYO9kdJmjpzkjCTcT4iwbajh9Mzf/SGRJ0xSB84XA= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764116; c=relaxed/simple; bh=K+TlnW9Omwb4YI1ixYapWz1zvuIl+jvuJ8oEax/y7lQ=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=p9crqRvWCZlSM5gHMmmMEpSQxe8uHcD3//msI9Gyu/RONToRtR0fsWA5Z3GcByJBh3DD6oCFbFjhOYdO37xSEUBm5wKD/db252PJ+T2viyghxCGg2XVMwAz6jfR9T/cs/ftAWpyC/hx/B+DmBzwqx5NhU+xiXcqY6oS8MhHQB1Y= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=q8tqisc6; arc=none smtp.client-ip=209.85.128.44 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="q8tqisc6" Received: by mail-wm1-f44.google.com with SMTP id 5b1f17b1804b1-428fb103724so34600015e9.1 for ; Tue, 27 Aug 2024 06:08:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764112; x=1725368912; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=zVI5R+WdttBIprX1m04d9bNMFDDS+NINGmtqeG2h9Pk=; b=q8tqisc6qvwC9MpIGhu5X3sDpdVxWzf1pz8gsEjWmXCciikURaEpsv63Wnu+Lyyhiz y9c+9wWkOpjefQg+VDZHUlBHs48x55NpWBY7aZoFtQ6CJlfTfUIXXjkiUwZbytfFGmni 2IgrJmypXc7PyWxJGCw9AkFwYiJF3r5o3MArh6duTOL7v0xpb8LaOJaqZS3r36/zs3OE vV51T8rkS2jmLCvGZLo09gxb3GBA4sbfiOysyJmd46zriiUu1zwJRKeuJn3Q6VmIqZLm 380GFYZ1y6QdszPJdpJVYsJccbiaCLRNeQxq+PotlwOFNku2AINR+dXJ0ATTdLLPeG9J tp1Q== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764112; x=1725368912; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=zVI5R+WdttBIprX1m04d9bNMFDDS+NINGmtqeG2h9Pk=; b=FWcO1Dxo1OfF/JuHXJ1L8wbaJ9IEScbDeWLyrZQix0tcd1hLUxVmuzfZ+d4oNyku9P gufj/V9P+claUoSDF8oJU8gJ0XSB7euP1/tKj2PzG9/gmRmY6qSpQkDXH8lgRyaHwRDl CGJn6Uc4j7PoyM2EXxIwgcfKBGRoqoJ1dydyBmS33d4lc+3VqU9tnSfJ9fDpI5FKVlqH cHuMledGEFQ00i1lDAKa+5OnNZ8a7Kc2BIIOPdruOJnmro+ZrJW4hqyJS/nRCoVTdQan qhMdpQE6cFgCt5+hMoYD8GPupZe6jqUIv0YAt3dyZ+A7W3i+iR8d69bA7JeqHNmJ1zoC J2HQ== X-Gm-Message-State: AOJu0YzYLi1c5Kzz4vKlNWki97YiEoIkjDumym1ZHtabh8YEsMbIMFME RnoCWDN6a9jKpNqDLKEWA6ZFDTYZVLEuz+o0rY0p6Q1Tg3045h6f+0E7yzVRp3Y= X-Google-Smtp-Source: AGHT+IE8YdSXKOIj7u+2bEQxKE7kkLyia2JoNSH9WemPGkONFgqu20dBwFvHn7im4jBKQqBOOuvFvA== X-Received: by 2002:a05:600c:19c7:b0:426:5b19:d2b3 with SMTP id 5b1f17b1804b1-42b9a682241mr16687955e9.14.1724764111836; Tue, 27 Aug 2024 06:08:31 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id 5b1f17b1804b1-42abefcb674sm221517205e9.32.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:30 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 29A415F9F9; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, D Scott Phillips , =?utf-8?q?Alex_Benn=C3=A9e?= Subject: [PATCH 2/3] ampere/arm64: Work around Ampere Altra erratum #82288 PCIE_65 Date: Tue, 27 Aug 2024 14:08:28 +0100 Message-Id: <20240827130829.43632-3-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: D Scott Phillips Altra's PCIe controller may generate incorrect addresses when receiving writes from the CPU with a discontiguous set of byte enables. Attempt to work around this by handing out Device-nGnRE maps instead of Normal Non-cacheable maps for PCIe memory areas. Upstream-Status: Pending Signed-off-by: D Scott Phillips Signed-off-by: Alex Bennée --- arch/arm64/Kconfig | 22 +++++++++++++++++++++- arch/arm64/include/asm/io.h | 3 +++ arch/arm64/include/asm/pgtable.h | 27 ++++++++++++++++++++++----- arch/arm64/mm/ioremap.c | 27 +++++++++++++++++++++++++++ drivers/pci/quirks.c | 9 +++++++++ include/asm-generic/io.h | 4 ++++ mm/ioremap.c | 2 +- 7 files changed, 87 insertions(+), 7 deletions(-) diff --git a/arch/arm64/Kconfig b/arch/arm64/Kconfig index b3fc891f15442..01adb50df214e 100644 --- a/arch/arm64/Kconfig +++ b/arch/arm64/Kconfig @@ -440,6 +440,27 @@ config AMPERE_ERRATUM_AC03_CPU_38 config ARM64_WORKAROUND_CLEAN_CACHE bool +config ALTRA_ERRATUM_82288 + bool "Ampere Altra: 82288: PCIE_65: PCIe Root Port outbound write combining issue" + default y + help + This option adds an alternative code sequence to work around + Ampere Altra erratum 82288. + + PCIe device drivers may map MMIO space as Normal, non-cacheable + memory attribute (e.g. Linux kernel drivers mapping MMIO + using ioremap_wc). This may be for the purpose of enabling write + combining or unaligned accesses. This can result in data corruption + on the PCIe interface’s outbound MMIO writes due to issues with the + write-combining operation. + + The workaround modifies software that maps PCIe MMIO space as Normal, + non-cacheable memory (e.g. ioremap_wc) to instead Device, + non-gatheringmemory (e.g. ioremap). And all memory operations on PCIe + MMIO space must be strictly aligned. + + If unsure, say Y. + config ARM64_ERRATUM_826319 bool "Cortex-A53: 826319: System might deadlock if a write cannot complete until read data is accepted" default y @@ -2388,4 +2409,3 @@ endmenu # "CPU Power Management" source "drivers/acpi/Kconfig" source "arch/arm64/kvm/Kconfig" - diff --git a/arch/arm64/include/asm/io.h b/arch/arm64/include/asm/io.h index 41fd90895dfc3..403b65f2f44de 100644 --- a/arch/arm64/include/asm/io.h +++ b/arch/arm64/include/asm/io.h @@ -273,6 +273,9 @@ __iowrite64_copy(void __iomem *to, const void *from, size_t count) #define ioremap_prot ioremap_prot +pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t size, unsigned long prot); +#define ioremap_map_prot ioremap_map_prot + #define _PAGE_IOREMAP PROT_DEVICE_nGnRE #define ioremap_wc(addr, size) \ diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index 7a4f5604be3f7..f4603924390eb 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -236,11 +236,6 @@ static inline pte_t pte_mkyoung(pte_t pte) return set_pte_bit(pte, __pgprot(PTE_AF)); } -static inline pte_t pte_mkspecial(pte_t pte) -{ - return set_pte_bit(pte, __pgprot(PTE_SPECIAL)); -} - static inline pte_t pte_mkcont(pte_t pte) { pte = set_pte_bit(pte, __pgprot(PTE_CONT)); @@ -682,6 +677,28 @@ static inline bool pud_table(pud_t pud) { return true; } PUD_TYPE_TABLE) #endif +#ifdef CONFIG_ALTRA_ERRATUM_82288 +extern bool __read_mostly have_altra_erratum_82288; +#endif + +static inline pte_t pte_mkspecial(pte_t pte) +{ +#ifdef CONFIG_ALTRA_ERRATUM_82288 + phys_addr_t phys = __pte_to_phys(pte); + pgprot_t prot = __pgprot(pte_val(pte) & ~PTE_ADDR_LOW); + + if (unlikely(have_altra_erratum_82288) && + (phys < 0x80000000 || + (phys >= 0x200000000000 && phys < 0x400000000000) || + (phys >= 0x600000000000 && phys < 0x800000000000))) { + pte = __pte(__phys_to_pte_val(phys) | pgprot_val(pgprot_device(prot))); + } +#endif + + return set_pte_bit(pte, __pgprot(PTE_SPECIAL)); +} + + extern pgd_t init_pg_dir[]; extern pgd_t init_pg_end[]; extern pgd_t swapper_pg_dir[]; diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index 269f2f63ab7dc..8965766181359 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -3,6 +3,33 @@ #include #include +#ifdef CONFIG_ALTRA_ERRATUM_82288 + +bool have_altra_erratum_82288 __read_mostly; +EXPORT_SYMBOL(have_altra_erratum_82288); + +static bool is_altra_pci(phys_addr_t phys_addr, size_t size) +{ + phys_addr_t end = phys_addr + size; + + return (phys_addr < 0x80000000 || + (end > 0x200000000000 && phys_addr < 0x400000000000) || + (end > 0x600000000000 && phys_addr < 0x800000000000)); +} +#endif + +pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t size, + unsigned long prot_val) +{ + pgprot_t prot = __pgprot(prot_val); +#ifdef CONFIG_ALTRA_ERRATUM_82288 + if (unlikely(have_altra_erratum_82288 && is_altra_pci(phys_addr, size))) { + prot = pgprot_device(prot); + } +#endif + return prot; +} + void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, unsigned long prot) { diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c index a2ce4e08edf5a..8baf90ee3357c 100644 --- a/drivers/pci/quirks.c +++ b/drivers/pci/quirks.c @@ -6234,6 +6234,15 @@ DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa73f, dpc_log_size); DECLARE_PCI_FIXUP_HEADER(PCI_VENDOR_ID_INTEL, 0xa76e, dpc_log_size); #endif +#ifdef CONFIG_ALTRA_ERRATUM_82288 +static void quirk_altra_erratum_82288(struct pci_dev *dev) +{ + pr_info_once("Write combining PCI maps disabled due to hardware erratum\n"); + have_altra_erratum_82288 = true; +} +DECLARE_PCI_FIXUP_EARLY(PCI_VENDOR_ID_AMPERE, 0xe100, quirk_altra_erratum_82288); +#endif + /* * For a PCI device with multiple downstream devices, its driver may use * a flattened device tree to describe the downstream devices. diff --git a/include/asm-generic/io.h b/include/asm-generic/io.h index 80de699bf6af4..75670d7094537 100644 --- a/include/asm-generic/io.h +++ b/include/asm-generic/io.h @@ -1047,6 +1047,10 @@ static inline void iounmap(volatile void __iomem *addr) #elif defined(CONFIG_GENERIC_IOREMAP) #include +#ifndef ioremap_map_prot +#define ioremap_map_prot(phys_addr, size, prot) __pgprot(prot) +#endif + void __iomem *generic_ioremap_prot(phys_addr_t phys_addr, size_t size, pgprot_t prot); diff --git a/mm/ioremap.c b/mm/ioremap.c index 3e049dfb28bd0..a4e6950682f33 100644 --- a/mm/ioremap.c +++ b/mm/ioremap.c @@ -52,7 +52,7 @@ void __iomem *generic_ioremap_prot(phys_addr_t phys_addr, size_t size, void __iomem *ioremap_prot(phys_addr_t phys_addr, size_t size, unsigned long prot) { - return generic_ioremap_prot(phys_addr, size, __pgprot(prot)); + return generic_ioremap_prot(phys_addr, size, ioremap_map_prot(phys_addr, size, prot)); } EXPORT_SYMBOL(ioremap_prot); #endif From patchwork Tue Aug 27 13:08:29 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 8bit X-Patchwork-Submitter: =?utf-8?q?Alex_Benn=C3=A9e?= X-Patchwork-Id: 13779529 Received: from mail-wr1-f45.google.com (mail-wr1-f45.google.com [209.85.221.45]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id AA5971C68A7 for ; Tue, 27 Aug 2024 13:08:34 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.221.45 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764117; cv=none; b=WHQa8Aiv4Pbe973tnxQ0YkvXbWd01Ys/Bo0aokZa/dXdvRE1zetv6MU7K0MCFlz3BVTJGR1IN8MBjXJxvN1QWcIqlToOK5lJG3WCFbTvjcs0noix2RxWA4gCuZxJlmfxE14myIn51g9ltCxJDwkYYR7nKbZYKip6MlPSGz1eKfc= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1724764117; c=relaxed/simple; bh=DKZ2PjINcn2iG38PL7e/2uFqlx0qNbPScgOvJPCEMaI=; h=From:To:Cc:Subject:Date:Message-Id:In-Reply-To:References: MIME-Version:Content-Type; b=fX/boZfyHsKNMrsow9qJgYhTQmCGz+OZuQztV83y7l9UnDjge2UdcnRmqyzvcFCtnkBOC2sj2zpL3ziJMG7LU4t2373vkAUgs3J0r+Bscj9D1L7EU+niFzyasY1DHPeP9IVVlG6OV0Hh9uu2BR7M15HqkvJMAPdPWDJ0s8IMQPs= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org; spf=pass smtp.mailfrom=linaro.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b=K3aCZ5jo; arc=none smtp.client-ip=209.85.221.45 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=none dis=none) header.from=linaro.org Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=linaro.org Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=linaro.org header.i=@linaro.org header.b="K3aCZ5jo" Received: by mail-wr1-f45.google.com with SMTP id ffacd0b85a97d-3717ff2358eso3000944f8f.1 for ; Tue, 27 Aug 2024 06:08:34 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linaro.org; s=google; t=1724764113; x=1725368913; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=7KqRcAFW8nUzDZ1FEN66jGQtx4LYaSNAvAG+5szB8q0=; b=K3aCZ5jo8K2zlHgZPYjZIJGNJpUpr05l82Yv/lFxIrdlvjsJBAab1K9PPi6B2CYUUK Tg8RFOCQPqG9ufKL1aqyktO7jsULYeTpNLLpkLSmbzf9ZeJ0cPmJ1AsoZOwtr5648feS XBe033GcGQlsp0niXml7uoadWyxzutHwPNCLT8pmsTv6RNxr7kqKG1tPccqG4CbEAzMY OGa/FVmKcKPhM6ixlSyXPwCNgl5QGVZbmuvCvAlg/EH7cM2u5FudVMkmXC13DYveDXiN 6R8mcz70LySzPXha9gpfxLwV7es1Tk9woyiDVBQxqFM29X2nXfKddpOblgLYcqRrbcOh 6OSw== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1724764113; x=1725368913; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=7KqRcAFW8nUzDZ1FEN66jGQtx4LYaSNAvAG+5szB8q0=; b=sMBEkLLJlM8tdMd6WZFjT6OtDEldfVEan7PUrHwQh2Q68WfT9t6s35BnLshcuYU0fX NEch3X7tCcbtIDQCkRM+4o599Ri+SVbBNSr1mPzh33jknyvP2QvqFWAVnThFhm3lAfmr Da4dj9nKBWzbTyNx0jNEIh978z9q1emzKc+DO550OvGCsNmsrndFBU4P3h5mD/8/Jt6Z NQ4boNGXKCQWGuxUhDmCvh/YbYN9pw8A1MZe7aEedpLqrr+GJnxxChEAAkhhTGGFXefV fdotVF64IRDMrCAT+ZnK64AdF5Od/eaw7CRQ0SWXYI2ionuOnHXDBmB/Nna0Wym5GLBp kl8w== X-Gm-Message-State: AOJu0Yxlf2hNaKdmM/yI+rljTa/H7c7Ytst/Lu+IOd4S5sHmgZcObmq9 EvOc3n9n+PL4nO0vGs3ZKTuw47YDikOLxAOBO19tiK7H9ZWRRySoru6l4MJ6b3M= X-Google-Smtp-Source: AGHT+IFlcKOK92s/UxJ1kIhXDpRFVT9AfPK9EVRV4+49EEVGWs+1h7/ccWiUMy3fn1r+RZMCd6eVbg== X-Received: by 2002:adf:a351:0:b0:371:7e46:68cb with SMTP id ffacd0b85a97d-373118c6b95mr7484089f8f.50.1724764112624; Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan ([85.9.250.243]) by smtp.gmail.com with ESMTPSA id ffacd0b85a97d-3730813c1e3sm13012590f8f.35.2024.08.27.06.08.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Tue, 27 Aug 2024 06:08:32 -0700 (PDT) Received: from draig.lan (localhost [IPv6:::1]) by draig.lan (Postfix) with ESMTP id 380E35F9FF; Tue, 27 Aug 2024 14:08:30 +0100 (BST) From: =?utf-8?q?Alex_Benn=C3=A9e?= To: linux-kernel@vger.kernel.org Cc: kvm@vger.kernel.org, linux-arm-kernel@lists.infradead.org, kvmarm@lists.linux.dev, maz@kernel.org, arnd@linaro.org, =?utf-8?q?Alex_Ben?= =?utf-8?q?n=C3=A9e?= Subject: [PATCH 3/3] ampere/arm64: instrument the altra workarounds Date: Tue, 27 Aug 2024 14:08:29 +0100 Message-Id: <20240827130829.43632-4-alex.bennee@linaro.org> X-Mailer: git-send-email 2.39.2 In-Reply-To: <20240827130829.43632-1-alex.bennee@linaro.org> References: <20240827130829.43632-1-alex.bennee@linaro.org> Precedence: bulk X-Mailing-List: kvm@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 This is mostly a debugging aid to measure the impact of workarounds. Signed-off-by: Alex Bennée --- arch/arm64/include/asm/pgtable.h | 2 ++ arch/arm64/mm/fault.c | 9 +++-- arch/arm64/mm/ioremap.c | 11 ++++++ include/trace/events/altra_fixup.h | 57 ++++++++++++++++++++++++++++++ 4 files changed, 77 insertions(+), 2 deletions(-) create mode 100644 include/trace/events/altra_fixup.h diff --git a/arch/arm64/include/asm/pgtable.h b/arch/arm64/include/asm/pgtable.h index f4603924390eb..26812b7fc6d93 100644 --- a/arch/arm64/include/asm/pgtable.h +++ b/arch/arm64/include/asm/pgtable.h @@ -679,6 +679,7 @@ static inline bool pud_table(pud_t pud) { return true; } #ifdef CONFIG_ALTRA_ERRATUM_82288 extern bool __read_mostly have_altra_erratum_82288; +void do_trace_altra_mkspecial(pte_t pte); #endif static inline pte_t pte_mkspecial(pte_t pte) @@ -692,6 +693,7 @@ static inline pte_t pte_mkspecial(pte_t pte) (phys >= 0x200000000000 && phys < 0x400000000000) || (phys >= 0x600000000000 && phys < 0x800000000000))) { pte = __pte(__phys_to_pte_val(phys) | pgprot_val(pgprot_device(prot))); + do_trace_altra_mkspecial(pte); } #endif diff --git a/arch/arm64/mm/fault.c b/arch/arm64/mm/fault.c index 744e7b1664b1c..6cb3c600cc56a 100644 --- a/arch/arm64/mm/fault.c +++ b/arch/arm64/mm/fault.c @@ -45,6 +45,8 @@ #include #include +#include + struct fault_info { /* fault handler, return 0 on successful handling */ int (*fn)(unsigned long far, unsigned long esr, @@ -1376,6 +1378,8 @@ static int fixup_alignment(unsigned long addr, unsigned int esr, u32 insn; int res; + trace_altra_fixup_alignment(addr, esr); + if (user_mode(regs)) { __le32 insn_le; @@ -1414,8 +1418,9 @@ static int do_alignment_fault(unsigned long far, unsigned long esr, struct pt_regs *regs) { #ifdef CONFIG_ALTRA_ERRATUM_82288 - if (!fixup_alignment(far, esr, regs)) - return 0; + if (!fixup_alignment(far, esr, regs)) { + return 0; + } #endif if (IS_ENABLED(CONFIG_COMPAT_ALIGNMENT_FIXUPS) && compat_user_mode(regs)) diff --git a/arch/arm64/mm/ioremap.c b/arch/arm64/mm/ioremap.c index 8965766181359..d38d903d8a063 100644 --- a/arch/arm64/mm/ioremap.c +++ b/arch/arm64/mm/ioremap.c @@ -5,9 +5,19 @@ #ifdef CONFIG_ALTRA_ERRATUM_82288 +#define CREATE_TRACE_POINTS +#include + bool have_altra_erratum_82288 __read_mostly; EXPORT_SYMBOL(have_altra_erratum_82288); +void do_trace_altra_mkspecial(pte_t pte) +{ + trace_altra_mkspecial(pte); +} +EXPORT_SYMBOL(do_trace_altra_mkspecial); +EXPORT_TRACEPOINT_SYMBOL(altra_mkspecial); + static bool is_altra_pci(phys_addr_t phys_addr, size_t size) { phys_addr_t end = phys_addr + size; @@ -25,6 +35,7 @@ pgprot_t ioremap_map_prot(phys_addr_t phys_addr, size_t size, #ifdef CONFIG_ALTRA_ERRATUM_82288 if (unlikely(have_altra_erratum_82288 && is_altra_pci(phys_addr, size))) { prot = pgprot_device(prot); + trace_altra_ioremap_prot(prot); } #endif return prot; diff --git a/include/trace/events/altra_fixup.h b/include/trace/events/altra_fixup.h new file mode 100644 index 0000000000000..73115740c5d84 --- /dev/null +++ b/include/trace/events/altra_fixup.h @@ -0,0 +1,57 @@ +/* SPDX-License-Identifier: GPL-2.0 */ +#undef TRACE_SYSTEM +#define TRACE_SYSTEM altra_fixup + +#if !defined(_ALTERA_FIXUP_H_) || defined(TRACE_HEADER_MULTI_READ) +#define _ALTRA_FIXUP_H_ + +#include +#include + +#ifdef CONFIG_ALTRA_ERRATUM_82288 + +TRACE_EVENT(altra_fixup_alignment, + TP_PROTO(unsigned long far, unsigned long esr), + TP_ARGS(far, esr), + TP_STRUCT__entry( + __field(unsigned long, far) + __field(unsigned long, esr) + ), + TP_fast_assign( + __entry->far = far; + __entry->esr = esr; + ), + TP_printk("far=0x%016lx esr=0x%016lx", + __entry->far, __entry->esr) +); + +TRACE_EVENT(altra_mkspecial, + TP_PROTO(pte_t pte), + TP_ARGS(pte), + TP_STRUCT__entry( + __field(pteval_t, pte) + ), + TP_fast_assign( + __entry->pte = pte_val(pte); + ), + TP_printk("pte=0x%016llx", __entry->pte) +); + +TRACE_EVENT(altra_ioremap_prot, + TP_PROTO(pgprot_t prot), + TP_ARGS(prot), + TP_STRUCT__entry( + __field(pteval_t, pte) + ), + TP_fast_assign( + __entry->pte = pgprot_val(prot); + ), + TP_printk("prot=0x%016llx", __entry->pte) +); + +#endif /* CONFIG_ALTRA_ERRATUM_82288 */ + +#endif /* _ALTRA_FIXUP_H_ */ + +/* This part must be outside protection */ +#include