From patchwork Mon Dec 3 18:05:48 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: James Morse X-Patchwork-Id: 10710183 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork-2.web.codeaurora.org (Postfix) with ESMTP id A92AC15A6 for ; Mon, 3 Dec 2018 18:06:29 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 9E0192B068 for ; Mon, 3 Dec 2018 18:06:29 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id 9145B2B47E; Mon, 3 Dec 2018 18:06:29 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-2.9 required=2.0 tests=BAYES_00,MAILING_LIST_MULTI, RCVD_IN_DNSWL_NONE autolearn=unavailable version=3.3.1 Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id EB9852B068 for ; Mon, 3 Dec 2018 18:06:28 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id E4F226B6A6C; Mon, 3 Dec 2018 13:06:27 -0500 (EST) Delivered-To: linux-mm-outgoing@kvack.org Received: by kanga.kvack.org (Postfix, from userid 40) id E26AA6B6A77; Mon, 3 Dec 2018 13:06:27 -0500 (EST) X-Original-To: int-list-linux-mm@kvack.org X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id CEF066B6A78; Mon, 3 Dec 2018 13:06:27 -0500 (EST) X-Original-To: linux-mm@kvack.org X-Delivered-To: linux-mm@kvack.org Received: from mail-ot1-f69.google.com (mail-ot1-f69.google.com [209.85.210.69]) by kanga.kvack.org (Postfix) with ESMTP id A415D6B6A6C for ; Mon, 3 Dec 2018 13:06:27 -0500 (EST) Received: by mail-ot1-f69.google.com with SMTP id o13so5931568otl.20 for ; Mon, 03 Dec 2018 10:06:27 -0800 (PST) X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-original-authentication-results:x-gm-message-state:from:to:cc :subject:date:message-id:mime-version:content-transfer-encoding; bh=6Ffc8JYXWfpfM8+NzAUyEPY3Mexpd92TeUlvyEJLgEo=; b=C/ZCFjtMTqoL+1X9W6f4AEVnRu2XygppOHs02+ZMpRUgXQUYU5BO5O4+QvFnUO1tA1 3wnsycFSzYsq63qaIPHwuBQi5zG54WyqnoJAJD2L+3oBlL59XzpPI32UWBeVScVs7cq3 lkJGx31XSazx1ELiAZ5BjZI+2QgGFLG7MKETDJ5UYbKnuzE6/gtfkM27S81+Bh2H1ItS gMv+ZQD86L2DFKTRo6+91zG/LxiLfp6c43tBodFgAftnSx2PqBihwQ2qKPfSzWrFPUkp j1nXYhxjNScCl/ATICIdJzdwzuO9dgfCeE3c+5iJHt6VWiudTUFUZeWPQlp/jSN3PeAM 9pCw== X-Original-Authentication-Results: mx.google.com; spf=pass (google.com: domain of james.morse@arm.com designates 217.140.101.70 as permitted sender) smtp.mailfrom=james.morse@arm.com X-Gm-Message-State: AA+aEWYcTbLqYkThtjZMzjT/HH1XCHDSrHvvbuPzQFXHoYmw9H/+UY4l gXPWVfG9ZfqSoK03hFMoliEjpdh9n31EGCfctM0EpaqpVrDYwa31a9mmgKtSWfSKF01Q2FANi7W UdcLcHLeHhFxKTn4Qs2nllnhKMzl/EI6rYpySeTNV4xovpLekDO4G3KEfcxM9j3vDYA== X-Received: by 2002:a9d:1982:: with SMTP id k2mr10223714otk.197.1543860387196; Mon, 03 Dec 2018 10:06:27 -0800 (PST) X-Google-Smtp-Source: AFSGD/XJps8to4o9axXdGEAbygbIuFKlvMCXQDnXjanKiuT8FJyQOM5FvSvOAOLYbWor10bv0ijv X-Received: by 2002:a9d:1982:: with SMTP id k2mr10223664otk.197.1543860386190; Mon, 03 Dec 2018 10:06:26 -0800 (PST) ARC-Seal: i=1; a=rsa-sha256; t=1543860386; cv=none; d=google.com; s=arc-20160816; b=bHReHQC4hsYaveXYBnh5doyJ0LCgiIHR/KQJCYu0nQwRlloWjITVyo+ovoQorcuuU0 MlWzQcqV1nNJ09iNFy962HCHQJq5oyhGwDccFRRwmjgWRg61LLq4wdghBae3Ihnq7r4T Lu+/n0wJcBCMkkrT0tjm2O2LFU8rFW/g4acJhyf7101ZOs2Q6L9eTTZH4/Yr+hzvKTpv HV1Znc0e3I80Qvut/Eczcrxkt9G2XV1zKgIpOCHp9swR1ZUrWcyWQ7v/aDZG5K3MeWA4 vcek93XCHdsdv7qBjet0Bco9U+SAPpnbaWtAV995kKoYtMczzD3Hbm1Y7W0Rkl/5G5V8 pZ1Q== ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=google.com; s=arc-20160816; h=content-transfer-encoding:mime-version:message-id:date:subject:cc :to:from; bh=6Ffc8JYXWfpfM8+NzAUyEPY3Mexpd92TeUlvyEJLgEo=; b=jiUW7BRylW45lN2KSJe7fQmas7bXBBwejp2cEzgAnB4DlCrAn/uzVWJJgtYE7TtGdB aZyjc9zp8qHgQoakAUAIh4158Sv+30FZLiZ0Q9BDdtbOxQ8qKO33FgaZJoMSsmqm7u5p LciE2MqWFmJi1Orub9i4jdUdqlDmwXq5LPxLJu6X7qNvcKRKXDYUv+3lUrnPsxLKMllq YEC0qaYiKpGuzBoRl2DnHiZJdl/nsL6Ugt/A4rCcBXKEyCsRpKoDviO5pWEE7N5Wj2RE aDz5k/erLcYHBlckUnW1gc3MsxNlxIcuoa0YSGEeqSUVgKon/V1ADybp2fliYyYxZb6V v9Xg== ARC-Authentication-Results: i=1; mx.google.com; spf=pass (google.com: domain of james.morse@arm.com designates 217.140.101.70 as permitted sender) smtp.mailfrom=james.morse@arm.com Received: from foss.arm.com (usa-sjc-mx-foss1.foss.arm.com. [217.140.101.70]) by mx.google.com with ESMTP id d134si6897499oib.261.2018.12.03.10.06.25 for ; Mon, 03 Dec 2018 10:06:25 -0800 (PST) Received-SPF: pass (google.com: domain of james.morse@arm.com designates 217.140.101.70 as permitted sender) client-ip=217.140.101.70; Authentication-Results: mx.google.com; spf=pass (google.com: domain of james.morse@arm.com designates 217.140.101.70 as permitted sender) smtp.mailfrom=james.morse@arm.com Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.72.51.249]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 3D92D1682; Mon, 3 Dec 2018 10:06:25 -0800 (PST) Received: from eglon.cambridge.arm.com (eglon.cambridge.arm.com [10.1.196.105]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 6BE2A3F59C; Mon, 3 Dec 2018 10:06:22 -0800 (PST) From: James Morse To: linux-acpi@vger.kernel.org Cc: kvmarm@lists.cs.columbia.edu, linux-arm-kernel@lists.infradead.org, linux-mm@kvack.org, Borislav Petkov , Marc Zyngier , Christoffer Dall , Will Deacon , Catalin Marinas , Naoya Horiguchi , Rafael Wysocki , Len Brown , Tony Luck , Dongjiu Geng , Xie XiuQi , Fan Wu , James Morse Subject: [PATCH v7 00/25] APEI in_nmi() rework and SDEI wire-up Date: Mon, 3 Dec 2018 18:05:48 +0000 Message-Id: <20181203180613.228133-1-james.morse@arm.com> X-Mailer: git-send-email 2.19.2 MIME-Version: 1.0 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: X-Virus-Scanned: ClamAV using ClamSMTP Hello, This series aims to wire-up arm64's fancy new software-NMI notifications for firmware-first RAS. These need to use the estatus-queue, which is also needed for notifications via emulated-SError. All of these things take the 'in_nmi()' path through ghes_copy_tofrom_phys(), and so will deadlock if they can interact, which they might. To that end, this series removes the in_nmi() stuff from ghes.c. Locks are pushed out to the notification helpers, and fixmap entries are passed in to the code that needs them. This means the estatus-queue users can interrupt each other however they like. While doing this there is a fair amount of cleanup, which is (now) at the beginning of the series. NMIlike notifications interrupting ghes_probe() can go wrong for three different reasons. CPER record blocks greater than PAGE_SIZE dont' work. The estatus-pool allocation is simplified and the silent-flag/oops-begin is removed. Nothing in this series is intended as fixes, as its all cleanup or never-worked. ----------%<---------- The earlier boiler-plate: What's SDEI? Its ARM's "Software Delegated Exception Interface" [0]. It's used by firmware to tell the OS about firmware-first RAS events. These Software exceptions can interrupt anything, so I describe them as NMI-like. They aren't the only NMI-like way to notify the OS about firmware-first RAS events, the ACPI spec also defines 'NOTFIY_SEA' and 'NOTIFY_SEI'. (Acronyms: SEA, Synchronous External Abort. The CPU requested some memory, but the owner of that memory said no. These are always synchronous with the instruction that caused them. SEI, System-Error Interrupt, commonly called SError. This is an asynchronous external abort, the memory-owner didn't say no at the right point. Collectively these things are called external-aborts How is firmware involved? It traps these and re-injects them into the kernel once its written the CPER records). APEI's GHES code only expects one source of NMI. If a platform implements more than one of these mechanisms, APEI needs to handle the interaction. 'SEA' and 'SEI' can interact as 'SEI' is asynchronous. SDEI can interact with itself: its exceptions can be 'normal' or 'critical', and firmware could use both types for RAS. (errors using normal, 'panic-now' using critical). ----------%<---------- Known issue: * ghes_copy_tofrom_phys() already takes a lock in NMI context, this series moves that around, and makes sure we never try to take the same lock from different NMIlike notifications. Since the switch to queued spinlocks it looks like the kernel can only be 4 context's deep in spinlock, which arm64 could exceed as it doesn't have a single architected NMI. It either needs an additional idx-bit in the qspinlock, or for ghes.c to switch to using a different type of lock for NMIlike notifications. Changes since v6: * Changed the order of the series. * Made hest.c own the estatus pool, which is now vmalloc()d. * Culled #ifdef, hopefully without generating too much noise. * Added GHESv2 'ack' support to NMI-like notifications * Use task-work to kick the memory_failure_queue() Specific changes are noted in each patch. [v6] https://www.spinics.net/lists/linux-acpi/msg84228.html [v5] https://www.spinics.net/lists/linux-acpi/msg82993.html [v4] https://www.spinics.net/lists/arm-kernel/msg653078.html [v3] https://www.spinics.net/lists/arm-kernel/msg649230.html [0] https://static.docs.arm.com/den0054/a/ARM_DEN0054A_Software_Delegated_Exception_Interface.pdf Feedback welcome, Thanks James Morse (25): ACPI / APEI: Don't wait to serialise with oops messages when panic()ing ACPI / APEI: Remove silent flag from ghes_read_estatus() ACPI / APEI: Switch estatus pool to use vmalloc memory ACPI / APEI: Make hest.c manage the estatus memory pool ACPI / APEI: Make estatus pool allocation a static size ACPI / APEI: Don't store CPER records physical address in struct ghes ACPI / APEI: Remove spurious GHES_TO_CLEAR check ACPI / APEI: Don't update struct ghes' flags in read/clear estatus ACPI / APEI: Generalise the estatus queue's notify code ACPI / APEI: Tell firmware the estatus queue consumed the records ACPI / APEI: Move NOTIFY_SEA between the estatus-queue and NOTIFY_NMI ACPI / APEI: Switch NOTIFY_SEA to use the estatus queue KVM: arm/arm64: Add kvm_ras.h to collect kvm specific RAS plumbing arm64: KVM/mm: Move SEA handling behind a single 'claim' interface ACPI / APEI: Move locking to the notification helper ACPI / APEI: Let the notification helper specify the fixmap slot ACPI / APEI: Pass ghes and estatus separately to avoid a later copy ACPI / APEI: Split ghes_read_estatus() to allow a peek at the CPER length ACPI / APEI: Only use queued estatus entry during _in_nmi_notify_one() ACPI / APEI: Use separate fixmap pages for arm64 NMI-like notifications mm/memory-failure: Add memory_failure_queue_kick() ACPI / APEI: Kick the memory_failure() queue for synchronous errors arm64: acpi: Make apei_claim_sea() synchronise with APEI's irq work firmware: arm_sdei: Add ACPI GHES registration helper ACPI / APEI: Add support for the SDEI GHES Notification type arch/arm/include/asm/kvm_ras.h | 14 + arch/arm/include/asm/system_misc.h | 5 - arch/arm64/include/asm/acpi.h | 4 +- arch/arm64/include/asm/daifflags.h | 1 + arch/arm64/include/asm/fixmap.h | 6 +- arch/arm64/include/asm/kvm_ras.h | 25 + arch/arm64/include/asm/system_misc.h | 2 - arch/arm64/kernel/acpi.c | 51 +++ arch/arm64/mm/fault.c | 25 +- drivers/acpi/apei/Kconfig | 12 +- drivers/acpi/apei/ghes.c | 652 ++++++++++++++++----------- drivers/acpi/apei/hest.c | 5 + drivers/firmware/arm_sdei.c | 70 +++ include/acpi/ghes.h | 4 +- include/linux/arm_sdei.h | 9 + include/linux/mm.h | 1 + mm/memory-failure.c | 15 +- virt/kvm/arm/mmu.c | 4 +- 18 files changed, 606 insertions(+), 299 deletions(-) create mode 100644 arch/arm/include/asm/kvm_ras.h create mode 100644 arch/arm64/include/asm/kvm_ras.h