From patchwork Mon Apr 12 22:43:18 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12198985 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 30375C433ED for ; Mon, 12 Apr 2021 22:43:33 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id CBD206121E for ; Mon, 12 Apr 2021 22:43:32 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org CBD206121E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 5EC166B006E; Mon, 12 Apr 2021 18:43:32 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 5C3576B0070; Mon, 12 Apr 2021 18:43:32 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 463CD6B0071; Mon, 12 Apr 2021 18:43:32 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0027.hostedemail.com [216.40.44.27]) by kanga.kvack.org (Postfix) with ESMTP id 2B1C26B006E for ; Mon, 12 Apr 2021 18:43:32 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay04.hostedemail.com (Postfix) with ESMTP id CA5EA689C for ; Mon, 12 Apr 2021 22:43:31 +0000 (UTC) X-FDA: 78025193022.05.65F0A63 Received: from mail-pj1-f41.google.com (mail-pj1-f41.google.com [209.85.216.41]) by imf25.hostedemail.com (Postfix) with ESMTP id 48E3A600010E for ; Mon, 12 Apr 2021 22:43:29 +0000 (UTC) Received: by mail-pj1-f41.google.com with SMTP id e8-20020a17090a7288b029014e51f5a6baso2653748pjg.2 for ; Mon, 12 Apr 2021 15:43:31 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=JkA3Ny2V6Qq9Dk3aSySw7xYrKKcRmImrxlIoiSgIjQ0=; b=X0R7t1a+/vUebM2KusNVgIyrf8ZQZp4ygxUvDy25rGxEtvJoLMDsHRy7aV1gFRm/wk NM5SL9lNBEg7xDvdaBfI05WJqBVaArAYzFx5PMGvhQ5sEkph/y8LEjcXZStSlTqB2Xej XqaytIGgnAlq2YhnsXpj+uu0h1YWIE2qVUwrtrVWtKn9q9exriujEzk8rvqkz2mOwQuN +9mL+t1FV8qrKEJHe/TBJgLUA+fqpY6vGIjUCQULhxghq2pxhqpGoz6aPe6gILzvEIJ0 msoxnEDFV+G5c0zmVQLP3NhTVz4ca+9EmUwwP3rwHFWLzuxkbdjTnqjPGDcGuBFcDzpp E2DA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=JkA3Ny2V6Qq9Dk3aSySw7xYrKKcRmImrxlIoiSgIjQ0=; b=ldFF3XX9e6ccVDhxOV7FQ+VZ2mmruPXexrqn9tgiiDeq+xYjZO/E3RzxtMQt36x3V/ slrH+1Uz9Px7OycfAlX1ZASEW7GhOBLvAHFNHH8RBzHTGKQsMRbVWvETBBSBg7TrjZFx JQ0UoNOlI8rLVISEXHrBi+etRtZUKjGxRs7fmyUU7NtWD40n4iP8kcml7m8qR28jmc2g Nslj6WWa2yHVKzZO9iidhI4c3jiZt7QRhD3gMRx3wZguIUrcouuRMdUiKBUu2MH9OPWt a5FyES1BQlfNhJNDvD5e+Y0IItVye0b8g0tRjAn9OcPub4FFPlIcIadEHEmuNi/Qaepc Rffg== X-Gm-Message-State: AOAM531vlsV+St/P/rjIKtnN5zJKWSrXnPwZ3drqEhE/Wl5B8sJog4/u DEN2j/lFRXtOhRDKEFvqsznEO5IeNzLbP5U= X-Google-Smtp-Source: ABdhPJykwz0wDYo5DwqyUPL0KJ+J6h8Y3bmJ5iNVsNLoy4vJCeDpgEL+MWPvo4ds3wa3fN4u8f0dXA== X-Received: by 2002:a17:90b:4b8c:: with SMTP id lr12mr1556254pjb.124.1618267410185; Mon, 12 Apr 2021 15:43:30 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id l25sm13365373pgu.72.2021.04.12.15.43.27 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 15:43:29 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 1/3] mm/memory-failure: Use a mutex to avoid memory_failure() races Date: Tue, 13 Apr 2021 07:43:18 +0900 Message-Id: <20210412224320.1747638-2-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210412224320.1747638-1-nao.horiguchi@gmail.com> References: <20210412224320.1747638-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 48E3A600010E X-Stat-Signature: 4mp5jmhod1xy9g9379h6t3fexb5y1tu9 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf25; identity=mailfrom; envelope-from=""; helo=mail-pj1-f41.google.com; client-ip=209.85.216.41 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618267409-62786 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Tony Luck There can be races when multiple CPUs consume poison from the same page. The first into memory_failure() atomically sets the HWPoison page flag and begins hunting for tasks that map this page. Eventually it invalidates those mappings and may send a SIGBUS to the affected tasks. But while all that work is going on, other CPUs see a "success" return code from memory_failure() and so they believe the error has been handled and continue executing. Fix by wrapping most of the internal parts of memory_failure() in a mutex. Signed-off-by: Tony Luck Signed-off-by: Naoya Horiguchi Signed-off-by: Tony Luck Signed-off-by: Naoya Horiguchi --- mm/memory-failure.c | 19 +++++++++++++++++-- 1 file changed, 17 insertions(+), 2 deletions(-) diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c index 24210c9bd843..c1509f4b565e 100644 --- v5.12-rc5/mm/memory-failure.c +++ v5.12-rc5_patched/mm/memory-failure.c @@ -1381,6 +1381,8 @@ static int memory_failure_dev_pagemap(unsigned long pfn, int flags, return rc; } +static DEFINE_MUTEX(mf_mutex); + /** * memory_failure - Handle memory failure of a page. * @pfn: Page Number of the corrupted page @@ -1424,12 +1426,18 @@ int memory_failure(unsigned long pfn, int flags) return -ENXIO; } + mutex_lock(&mf_mutex); + try_again: - if (PageHuge(p)) - return memory_failure_hugetlb(pfn, flags); + if (PageHuge(p)) { + res = memory_failure_hugetlb(pfn, flags); + goto out2; + } + if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); + mutex_unlock(&mf_mutex); return 0; } @@ -1463,9 +1471,11 @@ int memory_failure(unsigned long pfn, int flags) res = MF_FAILED; } action_result(pfn, MF_MSG_BUDDY, res); + mutex_unlock(&mf_mutex); return res == MF_RECOVERED ? 0 : -EBUSY; } else { action_result(pfn, MF_MSG_KERNEL_HIGH_ORDER, MF_IGNORED); + mutex_unlock(&mf_mutex); return -EBUSY; } } @@ -1473,6 +1483,7 @@ int memory_failure(unsigned long pfn, int flags) if (PageTransHuge(hpage)) { if (try_to_split_thp_page(p, "Memory Failure") < 0) { action_result(pfn, MF_MSG_UNSPLIT_THP, MF_IGNORED); + mutex_unlock(&mf_mutex); return -EBUSY; } VM_BUG_ON_PAGE(!page_count(p), p); @@ -1517,6 +1528,7 @@ int memory_failure(unsigned long pfn, int flags) num_poisoned_pages_dec(); unlock_page(p); put_page(p); + mutex_unlock(&mf_mutex); return 0; } if (hwpoison_filter(p)) { @@ -1524,6 +1536,7 @@ int memory_failure(unsigned long pfn, int flags) num_poisoned_pages_dec(); unlock_page(p); put_page(p); + mutex_unlock(&mf_mutex); return 0; } @@ -1559,6 +1572,8 @@ int memory_failure(unsigned long pfn, int flags) res = identify_page_state(pfn, p, page_flags); out: unlock_page(p); +out2: + mutex_unlock(&mf_mutex); return res; } EXPORT_SYMBOL_GPL(memory_failure); From patchwork Mon Apr 12 22:43:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12198987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7332C43460 for ; Mon, 12 Apr 2021 22:43:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4816E61244 for ; Mon, 12 Apr 2021 22:43:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4816E61244 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DDCE66B0070; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D667D6B0072; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEB066B0070; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id A0A256B0070 for ; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5E5A2180AD81D for ; Mon, 12 Apr 2021 22:43:34 +0000 (UTC) X-FDA: 78025193148.14.01EAD31 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf05.hostedemail.com (Postfix) with ESMTP id 66C86E000104 for ; Mon, 12 Apr 2021 22:43:33 +0000 (UTC) Received: by mail-pl1-f172.google.com with SMTP id z22so2031401plo.3 for ; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=f/HYbNRLsApenom0NFRbF9JdUXkmoFwxcRf6BNOXYUII4ox16vyEHHeI6wxMAPi0Gq C3Uv9pW8IY5/cYsfSHcKeS0HXJZiKFUiCi3r+DZXg1ZmG6w5NcIFEE5o+cpNmAesCwtm Rqy/VkoEfacLhP2zBOCA3hjpRJ2a/rAbtiz8huTKx1WT62yutctCdJ6XNr5yaMz7wBb7 aIKRheBotsTFfaSb4IEluArWWfySIPI13A2cU4XdSUlpEjqDgyOwHsScSatylhPLNM+F K8MhiLvUU+5o0yDouHqYdeL/uwuxIHWkvY6x09CV4FinCKkCAQVdaZ/DPsWAgxnosXwT V1og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=fB6Y5MEV+8Uogi+4dSy7bmQrjXLldIZ/Shcv4qKFHuExYkh1fT1MxCYHSCpl5dI6MJ lboDFoqoB0qMkaZopqnrvO6pTGQQUqowLYRQLB3xs98I9r/cF2/wnEtbV9bjxu1WxZQZ ztO9fp8kJO3yLINB1zU7tOvhYKHwqhw/8zVRxUdcCsmnsy3XmGgK0IQRoDNlwmqK2C7/ 68iF6aT3bUYXUnCa6PCfCrvuh7RK/kt9FKmPRnX3raFR/PZp9z4HiWwOc4mO/SDaHeAX iwIS3n1issrqK4HMCkw1nY9KleUM5J4cEy/7kJ09gWiblFZGzCYmVMLsw45G6K7AX/4q xI7w== X-Gm-Message-State: AOAM533PknZIMiYZTetpeYVQTYWj13GQ5SbB50k9SnM2Gdk8HFUKXC6L 3jErQbUMtkXSXS11ZEFsjwa9EL+izXyyDHM= X-Google-Smtp-Source: ABdhPJxIkwcM/K0oArvegDYbDl0TUq2BZZovmIRLlI/eji+k5V+qSKF/qUJPjmgKVVWaOVZcY0wx9g== X-Received: by 2002:a17:902:8f89:b029:ea:ea23:a02c with SMTP id z9-20020a1709028f89b02900eaea23a02cmr8991008plo.71.1618267413002; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id l25sm13365373pgu.72.2021.04.12.15.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 15:43:32 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 2/3] mm,hwpoison: return -EHWPOISON when page already Date: Tue, 13 Apr 2021 07:43:19 +0900 Message-Id: <20210412224320.1747638-3-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210412224320.1747638-1-nao.horiguchi@gmail.com> References: <20210412224320.1747638-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 66C86E000104 X-Stat-Signature: nrbjqpx7npir5o7hhthsm8z6y3skiy4f Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=mail-pl1-f172.google.com; client-ip=209.85.214.172 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618267413-275349 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Aili Yao When the page is already poisoned, another memory_failure() call in the same page now returns 0, meaning OK. For nested memory mce handling, this behavior may lead to one mce looping, Example: 1. When LCME is enabled, and there are two processes A && B running on different core X && Y separately, which will access one same page, then the page corrupted when process A access it, a MCE will be rasied to core X and the error process is just underway. 2. Then B access the page and trigger another MCE to core Y, it will also do error process, it will see TestSetPageHWPoison be true, and 0 is returned. 3. The kill_me_maybe will check the return: 1244 static void kill_me_maybe(struct callback_head *cb) 1245 { ... 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); 1257 sync_core(); 1258 return; 1259 } ... 1267 } 4. The error process for B will end, and may nothing happened if kill-early is not set, The process B will re-excute instruction and get into mce again and then loop happens. And also the set_mce_nospec() here is not proper, may refer to commit fd0e786d9d09 ("x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages"). For other cases which care the return value of memory_failure() should check why they want to process a memory error which have already been processed. This behavior seems reasonable. Signed-off-by: Aili Yao Signed-off-by: Naoya Horiguchi --- mm/memory-failure.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c index c1509f4b565e..368ef77e01f9 100644 --- v5.12-rc5/mm/memory-failure.c +++ v5.12-rc5_patched/mm/memory-failure.c @@ -1228,7 +1228,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + return -EHWPOISON; } num_poisoned_pages_inc(); @@ -1438,7 +1438,7 @@ int memory_failure(unsigned long pfn, int flags) pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); mutex_unlock(&mf_mutex); - return 0; + return -EHWPOISON; } orig_head = hpage = compound_head(p); From patchwork Mon Apr 12 22:43:20 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12198989 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id 1143DC43461 for ; Mon, 12 Apr 2021 22:43:39 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 9B9526121E for ; Mon, 12 Apr 2021 22:43:38 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 9B9526121E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id 3F1306B0071; Mon, 12 Apr 2021 18:43:38 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 3A11E6B0072; Mon, 12 Apr 2021 18:43:38 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 21A096B0073; Mon, 12 Apr 2021 18:43:38 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0215.hostedemail.com [216.40.44.215]) by kanga.kvack.org (Postfix) with ESMTP id 0094F6B0071 for ; Mon, 12 Apr 2021 18:43:37 -0400 (EDT) Received: from smtpin21.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 9F3C1689C for ; Mon, 12 Apr 2021 22:43:37 +0000 (UTC) X-FDA: 78025193274.21.9FCF3EB Received: from mail-pj1-f45.google.com (mail-pj1-f45.google.com [209.85.216.45]) by imf18.hostedemail.com (Postfix) with ESMTP id 6C5602000249 for ; Mon, 12 Apr 2021 22:43:37 +0000 (UTC) Received: by mail-pj1-f45.google.com with SMTP id nm3-20020a17090b19c3b029014e1bbf6c60so3767609pjb.4 for ; Mon, 12 Apr 2021 15:43:36 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=NEcfkWjzQjcy5UJ/ItAB81OJjAIgDt8AwI4FNa9iha4=; b=JU/b6l8DSFz0zV7qBlusgUefLKuWDkZ14uln/XgG8qdgR3V+PpN3v36OleMrM2NyHI /biJAQJldS0fqoIrsOgDss5/LeYzxoMcigzDvWTgCs1JSLsQstvzYVGDOnUU1/UL/l/V xP5yyIu9HgUYzhp8mG4hTqlCTUb06EyQJG1A7FHvPqvgItMn7Z/MqxaOTFtrk3J7XI+Y 6JItWUhWKlbIv1+bvQXvwBooU57N/NrfFZnf1IxKAyU1WuF3S2XNKI5VTMdJANic1jOu Ge8y+sM6Y7sjKMx9pPTyevVC9o6KBc+1C9uVrur7ri1pABmYN4LbTtiX81wz2d/6pG6U 0bbQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=NEcfkWjzQjcy5UJ/ItAB81OJjAIgDt8AwI4FNa9iha4=; b=uWULp4Z387RAIGBnsf74jntAY5EiLnlCRAldeT5Y3pHgrV/ZoCYCzFi4+FH3d3bL0b P3cXBJcXcGWIuuLly+TuWcnykNskPxpHYnh3w6Ovp2l7ZQZqFLKRPGWv8lxhs67y9kw1 MIL/l/SOR6efrP8vwrwLD8sUqtSBc2I1YHci8AzzHSaTWFwZ+4LetIaPtFvru/A0qWjP a/eS0YrmDag2jEKS7FXWZGnxMi1wtRSf0WiC19gFpWfashUAjJWwq58pEdhy6TwQsHzY amB30ym4YSFHMyi05Tyc2PJjqAU4TnzHfakzUpACkgcCecKHhzP8Wdp9sxRjynab3gmE Xdpw== X-Gm-Message-State: AOAM532dg7SlxtBuhZXdgLOeAnHH7ILNxFrx082Bcic8mI71tOuU/zrV iV8jmEa/ubwzozP32vSDLfOfg9K5YCa2hvY= X-Google-Smtp-Source: ABdhPJz/vrU0iXVN/bE+hDIGcvFZngGKof/HaE1In90UBYxRTyxthWWXv1zVg0HsSK7flMVYag1eKw== X-Received: by 2002:a17:902:59d4:b029:ea:bbc5:c775 with SMTP id d20-20020a17090259d4b02900eabbc5c775mr14523796plj.11.1618267415716; Mon, 12 Apr 2021 15:43:35 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id l25sm13365373pgu.72.2021.04.12.15.43.33 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 15:43:35 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 3/3] mm,hwpoison: add kill_accessing_process() to find error virtual address Date: Tue, 13 Apr 2021 07:43:20 +0900 Message-Id: <20210412224320.1747638-4-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210412224320.1747638-1-nao.horiguchi@gmail.com> References: <20210412224320.1747638-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 6C5602000249 X-Stat-Signature: 78ej7dhu1osm3f3xri5nnocainizuwtj X-Rspamd-Server: rspam02 Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf18; identity=mailfrom; envelope-from=""; helo=mail-pj1-f45.google.com; client-ip=209.85.216.45 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618267417-952192 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Naoya Horiguchi The previous patch solves the infinite MCE loop issue when multiple MCE events races. The remaining issue is to make sure that all threads processing Action Required MCEs send to the current processes the SIGBUS with the proper virtual address and the error size. This patch suggests to do page table walk to find the error virtual address. If we find multiple virtual addresses in walking, we now can't determine which one is correct, so we fall back to sending SIGBUS in kill_me_maybe() without error info as we do now. This corner case needs to be solved in the future. Signed-off-by: Naoya Horiguchi --- arch/x86/kernel/cpu/mce/core.c | 13 ++- include/linux/swapops.h | 5 ++ mm/memory-failure.c | 147 ++++++++++++++++++++++++++++++++- 3 files changed, 161 insertions(+), 4 deletions(-) diff --git v5.12-rc5/arch/x86/kernel/cpu/mce/core.c v5.12-rc5_patched/arch/x86/kernel/cpu/mce/core.c index 7962355436da..3ce23445a48c 100644 --- v5.12-rc5/arch/x86/kernel/cpu/mce/core.c +++ v5.12-rc5_patched/arch/x86/kernel/cpu/mce/core.c @@ -1257,19 +1257,28 @@ static void kill_me_maybe(struct callback_head *cb) { struct task_struct *p = container_of(cb, struct task_struct, mce_kill_me); int flags = MF_ACTION_REQUIRED; + int ret; pr_err("Uncorrected hardware memory error in user-access at %llx", p->mce_addr); if (!p->mce_ripv) flags |= MF_MUST_KILL; - if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && - !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { + ret = memory_failure(p->mce_addr >> PAGE_SHIFT, flags); + if (!ret && !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); sync_core(); return; } + /* + * -EHWPOISON from memory_failure() means that it already sent SIGBUS + * to the current process with the proper error info, so no need to + * send it here again. + */ + if (ret == -EHWPOISON) + return; + if (p->mce_vaddr != (void __user *)-1l) { force_sig_mceerr(BUS_MCEERR_AR, p->mce_vaddr, PAGE_SHIFT); } else { diff --git v5.12-rc5/include/linux/swapops.h v5.12-rc5_patched/include/linux/swapops.h index d9b7c9132c2f..98ea67fcf360 100644 --- v5.12-rc5/include/linux/swapops.h +++ v5.12-rc5_patched/include/linux/swapops.h @@ -323,6 +323,11 @@ static inline int is_hwpoison_entry(swp_entry_t entry) return swp_type(entry) == SWP_HWPOISON; } +static inline unsigned long hwpoison_entry_to_pfn(swp_entry_t entry) +{ + return swp_offset(entry); +} + static inline void num_poisoned_pages_inc(void) { atomic_long_inc(&num_poisoned_pages); diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c index 368ef77e01f9..04e002bd573a 100644 --- v5.12-rc5/mm/memory-failure.c +++ v5.12-rc5_patched/mm/memory-failure.c @@ -56,6 +56,7 @@ #include #include #include +#include #include "internal.h" #include "ras/ras_event.h" @@ -554,6 +555,142 @@ static void collect_procs(struct page *page, struct list_head *tokill, collect_procs_file(page, tokill, force_early); } +struct hwp_walk { + struct to_kill tk; + unsigned long pfn; + int flags; +}; + +static int set_to_kill(struct to_kill *tk, unsigned long addr, short shift) +{ + /* Abort pagewalk when finding multiple mappings to the error page. */ + if (tk->addr) + return 1; + tk->addr = addr; + tk->size_shift = shift; + return 0; +} + +static int check_hwpoisoned_entry(pte_t pte, unsigned long addr, short shift, + unsigned long poisoned_pfn, struct to_kill *tk) +{ + unsigned long pfn; + + if (pte_present(pte)) { + pfn = pte_pfn(pte); + } else { + swp_entry_t swp = pte_to_swp_entry(pte); + + if (is_hwpoison_entry(swp)) + pfn = hwpoison_entry_to_pfn(swp); + } + + if (!pfn || pfn != poisoned_pfn) + return 0; + + return set_to_kill(tk, addr, shift); +} + +static int hwpoison_pte_range(pmd_t *pmdp, unsigned long addr, + unsigned long end, struct mm_walk *walk) +{ + struct hwp_walk *hwp = (struct hwp_walk *)walk->private; + int ret; + pte_t *ptep; + spinlock_t *ptl; + + ptl = pmd_trans_huge_lock(pmdp, walk->vma); + if (ptl) { + pmd_t pmd = *pmdp; + + if (pmd_present(pmd)) { + unsigned long pfn = pmd_pfn(pmd); + + if (pfn <= hwp->pfn && hwp->pfn < pfn + PMD_SIZE) { + unsigned long hwpoison_vaddr = addr + + (hwp->pfn << PAGE_SHIFT & ~PMD_MASK); + + ret = set_to_kill(&hwp->tk, hwpoison_vaddr, + PAGE_SHIFT); + } + } + spin_unlock(ptl); + goto out; + } + + if (pmd_trans_unstable(pmdp)) + goto out; + + ptep = pte_offset_map_lock(walk->vma->vm_mm, pmdp, addr, &ptl); + for (; addr != end; ptep++, addr += PAGE_SIZE) { + ret = check_hwpoisoned_entry(*ptep, addr, PAGE_SHIFT, + hwp->pfn, &hwp->tk); + if (ret == 1) + break; + } + pte_unmap_unlock(ptep - 1, ptl); +out: + cond_resched(); + return ret; +} + +#ifdef CONFIG_HUGETLB_PAGE +static int hwpoison_hugetlb_range(pte_t *ptep, unsigned long hmask, + unsigned long addr, unsigned long end, + struct mm_walk *walk) +{ + struct hwp_walk *hwp = (struct hwp_walk *)walk->private; + pte_t pte = huge_ptep_get(ptep); + struct hstate *h = hstate_vma(walk->vma); + + return check_hwpoisoned_entry(pte, addr, huge_page_shift(h), + hwp->pfn, &hwp->tk); +} +#else +#define hwpoison_hugetlb_range NULL +#endif + +static struct mm_walk_ops hwp_walk_ops = { + .pmd_entry = hwpoison_pte_range, + .hugetlb_entry = hwpoison_hugetlb_range, +}; + +/* + * Sends SIGBUS to the current process with the error info. + * + * This function is intended to handle "Action Required" MCEs on already + * hardware poisoned pages. They could happen, for example, when + * memory_failure() failed to unmap the error page at the first call, or + * when multiple Action Optional MCE events races on different CPUs with + * Local MCE enabled. + * + * MCE handler currently has no easy access to the error virtual address, + * so this function walks page table to find it. One challenge on this is + * to reliably get the proper virual address of the error to report to + * applications via SIGBUS. A process could map a page multiple times to + * different virtual addresses, then we now have no way to tell which virtual + * address was accessed when the Action Required MCE was generated. + * So in such a corner case, we now give up and fall back to sending SIGBUS + * with no error info. + */ +static int kill_accessing_process(struct task_struct *p, unsigned long pfn, + int flags) +{ + int ret; + struct hwp_walk priv = { + .pfn = pfn, + }; + priv.tk.tsk = p; + + mmap_read_lock(p->mm); + ret = walk_page_range(p->mm, 0, TASK_SIZE_MAX, &hwp_walk_ops, + (void *)&priv); + if (!ret && priv.tk.addr) + kill_proc(&priv.tk, pfn, flags); + mmap_read_unlock(p->mm); + return ret ? -EFAULT : -EHWPOISON; +} + static const char *action_name[] = { [MF_IGNORED] = "Ignored", [MF_FAILED] = "Failed", @@ -1228,7 +1365,10 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return -EHWPOISON; + res = -EHWPOISON; + if (flags & MF_ACTION_REQUIRED) + res = kill_accessing_process(current, page_to_pfn(head), flags); + return res; } num_poisoned_pages_inc(); @@ -1437,8 +1577,11 @@ int memory_failure(unsigned long pfn, int flags) if (TestSetPageHWPoison(p)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); + res = -EHWPOISON; + if (flags & MF_ACTION_REQUIRED) + res = kill_accessing_process(current, pfn, flags); mutex_unlock(&mf_mutex); - return -EHWPOISON; + return res; } orig_head = hpage = compound_head(p);