From patchwork Mon Apr 12 22:43:19 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Naoya Horiguchi X-Patchwork-Id: 12198987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-13.5 required=3.0 tests=BAYES_00, DKIM_ADSP_CUSTOM_MED,DKIM_INVALID,DKIM_SIGNED,FREEMAIL_FORGED_FROMDOMAIN, FREEMAIL_FROM,HEADER_FROM_DIFFERENT_DOMAINS,INCLUDES_CR_TRAILER, INCLUDES_PATCH,MAILING_LIST_MULTI,SPF_HELO_NONE,SPF_PASS,USER_AGENT_GIT autolearn=ham autolearn_force=no version=3.4.0 Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id B7332C43460 for ; Mon, 12 Apr 2021 22:43:35 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 4816E61244 for ; Mon, 12 Apr 2021 22:43:35 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.3.2 mail.kernel.org 4816E61244 Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=owner-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix) id DDCE66B0070; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id D667D6B0072; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id BEB066B0070; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0220.hostedemail.com [216.40.44.220]) by kanga.kvack.org (Postfix) with ESMTP id A0A256B0070 for ; Mon, 12 Apr 2021 18:43:34 -0400 (EDT) Received: from smtpin14.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay01.hostedemail.com (Postfix) with ESMTP id 5E5A2180AD81D for ; Mon, 12 Apr 2021 22:43:34 +0000 (UTC) X-FDA: 78025193148.14.01EAD31 Received: from mail-pl1-f172.google.com (mail-pl1-f172.google.com [209.85.214.172]) by imf05.hostedemail.com (Postfix) with ESMTP id 66C86E000104 for ; Mon, 12 Apr 2021 22:43:33 +0000 (UTC) Received: by mail-pl1-f172.google.com with SMTP id z22so2031401plo.3 for ; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=f/HYbNRLsApenom0NFRbF9JdUXkmoFwxcRf6BNOXYUII4ox16vyEHHeI6wxMAPi0Gq C3Uv9pW8IY5/cYsfSHcKeS0HXJZiKFUiCi3r+DZXg1ZmG6w5NcIFEE5o+cpNmAesCwtm Rqy/VkoEfacLhP2zBOCA3hjpRJ2a/rAbtiz8huTKx1WT62yutctCdJ6XNr5yaMz7wBb7 aIKRheBotsTFfaSb4IEluArWWfySIPI13A2cU4XdSUlpEjqDgyOwHsScSatylhPLNM+F K8MhiLvUU+5o0yDouHqYdeL/uwuxIHWkvY6x09CV4FinCKkCAQVdaZ/DPsWAgxnosXwT V1og== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20161025; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=+eLCTporA8diA9+4AwwE0Qpo+13q5cqafF4gWgDZDZA=; b=fB6Y5MEV+8Uogi+4dSy7bmQrjXLldIZ/Shcv4qKFHuExYkh1fT1MxCYHSCpl5dI6MJ lboDFoqoB0qMkaZopqnrvO6pTGQQUqowLYRQLB3xs98I9r/cF2/wnEtbV9bjxu1WxZQZ ztO9fp8kJO3yLINB1zU7tOvhYKHwqhw/8zVRxUdcCsmnsy3XmGgK0IQRoDNlwmqK2C7/ 68iF6aT3bUYXUnCa6PCfCrvuh7RK/kt9FKmPRnX3raFR/PZp9z4HiWwOc4mO/SDaHeAX iwIS3n1issrqK4HMCkw1nY9KleUM5J4cEy/7kJ09gWiblFZGzCYmVMLsw45G6K7AX/4q xI7w== X-Gm-Message-State: AOAM533PknZIMiYZTetpeYVQTYWj13GQ5SbB50k9SnM2Gdk8HFUKXC6L 3jErQbUMtkXSXS11ZEFsjwa9EL+izXyyDHM= X-Google-Smtp-Source: ABdhPJxIkwcM/K0oArvegDYbDl0TUq2BZZovmIRLlI/eji+k5V+qSKF/qUJPjmgKVVWaOVZcY0wx9g== X-Received: by 2002:a17:902:8f89:b029:ea:ea23:a02c with SMTP id z9-20020a1709028f89b02900eaea23a02cmr8991008plo.71.1618267413002; Mon, 12 Apr 2021 15:43:33 -0700 (PDT) Received: from localhost.localdomain (h175-177-040-153.catv02.itscom.jp. [175.177.40.153]) by smtp.gmail.com with ESMTPSA id l25sm13365373pgu.72.2021.04.12.15.43.30 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Mon, 12 Apr 2021 15:43:32 -0700 (PDT) From: Naoya Horiguchi To: linux-mm@kvack.org, Tony Luck , Aili Yao Cc: Andrew Morton , Oscar Salvador , David Hildenbrand , Borislav Petkov , Andy Lutomirski , Naoya Horiguchi , linux-kernel@vger.kernel.org Subject: [PATCH v1 2/3] mm,hwpoison: return -EHWPOISON when page already Date: Tue, 13 Apr 2021 07:43:19 +0900 Message-Id: <20210412224320.1747638-3-nao.horiguchi@gmail.com> X-Mailer: git-send-email 2.25.1 In-Reply-To: <20210412224320.1747638-1-nao.horiguchi@gmail.com> References: <20210412224320.1747638-1-nao.horiguchi@gmail.com> MIME-Version: 1.0 X-Rspamd-Server: rspam05 X-Rspamd-Queue-Id: 66C86E000104 X-Stat-Signature: nrbjqpx7npir5o7hhthsm8z6y3skiy4f Received-SPF: none (gmail.com>: No applicable sender policy available) receiver=imf05; identity=mailfrom; envelope-from=""; helo=mail-pl1-f172.google.com; client-ip=209.85.214.172 X-HE-DKIM-Result: pass/pass X-HE-Tag: 1618267413-275349 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: From: Aili Yao When the page is already poisoned, another memory_failure() call in the same page now returns 0, meaning OK. For nested memory mce handling, this behavior may lead to one mce looping, Example: 1. When LCME is enabled, and there are two processes A && B running on different core X && Y separately, which will access one same page, then the page corrupted when process A access it, a MCE will be rasied to core X and the error process is just underway. 2. Then B access the page and trigger another MCE to core Y, it will also do error process, it will see TestSetPageHWPoison be true, and 0 is returned. 3. The kill_me_maybe will check the return: 1244 static void kill_me_maybe(struct callback_head *cb) 1245 { ... 1254 if (!memory_failure(p->mce_addr >> PAGE_SHIFT, flags) && 1255 !(p->mce_kflags & MCE_IN_KERNEL_COPYIN)) { 1256 set_mce_nospec(p->mce_addr >> PAGE_SHIFT, p->mce_whole_page); 1257 sync_core(); 1258 return; 1259 } ... 1267 } 4. The error process for B will end, and may nothing happened if kill-early is not set, The process B will re-excute instruction and get into mce again and then loop happens. And also the set_mce_nospec() here is not proper, may refer to commit fd0e786d9d09 ("x86/mm, mm/hwpoison: Don't unconditionally unmap kernel 1:1 pages"). For other cases which care the return value of memory_failure() should check why they want to process a memory error which have already been processed. This behavior seems reasonable. Signed-off-by: Aili Yao Signed-off-by: Naoya Horiguchi --- mm/memory-failure.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-) diff --git v5.12-rc5/mm/memory-failure.c v5.12-rc5_patched/mm/memory-failure.c index c1509f4b565e..368ef77e01f9 100644 --- v5.12-rc5/mm/memory-failure.c +++ v5.12-rc5_patched/mm/memory-failure.c @@ -1228,7 +1228,7 @@ static int memory_failure_hugetlb(unsigned long pfn, int flags) if (TestSetPageHWPoison(head)) { pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); - return 0; + return -EHWPOISON; } num_poisoned_pages_inc(); @@ -1438,7 +1438,7 @@ int memory_failure(unsigned long pfn, int flags) pr_err("Memory failure: %#lx: already hardware poisoned\n", pfn); mutex_unlock(&mf_mutex); - return 0; + return -EHWPOISON; } orig_head = hpage = compound_head(p);