From patchwork Wed Oct 20 21:07:54 2021 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Yang Shi X-Patchwork-Id: 12573369 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from mail.kernel.org (mail.kernel.org [198.145.29.99]) by smtp.lore.kernel.org (Postfix) with ESMTP id C334FC433F5 for ; Wed, 20 Oct 2021 21:08:11 +0000 (UTC) Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by mail.kernel.org (Postfix) with ESMTP id 758BB60F9E for ; Wed, 20 Oct 2021 21:08:11 +0000 (UTC) DMARC-Filter: OpenDMARC Filter v1.4.1 mail.kernel.org 758BB60F9E Authentication-Results: mail.kernel.org; dmarc=fail (p=none dis=none) header.from=gmail.com Authentication-Results: mail.kernel.org; spf=pass smtp.mailfrom=kvack.org Received: by kanga.kvack.org (Postfix) id 1988494000C; Wed, 20 Oct 2021 17:08:11 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 1475E940009; Wed, 20 Oct 2021 17:08:11 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id F282294000C; Wed, 20 Oct 2021 17:08:10 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from forelay.hostedemail.com (smtprelay0009.hostedemail.com [216.40.44.9]) by kanga.kvack.org (Postfix) with ESMTP id E12E2940009 for ; Wed, 20 Oct 2021 17:08:10 -0400 (EDT) Received: from smtpin05.hostedemail.com (10.5.19.251.rfc1918.com [10.5.19.251]) by forelay02.hostedemail.com (Postfix) with ESMTP id 6FDEF2BC2D for ; Wed, 20 Oct 2021 21:08:10 +0000 (UTC) X-FDA: 78718053540.05.525C170 Received: from mail-pf1-f178.google.com (mail-pf1-f178.google.com [209.85.210.178]) by imf09.hostedemail.com (Postfix) with ESMTP id 074C53000103 for ; Wed, 20 Oct 2021 21:08:07 +0000 (UTC) Received: by mail-pf1-f178.google.com with SMTP id m26so4021848pff.3 for ; Wed, 20 Oct 2021 14:08:09 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20210112; h=from:to:cc:subject:date:message-id:in-reply-to:references :mime-version:content-transfer-encoding; bh=VhKiYV125XYYJ9Cqjf+na/y/uU4EzGqrjZFtLgGxOLI=; b=KVvaztXO6aH4dUV/+AVYfrJKBr310WbJPKg9aP4UAoKyLzbBAA/V7BUDpPoJ5DwVVU E9Oc2nXFJEBb0wptinixqkhTkDOHb07FUd6d3IlUMF452b9UX99n7SI9qdGBY8axem73 wqOMbg2z1zxd6V3ZYyGWVz9+wmdFFCPM4uzHAkoJ61BUuhCMglMmPzU1U844W/HOpWIY xoQQmyPl4TF06WLj7dLCA+j4vjQfxIZwMVkefWJMF+b+0sIkum0uTampsUpxRGkpWNcn C7AA+kAtRq/w2wdfzY9FOf8WHxmqAJD7EV+eaSs1KlhfDETMqhPj0511Fr3LOWkW3eAu jxhA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20210112; h=x-gm-message-state:from:to:cc:subject:date:message-id:in-reply-to :references:mime-version:content-transfer-encoding; bh=VhKiYV125XYYJ9Cqjf+na/y/uU4EzGqrjZFtLgGxOLI=; b=WB2ZqaKyMQ3LIRGWnXRoc8SvSLZncN/paZixLPZXw3yonNxTut0V76cD4i8KlIxWoT 2BCgKU4K6mKgAQxsBppNt0nHg7PwN30E8GZFFsTsncHAEneOxhDUV9Ozt6CTQpQLkUHy QeXGNp2qAdxkh0jHp51pCZUzkZ3MhMX+VoEvsCQGDfUaTl5ypq4eDBUtL2mEziYEHiHH JbmlBnkDu8uyHEfgp0ELORu8A2Tn9/N8wypLA+uteeb9PnLkhGSkYV8laKH8UviCM7YJ m+2CQchSQNTw6z6wPAB9Cz1fNlJDfr/1FAeVvO4RBFfC92xeg9CnX9dcBZfE3JKpka9V PnIQ== X-Gm-Message-State: AOAM533usb7nCcv/PqKf9fJpioK/51FUfErhU4snA7/zYCtXrFqwT1ST sIzCmsdeh78N9JHQk79Qiuw= X-Google-Smtp-Source: ABdhPJzPAtb2hHvPGKVOLDaQI/51Q5nIlKC8rfpRs5WkHpv/lGnOr1rSdUFphUbHU0P0Pn6b3Qw3Ow== X-Received: by 2002:a63:a744:: with SMTP id w4mr1242143pgo.456.1634764089146; Wed, 20 Oct 2021 14:08:09 -0700 (PDT) Received: from localhost.localdomain (c-73-93-239-127.hsd1.ca.comcast.net. [73.93.239.127]) by smtp.gmail.com with ESMTPSA id i8sm3403143pfo.117.2021.10.20.14.08.07 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Wed, 20 Oct 2021 14:08:08 -0700 (PDT) From: Yang Shi To: naoya.horiguchi@nec.com, hughd@google.com, kirill.shutemov@linux.intel.com, willy@infradead.org, peterx@redhat.com, osalvador@suse.de, akpm@linux-foundation.org Cc: shy828301@gmail.com, linux-mm@kvack.org, linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org Subject: [v5 PATCH 5/6] mm: shmem: don't truncate page if memory failure happens Date: Wed, 20 Oct 2021 14:07:54 -0700 Message-Id: <20211020210755.23964-6-shy828301@gmail.com> X-Mailer: git-send-email 2.26.2 In-Reply-To: <20211020210755.23964-1-shy828301@gmail.com> References: <20211020210755.23964-1-shy828301@gmail.com> MIME-Version: 1.0 X-Rspamd-Queue-Id: 074C53000103 X-Stat-Signature: 7j31psop6oqzga55x3cyryg3k6hiramh Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=gmail.com header.s=20210112 header.b=KVvaztXO; spf=pass (imf09.hostedemail.com: domain of shy828301@gmail.com designates 209.85.210.178 as permitted sender) smtp.mailfrom=shy828301@gmail.com; dmarc=pass (policy=none) header.from=gmail.com X-Rspamd-Server: rspam04 X-HE-Tag: 1634764087-989213 X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: The current behavior of memory failure is to truncate the page cache regardless of dirty or clean. If the page is dirty the later access will get the obsolete data from disk without any notification to the users. This may cause silent data loss. It is even worse for shmem since shmem is in-memory filesystem, truncating page cache means discarding data blocks. The later read would return all zero. The right approach is to keep the corrupted page in page cache, any later access would return error for syscalls or SIGBUS for page fault, until the file is truncated, hole punched or removed. The regular storage backed filesystems would be more complicated so this patch is focused on shmem. This also unblock the support for soft offlining shmem THP. Signed-off-by: Yang Shi --- mm/memory-failure.c | 10 +++++++++- mm/shmem.c | 38 +++++++++++++++++++++++++++++++++++--- mm/userfaultfd.c | 5 +++++ 3 files changed, 49 insertions(+), 4 deletions(-) diff --git a/mm/memory-failure.c b/mm/memory-failure.c index aaeda93d26fb..3603a3acf7b3 100644 --- a/mm/memory-failure.c +++ b/mm/memory-failure.c @@ -57,6 +57,7 @@ #include #include #include +#include #include "internal.h" #include "ras/ras_event.h" @@ -866,6 +867,7 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) { int ret; struct address_space *mapping; + bool extra_pins; delete_from_lru_cache(p); @@ -894,6 +896,12 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) goto out; } + /* + * The shmem page is kept in page cache instead of truncating + * so is expected to have an extra refcount after error-handling. + */ + extra_pins = shmem_mapping(mapping); + /* * Truncation is a bit tricky. Enable it per file system for now. * @@ -903,7 +911,7 @@ static int me_pagecache_clean(struct page_state *ps, struct page *p) out: unlock_page(p); - if (has_extra_refcount(ps, p, false)) + if (has_extra_refcount(ps, p, extra_pins)) ret = MF_FAILED; return ret; diff --git a/mm/shmem.c b/mm/shmem.c index b5860f4a2738..89062ce85db8 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -2456,6 +2456,7 @@ shmem_write_begin(struct file *file, struct address_space *mapping, struct inode *inode = mapping->host; struct shmem_inode_info *info = SHMEM_I(inode); pgoff_t index = pos >> PAGE_SHIFT; + int ret = 0; /* i_rwsem is held by caller */ if (unlikely(info->seals & (F_SEAL_GROW | @@ -2466,7 +2467,15 @@ shmem_write_begin(struct file *file, struct address_space *mapping, return -EPERM; } - return shmem_getpage(inode, index, pagep, SGP_WRITE); + ret = shmem_getpage(inode, index, pagep, SGP_WRITE); + + if (*pagep && PageHWPoison(*pagep)) { + unlock_page(*pagep); + put_page(*pagep); + ret = -EIO; + } + + return ret; } static int @@ -2553,6 +2562,12 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (sgp == SGP_CACHE) set_page_dirty(page); unlock_page(page); + + if (PageHWPoison(page)) { + put_page(page); + error = -EIO; + break; + } } /* @@ -3114,7 +3129,8 @@ static const char *shmem_get_link(struct dentry *dentry, page = find_get_page(inode->i_mapping, 0); if (!page) return ERR_PTR(-ECHILD); - if (!PageUptodate(page)) { + if (PageHWPoison(page) || + !PageUptodate(page)) { put_page(page); return ERR_PTR(-ECHILD); } @@ -3122,6 +3138,11 @@ static const char *shmem_get_link(struct dentry *dentry, error = shmem_getpage(inode, 0, &page, SGP_READ); if (error) return ERR_PTR(error); + if (page && PageHWPoison(page)) { + unlock_page(page); + put_page(page); + return ERR_PTR(-ECHILD); + } unlock_page(page); } set_delayed_call(done, shmem_put_link, page); @@ -3772,6 +3793,13 @@ static void shmem_destroy_inodecache(void) kmem_cache_destroy(shmem_inode_cachep); } +/* Keep the page in page cache instead of truncating it */ +static int shmem_error_remove_page(struct address_space *mapping, + struct page *page) +{ + return 0; +} + const struct address_space_operations shmem_aops = { .writepage = shmem_writepage, .set_page_dirty = __set_page_dirty_no_writeback, @@ -3782,7 +3810,7 @@ const struct address_space_operations shmem_aops = { #ifdef CONFIG_MIGRATION .migratepage = migrate_page, #endif - .error_remove_page = generic_error_remove_page, + .error_remove_page = shmem_error_remove_page, }; EXPORT_SYMBOL(shmem_aops); @@ -4193,6 +4221,10 @@ struct page *shmem_read_mapping_page_gfp(struct address_space *mapping, page = ERR_PTR(error); else unlock_page(page); + + if (PageHWPoison(page)) + page = ERR_PTR(-EIO); + return page; #else /* diff --git a/mm/userfaultfd.c b/mm/userfaultfd.c index 7a9008415534..b688d5327177 100644 --- a/mm/userfaultfd.c +++ b/mm/userfaultfd.c @@ -233,6 +233,11 @@ static int mcontinue_atomic_pte(struct mm_struct *dst_mm, goto out; } + if (PageHWPoison(page)) { + ret = -EIO; + goto out_release; + } + ret = mfill_atomic_install_pte(dst_mm, dst_pmd, dst_vma, dst_addr, page, false, wp_copy); if (ret)