From patchwork Fri Oct 25 03:26:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13849987 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id C9736D116E3 for ; Fri, 25 Oct 2024 03:27:03 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 34FC86B008A; Thu, 24 Oct 2024 23:27:03 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 2DAAF6B0096; Thu, 24 Oct 2024 23:27:03 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 17A0E6B009E; Thu, 24 Oct 2024 23:27:03 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0017.hostedemail.com [216.40.44.17]) by kanga.kvack.org (Postfix) with ESMTP id EC8A66B008A for ; Thu, 24 Oct 2024 23:27:02 -0400 (EDT) Received: from smtpin24.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay09.hostedemail.com (Postfix) with ESMTP id 80E608040E for ; Fri, 25 Oct 2024 03:26:45 +0000 (UTC) X-FDA: 82710687696.24.FA31CA8 Received: from out30-119.freemail.mail.aliyun.com (out30-119.freemail.mail.aliyun.com [115.124.30.119]) by imf09.hostedemail.com (Postfix) with ESMTP id A3F70140002 for ; Fri, 25 Oct 2024 03:26:44 +0000 (UTC) Authentication-Results: imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="SrzjMw/e"; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729826666; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:references:dkim-signature; bh=0nOXd776EWcCD5hlZ32PV6/J3I4PXz2ZWlCfAwu6rzQ=; b=nr2pRohoXXr7EGfUvZkKVOrLcmoDfVTyoKq3K712FyPTcjNxxpRmmzMRkWNH18k8TBUyBV 6DuOzX8hMrYvj2F7FMJASi5Eqeml7uGXZP8SKO5Ev6I33DfUB/ft7Ecqh6SUDHY3OY8WEo 86JJKm64zln6msJIADao5WgkCykVwus= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729826666; a=rsa-sha256; cv=none; b=fX3fgTvueVWWR7kGHsmv6XqTsqYhDmIGfIllstZcb+8I4OpPGYjp9eqetYlGOcb9XwBJLy pP+hIp4nXLnTkcZ/H/uJRehs9oAWTbUyaRKJ8btt3O0UbZ/Pd18iWR+Vsy9X+ZlSmI5eIP b7n8ch7u5bJd1EiWzCItLsn/b4sBxIc= ARC-Authentication-Results: i=1; imf09.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b="SrzjMw/e"; spf=pass (imf09.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.119 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729826815; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=0nOXd776EWcCD5hlZ32PV6/J3I4PXz2ZWlCfAwu6rzQ=; b=SrzjMw/eKgVR8KRQJ6NEGu6Zbbp+HSZ0vBAFupRI1f8YIRd+mMZe1ZKofxkQe/pehftu5XEv3UkBnuljRsz5ACE3GDZkg5c7CeCLACn9AN0cfU7YCybxCXt0O8ku+IOF7kI/nWllbEDkEHY8tS77RDi66WddpI+d701+TN4kDt8= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHqyxJi_1729826813 cluster:ay36) by smtp.aliyun-inc.com; Fri, 25 Oct 2024 11:26:53 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, wangkefeng.wang@huawei.com, shy828301@gmail.com, dhowells@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH] mm: shmem: fallback to page size splice if large folio has poisoned subpages Date: Fri, 25 Oct 2024 11:26:39 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 MIME-Version: 1.0 X-Stat-Signature: 3eggkai6yt8eoeo1y85hdn3u3597b7h4 X-Rspamd-Queue-Id: A3F70140002 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729826804-106499 X-HE-Meta: U2FsdGVkX1+L8mzYBnIhaLIRFstm8AF80qy27OPkebahCk/LiKRUIFg+nrXFLwPwJHHQXmqFVbGPs9Vxu63iEP974qeuf7pXXOf+VV5283IXslzweEBnT9ubVJwk5uYPvB2LoiHMPXCeVjmQVVIHXzwAMu1lj/4R0lwZEWhMmULhgk8vE+UILlModAUDcCarhrzvqOzRu7BYjJwHLMnH51CdGFjqpCKdeCRfPAA/N3TvIxD8h4yAmPisIDbVWSTYE0NX6DEfn06fwXzMV69ngnvbiJKDIuIAMe9T3iETSmZG+APJGZdYp5MaSyJ0W9XT36tB25GGD6We6IjlwBwUaLb5pYo2hBh+0Kkxlb4qz7HIZl9rR06V/CTmZsMbQsBAzmyFZef+Fs/Iuw9MAor0mSEfzR9jlMSZvTHbrkdPaL3zMaD3RhDJSaS08NFXwktEDTics3Mnxy/1LhVecbc5I4DKyViE6aE5GiKSvZ+xZzhmNT3iq25YECNVLU+S298NCqYYsRicFHBiCS8wndVswfqxn+hD4YKKOKcKYIoJcVJbHLlbk6z5esVnYsA6OOzCbfU1uXfc5hvUR58gZqzQTAPKn5j5r25QB4oYQhDTGjDkTzOCW4l6c2m1+c6CRUw80nmy7kpxLqdqKG+j4snFcogODXU9WvUzJPZXc07GP7CsR0vtGVwZIctB9ox2whlEK+NJxCwOJpTHD5QlKd4GkZI8BB/Un23I/aZu0WNpvHACNqHzbipKfanCvJOKuqboR0yYWoA1ob2ZQkHDvI18psX7rnlV6xtqkfyig7l52k0A326uYUzBQ7BK0fE0gw1B8oe6/XZeq/Gtiu2SM3+kyc8bwqyXJQCsRYCU91udfXOV8z8ZHiIHSwPIhwP+mozekWRIVOxyd5SExc0rw3lG3DRALwcEsyE5Pnhu4p89DVXs7Nm9SYYidZRynwuhM0uIkLhzh8F1UhST0x0rKMj a/5W2Djl 6RjTKDR5QkRYFmSCAEttJ1gVZDWoWp4Bj0sjqXEeNcqNoTTmsxCPyVz8Np3YYcsRh8vATCW+XZwqzSCq0ZGsGXkM/079kxK4fZRm9+fK6+5XswFHOsMx2lQUioxbe+SZeDZx2y0Wx9l/hZl7+0nKA68jpnNcaq9cLs3kKVaopK+f2aKVtNYkEbqaZhX68MVqETl6ghqaLIiz1Y9C/cgfjjbxI9IRDyzNRrNHThj4CMQhFlNkfVRRdJJTWMiXOrm/hpgNo1vQ1/jSVV8VpUGr3CaErpBNMxpDernDmLyeRtHAtBD2ILCRLjYtzIA== X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, and splice() can not read any subpages if the large folio has a poisoned subpage, which is not good as we discussed in previous mail[1]. Thus adding a fallback to the PAGE_SIZE splice() still allows reading normal subpages if the large folio has hwpoisoned subpages. [1] https://lore.kernel.org/all/Zw_d0EVAJkpNJEbA@casper.infradead.org/ Signed-off-by: Baolin Wang --- mm/shmem.c | 39 +++++++++++++++++++++++++++++++-------- 1 file changed, 31 insertions(+), 8 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index 1bef6e32a1fa..79010e636056 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3291,11 +3291,16 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, len = min_t(size_t, len, npages * PAGE_SIZE); do { + bool fallback_page_splice = false; + struct page *page = NULL; + pgoff_t index; + size_t size; + if (*ppos >= i_size_read(inode)) break; - error = shmem_get_folio(inode, *ppos / PAGE_SIZE, 0, &folio, - SGP_READ); + index = *ppos >> PAGE_SHIFT; + error = shmem_get_folio(inode, index, 0, &folio, SGP_READ); if (error) { if (error == -EINVAL) error = 0; @@ -3304,12 +3309,15 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, if (folio) { folio_unlock(folio); - if (folio_test_hwpoison(folio) || - (folio_test_large(folio) && - folio_test_has_hwpoisoned(folio))) { + page = folio_file_page(folio, index); + if (PageHWPoison(page)) { error = -EIO; break; } + + if (folio_test_large(folio) && + folio_test_has_hwpoisoned(folio)) + fallback_page_splice = true; } /* @@ -3323,7 +3331,18 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, isize = i_size_read(inode); if (unlikely(*ppos >= isize)) break; - part = min_t(loff_t, isize - *ppos, len); + /* + * Fallback to PAGE_SIZE splice if the large folio has hwpoisoned + * subpages. + */ + if (likely(!fallback_page_splice)) { + size = len; + } else { + size_t offset = *ppos & ~PAGE_MASK; + + size = min_t(loff_t, PAGE_SIZE - offset, len); + } + part = min_t(loff_t, isize - *ppos, size); if (folio) { /* @@ -3331,8 +3350,12 @@ static ssize_t shmem_file_splice_read(struct file *in, loff_t *ppos, * virtual addresses, take care about potential aliasing * before reading the page on the kernel side. */ - if (mapping_writably_mapped(mapping)) - flush_dcache_folio(folio); + if (mapping_writably_mapped(mapping)) { + if (likely(!fallback_page_splice)) + flush_dcache_folio(folio); + else + flush_dcache_page(page); + } folio_mark_accessed(folio); /* * Ok, we have the page, and it's up-to-date, so we can