From patchwork Wed Oct 16 10:09:30 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Baolin Wang X-Patchwork-Id: 13838104 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from kanga.kvack.org (kanga.kvack.org [205.233.56.17]) by smtp.lore.kernel.org (Postfix) with ESMTP id E093DD1AD40 for ; Wed, 16 Oct 2024 10:09:52 +0000 (UTC) Received: by kanga.kvack.org (Postfix) id 7E8EC6B0088; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) Received: by kanga.kvack.org (Postfix, from userid 40) id 773F96B008C; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) X-Delivered-To: int-list-linux-mm@kvack.org Received: by kanga.kvack.org (Postfix, from userid 63042) id 462E16B0089; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) X-Delivered-To: linux-mm@kvack.org Received: from relay.hostedemail.com (smtprelay0010.hostedemail.com [216.40.44.10]) by kanga.kvack.org (Postfix) with ESMTP id 2451D6B0088 for ; Wed, 16 Oct 2024 06:09:51 -0400 (EDT) Received: from smtpin26.hostedemail.com (a10.router.float.18 [10.200.18.1]) by unirelay07.hostedemail.com (Postfix) with ESMTP id A6FB7161C01 for ; Wed, 16 Oct 2024 10:09:39 +0000 (UTC) X-FDA: 82679043888.26.34D934F Received: from out30-124.freemail.mail.aliyun.com (out30-124.freemail.mail.aliyun.com [115.124.30.124]) by imf24.hostedemail.com (Postfix) with ESMTP id AC1AC180009 for ; Wed, 16 Oct 2024 10:09:44 +0000 (UTC) Authentication-Results: imf24.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=YY2Ix3PU; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=hostedemail.com; s=arc-20220608; t=1729073243; h=from:from:sender:reply-to:subject:subject:date:date: message-id:message-id:to:to:cc:cc:mime-version:mime-version: content-type:content-transfer-encoding:content-transfer-encoding: in-reply-to:in-reply-to:references:references:dkim-signature; bh=wkcQzqhjpS9uwSQXB5+PPSGfoLQjvCtl7yMPRr0AdnM=; b=SVEuKXTIQXLKpOqNlyaIkOWnU9Yxl2SgjxeqTyHIxBKafIiRJo3XDcYW8mFXLo6wE7uAu6 CI+mMmTeHcniDl89g/V33XZmA0n9S0Dz2aSgdgqQSELuMQ6iFG8ojcmvjnzSv9la8u0cwW QqbbFrvLCwWPZSSEuu3t8v8gPvfK5yU= ARC-Seal: i=1; s=arc-20220608; d=hostedemail.com; t=1729073243; a=rsa-sha256; cv=none; b=GygvVM3hBGJj1JVKdFH89CwatW8MJNlnx5lz/0Rg5fSJ0YBqCD/KM/zPRrr8h7BF/XzlfT gtv5FpAShHJfiyMft45kMeG+5WDQaRla2PHExIoN5cXi0NHhrfplVBb+XGlz2QBTSwDf+e 0iWcx0X5NJf2HUDJS2pEXH+mBuLfzhA= ARC-Authentication-Results: i=1; imf24.hostedemail.com; dkim=pass header.d=linux.alibaba.com header.s=default header.b=YY2Ix3PU; spf=pass (imf24.hostedemail.com: domain of baolin.wang@linux.alibaba.com designates 115.124.30.124 as permitted sender) smtp.mailfrom=baolin.wang@linux.alibaba.com; dmarc=pass (policy=none) header.from=linux.alibaba.com DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=linux.alibaba.com; s=default; t=1729073380; h=From:To:Subject:Date:Message-Id:MIME-Version; bh=wkcQzqhjpS9uwSQXB5+PPSGfoLQjvCtl7yMPRr0AdnM=; b=YY2Ix3PUdGK6OeQxRCOkBAMpgCskzZFO1TJRWbWEiAUtdENrL9EcF9Zc7ZIIZtA5FLDkHVjx2D5jkfcVy1KBJEuKbuMtPXzz6D1KtQHXzGodk4X4aoS/lKray8HJ8IQ+/pOJBcWbc7RngHgjePiRudoKMad4oxxB5cTwMuGKYV0= Received: from localhost(mailfrom:baolin.wang@linux.alibaba.com fp:SMTPD_---0WHGye6P_1729073379 cluster:ay36) by smtp.aliyun-inc.com; Wed, 16 Oct 2024 18:09:40 +0800 From: Baolin Wang To: akpm@linux-foundation.org, hughd@google.com Cc: willy@infradead.org, david@redhat.com, baolin.wang@linux.alibaba.com, linux-mm@kvack.org, linux-kernel@vger.kernel.org Subject: [PATCH 2/2] mm: shmem: improve the tmpfs large folio read performance Date: Wed, 16 Oct 2024 18:09:30 +0800 Message-Id: X-Mailer: git-send-email 2.39.3 In-Reply-To: References: MIME-Version: 1.0 X-Stat-Signature: x39upn3wya4h8ypgucmmn6zq3rm54cws X-Rspamd-Queue-Id: AC1AC180009 X-Rspam-User: X-Rspamd-Server: rspam08 X-HE-Tag: 1729073384-513832 X-HE-Meta: U2FsdGVkX18ETOiHuh98rr7ZzOwdN8Gbllju4N45cZ1c92tJlM8OnOID2x8rAXq6MC1rT6mZGM9BaXJ31ICUwZU8nXnY/2XP4e/NlxcSg9izG+Rb1MjgYf8O9QbPL93MD9rZ7/52fTWik44dLDKoj6YFVgH5yJQ8waa68D49HyqvyhRC2r8Hz8ZkoFaNCOd8N0E6EdSJsOiPLRc3jIzj3HjWpPKH3KBn4hUmlBfYxe4L+fTYOBecTckpqfNsoBimOabbmG1cbtg2d9avpGdy/GgVp/tx/yQ2jOSUDRPpsvke2u2TE3aqobF3LQNoDcrrnUSQwnN2zoDFB7FUzx6RMWcoW3xwb0YhLtADbFHE9GkLAcSbDjYQcHwK2SCuph0Re5he2o2MAqoiML4dKj/J3Lmup0cc7aMSOeUkoJuF/lfwHl1m36yNao52KjE+dDjcRPgFNN0fYWW63qDlded/LHaPZEgiBOgrFctG4zzDzCfm4tuv7RWFPXfBNzGY+dkaM14Rt/aJ6EyiSQjxMBu1zgZhHk4TGvbAWPrD2Hv9ZIdU1hs6obwOD1qLIFzrrmSdr5Uk7qksUnHOh7A2AoOtO35dPIC1uF16J7PqgE/ZXyGxHk3uiLnFZDqtk3VmE3R1gX7GCHS6UAmRQuLwuQa/ooQv99hSktgkaXDisTY8AGsaAIDuOoZM2K+PX3IGnu8xYJDo48CpF6zme5s6ZdcDW5vbY6UB7lBKH7buLAuxJ8m2ttVLVk2f+a0GCaL2cixujZ9WQ+3TJBndbOQgYTwxN8q2pgPKIhOuEM6AJBBdDNXmC4sBK7LeBrBPHyxIEUPGN2PWPtF/jcJQcbVocOLAu9fzMSklaylzG9LAnZCOmKGeFM2sTy83i5nbNYxi6CoJKNiEtUVCucgXriWnZyTl43sGT0IGlV9YZcL6deCVCxmZy093znSt6JfQNS5cj0tdyFPtkqCtJ25afpKWCS3 PxBcI6Aa q5aJvoiQyQPr2LhjR9S+GgsmN+jZN8ZDDefomQsGyIDWYLD3KKu0/4d8/Kyr5GyVUcFJiTF9GjSX506qH8qnTLM735dziihTDMbVI9qYx2/UDo25uh+cVKPMCdjNq5Et/ThXWcBd/S6kJc8lKVR+2AEnmcRv1wACDet+Nxf9v8hNGiJg0pF41Sjm8ySPVAqoTRWFX/W/MvV97gUYjqc0iYGdqtdQyESzbHlKY X-Bogosity: Ham, tests=bogofilter, spamicity=0.000000, version=1.2.4 Sender: owner-linux-mm@kvack.org Precedence: bulk X-Loop: owner-majordomo@kvack.org List-ID: List-Subscribe: List-Unsubscribe: The tmpfs has already supported the PMD-sized large folios, but the tmpfs read operation still performs copying at the PAGE SIZE granularity, which is unreasonable. This patch changes to copy data at the folio granularity, which can improve the read performance, as well as changing to use folio related functions. Use 'fio bs=64k' to read a 1G tmpfs file populated with 2M THPs, and I can see about 20% performance improvement, and no regression with bs=4k. Before the patch: READ: bw=10.0GiB/s After the patch: READ: bw=12.0GiB/s Signed-off-by: Baolin Wang --- mm/shmem.c | 22 ++++++++++++---------- 1 file changed, 12 insertions(+), 10 deletions(-) diff --git a/mm/shmem.c b/mm/shmem.c index edab02a26aac..7e79b6a96da0 100644 --- a/mm/shmem.c +++ b/mm/shmem.c @@ -3108,13 +3108,12 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) ssize_t retval = 0; index = iocb->ki_pos >> PAGE_SHIFT; - offset = iocb->ki_pos & ~PAGE_MASK; for (;;) { struct folio *folio = NULL; - struct page *page = NULL; unsigned long nr, ret; loff_t end_offset, i_size = i_size_read(inode); + size_t fsize; if (unlikely(iocb->ki_pos >= i_size)) break; @@ -3128,8 +3127,9 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) if (folio) { folio_unlock(folio); - page = folio_file_page(folio, index); - if (PageHWPoison(page)) { + if (folio_test_hwpoison(folio) || + (folio_test_large(folio) && + folio_test_has_hwpoisoned(folio))) { folio_put(folio); error = -EIO; break; @@ -3147,7 +3147,12 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) break; } end_offset = min_t(loff_t, i_size, iocb->ki_pos + to->count); - nr = min_t(loff_t, end_offset - iocb->ki_pos, PAGE_SIZE - offset); + if (folio) + fsize = folio_size(folio); + else + fsize = PAGE_SIZE; + offset = iocb->ki_pos & (fsize - 1); + nr = min_t(loff_t, end_offset - iocb->ki_pos, fsize - offset); if (folio) { /* @@ -3156,7 +3161,7 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) * before reading the page on the kernel side. */ if (mapping_writably_mapped(mapping)) - flush_dcache_page(page); + flush_dcache_folio(folio); /* * Mark the page accessed if we read the beginning. */ @@ -3166,9 +3171,8 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) * Ok, we have the page, and it's up-to-date, so * now we can copy it to user space... */ - ret = copy_page_to_iter(page, offset, nr, to); + ret = copy_folio_to_iter(folio, offset, nr, to); folio_put(folio); - } else if (user_backed_iter(to)) { /* * Copy to user tends to be so well optimized, but @@ -3186,8 +3190,6 @@ static ssize_t shmem_file_read_iter(struct kiocb *iocb, struct iov_iter *to) } retval += ret; - offset += ret; - offset &= ~PAGE_MASK; iocb->ki_pos += ret; index = iocb->ki_pos >> PAGE_SHIFT;