From patchwork Thu Jan 18 22:19:39 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523246 Received: from mail-pl1-f171.google.com (mail-pl1-f171.google.com [209.85.214.171]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 4A30E3218B for ; Thu, 18 Jan 2024 22:22:21 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.171 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616543; cv=none; b=o/9yRekDGKpC5aOBT7aT5RnMSKwkXu04Brt5aaWjLAejNrmdBAun1p/lZgArH+DiHQTlvV4KW5BRM0n8kC3Cw/CDqLGGau9Z7+KZNvwLmuumWYzKxZ9HU5GaFQaW4WCn0Uq893The9tn2BBMfjlghgynsb+T76op67hRdWI7hqM= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616543; c=relaxed/simple; bh=onnpTbRDByCIZ1/6zmnBOrJi9ZP31m7f7TGpM+XXNFA=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=dMIHbpQ+BWpMnkFFZLm4x9OeqMhYVZlgkUfgTKMcP/ugqWEDwrUTM61By7dZL15OCxy8E0vCAan+yNViMJi6zoelv1P2/fz2F8as3PS9ZgtKrcjLdoL+hOmQaAWn7spNeKX/0dah6Fbc9C3dVP1LtyUjKXywKybjA+lRdqnRGgo= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=D9KWcC7k; arc=none smtp.client-ip=209.85.214.171 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="D9KWcC7k" Received: by mail-pl1-f171.google.com with SMTP id d9443c01a7336-1d45f182fa2so822685ad.3 for ; Thu, 18 Jan 2024 14:22:21 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616541; x=1706221341; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=; b=D9KWcC7kt164VbbZzWIZhYzF+r0+5vIdevppK6FkxczIJDzPvDycbuPPozaK6z9Dsc Em5+7uZ3h77lrEJwbWyhT6y8YQ2zelaosHHT1qFjU/BFD2nHC+Ch0SsXKL8TN3N2xBVb T78HizJhQEnQhvBi5SySBSy+FxASf84zPCXtnZyMXhsGBPKLXMgd5xN/X5xwfNSxI9gp RxoDMy70eADtrvYXaJm/z//3MssrAF1pcc1E2dTz/FMZiDJQIgMddV7oHV8/uHUUV3r5 vKF/pxV4EvpMEYSMiKTxkaa3QH/Po2Eq+Dq1yXgdWASJwCVezXoe/5ofjVPYcDa3Q6W3 vsqQ== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616541; x=1706221341; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=SIYg4fbOx5xpSV7xBsoY4WQ6ng+5chM3y1Ni3HkVJSg=; b=QjJbZx0bARdMgbSulrmU8ikGR4jURAvSBl8nN0NrwSmvu8UV7xeRijeiA/sHy5TprB FpKbHJHj0GPzZH7fJxTz0eVtHwTZSf+I737A3W5Mfbrcel+LELka+5tsrwREHWFOFxkt JpyHDUeGUucizRP9evYYiDlOe4YGzkWcoUW3psGNWyl47p6fqlipSulqNdwXPx7hXRQ9 fN69AOxBz4bpiFwiTUMxy2N3EoY5PcXNX9x7/ZZ0rpKYeaVsO47p3Xwp340II/bUhBDT OeowE/76gP1ql/Lzv7kxmqFG39tG/oyc6n7Tm3gSO0Ud9hH4B/6vMK9YhfhR1Srig4o8 6iwA== X-Gm-Message-State: AOJu0Yx5zgygjg4WU4NRKg7QFSTGq7byI/FWDQIr61lvC1/iC0OgCPNs KGeglfhsfJJsHNSJMQf4JNx4N/O3/IzObpesG5YlBT4LAKOLZiCbbFzYm80YS10aeMIT3vYMEpP t X-Google-Smtp-Source: AGHT+IFvJI+inGv4x/2Vzu4CfgzR5ivHLaCvAfCWUVCseYlv9qmktZULfGpdbMDYc55IPbpJDykUzw== X-Received: by 2002:a17:902:ee8a:b0:1d5:7220:9ff with SMTP id a10-20020a170902ee8a00b001d5722009ffmr1594374pld.117.1705616541575; Thu, 18 Jan 2024 14:22:21 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id p12-20020a170902eacc00b001d71729ec9csm531276pld.188.2024.01.18.14.22.20 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:21 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGN-0h; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlm-2z8S; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 1/3] xfs: unmapped buffer item size straddling mismatch Date: Fri, 19 Jan 2024 09:19:39 +1100 Message-ID: <20240118222216.4131379-2-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner We never log large contiguous regions of unmapped buffers, so this bug is never triggered by the current code. However, the slowpath for formatting buffer straddling regions is broken. That is, the size and shape of the log vector calculated across a straddle does not match how the formatting code formats a straddle. This results in a log vector with an uninitialised iovec and this causes a crash when xlog_write_full() goes to copy the iovec into the journal. Whilst touching this code, don't bother checking mapped or single folio buffers for discontiguous regions because they don't have them. This significantly reduces the overhead of this check when logging large buffers as calling xfs_buf_offset() is not free and it occurs a *lot* in those cases. Fixes: 929f8b0deb83 ("xfs: optimise xfs_buf_item_size/format for contiguous regions") Signed-off-by: Dave Chinner Reviewed-by: Christoph Hellwig --- fs/xfs/xfs_buf_item.c | 18 ++++++++++++++---- 1 file changed, 14 insertions(+), 4 deletions(-) diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 43031842341a..83a81cb52d8e 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -56,6 +56,10 @@ xfs_buf_log_format_size( (blfp->blf_map_size * sizeof(blfp->blf_data_map[0])); } +/* + * We only have to worry about discontiguous buffer range straddling on unmapped + * buffers. Everything else will have a contiguous data region we can copy from. + */ static inline bool xfs_buf_item_straddle( struct xfs_buf *bp, @@ -65,6 +69,9 @@ xfs_buf_item_straddle( { void *first, *last; + if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) + return false; + first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT)); last = xfs_buf_offset(bp, offset + ((first_bit + nbits) << XFS_BLF_SHIFT)); @@ -132,11 +139,13 @@ xfs_buf_item_size_segment( return; slow_scan: - /* Count the first bit we jumped out of the above loop from */ - (*nvecs)++; - *nbytes += XFS_BLF_CHUNK; + ASSERT(bp->b_addr == NULL); last_bit = first_bit; + nbits = 1; while (last_bit != -1) { + + *nbytes += XFS_BLF_CHUNK; + /* * This takes the bit number to start looking from and * returns the next set bit from there. It returns -1 @@ -151,6 +160,8 @@ xfs_buf_item_size_segment( * else keep scanning the current set of bits. */ if (next_bit == -1) { + if (first_bit != last_bit) + (*nvecs)++; break; } else if (next_bit != last_bit + 1 || xfs_buf_item_straddle(bp, offset, first_bit, nbits)) { @@ -162,7 +173,6 @@ xfs_buf_item_size_segment( last_bit++; nbits++; } - *nbytes += XFS_BLF_CHUNK; } } From patchwork Thu Jan 18 22:19:40 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523249 Received: from mail-pl1-f181.google.com (mail-pl1-f181.google.com [209.85.214.181]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 5309A32197 for ; Thu, 18 Jan 2024 22:22:23 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.214.181 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616545; cv=none; b=Q9kzf38cZ6bS5oDaqPCFAiGKf11ipW+XWbEC/Y9AGodb/eIxlbGhBOkmbq7yUBYogkv/XITXGuGoHW9zHCu4mgsLjUDtkb/SYGXhURvGQgb/eTb0j+w5vnFPb1lToIn9Vb6gZJeAUxMMkZ8KG3btKdafutqy0wSkGgD6U0qAh0o= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616545; c=relaxed/simple; bh=2dxpv6wiGYOCfMutbg+D8S26vtoJopdHo/FKlTF2V70=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=ZgOfsfmEybpZ0d2p8hOTzL8JYrA2mRnNe5bDz8IT8+5ENDMu5pq6sedxaIEY5v65PakAlPRdGG6rij7ngHOi9aw5YAxKB9LDNGKjb4l49nvR5xuShnyHzINXpciLM/wLp7+1pryfFzstMa5dViSrJ0MsJScGYHq427HD8wWIZTg= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=PkrvDltm; arc=none smtp.client-ip=209.85.214.181 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="PkrvDltm" Received: by mail-pl1-f181.google.com with SMTP id d9443c01a7336-1d480c6342dso896855ad.2 for ; Thu, 18 Jan 2024 14:22:23 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542; x=1706221342; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=; b=PkrvDltmdEYAUppjOXecctAgINbs7W4T6+q9JbtaWzc/eiH5I8adCCfhwG0DqgVJrx PlLUYYqw5XhjvLjThKVzrMqXOe60/QxzbjdVRSpOCEfXI0oF0h6xwUbYcWWqnR1okvzI OU5/kTACMgUK+BuczRaPqHvpH437UN6jhCTDvSaIZFbihhEArzP1+DmHwKzsSfPSsLhM gPwb8qWNCv8I2xjW/PKn0h3EMnTxIwbjYCSoK/fs+qZfQZYCv0BQFcF1AEWkNBVQ7N+s WPrg7DNHTI75kkqqxNcnG/DuNBv8pKMsI12P4xPh2UYHFFI8lQWJ/BUBpSFDIV6FbiME yEQA== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616542; x=1706221342; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=xOQkNaBecqrKeof8epQC7tBtDt+f9sgqsspBN6V4LXw=; b=mp1OaNZ7sAf3tPM5EcDWhXK5APAtlFW/c0mWB2ZXgAlqcqv5BPsxrAZlsfic026Nl1 nyE9QekvOwuhoSJMwh2avCQCXKN4NnovHx4Y/JUcb12qNDJGuk8HG8TD85LxrS5fRSYO Zc4lTja6tcj6CRMpedEVdiIedTcXOJLdFnvNSDvVEWIPq2MKuNcWcMAFFwwu9z8GKdJ5 1Ps4AdelNa5FUCrFODJt6mDfLZ05EIZt1SP8ea1Cy0hFj4Qs9awSM8ixmF9Xa14QK+CB GmZoqQDfHpYoH46pheV/sgFilg2X6pO8wuCQou9TFrxvbVCnGJKyLS7Yl3hGugCLkuXE nb0w== X-Gm-Message-State: AOJu0YyYajCEp/qxPHSpfgqRWvU3EAaOSZ3hnZwJbp3FHfkdx1zMK6n0 mndGtZ5sGDLL0uvp9UwPv8B85l/p6BIKNsYRIEif3GcpFW6xyBqRBSkTS1FzrfYMZHs/xDJzeyh o X-Google-Smtp-Source: AGHT+IHiaXm1EWAGasNNOOfrGx0XW1ck6lQq+G178B7PjIkwrsJeX+rv69F7epMmGMSQOU5IySypvg== X-Received: by 2002:a17:902:bc8b:b0:1d5:8bf4:c7b2 with SMTP id bb11-20020a170902bc8b00b001d58bf4c7b2mr1474681plb.88.1705616542573; Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id ji15-20020a170903324f00b001d7164acf5csm601148plb.120.2024.01.18.14.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGQ-11; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlq-3FAD; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 2/3] xfs: use folios in the buffer cache Date: Fri, 19 Jan 2024 09:19:40 +1100 Message-ID: <20240118222216.4131379-3-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner Convert the use of struct pages to struct folio everywhere. This is just direct API conversion, no actual logic of code changes should result. Note: this conversion currently assumes only single page folios are allocated, and because some of the MM interfaces we use take pointers to arrays of struct pages, the address of single page folios and struct pages are the same. e.g alloc_pages_bulk_array(), vm_map_ram(), etc. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 127 +++++++++++++++++++++--------------------- fs/xfs/xfs_buf.h | 14 ++--- fs/xfs/xfs_buf_item.c | 2 +- fs/xfs/xfs_linux.h | 8 +++ 4 files changed, 80 insertions(+), 71 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 08f2fbc04db5..15907e92d0d3 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -60,25 +60,25 @@ xfs_buf_submit( return __xfs_buf_submit(bp, !(bp->b_flags & XBF_ASYNC)); } +/* + * Return true if the buffer is vmapped. + * + * b_addr is null if the buffer is not mapped, but the code is clever enough to + * know it doesn't have to map a single folio, so the check has to be both for + * b_addr and bp->b_folio_count > 1. + */ static inline int xfs_buf_is_vmapped( struct xfs_buf *bp) { - /* - * Return true if the buffer is vmapped. - * - * b_addr is null if the buffer is not mapped, but the code is clever - * enough to know it doesn't have to map a single page, so the check has - * to be both for b_addr and bp->b_page_count > 1. - */ - return bp->b_addr && bp->b_page_count > 1; + return bp->b_addr && bp->b_folio_count > 1; } static inline int xfs_buf_vmap_len( struct xfs_buf *bp) { - return (bp->b_page_count * PAGE_SIZE); + return (bp->b_folio_count * PAGE_SIZE); } /* @@ -197,7 +197,7 @@ xfs_buf_get_maps( } /* - * Frees b_pages if it was allocated. + * Frees b_maps if it was allocated. */ static void xfs_buf_free_maps( @@ -273,26 +273,26 @@ _xfs_buf_alloc( } static void -xfs_buf_free_pages( +xfs_buf_free_folios( struct xfs_buf *bp) { uint i; - ASSERT(bp->b_flags & _XBF_PAGES); + ASSERT(bp->b_flags & _XBF_FOLIOS); if (xfs_buf_is_vmapped(bp)) - vm_unmap_ram(bp->b_addr, bp->b_page_count); + vm_unmap_ram(bp->b_addr, bp->b_folio_count); - for (i = 0; i < bp->b_page_count; i++) { - if (bp->b_pages[i]) - __free_page(bp->b_pages[i]); + for (i = 0; i < bp->b_folio_count; i++) { + if (bp->b_folios[i]) + __folio_put(bp->b_folios[i]); } - mm_account_reclaimed_pages(bp->b_page_count); + mm_account_reclaimed_pages(bp->b_folio_count); - if (bp->b_pages != bp->b_page_array) - kfree(bp->b_pages); - bp->b_pages = NULL; - bp->b_flags &= ~_XBF_PAGES; + if (bp->b_folios != bp->b_folio_array) + kfree(bp->b_folios); + bp->b_folios = NULL; + bp->b_flags &= ~_XBF_FOLIOS; } static void @@ -313,8 +313,8 @@ xfs_buf_free( ASSERT(list_empty(&bp->b_lru)); - if (bp->b_flags & _XBF_PAGES) - xfs_buf_free_pages(bp); + if (bp->b_flags & _XBF_FOLIOS) + xfs_buf_free_folios(bp); else if (bp->b_flags & _XBF_KMEM) kfree(bp->b_addr); @@ -345,15 +345,15 @@ xfs_buf_alloc_kmem( return -ENOMEM; } bp->b_offset = offset_in_page(bp->b_addr); - bp->b_pages = bp->b_page_array; - bp->b_pages[0] = kmem_to_page(bp->b_addr); - bp->b_page_count = 1; + bp->b_folios = bp->b_folio_array; + bp->b_folios[0] = kmem_to_folio(bp->b_addr); + bp->b_folio_count = 1; bp->b_flags |= _XBF_KMEM; return 0; } static int -xfs_buf_alloc_pages( +xfs_buf_alloc_folios( struct xfs_buf *bp, xfs_buf_flags_t flags) { @@ -364,16 +364,16 @@ xfs_buf_alloc_pages( gfp_mask |= __GFP_NORETRY; /* Make sure that we have a page list */ - bp->b_page_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); - if (bp->b_page_count <= XB_PAGES) { - bp->b_pages = bp->b_page_array; + bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); + if (bp->b_folio_count <= XB_FOLIOS) { + bp->b_folios = bp->b_folio_array; } else { - bp->b_pages = kzalloc(sizeof(struct page *) * bp->b_page_count, + bp->b_folios = kzalloc(sizeof(struct folio *) * bp->b_folio_count, gfp_mask); - if (!bp->b_pages) + if (!bp->b_folios) return -ENOMEM; } - bp->b_flags |= _XBF_PAGES; + bp->b_flags |= _XBF_FOLIOS; /* Assure zeroed buffer for non-read cases. */ if (!(flags & XBF_READ)) @@ -387,9 +387,9 @@ xfs_buf_alloc_pages( for (;;) { long last = filled; - filled = alloc_pages_bulk_array(gfp_mask, bp->b_page_count, - bp->b_pages); - if (filled == bp->b_page_count) { + filled = alloc_pages_bulk_array(gfp_mask, bp->b_folio_count, + (struct page **)bp->b_folios); + if (filled == bp->b_folio_count) { XFS_STATS_INC(bp->b_mount, xb_page_found); break; } @@ -398,7 +398,7 @@ xfs_buf_alloc_pages( continue; if (flags & XBF_READ_AHEAD) { - xfs_buf_free_pages(bp); + xfs_buf_free_folios(bp); return -ENOMEM; } @@ -412,14 +412,14 @@ xfs_buf_alloc_pages( * Map buffer into kernel address-space if necessary. */ STATIC int -_xfs_buf_map_pages( +_xfs_buf_map_folios( struct xfs_buf *bp, xfs_buf_flags_t flags) { - ASSERT(bp->b_flags & _XBF_PAGES); - if (bp->b_page_count == 1) { + ASSERT(bp->b_flags & _XBF_FOLIOS); + if (bp->b_folio_count == 1) { /* A single page buffer is always mappable */ - bp->b_addr = page_address(bp->b_pages[0]); + bp->b_addr = folio_address(bp->b_folios[0]); } else if (flags & XBF_UNMAPPED) { bp->b_addr = NULL; } else { @@ -443,8 +443,8 @@ _xfs_buf_map_pages( */ nofs_flag = memalloc_nofs_save(); do { - bp->b_addr = vm_map_ram(bp->b_pages, bp->b_page_count, - -1); + bp->b_addr = vm_map_ram((struct page **)bp->b_folios, + bp->b_folio_count, -1); if (bp->b_addr) break; vm_unmap_aliases(); @@ -571,7 +571,7 @@ xfs_buf_find_lock( return -ENOENT; } ASSERT((bp->b_flags & _XBF_DELWRI_Q) == 0); - bp->b_flags &= _XBF_KMEM | _XBF_PAGES; + bp->b_flags &= _XBF_KMEM | _XBF_FOLIOS; bp->b_ops = NULL; } return 0; @@ -629,14 +629,15 @@ xfs_buf_find_insert( goto out_drop_pag; /* - * For buffers that fit entirely within a single page, first attempt to - * allocate the memory from the heap to minimise memory usage. If we - * can't get heap memory for these small buffers, we fall back to using - * the page allocator. + * For buffers that fit entirely within a single page folio, first + * attempt to allocate the memory from the heap to minimise memory + * usage. If we can't get heap memory for these small buffers, we fall + * back to using the page allocator. */ + if (BBTOB(new_bp->b_length) >= PAGE_SIZE || xfs_buf_alloc_kmem(new_bp, flags) < 0) { - error = xfs_buf_alloc_pages(new_bp, flags); + error = xfs_buf_alloc_folios(new_bp, flags); if (error) goto out_free_buf; } @@ -728,11 +729,11 @@ xfs_buf_get_map( /* We do not hold a perag reference anymore. */ if (!bp->b_addr) { - error = _xfs_buf_map_pages(bp, flags); + error = _xfs_buf_map_folios(bp, flags); if (unlikely(error)) { xfs_warn_ratelimited(btp->bt_mount, - "%s: failed to map %u pages", __func__, - bp->b_page_count); + "%s: failed to map %u folios", __func__, + bp->b_folio_count); xfs_buf_relse(bp); return error; } @@ -963,14 +964,14 @@ xfs_buf_get_uncached( if (error) return error; - error = xfs_buf_alloc_pages(bp, flags); + error = xfs_buf_alloc_folios(bp, flags); if (error) goto fail_free_buf; - error = _xfs_buf_map_pages(bp, 0); + error = _xfs_buf_map_folios(bp, 0); if (unlikely(error)) { xfs_warn(target->bt_mount, - "%s: failed to map pages", __func__); + "%s: failed to map folios", __func__); goto fail_free_buf; } @@ -1465,7 +1466,7 @@ xfs_buf_ioapply_map( blk_opf_t op) { int page_index; - unsigned int total_nr_pages = bp->b_page_count; + unsigned int total_nr_pages = bp->b_folio_count; int nr_pages; struct bio *bio; sector_t sector = bp->b_maps[map].bm_bn; @@ -1503,7 +1504,7 @@ xfs_buf_ioapply_map( if (nbytes > size) nbytes = size; - rbytes = bio_add_page(bio, bp->b_pages[page_index], nbytes, + rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes, offset); if (rbytes < nbytes) break; @@ -1716,13 +1717,13 @@ xfs_buf_offset( struct xfs_buf *bp, size_t offset) { - struct page *page; + struct folio *folio; if (bp->b_addr) return bp->b_addr + offset; - page = bp->b_pages[offset >> PAGE_SHIFT]; - return page_address(page) + (offset & (PAGE_SIZE-1)); + folio = bp->b_folios[offset >> PAGE_SHIFT]; + return folio_address(folio) + (offset & (PAGE_SIZE-1)); } void @@ -1735,18 +1736,18 @@ xfs_buf_zero( bend = boff + bsize; while (boff < bend) { - struct page *page; + struct folio *folio; int page_index, page_offset, csize; page_index = (boff + bp->b_offset) >> PAGE_SHIFT; page_offset = (boff + bp->b_offset) & ~PAGE_MASK; - page = bp->b_pages[page_index]; + folio = bp->b_folios[page_index]; csize = min_t(size_t, PAGE_SIZE - page_offset, BBTOB(bp->b_length) - boff); ASSERT((csize + page_offset) <= PAGE_SIZE); - memset(page_address(page) + page_offset, 0, csize); + memset(folio_address(folio) + page_offset, 0, csize); boff += csize; } diff --git a/fs/xfs/xfs_buf.h b/fs/xfs/xfs_buf.h index b470de08a46c..1e7298ff3fa5 100644 --- a/fs/xfs/xfs_buf.h +++ b/fs/xfs/xfs_buf.h @@ -29,7 +29,7 @@ struct xfs_buf; #define XBF_READ_AHEAD (1u << 2) /* asynchronous read-ahead */ #define XBF_NO_IOACCT (1u << 3) /* bypass I/O accounting (non-LRU bufs) */ #define XBF_ASYNC (1u << 4) /* initiator will not wait for completion */ -#define XBF_DONE (1u << 5) /* all pages in the buffer uptodate */ +#define XBF_DONE (1u << 5) /* all folios in the buffer uptodate */ #define XBF_STALE (1u << 6) /* buffer has been staled, do not find it */ #define XBF_WRITE_FAIL (1u << 7) /* async writes have failed on this buffer */ @@ -39,7 +39,7 @@ struct xfs_buf; #define _XBF_LOGRECOVERY (1u << 18)/* log recovery buffer */ /* flags used only internally */ -#define _XBF_PAGES (1u << 20)/* backed by refcounted pages */ +#define _XBF_FOLIOS (1u << 20)/* backed by refcounted folios */ #define _XBF_KMEM (1u << 21)/* backed by heap memory */ #define _XBF_DELWRI_Q (1u << 22)/* buffer on a delwri queue */ @@ -68,7 +68,7 @@ typedef unsigned int xfs_buf_flags_t; { _XBF_INODES, "INODES" }, \ { _XBF_DQUOTS, "DQUOTS" }, \ { _XBF_LOGRECOVERY, "LOG_RECOVERY" }, \ - { _XBF_PAGES, "PAGES" }, \ + { _XBF_FOLIOS, "FOLIOS" }, \ { _XBF_KMEM, "KMEM" }, \ { _XBF_DELWRI_Q, "DELWRI_Q" }, \ /* The following interface flags should never be set */ \ @@ -116,7 +116,7 @@ typedef struct xfs_buftarg { struct ratelimit_state bt_ioerror_rl; } xfs_buftarg_t; -#define XB_PAGES 2 +#define XB_FOLIOS 2 struct xfs_buf_map { xfs_daddr_t bm_bn; /* block number for I/O */ @@ -180,14 +180,14 @@ struct xfs_buf { struct xfs_buf_log_item *b_log_item; struct list_head b_li_list; /* Log items list head */ struct xfs_trans *b_transp; - struct page **b_pages; /* array of page pointers */ - struct page *b_page_array[XB_PAGES]; /* inline pages */ + struct folio **b_folios; /* array of folio pointers */ + struct folio *b_folio_array[XB_FOLIOS]; /* inline folios */ struct xfs_buf_map *b_maps; /* compound buffer map */ struct xfs_buf_map __b_map; /* inline compound buffer map */ int b_map_count; atomic_t b_pin_count; /* pin count */ atomic_t b_io_remaining; /* #outstanding I/O requests */ - unsigned int b_page_count; /* size of page array */ + unsigned int b_folio_count; /* size of folio array */ unsigned int b_offset; /* page offset of b_addr, only for _XBF_KMEM buffers */ int b_error; /* error code on I/O */ diff --git a/fs/xfs/xfs_buf_item.c b/fs/xfs/xfs_buf_item.c index 83a81cb52d8e..d1407cee48d9 100644 --- a/fs/xfs/xfs_buf_item.c +++ b/fs/xfs/xfs_buf_item.c @@ -69,7 +69,7 @@ xfs_buf_item_straddle( { void *first, *last; - if (bp->b_page_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) + if (bp->b_folio_count == 1 || !(bp->b_flags & XBF_UNMAPPED)) return false; first = xfs_buf_offset(bp, offset + (first_bit << XFS_BLF_SHIFT)); diff --git a/fs/xfs/xfs_linux.h b/fs/xfs/xfs_linux.h index caccb7f76690..804389b8e802 100644 --- a/fs/xfs/xfs_linux.h +++ b/fs/xfs/xfs_linux.h @@ -279,4 +279,12 @@ kmem_to_page(void *addr) return virt_to_page(addr); } +static inline struct folio * +kmem_to_folio(void *addr) +{ + if (is_vmalloc_addr(addr)) + return page_folio(vmalloc_to_page(addr)); + return virt_to_folio(addr); +} + #endif /* __XFS_LINUX__ */ From patchwork Thu Jan 18 22:19:41 2024 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Dave Chinner X-Patchwork-Id: 13523248 Received: from mail-pf1-f175.google.com (mail-pf1-f175.google.com [209.85.210.175]) (using TLSv1.2 with cipher ECDHE-RSA-AES128-GCM-SHA256 (128/128 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 039333218C for ; Thu, 18 Jan 2024 22:22:22 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=209.85.210.175 ARC-Seal: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616544; cv=none; b=O7Ge4LKN0O6cCya18GyT0JjkNjyDrD8SSqXVeN5yTLtP6B4o2ZTVjS78FdWHlzlTmcTwuJywM0zUWKVY7gT2/cYQmlil1o3DM96uo8m1NQ7QpRwtwXDlMueq9BHkz2ru2rUELU0MmN/JlgmJXFmsDjyqGSR4o2NEQbkDe8hhJ3c= ARC-Message-Signature: i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1705616544; c=relaxed/simple; bh=fglI7NZLaK53wxq60litmrlh03vEYEzbBQ9XqKhiAjo=; h=From:To:Cc:Subject:Date:Message-ID:In-Reply-To:References: MIME-Version; b=k8lgsWbnEbAbBFOgSEgEI3GuvxT50HGVlGUkPmvea4H+2PDYv+4sKlcojVuo+zBjhHkpPKpfCB1RNE2OT9/ONHLldIkcLPEngoVdIswZV72A6AetlkdYr87HQQ7Aq68+Qzo+KaPzERuMOJofkSmxKWtehLTzaDOJTnzZFyaDAgA= ARC-Authentication-Results: i=1; smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com; spf=pass smtp.mailfrom=fromorbit.com; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b=ZknSSSDX; arc=none smtp.client-ip=209.85.210.175 Authentication-Results: smtp.subspace.kernel.org; dmarc=pass (p=quarantine dis=none) header.from=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; spf=pass smtp.mailfrom=fromorbit.com Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=fromorbit-com.20230601.gappssmtp.com header.i=@fromorbit-com.20230601.gappssmtp.com header.b="ZknSSSDX" Received: by mail-pf1-f175.google.com with SMTP id d2e1a72fcca58-6d9f94b9186so215628b3a.0 for ; Thu, 18 Jan 2024 14:22:22 -0800 (PST) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=fromorbit-com.20230601.gappssmtp.com; s=20230601; t=1705616542; x=1706221342; darn=vger.kernel.org; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:from:to:cc:subject:date :message-id:reply-to; bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=; b=ZknSSSDXvW2yX1N+C1Sl/vubROfZFsNhoVnOuwsnKH2f3+k9EzMokfaXCr6nnx1Wgp kg8307rOkgXM4J+d5Sydw79xVd4KMXIG5eJtLdm26BTE02ExXeo7PUucdho/bYDtMi6+ UL+dfTKAfDmXJJO/G65RGT3eKXeFefpMdrr6iVhzcK6xBDDf7PF8qyr3VZBhsYldyLU5 SoVjZ1NpbviFRFP4NRUFpAXS22+2kX6Jz8lEwvhxqYmCyGm/8jU3ptbU8QXriRJKw9x5 l4AClv9lPLefyC1HdAQ0oh70dTxWo4kAoInZoSdT2zADxquZxN5BcAFTYeIfMpoWpWxi 84Xg== X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1705616542; x=1706221342; h=content-transfer-encoding:mime-version:references:in-reply-to :message-id:date:subject:cc:to:from:x-gm-message-state:from:to:cc :subject:date:message-id:reply-to; bh=fO+mBN2GDAFsDbZRgLX551gORnQ6e1TTPyY8dgTZaZk=; b=MS6XK6Z3qTJdg2TK1Zt8IrNruLSjowQ/dICuNlHVLj2spf1C7kXPTIR9bgzI2OgbVa 93eBlRk1fYjbdREhSNqM/36krrtjD1PBoQXHZr/a5JeB64O4ygDF8k7fu4GyE6exN1zL V2Yfyt8eKPueyOS9FD4Klm0lwV8nAWj6mqDGvXar512uAntvOV5WThmWq7YiYpOmMcTx zeIHjUmI530BQtpZQ1vbnUeGquBAgqDHGUxDULhivID0kouyzxG0zoLRCBESiR6UJA57 jkaH0+OdshEejCA8WEXJ4AnvCkL0z1U0aK9zGV9mMYgp2rvinr7h1Gr/G1YmD8anHzAI 4HyQ== X-Gm-Message-State: AOJu0YzoZtsChnOUrvDcnMyYe3qPOcnWQfEoP/XkC/+3HtDuR9G8ZhNu IoGT2mCw422KMUpaQIJVagsdcdSjAiQSo2K4Sdk39Wgnq0h1vnpNvl1Zvv7j1CxeqmR0LPpBT+F o X-Google-Smtp-Source: AGHT+IFxnt3+P3qfd/Or7vfgHgHJT56Si91E5SB2dSZzMTOU7Zs7rV/RsmUryc8DgCT+nt/t5oxBsQ== X-Received: by 2002:a05:6a20:ae1c:b0:19b:4580:e9c6 with SMTP id dp28-20020a056a20ae1c00b0019b4580e9c6mr1220797pzb.65.1705616542306; Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from dread.disaster.area (pa49-180-249-6.pa.nsw.optusnet.com.au. [49.180.249.6]) by smtp.gmail.com with ESMTPSA id s13-20020a056a00194d00b006db13a02921sm3764329pfk.183.2024.01.18.14.22.21 (version=TLS1_3 cipher=TLS_AES_256_GCM_SHA384 bits=256/256); Thu, 18 Jan 2024 14:22:22 -0800 (PST) Received: from [192.168.253.23] (helo=devoid.disaster.area) by dread.disaster.area with esmtp (Exim 4.96) (envelope-from ) id 1rQamB-00CCGW-1A; Fri, 19 Jan 2024 09:22:18 +1100 Received: from dave by devoid.disaster.area with local (Exim 4.97) (envelope-from ) id 1rQamA-0000000HMlw-3ZYk; Fri, 19 Jan 2024 09:22:18 +1100 From: Dave Chinner To: linux-xfs@vger.kernel.org Cc: willy@infradead.org, linux-mm@kvack.org Subject: [PATCH 3/3] xfs: convert buffer cache to use high order folios Date: Fri, 19 Jan 2024 09:19:41 +1100 Message-ID: <20240118222216.4131379-4-david@fromorbit.com> X-Mailer: git-send-email 2.43.0 In-Reply-To: <20240118222216.4131379-1-david@fromorbit.com> References: <20240118222216.4131379-1-david@fromorbit.com> Precedence: bulk X-Mailing-List: linux-xfs@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 From: Dave Chinner Now that we have the buffer cache using the folio API, we can extend the use of folios to allocate high order folios for multi-page buffers rather than an array of single pages that are then vmapped into a contiguous range. This creates two types of buffers: single folio buffers that can have arbitrary order, and multi-folio buffers made up of many single page folios that get vmapped. The latter is essentially the existing code, so there are no logic changes to handle this case. There are a few places where we iterate the folios on a buffer. These need to be converted to handle the high order folio case. Luckily, this only occurs when bp->b_folio_count == 1, and the code for handling this case is just a simple application of the folio API to the operations that need to be performed. The code that allocates buffers will optimistically attempt a high order folio allocation as a fast path. If this high order allocation fails, then we fall back to the existing multi-folio allocation code. This now forms the slow allocation path, and hopefully will be largely unused in normal conditions. This should improve performance of large buffer operations (e.g. large directory block sizes) as we should now mostly avoid the expense of vmapping large buffers (and the vmap lock contention that can occur) as well as avoid the runtime pressure that frequently accessing kernel vmapped pages put on the TLBs. Signed-off-by: Dave Chinner --- fs/xfs/xfs_buf.c | 150 +++++++++++++++++++++++++++++++++++++---------- 1 file changed, 119 insertions(+), 31 deletions(-) diff --git a/fs/xfs/xfs_buf.c b/fs/xfs/xfs_buf.c index 15907e92d0d3..df363f17ea1a 100644 --- a/fs/xfs/xfs_buf.c +++ b/fs/xfs/xfs_buf.c @@ -74,6 +74,10 @@ xfs_buf_is_vmapped( return bp->b_addr && bp->b_folio_count > 1; } +/* + * See comment above xfs_buf_alloc_folios() about the constraints placed on + * allocating vmapped buffers. + */ static inline int xfs_buf_vmap_len( struct xfs_buf *bp) @@ -344,14 +348,72 @@ xfs_buf_alloc_kmem( bp->b_addr = NULL; return -ENOMEM; } - bp->b_offset = offset_in_page(bp->b_addr); bp->b_folios = bp->b_folio_array; bp->b_folios[0] = kmem_to_folio(bp->b_addr); + bp->b_offset = offset_in_folio(bp->b_folios[0], bp->b_addr); bp->b_folio_count = 1; bp->b_flags |= _XBF_KMEM; return 0; } +/* + * Allocating a high order folio makes the assumption that buffers are a + * power-of-2 size so that ilog2() returns the exact order needed to fit + * the contents of the buffer. Buffer lengths are mostly a power of two, + * so this is not an unreasonable approach to take by default. + * + * The exception here are user xattr data buffers, which can be arbitrarily + * sized up to 64kB plus structure metadata. In that case, round up the order. + */ +static bool +xfs_buf_alloc_folio( + struct xfs_buf *bp, + gfp_t gfp_mask) +{ + int length = BBTOB(bp->b_length); + int order; + + order = ilog2(length); + if ((1 << order) < length) + order = ilog2(length - 1) + 1; + + if (order <= PAGE_SHIFT) + order = 0; + else + order -= PAGE_SHIFT; + + bp->b_folio_array[0] = folio_alloc(gfp_mask, order); + if (!bp->b_folio_array[0]) + return false; + + bp->b_folios = bp->b_folio_array; + bp->b_folio_count = 1; + bp->b_flags |= _XBF_FOLIOS; + return true; +} + +/* + * When we allocate folios for a buffer, we end up with one of two types of + * buffer. + * + * The first type is a single folio buffer - this may be a high order + * folio or just a single page sized folio, but either way they get treated the + * same way by the rest of the code - the buffer memory spans a single + * contiguous memory region that we don't have to map and unmap to access the + * data directly. + * + * The second type of buffer is the multi-folio buffer. These are *always* made + * up of single page folios so that they can be fed to vmap_ram() to return a + * contiguous memory region we can access the data through, or mark it as + * XBF_UNMAPPED and access the data directly through individual folio_address() + * calls. + * + * We don't use high order folios for this second type of buffer (yet) because + * having variable size folios makes offset-to-folio indexing and iteration of + * the data range more complex than if they are fixed size. This case should now + * be the slow path, though, so unless we regularly fail to allocate high order + * folios, there should be little need to optimise this path. + */ static int xfs_buf_alloc_folios( struct xfs_buf *bp, @@ -363,7 +425,15 @@ xfs_buf_alloc_folios( if (flags & XBF_READ_AHEAD) gfp_mask |= __GFP_NORETRY; - /* Make sure that we have a page list */ + /* Assure zeroed buffer for non-read cases. */ + if (!(flags & XBF_READ)) + gfp_mask |= __GFP_ZERO; + + /* Optimistically attempt a single high order folio allocation. */ + if (xfs_buf_alloc_folio(bp, gfp_mask)) + return 0; + + /* Fall back to allocating an array of single page folios. */ bp->b_folio_count = DIV_ROUND_UP(BBTOB(bp->b_length), PAGE_SIZE); if (bp->b_folio_count <= XB_FOLIOS) { bp->b_folios = bp->b_folio_array; @@ -375,9 +445,6 @@ xfs_buf_alloc_folios( } bp->b_flags |= _XBF_FOLIOS; - /* Assure zeroed buffer for non-read cases. */ - if (!(flags & XBF_READ)) - gfp_mask |= __GFP_ZERO; /* * Bulk filling of pages can take multiple calls. Not filling the entire @@ -418,7 +485,7 @@ _xfs_buf_map_folios( { ASSERT(bp->b_flags & _XBF_FOLIOS); if (bp->b_folio_count == 1) { - /* A single page buffer is always mappable */ + /* A single folio buffer is always mappable */ bp->b_addr = folio_address(bp->b_folios[0]); } else if (flags & XBF_UNMAPPED) { bp->b_addr = NULL; @@ -1465,20 +1532,28 @@ xfs_buf_ioapply_map( int *count, blk_opf_t op) { - int page_index; - unsigned int total_nr_pages = bp->b_folio_count; - int nr_pages; + int folio_index; + unsigned int total_nr_folios = bp->b_folio_count; + int nr_folios; struct bio *bio; sector_t sector = bp->b_maps[map].bm_bn; int size; int offset; - /* skip the pages in the buffer before the start offset */ - page_index = 0; + /* + * If the start offset if larger than a single page, we need to be + * careful. We might have a high order folio, in which case the indexing + * is from the start of the buffer. However, if we have more than one + * folio single page folio in the buffer, we need to skip the folios in + * the buffer before the start offset. + */ + folio_index = 0; offset = *buf_offset; - while (offset >= PAGE_SIZE) { - page_index++; - offset -= PAGE_SIZE; + if (bp->b_folio_count > 1) { + while (offset >= PAGE_SIZE) { + folio_index++; + offset -= PAGE_SIZE; + } } /* @@ -1491,28 +1566,28 @@ xfs_buf_ioapply_map( next_chunk: atomic_inc(&bp->b_io_remaining); - nr_pages = bio_max_segs(total_nr_pages); + nr_folios = bio_max_segs(total_nr_folios); - bio = bio_alloc(bp->b_target->bt_bdev, nr_pages, op, GFP_NOIO); + bio = bio_alloc(bp->b_target->bt_bdev, nr_folios, op, GFP_NOIO); bio->bi_iter.bi_sector = sector; bio->bi_end_io = xfs_buf_bio_end_io; bio->bi_private = bp; - for (; size && nr_pages; nr_pages--, page_index++) { - int rbytes, nbytes = PAGE_SIZE - offset; + for (; size && nr_folios; nr_folios--, folio_index++) { + struct folio *folio = bp->b_folios[folio_index]; + int nbytes = folio_size(folio) - offset; if (nbytes > size) nbytes = size; - rbytes = bio_add_folio(bio, bp->b_folios[page_index], nbytes, - offset); - if (rbytes < nbytes) + if (!bio_add_folio(bio, folio, nbytes, + offset_in_folio(folio, offset))) break; offset = 0; sector += BTOBB(nbytes); size -= nbytes; - total_nr_pages--; + total_nr_folios--; } if (likely(bio->bi_iter.bi_size)) { @@ -1722,6 +1797,13 @@ xfs_buf_offset( if (bp->b_addr) return bp->b_addr + offset; + /* Single folio buffers may use large folios. */ + if (bp->b_folio_count == 1) { + folio = bp->b_folios[0]; + return folio_address(folio) + offset_in_folio(folio, offset); + } + + /* Multi-folio buffers always use PAGE_SIZE folios */ folio = bp->b_folios[offset >> PAGE_SHIFT]; return folio_address(folio) + (offset & (PAGE_SIZE-1)); } @@ -1737,18 +1819,24 @@ xfs_buf_zero( bend = boff + bsize; while (boff < bend) { struct folio *folio; - int page_index, page_offset, csize; + int folio_index, folio_offset, csize; - page_index = (boff + bp->b_offset) >> PAGE_SHIFT; - page_offset = (boff + bp->b_offset) & ~PAGE_MASK; - folio = bp->b_folios[page_index]; - csize = min_t(size_t, PAGE_SIZE - page_offset, + /* Single folio buffers may use large folios. */ + if (bp->b_folio_count == 1) { + folio = bp->b_folios[0]; + folio_offset = offset_in_folio(folio, + bp->b_offset + boff); + } else { + folio_index = (boff + bp->b_offset) >> PAGE_SHIFT; + folio_offset = (boff + bp->b_offset) & ~PAGE_MASK; + folio = bp->b_folios[folio_index]; + } + + csize = min_t(size_t, folio_size(folio) - folio_offset, BBTOB(bp->b_length) - boff); + ASSERT((csize + folio_offset) <= folio_size(folio)); - ASSERT((csize + page_offset) <= PAGE_SIZE); - - memset(folio_address(folio) + page_offset, 0, csize); - + memset(folio_address(folio) + folio_offset, 0, csize); boff += csize; } }