From patchwork Sat Mar 3 13:59:50 2018 Content-Type: text/plain; charset="utf-8" MIME-Version: 1.0 Content-Transfer-Encoding: 7bit X-Patchwork-Submitter: Brian Foster X-Patchwork-Id: 10256093 Return-Path: Received: from mail.wl.linuxfoundation.org (pdx-wl-mail.web.codeaurora.org [172.30.200.125]) by pdx-korg-patchwork.web.codeaurora.org (Postfix) with ESMTP id 1C01860211 for ; Sat, 3 Mar 2018 13:59:54 +0000 (UTC) Received: from mail.wl.linuxfoundation.org (localhost [127.0.0.1]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id 0400A28777 for ; Sat, 3 Mar 2018 13:59:54 +0000 (UTC) Received: by mail.wl.linuxfoundation.org (Postfix, from userid 486) id EA70B2878C; Sat, 3 Mar 2018 13:59:53 +0000 (UTC) X-Spam-Checker-Version: SpamAssassin 3.3.1 (2010-03-16) on pdx-wl-mail.web.codeaurora.org X-Spam-Level: X-Spam-Status: No, score=-6.9 required=2.0 tests=BAYES_00,RCVD_IN_DNSWL_HI autolearn=ham version=3.3.1 Received: from vger.kernel.org (vger.kernel.org [209.132.180.67]) by mail.wl.linuxfoundation.org (Postfix) with ESMTP id C9B9C28777 for ; Sat, 3 Mar 2018 13:59:52 +0000 (UTC) Received: (majordomo@vger.kernel.org) by vger.kernel.org via listexpand id S932090AbeCCN7w (ORCPT ); Sat, 3 Mar 2018 08:59:52 -0500 Received: from mx1.redhat.com ([209.132.183.28]:39880 "EHLO mx1.redhat.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S932088AbeCCN7v (ORCPT ); Sat, 3 Mar 2018 08:59:51 -0500 Received: from smtp.corp.redhat.com (int-mx04.intmail.prod.int.phx2.redhat.com [10.5.11.14]) (using TLSv1.2 with cipher AECDH-AES256-SHA (256/256 bits)) (No client certificate requested) by mx1.redhat.com (Postfix) with ESMTPS id 002CA7FEC8; Sat, 3 Mar 2018 13:59:51 +0000 (UTC) Received: from bfoster.bfoster (dhcp-41-20.bos.redhat.com [10.18.41.20]) by smtp.corp.redhat.com (Postfix) with ESMTP id C2DD15D9CB; Sat, 3 Mar 2018 13:59:50 +0000 (UTC) Received: by bfoster.bfoster (Postfix, from userid 1000) id 1EAF6121EC9; Sat, 3 Mar 2018 08:59:50 -0500 (EST) Date: Sat, 3 Mar 2018 08:59:50 -0500 From: Brian Foster To: "Darrick J. Wong" Cc: linux-xfs@vger.kernel.org, djwong@kernel.org Subject: Re: [PATCH 5/5] xfs: fix agfl wrapping Message-ID: <20180303135949.GA2767@bfoster.bfoster> References: <151935118240.21654.10146641674018880885.stgit@magnolia> <151935121574.21654.7158330354778549108.stgit@magnolia> <20180227193548.GF3479@bfoster.bfoster> <20180227210313.GK19312@magnolia> <20180228224351.GA31332@bfoster.bfoster> <20180228232032.GK19318@magnolia> <20180301172832.GA34164@bfoster.bfoster> <20180301205541.GE12763@magnolia> <20180302131206.GA38614@bfoster.bfoster> MIME-Version: 1.0 Content-Disposition: inline In-Reply-To: <20180302131206.GA38614@bfoster.bfoster> User-Agent: Mutt/1.9.1 (2017-09-22) X-Scanned-By: MIMEDefang 2.79 on 10.5.11.14 X-Greylist: Sender IP whitelisted, not delayed by milter-greylist-4.5.16 (mx1.redhat.com [10.5.110.25]); Sat, 03 Mar 2018 13:59:51 +0000 (UTC) Sender: linux-xfs-owner@vger.kernel.org Precedence: bulk List-ID: X-Mailing-List: linux-xfs@vger.kernel.org X-Virus-Scanned: ClamAV using ClamSMTP On Fri, Mar 02, 2018 at 08:12:07AM -0500, Brian Foster wrote: > On Thu, Mar 01, 2018 at 12:55:41PM -0800, Darrick J. Wong wrote: > > On Thu, Mar 01, 2018 at 12:28:33PM -0500, Brian Foster wrote: > > > On Wed, Feb 28, 2018 at 03:20:32PM -0800, Darrick J. Wong wrote: > > > > On Wed, Feb 28, 2018 at 05:43:51PM -0500, Brian Foster wrote: > > > > > On Tue, Feb 27, 2018 at 01:03:13PM -0800, Darrick J. Wong wrote: > > > > > > On Tue, Feb 27, 2018 at 02:35:49PM -0500, Brian Foster wrote: > > > > > > > On Thu, Feb 22, 2018 at 06:00:15PM -0800, Darrick J. Wong wrote: > > > > > > > > From: Darrick J. Wong ... > > Going forward, I want the number of unpacked kernels to decrease as > > quickly as possible. I understand that distro kernel maintainers are > > not willing to apply the packing patch to their kernel until we come up > > with a smooth transition path. > > > > I agree wrt to upstream, but note that I don't anticipate including the > padding fix downstream any time soon. > > > I don't want to support fixing agfls to be 118 units long on 64-bit > > unpacked kernels and 119 units long on 32-bit unpacked kernels, and I > > only want to support the packed kernels with their 119 unit long agfls. > > An AGFL that starts at 0 and ends at flcount-1 is compatible with packed > > and unpacked kernels, so the v2 patch I sent last night removes the > > delicate per-case surgery in favor of a new strategy where the mount > > time and unmount time helpers both look for agfl configurationss that > > are known to cause problems, and solves them all with the same solution: > > moving the agfl list towards the start of the block. > > > > The purpose of the patch I sent was not for upstream unpacked support > going forward. Upstream has clearly moved forward with the packed > format. The goal of the patch was to explore a single/generic patch that > could be merged upstream/downstream and handle compatibility cleanly. > FWIW, here's a new variant of a bidirectional fixup. It's refactored and polished up a bit into a patch. It basically inspects the agf when first read for any evidence that the on-disk fields reflect a size mismatch with the current kernel and sets a flag if so. agfl gets/puts check the flag and thus the first transaction that attempts to modify a mismatched agfl swaps a block into or out of the gap slot appropriately. This avoids the need for any new transactions or mount time scan and the (downstream motivated) packed -> unpacked case is only 10 or so lines of additional code. Only spot tested, but I _think_ it covers all of the cases. Hm? Brian --- 8< --- --- To unsubscribe from this list: send the line "unsubscribe linux-xfs" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html diff --git a/fs/xfs/libxfs/xfs_alloc.c b/fs/xfs/libxfs/xfs_alloc.c index c02781a4c091..1cbf80d8481f 100644 --- a/fs/xfs/libxfs/xfs_alloc.c +++ b/fs/xfs/libxfs/xfs_alloc.c @@ -2053,6 +2053,115 @@ xfs_alloc_space_available( return true; } +static int +xfs_agfl_ondisk_size( + struct xfs_mount *mp, + int first, + int last, + int count) +{ + int active = count; + int agfl_size = XFS_AGFL_SIZE(mp); + + if (count && last >= first) + active = last - first + 1; + else if (count) + active = agfl_size - first + last + 1; + + if (active == count + 1) + return agfl_size - 1; + if (active == count - 1 || first == agfl_size || last == agfl_size) + return agfl_size + 1; + + ASSERT(active == count); + return agfl_size; +} + +static bool +xfs_agfl_need_padfix( + struct xfs_mount *mp, + struct xfs_agf *agf) +{ + int f = be32_to_cpu(agf->agf_flfirst); + int l = be32_to_cpu(agf->agf_fllast); + int c = be32_to_cpu(agf->agf_flcount); + + return xfs_agfl_ondisk_size(mp, f, l, c) != XFS_AGFL_SIZE(mp); +} + +static int +xfs_agfl_check_padfix( + struct xfs_trans *tp, + struct xfs_buf *agbp, + struct xfs_buf *agflbp, + struct xfs_perag *pag) +{ + struct xfs_mount *mp = tp->t_mountp; + struct xfs_agf *agf = XFS_BUF_TO_AGF(agbp); + __be32 *agfl_bno = XFS_BUF_TO_AGFL_BNO(mp, agflbp); + int agfl_size = XFS_AGFL_SIZE(mp); + int ofirst, olast, osize; + int nfirst, nlast; + int logflags = 0; + int startoff = 0; + + if (!pag->pagf_needpadfix) + return 0; + + ofirst = nfirst = be32_to_cpu(agf->agf_flfirst); + olast = nlast = be32_to_cpu(agf->agf_fllast); + osize = xfs_agfl_ondisk_size(mp, ofirst, olast, pag->pagf_flcount); + + /* + * If the on-disk agfl is smaller than what the kernel expects, the + * last slot of the on-disk agfl is a gap with bogus data. Move the + * first valid block into the gap and bump the pointer. + */ + if (osize < agfl_size) { + ASSERT(pag->pagf_flcount != 0); + agfl_bno[agfl_size - 1] = agfl_bno[ofirst]; + startoff = (char *) &agfl_bno[agfl_size - 1] - (char *) agflbp->b_addr; + nfirst++; + goto done; + } + + /* + * Otherwise, the on-disk agfl is larger than what the current kernel + * expects. If empty, just fix up the first and last pointers. If not, + * move the inaccessible block to the end of the valid range. + */ + nfirst = do_mod(nfirst, agfl_size); + if (pag->pagf_flcount == 0) { + nlast = (nfirst == 0 ? agfl_size - 1 : nfirst - 1); + goto done; + } + if (nlast != agfl_size) + nlast++; + nlast = do_mod(nlast, agfl_size); + agfl_bno[nlast] = agfl_bno[osize - 1]; + startoff = (char *) &agfl_bno[nlast] - (char *) agflbp->b_addr; + +done: + if (nfirst != ofirst) { + agf->agf_flfirst = cpu_to_be32(nfirst); + logflags |= XFS_AGF_FLFIRST; + } + if (nlast != olast) { + agf->agf_fllast = cpu_to_be32(nlast); + logflags |= XFS_AGF_FLLAST; + } + if (startoff) { + xfs_trans_buf_set_type(tp, agflbp, XFS_BLFT_AGFL_BUF); + xfs_trans_log_buf(tp, agflbp, startoff, + startoff + sizeof(xfs_agblock_t) - 1); + } + if (logflags) + xfs_alloc_log_agf(tp, agbp, logflags); + + pag->pagf_needpadfix = false; + return 0; +} + /* * Decide whether to use this allocation group for this allocation. * If so, fix up the btree freelist's size. @@ -2258,6 +2367,12 @@ xfs_alloc_get_freelist( if (error) return error; + pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno)); + error = xfs_agfl_check_padfix(tp, agbp, agflbp, pag); + if (error) { + xfs_perag_put(pag); + return error; + } /* * Get the block number and update the data structures. @@ -2269,7 +2384,6 @@ xfs_alloc_get_freelist( if (be32_to_cpu(agf->agf_flfirst) == XFS_AGFL_SIZE(mp)) agf->agf_flfirst = 0; - pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno)); be32_add_cpu(&agf->agf_flcount, -1); xfs_trans_agflist_delta(tp, -1); pag->pagf_flcount--; @@ -2376,11 +2490,18 @@ xfs_alloc_put_freelist( if (!agflbp && (error = xfs_alloc_read_agfl(mp, tp, be32_to_cpu(agf->agf_seqno), &agflbp))) return error; + + pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno)); + error = xfs_agfl_check_padfix(tp, agbp, agflbp, pag); + if (error) { + xfs_perag_put(pag); + return error; + } + be32_add_cpu(&agf->agf_fllast, 1); if (be32_to_cpu(agf->agf_fllast) == XFS_AGFL_SIZE(mp)) agf->agf_fllast = 0; - pag = xfs_perag_get(mp, be32_to_cpu(agf->agf_seqno)); be32_add_cpu(&agf->agf_flcount, 1); xfs_trans_agflist_delta(tp, 1); pag->pagf_flcount++; @@ -2588,6 +2709,7 @@ xfs_alloc_read_agf( pag->pagb_count = 0; pag->pagb_tree = RB_ROOT; pag->pagf_init = 1; + pag->pagf_needpadfix = xfs_agfl_need_padfix(mp, agf); } #ifdef DEBUG else if (!XFS_FORCED_SHUTDOWN(mp)) { diff --git a/fs/xfs/xfs_mount.h b/fs/xfs/xfs_mount.h index e0792d036be2..78a6377a9b38 100644 --- a/fs/xfs/xfs_mount.h +++ b/fs/xfs/xfs_mount.h @@ -353,6 +353,7 @@ typedef struct xfs_perag { char pagi_inodeok; /* The agi is ok for inodes */ uint8_t pagf_levels[XFS_BTNUM_AGF]; /* # of levels in bno & cnt btree */ + bool pagf_needpadfix; uint32_t pagf_flcount; /* count of blocks in freelist */ xfs_extlen_t pagf_freeblks; /* total free blocks */ xfs_extlen_t pagf_longest; /* longest free space */