Message ID | 160494585913.772802.17231950418756379430.stgit@magnolia (mailing list archive) |
---|---|
State | Accepted, archived |
Headers | show |
Series | xfs: fix various scrub problems | expand |
On Mon, Nov 09, 2020 at 10:17:39AM -0800, Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Fix some serious WTF in the reference count scrubber's rmap fragment > processing. The code comment says that this loop is supposed to move > all fragment records starting at or before bno onto the worklist, but > there's no obvious reason why nr (the number of items added) should > increment starting from 1, and breaking the loop when we've added the > target number seems dubious since we could have more rmap fragments that > should have been added to the worklist. > > This seems to manifest in xfs/411 when adding one to the refcount field. > > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt") Looks good, Reviewed-by: Christoph Hellwig <hch@lst.de>
On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote: > From: Darrick J. Wong <darrick.wong@oracle.com> > > Fix some serious WTF in the reference count scrubber's rmap fragment > processing. The code comment says that this loop is supposed to move > all fragment records starting at or before bno onto the worklist, but > there's no obvious reason why nr (the number of items added) should > increment starting from 1, and breaking the loop when we've added the > target number seems dubious since we could have more rmap fragments that > should have been added to the worklist. > > This seems to manifest in xfs/411 when adding one to the refcount field. > > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt") > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > --- > fs/xfs/scrub/refcount.c | 8 +++----- > 1 file changed, 3 insertions(+), 5 deletions(-) > > > diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c > index beaeb6fa3119..dd672e6bbc75 100644 > --- a/fs/xfs/scrub/refcount.c > +++ b/fs/xfs/scrub/refcount.c > @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments( > */ > INIT_LIST_HEAD(&worklist); > rbno = NULLAGBLOCK; > - nr = 1; > > /* Make sure the fragments actually /are/ in agbno order. */ > bno = 0; > @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments( > * Find all the rmaps that start at or before the refc extent, > * and put them on the worklist. > */ > + nr = 0; > list_for_each_entry_safe(frag, n, &refchk->fragments, list) { > - if (frag->rm.rm_startblock > refchk->bno) > - goto done; > + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) > + break; In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes "nr != target_nr" condition (appearing after the loop) to evaluate to true (since atleast two rmap entries would be present for the refcount extent) which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as corrupt. Please let me know if my understanding of the code flow is correct? > bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; > if (bno < rbno) > rbno = bno; > list_move_tail(&frag->list, &worklist); > - if (nr == target_nr) > - break; > nr++; > } > > >
On Thu, Nov 12, 2020 at 06:21:52PM +0530, Chandan Babu R wrote: > On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote: > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > Fix some serious WTF in the reference count scrubber's rmap fragment > > processing. The code comment says that this loop is supposed to move > > all fragment records starting at or before bno onto the worklist, but > > there's no obvious reason why nr (the number of items added) should > > increment starting from 1, and breaking the loop when we've added the > > target number seems dubious since we could have more rmap fragments that > > should have been added to the worklist. > > > > This seems to manifest in xfs/411 when adding one to the refcount field. > > > > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt") > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > --- > > fs/xfs/scrub/refcount.c | 8 +++----- > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c > > index beaeb6fa3119..dd672e6bbc75 100644 > > --- a/fs/xfs/scrub/refcount.c > > +++ b/fs/xfs/scrub/refcount.c > > @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments( > > */ > > INIT_LIST_HEAD(&worklist); > > rbno = NULLAGBLOCK; > > - nr = 1; > > > > /* Make sure the fragments actually /are/ in agbno order. */ > > bno = 0; > > @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments( > > * Find all the rmaps that start at or before the refc extent, > > * and put them on the worklist. > > */ > > + nr = 0; > > list_for_each_entry_safe(frag, n, &refchk->fragments, list) { > > - if (frag->rm.rm_startblock > refchk->bno) > > - goto done; > > + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) > > + break; > > In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes > "nr != target_nr" condition (appearing after the loop) to evaluate to true > (since atleast two rmap entries would be present for the refcount extent) > which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as > corrupt. Please let me know if my understanding of the code flow is correct? Right. --D > > bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; > > if (bno < rbno) > > rbno = bno; > > list_move_tail(&frag->list, &worklist); > > - if (nr == target_nr) > > - break; > > nr++; > > } > > > > > > > > > -- > chandan > > >
On Thursday 12 November 2020 9:35:26 PM IST Darrick J. Wong wrote: > On Thu, Nov 12, 2020 at 06:21:52PM +0530, Chandan Babu R wrote: > > On Monday 9 November 2020 11:47:39 PM IST Darrick J. Wong wrote: > > > From: Darrick J. Wong <darrick.wong@oracle.com> > > > > > > Fix some serious WTF in the reference count scrubber's rmap fragment > > > processing. The code comment says that this loop is supposed to move > > > all fragment records starting at or before bno onto the worklist, but > > > there's no obvious reason why nr (the number of items added) should > > > increment starting from 1, and breaking the loop when we've added the > > > target number seems dubious since we could have more rmap fragments that > > > should have been added to the worklist. > > > > > > This seems to manifest in xfs/411 when adding one to the refcount field. > > > > > > Fixes: dbde19da9637 ("xfs: cross-reference the rmapbt data with the refcountbt") > > > Signed-off-by: Darrick J. Wong <darrick.wong@oracle.com> > > > --- > > > fs/xfs/scrub/refcount.c | 8 +++----- > > > 1 file changed, 3 insertions(+), 5 deletions(-) > > > > > > > > > diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c > > > index beaeb6fa3119..dd672e6bbc75 100644 > > > --- a/fs/xfs/scrub/refcount.c > > > +++ b/fs/xfs/scrub/refcount.c > > > @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments( > > > */ > > > INIT_LIST_HEAD(&worklist); > > > rbno = NULLAGBLOCK; > > > - nr = 1; > > > > > > /* Make sure the fragments actually /are/ in agbno order. */ > > > bno = 0; > > > @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments( > > > * Find all the rmaps that start at or before the refc extent, > > > * and put them on the worklist. > > > */ > > > + nr = 0; > > > list_for_each_entry_safe(frag, n, &refchk->fragments, list) { > > > - if (frag->rm.rm_startblock > refchk->bno) > > > - goto done; > > > + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) > > > + break; > > > > In the case of fuzzed refcnt value of 1, The condition "nr > target_nr" causes > > "nr != target_nr" condition (appearing after the loop) to evaluate to true > > (since atleast two rmap entries would be present for the refcount extent) > > which in turn causes xchk_refcountbt_xref_rmap() to flag the data structure as > > corrupt. Please let me know if my understanding of the code flow is correct? > > Right. > Ok. In that case the code change in this patch is handling the erroneous scenario correctly. Reviewed-by: Chandan Babu R <chandanrlinux@gmail.com> > --D > > > > bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; > > > if (bno < rbno) > > > rbno = bno; > > > list_move_tail(&frag->list, &worklist); > > > - if (nr == target_nr) > > > - break; > > > nr++; > > > } > > > > > > > > > > > > > >
diff --git a/fs/xfs/scrub/refcount.c b/fs/xfs/scrub/refcount.c index beaeb6fa3119..dd672e6bbc75 100644 --- a/fs/xfs/scrub/refcount.c +++ b/fs/xfs/scrub/refcount.c @@ -170,7 +170,6 @@ xchk_refcountbt_process_rmap_fragments( */ INIT_LIST_HEAD(&worklist); rbno = NULLAGBLOCK; - nr = 1; /* Make sure the fragments actually /are/ in agbno order. */ bno = 0; @@ -184,15 +183,14 @@ xchk_refcountbt_process_rmap_fragments( * Find all the rmaps that start at or before the refc extent, * and put them on the worklist. */ + nr = 0; list_for_each_entry_safe(frag, n, &refchk->fragments, list) { - if (frag->rm.rm_startblock > refchk->bno) - goto done; + if (frag->rm.rm_startblock > refchk->bno || nr > target_nr) + break; bno = frag->rm.rm_startblock + frag->rm.rm_blockcount; if (bno < rbno) rbno = bno; list_move_tail(&frag->list, &worklist); - if (nr == target_nr) - break; nr++; }