diff mbox

xfs_repair: don't call xfs_sb_quota_from_disk twice

Message ID 13dd3974-956d-c3af-86ed-f6ce5cc1b996@redhat.com (mailing list archive)
State Accepted
Headers show

Commit Message

Eric Sandeen June 24, 2016, 9:24 p.m. UTC
kernel commit 5ef828c4
xfs: avoid false quotacheck after unclean shutdown

made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
by default.

However, when this was merged to libxfs, existing separate
calls to libxfs_sb_quota_from_disk remained, and calling it
twice in a row on a V4 superblock leads to issues, because:


        if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
...
                sbp->sb_pquotino = sbp->sb_gquotino;
                sbp->sb_gquotino = NULLFSINO;

and after the second call, we have set both pquotino and gquotino
to NULLFSINO.

Fix this by making it safe to call twice, and also remove the extra
calls to libxfs_sb_quota_from_disk.

This is only spotted when running xfstests with "-m crc=0" because
the sb_from_disk change came about after V5 became default, and
the above behavior only exists on a V4 superblock.

Reported-by: Eryu Guan <eguan@redhat.com>
Signed-off-by: Eric Sandeen <sandeen@redhat.com>
---

Comments

Carlos Maiolino June 27, 2016, 9:48 a.m. UTC | #1
On Fri, Jun 24, 2016 at 04:24:58PM -0500, Eric Sandeen wrote:
> kernel commit 5ef828c4
> xfs: avoid false quotacheck after unclean shutdown
> 
> made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
> by default.
> 
> However, when this was merged to libxfs, existing separate
> calls to libxfs_sb_quota_from_disk remained, and calling it
> twice in a row on a V4 superblock leads to issues, because:
> 
> 
>         if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> ...
>                 sbp->sb_pquotino = sbp->sb_gquotino;
>                 sbp->sb_gquotino = NULLFSINO;
> 
> and after the second call, we have set both pquotino and gquotino
> to NULLFSINO.
> 
> Fix this by making it safe to call twice, and also remove the extra
> calls to libxfs_sb_quota_from_disk.
> 
> This is only spotted when running xfstests with "-m crc=0" because
> the sb_from_disk change came about after V5 became default, and
> the above behavior only exists on a V4 superblock.
> 
> Reported-by: Eryu Guan <eguan@redhat.com>
> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> ---
> 
> 
> diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
> index 45db6ae..44f3e3e 100644
> --- a/libxfs/xfs_sb.c
> +++ b/libxfs/xfs_sb.c
> @@ -316,13 +316,16 @@ xfs_sb_quota_from_disk(struct xfs_sb *sbp)
>  					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
>  	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
>  
> -	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> +	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
> +	    sbp->sb_gquotino != NULLFSINO)  {

Although I agree with this check, shouldn't we report some sort of error when it
happens? Once, it's not supposed to happen, and, might be a sign of corruption?

Cheers

>  		/*
>  		 * In older version of superblock, on-disk superblock only
>  		 * has sb_gquotino, and in-core superblock has both sb_gquotino
>  		 * and sb_pquotino. But, only one of them is supported at any
>  		 * point of time. So, if PQUOTA is set in disk superblock,
> -		 * copy over sb_gquotino to sb_pquotino.
> +		 * copy over sb_gquotino to sb_pquotino.  The NULLFSINO test
> +		 * above is to make sure we don't do this twice and wipe them
> +		 * both out!
>  		 */
>  		sbp->sb_pquotino = sbp->sb_gquotino;
>  		sbp->sb_gquotino = NULLFSINO;
> diff --git a/repair/sb.c b/repair/sb.c
> index 3965953..8087242 100644
> --- a/repair/sb.c
> +++ b/repair/sb.c
> @@ -155,7 +155,6 @@ __find_secondary_sb(
>  		for (i = 0; !done && i < bsize; i += BBSIZE)  {
>  			c_bufsb = (char *)sb + i;
>  			libxfs_sb_from_disk(&bufsb, (xfs_dsb_t *)c_bufsb);
> -			libxfs_sb_quota_from_disk(&bufsb);
>  
>  			if (verify_sb(c_bufsb, &bufsb, 0) != XR_OK)
>  				continue;
> @@ -568,7 +567,6 @@ get_sb(xfs_sb_t *sbp, xfs_off_t off, int size, xfs_agnumber_t agno)
>  		do_error("%s\n", strerror(error));
>  	}
>  	libxfs_sb_from_disk(sbp, buf);
> -	libxfs_sb_quota_from_disk(sbp);
>  
>  	rval = verify_sb((char *)buf, sbp, agno == 0);
>  	free(buf);
> diff --git a/repair/scan.c b/repair/scan.c
> index 964ff06..366ce16 100644
> --- a/repair/scan.c
> +++ b/repair/scan.c
> @@ -1622,7 +1622,6 @@ scan_ag(
>  		goto out_free_sb;
>  	}
>  	libxfs_sb_from_disk(sb, XFS_BUF_TO_SBP(sbbuf));
> -	libxfs_sb_quota_from_disk(sb);
>  
>  	agfbuf = libxfs_readbuf(mp->m_dev,
>  			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
Eric Sandeen June 27, 2016, 3:34 p.m. UTC | #2
On 6/27/16 4:48 AM, Carlos Maiolino wrote:
> On Fri, Jun 24, 2016 at 04:24:58PM -0500, Eric Sandeen wrote:
>> kernel commit 5ef828c4
>> xfs: avoid false quotacheck after unclean shutdown
>>
>> made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
>> by default.
>>
>> However, when this was merged to libxfs, existing separate
>> calls to libxfs_sb_quota_from_disk remained, and calling it
>> twice in a row on a V4 superblock leads to issues, because:
>>
>>
>>         if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
>> ...
>>                 sbp->sb_pquotino = sbp->sb_gquotino;
>>                 sbp->sb_gquotino = NULLFSINO;
>>
>> and after the second call, we have set both pquotino and gquotino
>> to NULLFSINO.
>>
>> Fix this by making it safe to call twice, and also remove the extra
>> calls to libxfs_sb_quota_from_disk.
>>
>> This is only spotted when running xfstests with "-m crc=0" because
>> the sb_from_disk change came about after V5 became default, and
>> the above behavior only exists on a V4 superblock.
>>
>> Reported-by: Eryu Guan <eguan@redhat.com>
>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>> ---
>>
>>
>> diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
>> index 45db6ae..44f3e3e 100644
>> --- a/libxfs/xfs_sb.c
>> +++ b/libxfs/xfs_sb.c
>> @@ -316,13 +316,16 @@ xfs_sb_quota_from_disk(struct xfs_sb *sbp)
>>  					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
>>  	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
>>  
>> -	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
>> +	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
>> +	    sbp->sb_gquotino != NULLFSINO)  {
> 
> Although I agree with this check, shouldn't we report some sort of error when it
> happens? Once, it's not supposed to happen, and, might be a sign of corruption?

I dunno, it would also happen if it gets called twice, which is intentionally
made harmless by this change.  We don't warn on free(NULL) for example...

I don't think it needs a warning.

-Eric
Carlos Maiolino June 28, 2016, 8:57 a.m. UTC | #3
On Mon, Jun 27, 2016 at 10:34:39AM -0500, Eric Sandeen wrote:
> 
> 
> On 6/27/16 4:48 AM, Carlos Maiolino wrote:
> > On Fri, Jun 24, 2016 at 04:24:58PM -0500, Eric Sandeen wrote:
> >> kernel commit 5ef828c4
> >> xfs: avoid false quotacheck after unclean shutdown
> >>
> >> made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
> >> by default.
> >>
> >> However, when this was merged to libxfs, existing separate
> >> calls to libxfs_sb_quota_from_disk remained, and calling it
> >> twice in a row on a V4 superblock leads to issues, because:
> >>
> >>
> >>         if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> >> ...
> >>                 sbp->sb_pquotino = sbp->sb_gquotino;
> >>                 sbp->sb_gquotino = NULLFSINO;
> >>
> >> and after the second call, we have set both pquotino and gquotino
> >> to NULLFSINO.
> >>
> >> Fix this by making it safe to call twice, and also remove the extra
> >> calls to libxfs_sb_quota_from_disk.
> >>
> >> This is only spotted when running xfstests with "-m crc=0" because
> >> the sb_from_disk change came about after V5 became default, and
> >> the above behavior only exists on a V4 superblock.
> >>
> >> Reported-by: Eryu Guan <eguan@redhat.com>
> >> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >> ---
> >>
> >>
> >> diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
> >> index 45db6ae..44f3e3e 100644
> >> --- a/libxfs/xfs_sb.c
> >> +++ b/libxfs/xfs_sb.c
> >> @@ -316,13 +316,16 @@ xfs_sb_quota_from_disk(struct xfs_sb *sbp)
> >>  					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
> >>  	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
> >>  
> >> -	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> >> +	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
> >> +	    sbp->sb_gquotino != NULLFSINO)  {
> > 
> > Although I agree with this check, shouldn't we report some sort of error when it
> > happens? Once, it's not supposed to happen, and, might be a sign of corruption?
> 
> I dunno, it would also happen if it gets called twice, which is intentionally
> made harmless by this change.  We don't warn on free(NULL) for example...
> 

Well, I don't 100% agree with not having a warning here, but it doesn't make the
patch less valuable.

Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>

> I don't think it needs a warning.
> 
> -Eric
>  
> 
> _______________________________________________
> xfs mailing list
> xfs@oss.sgi.com
> http://oss.sgi.com/mailman/listinfo/xfs
Eric Sandeen June 28, 2016, 3:40 p.m. UTC | #4
On 6/28/16 3:57 AM, Carlos Maiolino wrote:
> On Mon, Jun 27, 2016 at 10:34:39AM -0500, Eric Sandeen wrote:
>>
>>
>> On 6/27/16 4:48 AM, Carlos Maiolino wrote:
>>> On Fri, Jun 24, 2016 at 04:24:58PM -0500, Eric Sandeen wrote:
>>>> kernel commit 5ef828c4
>>>> xfs: avoid false quotacheck after unclean shutdown
>>>>
>>>> made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
>>>> by default.
>>>>
>>>> However, when this was merged to libxfs, existing separate
>>>> calls to libxfs_sb_quota_from_disk remained, and calling it
>>>> twice in a row on a V4 superblock leads to issues, because:
>>>>
>>>>
>>>>         if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
>>>> ...
>>>>                 sbp->sb_pquotino = sbp->sb_gquotino;
>>>>                 sbp->sb_gquotino = NULLFSINO;
>>>>
>>>> and after the second call, we have set both pquotino and gquotino
>>>> to NULLFSINO.
>>>>
>>>> Fix this by making it safe to call twice, and also remove the extra
>>>> calls to libxfs_sb_quota_from_disk.
>>>>
>>>> This is only spotted when running xfstests with "-m crc=0" because
>>>> the sb_from_disk change came about after V5 became default, and
>>>> the above behavior only exists on a V4 superblock.
>>>>
>>>> Reported-by: Eryu Guan <eguan@redhat.com>
>>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
>>>> ---
>>>>
>>>>
>>>> diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
>>>> index 45db6ae..44f3e3e 100644
>>>> --- a/libxfs/xfs_sb.c
>>>> +++ b/libxfs/xfs_sb.c
>>>> @@ -316,13 +316,16 @@ xfs_sb_quota_from_disk(struct xfs_sb *sbp)
>>>>  					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
>>>>  	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
>>>>  
>>>> -	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
>>>> +	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
>>>> +	    sbp->sb_gquotino != NULLFSINO)  {
>>>
>>> Although I agree with this check, shouldn't we report some sort of error when it
>>> happens? Once, it's not supposed to happen, and, might be a sign of corruption?
>>
>> I dunno, it would also happen if it gets called twice, which is intentionally
>> made harmless by this change.  We don't warn on free(NULL) for example...
>>
> 
> Well, I don't 100% agree with not having a warning here, but it doesn't make the
> patch less valuable.

Thanks Carlos - 

Maybe I don't understand what you want to warn about.

If we get here with:

	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
	    sbp->sb_gquotino != NULLFSINO)  {

that means we have an on-disk super without the pquotino field,
the XFS_PQUOTA_ACCT flag is set, and so the gquotino field was
used for the project quota; this is valid, and there is
nothing to warn about in this case.

If we get here with:

	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
	    sbp->sb_gquotino == NULLFSINO)  {

that means we have an on-disk super without the pquotino field,
the XFS_PQUOTA_ACCT flag is set, and the gquotino was not set
to a valid value.  This could happen either from a bad on-disk
value, or it could mean that we called the function twice in a
row.  Without maintaining more state, we can't know which, and
warning the user about a programming error wouldn't be helpful.

Actually, repair already handles this case elsewhere:

quota_sb_check(xfs_mount_t *mp)
{
        /*
         * if the sb says we have quotas and we lost both,
         * signal a superblock downgrade.  that will cause
         * the quota flags to get zeroed.  (if we only lost
         * one quota inode, do nothing and complain later.)
         *
         * if the sb says we have quotas but we didn't start out
         * with any quota inodes, signal a superblock downgrade.

In the case where quota flags are on but all quota inodes are
zero, it silently clears the quota flags.  Whether or not that
should be silent I'm not sure, but I think that is separate
from this patch.

Thanks,
-Eric


> Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
> 
>> I don't think it needs a warning.
>>
>> -Eric
>>  
>>
>> _______________________________________________
>> xfs mailing list
>> xfs@oss.sgi.com
>> http://oss.sgi.com/mailman/listinfo/xfs
>
Carlos Maiolino June 29, 2016, 8:30 a.m. UTC | #5
On Tue, Jun 28, 2016 at 10:40:31AM -0500, Eric Sandeen wrote:
> 
> 
> On 6/28/16 3:57 AM, Carlos Maiolino wrote:
> > On Mon, Jun 27, 2016 at 10:34:39AM -0500, Eric Sandeen wrote:
> >>
> >>
> >> On 6/27/16 4:48 AM, Carlos Maiolino wrote:
> >>> On Fri, Jun 24, 2016 at 04:24:58PM -0500, Eric Sandeen wrote:
> >>>> kernel commit 5ef828c4
> >>>> xfs: avoid false quotacheck after unclean shutdown
> >>>>
> >>>> made xfs_sb_from_disk() also call xfs_sb_quota_from_disk
> >>>> by default.
> >>>>
> >>>> However, when this was merged to libxfs, existing separate
> >>>> calls to libxfs_sb_quota_from_disk remained, and calling it
> >>>> twice in a row on a V4 superblock leads to issues, because:
> >>>>
> >>>>
> >>>>         if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> >>>> ...
> >>>>                 sbp->sb_pquotino = sbp->sb_gquotino;
> >>>>                 sbp->sb_gquotino = NULLFSINO;
> >>>>
> >>>> and after the second call, we have set both pquotino and gquotino
> >>>> to NULLFSINO.
> >>>>
> >>>> Fix this by making it safe to call twice, and also remove the extra
> >>>> calls to libxfs_sb_quota_from_disk.
> >>>>
> >>>> This is only spotted when running xfstests with "-m crc=0" because
> >>>> the sb_from_disk change came about after V5 became default, and
> >>>> the above behavior only exists on a V4 superblock.
> >>>>
> >>>> Reported-by: Eryu Guan <eguan@redhat.com>
> >>>> Signed-off-by: Eric Sandeen <sandeen@redhat.com>
> >>>> ---
> >>>>
> >>>>
> >>>> diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
> >>>> index 45db6ae..44f3e3e 100644
> >>>> --- a/libxfs/xfs_sb.c
> >>>> +++ b/libxfs/xfs_sb.c
> >>>> @@ -316,13 +316,16 @@ xfs_sb_quota_from_disk(struct xfs_sb *sbp)
> >>>>  					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
> >>>>  	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
> >>>>  
> >>>> -	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
> >>>> +	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
> >>>> +	    sbp->sb_gquotino != NULLFSINO)  {
> >>>
> >>> Although I agree with this check, shouldn't we report some sort of error when it
> >>> happens? Once, it's not supposed to happen, and, might be a sign of corruption?
> >>
> >> I dunno, it would also happen if it gets called twice, which is intentionally
> >> made harmless by this change.  We don't warn on free(NULL) for example...
> >>
> > 
> > Well, I don't 100% agree with not having a warning here, but it doesn't make the
> > patch less valuable.
> 
> Thanks Carlos - 
> 
> Maybe I don't understand what you want to warn about.
> 
> If we get here with:
> 
> 	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
> 	    sbp->sb_gquotino != NULLFSINO)  {
> 
> that means we have an on-disk super without the pquotino field,
> the XFS_PQUOTA_ACCT flag is set, and so the gquotino field was
> used for the project quota; this is valid, and there is
> nothing to warn about in this case.
> 
> If we get here with:
> 
> 	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
> 	    sbp->sb_gquotino == NULLFSINO)  {
> 
> that means we have an on-disk super without the pquotino field,
> the XFS_PQUOTA_ACCT flag is set, and the gquotino was not set
> to a valid value.  This could happen either from a bad on-disk
> value, or it could mean that we called the function twice in a
> row.  Without maintaining more state, we can't know which, and
> warning the user about a programming error wouldn't be helpful.
> 
> Actually, repair already handles this case elsewhere:
> 
> quota_sb_check(xfs_mount_t *mp)
> {
>         /*
>          * if the sb says we have quotas and we lost both,
>          * signal a superblock downgrade.  that will cause
>          * the quota flags to get zeroed.  (if we only lost
>          * one quota inode, do nothing and complain later.)
>          *
>          * if the sb says we have quotas but we didn't start out
>          * with any quota inodes, signal a superblock downgrade.
> 
> In the case where quota flags are on but all quota inodes are
> zero, it silently clears the quota flags.  Whether or not that
> should be silent I'm not sure, but I think that is separate
> from this patch.
> 
> Thanks,
> -Eric

Thanks for the great and detailed explanation Eric, I think I was just being too
careful about not having a warning there, without completely understand why a
warning isn't not necessary there. :)

> 
> 
> > Reviewed-by: Carlos Maiolino <cmaiolino@redhat.com>
diff mbox

Patch

diff --git a/libxfs/xfs_sb.c b/libxfs/xfs_sb.c
index 45db6ae..44f3e3e 100644
--- a/libxfs/xfs_sb.c
+++ b/libxfs/xfs_sb.c
@@ -316,13 +316,16 @@  xfs_sb_quota_from_disk(struct xfs_sb *sbp)
 					XFS_PQUOTA_CHKD : XFS_GQUOTA_CHKD;
 	sbp->sb_qflags &= ~(XFS_OQUOTA_ENFD | XFS_OQUOTA_CHKD);
 
-	if (sbp->sb_qflags & XFS_PQUOTA_ACCT)  {
+	if (sbp->sb_qflags & XFS_PQUOTA_ACCT &&
+	    sbp->sb_gquotino != NULLFSINO)  {
 		/*
 		 * In older version of superblock, on-disk superblock only
 		 * has sb_gquotino, and in-core superblock has both sb_gquotino
 		 * and sb_pquotino. But, only one of them is supported at any
 		 * point of time. So, if PQUOTA is set in disk superblock,
-		 * copy over sb_gquotino to sb_pquotino.
+		 * copy over sb_gquotino to sb_pquotino.  The NULLFSINO test
+		 * above is to make sure we don't do this twice and wipe them
+		 * both out!
 		 */
 		sbp->sb_pquotino = sbp->sb_gquotino;
 		sbp->sb_gquotino = NULLFSINO;
diff --git a/repair/sb.c b/repair/sb.c
index 3965953..8087242 100644
--- a/repair/sb.c
+++ b/repair/sb.c
@@ -155,7 +155,6 @@  __find_secondary_sb(
 		for (i = 0; !done && i < bsize; i += BBSIZE)  {
 			c_bufsb = (char *)sb + i;
 			libxfs_sb_from_disk(&bufsb, (xfs_dsb_t *)c_bufsb);
-			libxfs_sb_quota_from_disk(&bufsb);
 
 			if (verify_sb(c_bufsb, &bufsb, 0) != XR_OK)
 				continue;
@@ -568,7 +567,6 @@  get_sb(xfs_sb_t *sbp, xfs_off_t off, int size, xfs_agnumber_t agno)
 		do_error("%s\n", strerror(error));
 	}
 	libxfs_sb_from_disk(sbp, buf);
-	libxfs_sb_quota_from_disk(sbp);
 
 	rval = verify_sb((char *)buf, sbp, agno == 0);
 	free(buf);
diff --git a/repair/scan.c b/repair/scan.c
index 964ff06..366ce16 100644
--- a/repair/scan.c
+++ b/repair/scan.c
@@ -1622,7 +1622,6 @@  scan_ag(
 		goto out_free_sb;
 	}
 	libxfs_sb_from_disk(sb, XFS_BUF_TO_SBP(sbbuf));
-	libxfs_sb_quota_from_disk(sb);
 
 	agfbuf = libxfs_readbuf(mp->m_dev,
 			XFS_AG_DADDR(mp, agno, XFS_AGF_DADDR(mp)),