diff mbox series

[RFC,v1,2/2] xfs: don't set warns on the id==0 dquot

Message ID 20220421165815.87837-3-catherine.hoang@oracle.com (mailing list archive)
State Deferred, archived
Headers show
Series xfs: remove quota warning limits | expand

Commit Message

Catherine Hoang April 21, 2022, 4:58 p.m. UTC
Quotas are not enforced on the id==0 dquot, so the quota code uses it
to store warning limits and timeouts.  Having just dropped support for
warning limits, this field no longer has any meaning.  Return -EINVAL
for this dquot id if the fieldmask has any of the QC_*_WARNS set.

Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
---
 fs/xfs/xfs_qm_syscalls.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Dave Chinner April 22, 2022, 1:40 a.m. UTC | #1
On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
> Quotas are not enforced on the id==0 dquot, so the quota code uses it
> to store warning limits and timeouts.  Having just dropped support for
> warning limits, this field no longer has any meaning.  Return -EINVAL
> for this dquot id if the fieldmask has any of the QC_*_WARNS set.
> 
> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> ---
>  fs/xfs/xfs_qm_syscalls.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
> index e7f3ac60ebd9..bdbd5c83b08e 100644
> --- a/fs/xfs/xfs_qm_syscalls.c
> +++ b/fs/xfs/xfs_qm_syscalls.c
> @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
>  		return -EINVAL;
>  	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
>  		return 0;
> +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
> +		return -EINVAL;

Why would we do this for only id == 0? This will still allow
non-zero warning values to be written to dquots that have id != 0,
but I'm not sure why we'd allow this given that the previous
patch just removed all the warning limit checking?

Which then makes me ask: why are we still reading the warning counts
from on disk dquots and writing in-memory values back to dquots?
Shouldn't xfs_dquot_to_disk() just write zeros to the warning fields
now, and xfs_dquot_from_disk() elide reading the warning counts
altogether? i.e. can we remove d_bwarns, d_iwarns and d_rtbwarns
from the struct fs_disk_quota altogether now?

Which then raises the question of whether copy_from_xfs_dqblk() and
friends should still support warn counts up in fs/quota/quota.c...?

Cheers,

Dave.
Catherine Hoang April 25, 2022, 3:34 p.m. UTC | #2
> On Apr 21, 2022, at 6:40 PM, Dave Chinner <david@fromorbit.com> wrote:
> 
> On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
>> Quotas are not enforced on the id==0 dquot, so the quota code uses it
>> to store warning limits and timeouts.  Having just dropped support for
>> warning limits, this field no longer has any meaning.  Return -EINVAL
>> for this dquot id if the fieldmask has any of the QC_*_WARNS set.
>> 
>> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
>> ---
>> fs/xfs/xfs_qm_syscalls.c | 2 ++
>> 1 file changed, 2 insertions(+)
>> 
>> diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
>> index e7f3ac60ebd9..bdbd5c83b08e 100644
>> --- a/fs/xfs/xfs_qm_syscalls.c
>> +++ b/fs/xfs/xfs_qm_syscalls.c
>> @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
>> 		return -EINVAL;
>> 	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
>> 		return 0;
>> +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
>> +		return -EINVAL;
> 
> Why would we do this for only id == 0? This will still allow
> non-zero warning values to be written to dquots that have id != 0,
> but I'm not sure why we'd allow this given that the previous
> patch just removed all the warning limit checking?
> 
> Which then makes me ask: why are we still reading the warning counts
> from on disk dquots and writing in-memory values back to dquots?
> Shouldn't xfs_dquot_to_disk() just write zeros to the warning fields
> now, and xfs_dquot_from_disk() elide reading the warning counts
> altogether? i.e. can we remove d_bwarns, d_iwarns and d_rtbwarns
> from the struct fs_disk_quota altogether now?
> 
> Which then raises the question of whether copy_from_xfs_dqblk() and
> friends should still support warn counts up in fs/quota/quota.c...?

The intent of this patchset is to only remove warning limits, not warning
counts. The id == 0 dquot stores warning limits, so we no longer want
warning values to be written here. Dquots with id != 0 store warning
counts, so we still allow these values to be updated. 

In regards to whether or not warning counts should be removed, it seems
like they currently do serve a purpose. Although there's some debate about
this in a previous thread:
https://lore.kernel.org/linux-xfs/20220303003808.GM117732@magnolia/

Catherine
> 
> Cheers,
> 
> Dave.
> -- 
> Dave Chinner
> david@fromorbit.com
Dave Chinner April 25, 2022, 10:42 p.m. UTC | #3
On Mon, Apr 25, 2022 at 03:34:52PM +0000, Catherine Hoang wrote:
> > On Apr 21, 2022, at 6:40 PM, Dave Chinner <david@fromorbit.com> wrote:
> > 
> > On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
> >> Quotas are not enforced on the id==0 dquot, so the quota code uses it
> >> to store warning limits and timeouts.  Having just dropped support for
> >> warning limits, this field no longer has any meaning.  Return -EINVAL
> >> for this dquot id if the fieldmask has any of the QC_*_WARNS set.
> >> 
> >> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> >> ---
> >> fs/xfs/xfs_qm_syscalls.c | 2 ++
> >> 1 file changed, 2 insertions(+)
> >> 
> >> diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
> >> index e7f3ac60ebd9..bdbd5c83b08e 100644
> >> --- a/fs/xfs/xfs_qm_syscalls.c
> >> +++ b/fs/xfs/xfs_qm_syscalls.c
> >> @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
> >> 		return -EINVAL;
> >> 	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
> >> 		return 0;
> >> +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
> >> +		return -EINVAL;
> > 
> > Why would we do this for only id == 0? This will still allow
> > non-zero warning values to be written to dquots that have id != 0,
> > but I'm not sure why we'd allow this given that the previous
> > patch just removed all the warning limit checking?
> > 
> > Which then makes me ask: why are we still reading the warning counts
> > from on disk dquots and writing in-memory values back to dquots?
> > Shouldn't xfs_dquot_to_disk() just write zeros to the warning fields
> > now, and xfs_dquot_from_disk() elide reading the warning counts
> > altogether? i.e. can we remove d_bwarns, d_iwarns and d_rtbwarns
> > from the struct fs_disk_quota altogether now?
> > 
> > Which then raises the question of whether copy_from_xfs_dqblk() and
> > friends should still support warn counts up in fs/quota/quota.c...?
> 
> The intent of this patchset is to only remove warning limits, not warning
> counts. The id == 0 dquot stores warning limits, so we no longer want
> warning values to be written here. Dquots with id != 0 store warning
> counts, so we still allow these values to be updated. 

Why? What uses the value? After this patchset the warning count is
not used anywhere in the kernel. Absent in-kernel enforced limits,
the warning count fundamentally useless as a userspace decision tool
because of the warning count is not deterministic in any way.
Nothing can be inferred about the absolute value of the count
because it can't be related back to individual user operations.

i.e. A single write() can get different numbers of warnings issuedi
simply because of file layout or inode config (e.g. DIO vs buffered,
writes into sparse regions, unwritten regions, if extent size hints
are enabled, etc) and so the absolute magnitude of the warning count
doesn't carry any significant meaning.

I still haven't seen a use case for these quota warning counts,
either with or without the limiting code. Who needs this, and why?

> In regards to whether or not warning counts should be removed, it seems
> like they currently do serve a purpose. Although there's some debate about
> this in a previous thread:
> https://lore.kernel.org/linux-xfs/20220303003808.GM117732@magnolia/

Once the warning count increment is removed, there are zero users of
the warning counts.

Legacy functionality policy (as defined by ALLOCSP removal) kicks in
here: remove anything questionable ASAP, before it bites us hard.
And this has already bitten us real hard...

Cheers,

Dave.
Darrick J. Wong April 27, 2022, 8:29 p.m. UTC | #4
On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
> Quotas are not enforced on the id==0 dquot, so the quota code uses it
> to store warning limits and timeouts.  Having just dropped support for
> warning limits, this field no longer has any meaning.  Return -EINVAL
> for this dquot id if the fieldmask has any of the QC_*_WARNS set.
> 
> Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> ---
>  fs/xfs/xfs_qm_syscalls.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
> index e7f3ac60ebd9..bdbd5c83b08e 100644
> --- a/fs/xfs/xfs_qm_syscalls.c
> +++ b/fs/xfs/xfs_qm_syscalls.c
> @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
>  		return -EINVAL;
>  	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
>  		return 0;
> +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
> +		return -EINVAL;

Assuming there'll be more patches coming to turn off the rest of the
warnings counters, this is a reasonable start:

Reviewed-by: Darrick J. Wong <djwong@kernel.org>

--D

>  
>  	/*
>  	 * Get the dquot (locked) before we start, as we need to do a
> -- 
> 2.27.0
>
kernel test robot April 28, 2022, 6:36 a.m. UTC | #5
Greeting,

FYI, we noticed the following commit (built with gcc-11):

commit: 3937433c634eaaa2e4b3abb112d83e07df13362c ("[RFC PATCH v1 2/2] xfs: don't set warns on the id==0 dquot")
url: https://github.com/intel-lab-lkp/linux/commits/Catherine-Hoang/xfs-remove-quota-warning-limits/20220422-005909
base: https://git.kernel.org/cgit/fs/xfs/xfs-linux.git for-next
patch link: https://lore.kernel.org/linux-xfs/20220421165815.87837-3-catherine.hoang@oracle.com

in testcase: xfstests
version: xfstests-x86_64-46e1b83-1_20220414
with following parameters:

	disk: 4HDD
	fs: xfs
	test: xfs-group-15
	ucode: 0x21

test-description: xfstests is a regression test suite for xfs and other files ystems.
test-url: git://git.kernel.org/pub/scm/fs/xfs/xfstests-dev.git


on test machine: 4 threads 1 sockets Intel(R) Core(TM) i3-3220 CPU @ 3.30GHz with 8G memory

caused below changes (please refer to attached dmesg/kmsg for entire log/backtrace):




If you fix the issue, kindly add following tag
Reported-by: kernel test robot <oliver.sang@intel.com>

xfs/153	- output mismatch (see /lkp/benchmarks/xfstests/results//xfs/153.out.bad)
    --- tests/xfs/153.out	2022-04-14 12:51:49.000000000 +0000
    +++ /lkp/benchmarks/xfstests/results//xfs/153.out.bad	2022-04-27 09:56:33.037088046 +0000
    @@ -6,6 +6,7 @@
     naming   =VERN bsize=XXX
     log      =LDEV bsize=XXX blocks=XXX
     realtime =RDEV extsz=XXX blocks=XXX, rtextents=XXX
    +xfs_quota: cannot set warnings: Invalid argument
     
     *** report no quota settings
     [ROOT] 0 0 0 00 [--------] 3 0 0 00 [--------] 0 0 0 00 [--------]
    ...
    (Run 'diff -u /lkp/benchmarks/xfstests/tests/xfs/153.out /lkp/benchmarks/xfstests/results//xfs/153.out.bad'  to see the entire diff)



To reproduce:

        git clone https://github.com/intel/lkp-tests.git
        cd lkp-tests
        sudo bin/lkp install job.yaml           # job file is attached in this email
        bin/lkp split-job --compatible job.yaml # generate the yaml file for lkp run
        sudo bin/lkp run generated-yaml-file

        # if come across any failure that blocks the test,
        # please remove ~/.lkp and /lkp dir to run from a clean state.
Allison Henderson April 29, 2022, 5:20 p.m. UTC | #6
On Tue, 2022-04-26 at 08:42 +1000, Dave Chinner wrote:
> On Mon, Apr 25, 2022 at 03:34:52PM +0000, Catherine Hoang wrote:
> > > On Apr 21, 2022, at 6:40 PM, Dave Chinner <david@fromorbit.com>
> > > wrote:
> > > 
> > > On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
> > > > Quotas are not enforced on the id==0 dquot, so the quota code
> > > > uses it
> > > > to store warning limits and timeouts.  Having just dropped
> > > > support for
> > > > warning limits, this field no longer has any meaning.  Return
> > > > -EINVAL
> > > > for this dquot id if the fieldmask has any of the QC_*_WARNS
> > > > set.
> > > > 
> > > > Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> > > > ---
> > > > fs/xfs/xfs_qm_syscalls.c | 2 ++
> > > > 1 file changed, 2 insertions(+)
> > > > 
> > > > diff --git a/fs/xfs/xfs_qm_syscalls.c
> > > > b/fs/xfs/xfs_qm_syscalls.c
> > > > index e7f3ac60ebd9..bdbd5c83b08e 100644
> > > > --- a/fs/xfs/xfs_qm_syscalls.c
> > > > +++ b/fs/xfs/xfs_qm_syscalls.c
> > > > @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
> > > > 		return -EINVAL;
> > > > 	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
> > > > 		return 0;
> > > > +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
> > > > +		return -EINVAL;
> > > 
> > > Why would we do this for only id == 0? This will still allow
> > > non-zero warning values to be written to dquots that have id !=
> > > 0,
> > > but I'm not sure why we'd allow this given that the previous
> > > patch just removed all the warning limit checking?
> > > 
> > > Which then makes me ask: why are we still reading the warning
> > > counts
> > > from on disk dquots and writing in-memory values back to dquots?
> > > Shouldn't xfs_dquot_to_disk() just write zeros to the warning
> > > fields
> > > now, and xfs_dquot_from_disk() elide reading the warning counts
> > > altogether? i.e. can we remove d_bwarns, d_iwarns and d_rtbwarns
> > > from the struct fs_disk_quota altogether now?
> > > 
> > > Which then raises the question of whether copy_from_xfs_dqblk()
> > > and
> > > friends should still support warn counts up in
> > > fs/quota/quota.c...?
> > 
> > The intent of this patchset is to only remove warning limits, not
> > warning
> > counts. The id == 0 dquot stores warning limits, so we no longer
> > want
> > warning values to be written here. Dquots with id != 0 store
> > warning
> > counts, so we still allow these values to be updated. 
> 
> Why? What uses the value? After this patchset the warning count is
> not used anywhere in the kernel. Absent in-kernel enforced limits,
> the warning count fundamentally useless as a userspace decision tool
> because of the warning count is not deterministic in any way.
> Nothing can be inferred about the absolute value of the count
> because it can't be related back to individual user operations.
> 
> i.e. A single write() can get different numbers of warnings issuedi
> simply because of file layout or inode config (e.g. DIO vs buffered,
> writes into sparse regions, unwritten regions, if extent size hints
> are enabled, etc) and so the absolute magnitude of the warning count
> doesn't carry any significant meaning.
> 
> I still haven't seen a use case for these quota warning counts,
> either with or without the limiting code. Who needs this, and why?

Hmm, I wonder if this is eluding to the idea of just removing the
function all together then?  And replacing it with a -EINVAL or maybe
-EOPNOTSUPP?  I only see a few calls to it in xfs_quotaops.c

Allison
> 
> > In regards to whether or not warning counts should be removed, it
> > seems
> > like they currently do serve a purpose. Although there's some
> > debate about
> > this in a previous thread:
> > https://lore.kernel.org/linux-xfs/20220303003808.GM117732@magnolia/
> 
> Once the warning count increment is removed, there are zero users of
> the warning counts.
> 
> Legacy functionality policy (as defined by ALLOCSP removal) kicks in
> here: remove anything questionable ASAP, before it bites us hard.
> And this has already bitten us real hard...
> 
> Cheers,
> 
> Dave.
Darrick J. Wong April 29, 2022, 7:38 p.m. UTC | #7
On Fri, Apr 29, 2022 at 10:20:18AM -0700, Alli wrote:
> On Tue, 2022-04-26 at 08:42 +1000, Dave Chinner wrote:
> > On Mon, Apr 25, 2022 at 03:34:52PM +0000, Catherine Hoang wrote:
> > > > On Apr 21, 2022, at 6:40 PM, Dave Chinner <david@fromorbit.com>
> > > > wrote:
> > > > 
> > > > On Thu, Apr 21, 2022 at 09:58:15AM -0700, Catherine Hoang wrote:
> > > > > Quotas are not enforced on the id==0 dquot, so the quota code
> > > > > uses it
> > > > > to store warning limits and timeouts.  Having just dropped
> > > > > support for
> > > > > warning limits, this field no longer has any meaning.  Return
> > > > > -EINVAL
> > > > > for this dquot id if the fieldmask has any of the QC_*_WARNS
> > > > > set.
> > > > > 
> > > > > Signed-off-by: Catherine Hoang <catherine.hoang@oracle.com>
> > > > > ---
> > > > > fs/xfs/xfs_qm_syscalls.c | 2 ++
> > > > > 1 file changed, 2 insertions(+)
> > > > > 
> > > > > diff --git a/fs/xfs/xfs_qm_syscalls.c
> > > > > b/fs/xfs/xfs_qm_syscalls.c
> > > > > index e7f3ac60ebd9..bdbd5c83b08e 100644
> > > > > --- a/fs/xfs/xfs_qm_syscalls.c
> > > > > +++ b/fs/xfs/xfs_qm_syscalls.c
> > > > > @@ -290,6 +290,8 @@ xfs_qm_scall_setqlim(
> > > > > 		return -EINVAL;
> > > > > 	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
> > > > > 		return 0;
> > > > > +	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
> > > > > +		return -EINVAL;
> > > > 
> > > > Why would we do this for only id == 0? This will still allow
> > > > non-zero warning values to be written to dquots that have id !=
> > > > 0,
> > > > but I'm not sure why we'd allow this given that the previous
> > > > patch just removed all the warning limit checking?
> > > > 
> > > > Which then makes me ask: why are we still reading the warning
> > > > counts
> > > > from on disk dquots and writing in-memory values back to dquots?
> > > > Shouldn't xfs_dquot_to_disk() just write zeros to the warning
> > > > fields
> > > > now, and xfs_dquot_from_disk() elide reading the warning counts
> > > > altogether? i.e. can we remove d_bwarns, d_iwarns and d_rtbwarns
> > > > from the struct fs_disk_quota altogether now?
> > > > 
> > > > Which then raises the question of whether copy_from_xfs_dqblk()
> > > > and
> > > > friends should still support warn counts up in
> > > > fs/quota/quota.c...?
> > > 
> > > The intent of this patchset is to only remove warning limits, not
> > > warning
> > > counts. The id == 0 dquot stores warning limits, so we no longer
> > > want
> > > warning values to be written here. Dquots with id != 0 store
> > > warning
> > > counts, so we still allow these values to be updated. 
> > 
> > Why? What uses the value? After this patchset the warning count is
> > not used anywhere in the kernel. Absent in-kernel enforced limits,
> > the warning count fundamentally useless as a userspace decision tool
> > because of the warning count is not deterministic in any way.
> > Nothing can be inferred about the absolute value of the count
> > because it can't be related back to individual user operations.
> > 
> > i.e. A single write() can get different numbers of warnings issuedi
> > simply because of file layout or inode config (e.g. DIO vs buffered,
> > writes into sparse regions, unwritten regions, if extent size hints
> > are enabled, etc) and so the absolute magnitude of the warning count
> > doesn't carry any significant meaning.
> > 
> > I still haven't seen a use case for these quota warning counts,
> > either with or without the limiting code. Who needs this, and why?
> 
> Hmm, I wonder if this is eluding to the idea of just removing the
> function all together then?  And replacing it with a -EINVAL or maybe
> -EOPNOTSUPP?  I only see a few calls to it in xfs_quotaops.c

I don't think /anyone/ needs it, but we're spending a lot of time
speculating and arguing about a poorly defined feature that has never
been satisfactorily implemented.  I broke it two LTSes ago based on my
misunderstanding (and nobody said anything during review) by making the
kernel bump the warning count, and nobody noticed until RH QA hit it.

I spoke to Jan Kara yesterday about removing the warning counts, and he
said he was fine if we simply made the VFS quotactl code return EINVAL
if you try to set the warning values, and then just remove everything
else behind that.

--D

> Allison
> > 
> > > In regards to whether or not warning counts should be removed, it
> > > seems
> > > like they currently do serve a purpose. Although there's some
> > > debate about
> > > this in a previous thread:
> > > https://lore.kernel.org/linux-xfs/20220303003808.GM117732@magnolia/
> > 
> > Once the warning count increment is removed, there are zero users of
> > the warning counts.
> > 
> > Legacy functionality policy (as defined by ALLOCSP removal) kicks in
> > here: remove anything questionable ASAP, before it bites us hard.
> > And this has already bitten us real hard...
> > 
> > Cheers,
> > 
> > Dave.
>
diff mbox series

Patch

diff --git a/fs/xfs/xfs_qm_syscalls.c b/fs/xfs/xfs_qm_syscalls.c
index e7f3ac60ebd9..bdbd5c83b08e 100644
--- a/fs/xfs/xfs_qm_syscalls.c
+++ b/fs/xfs/xfs_qm_syscalls.c
@@ -290,6 +290,8 @@  xfs_qm_scall_setqlim(
 		return -EINVAL;
 	if ((newlim->d_fieldmask & XFS_QC_MASK) == 0)
 		return 0;
+	if ((newlim->d_fieldmask & QC_WARNS_MASK) && id == 0)
+		return -EINVAL;
 
 	/*
 	 * Get the dquot (locked) before we start, as we need to do a