diff mbox series

[v2] inotify: Increase default inotify.max_user_watches limit to 1048576

Message ID 20201029154535.2074-1-longman@redhat.com
State New
Headers show
Series [v2] inotify: Increase default inotify.max_user_watches limit to 1048576 | expand

Commit Message

Waiman Long Oct. 29, 2020, 3:45 p.m. UTC
The default value of inotify.max_user_watches sysctl parameter was set
to 8192 since the introduction of the inotify feature in 2005 by
commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
small for many modern usage. As a result, users have to explicitly set
it to a larger value to make it work.

After some searching around the web, these are the
inotify.max_user_watches values used by some projects:
 - vscode:  524288
 - dropbox support: 100000
 - users on stackexchange: 12228
 - lsyncd user: 2000000
 - code42 support: 1048576
 - monodevelop: 16384
 - tectonic: 524288
 - openshift origin: 65536

Each watch point adds an inotify_inode_mark structure to an inode to
be watched. It also pins the watched inode as well as an inotify fdinfo
procfs file.

Modeled after the epoll.max_user_watches behavior to adjust the default
value according to the amount of addressable memory available, make
inotify.max_user_watches behave in a similar way to make it use no more
than 1% of addressable memory within the range [8192, 1048576].

For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
to 2 kbytes. That means a system with 196GB or more memory should have
the maximum value of 1048576 for inotify.max_user_watches. This default
should be big enough for most use cases.

With my x86-64 config, the size of xfs_inode, proc_inode and
inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
1760 bytes.

[v2: increase inotify watch cost as suggested by Amir and Honza]

Signed-off-by: Waiman Long <longman@redhat.com>
---
 fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
 1 file changed, 23 insertions(+), 1 deletion(-)

Comments

Amir Goldstein Oct. 29, 2020, 5:27 p.m. UTC | #1
On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@redhat.com> wrote:
>
> The default value of inotify.max_user_watches sysctl parameter was set
> to 8192 since the introduction of the inotify feature in 2005 by
> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
> small for many modern usage. As a result, users have to explicitly set
> it to a larger value to make it work.
>
> After some searching around the web, these are the
> inotify.max_user_watches values used by some projects:
>  - vscode:  524288
>  - dropbox support: 100000
>  - users on stackexchange: 12228
>  - lsyncd user: 2000000
>  - code42 support: 1048576
>  - monodevelop: 16384
>  - tectonic: 524288
>  - openshift origin: 65536
>
> Each watch point adds an inotify_inode_mark structure to an inode to
> be watched. It also pins the watched inode as well as an inotify fdinfo
> procfs file.
>
> Modeled after the epoll.max_user_watches behavior to adjust the default
> value according to the amount of addressable memory available, make
> inotify.max_user_watches behave in a similar way to make it use no more
> than 1% of addressable memory within the range [8192, 1048576].
>
> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
> to 2 kbytes. That means a system with 196GB or more memory should have
> the maximum value of 1048576 for inotify.max_user_watches. This default
> should be big enough for most use cases.
>
> With my x86-64 config, the size of xfs_inode, proc_inode and
> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
> 1760 bytes.
>
> [v2: increase inotify watch cost as suggested by Amir and Honza]
>
> Signed-off-by: Waiman Long <longman@redhat.com>
> ---
>  fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
>  1 file changed, 23 insertions(+), 1 deletion(-)
>
> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> index 186722ba3894..37d9f09c226f 100644
> --- a/fs/notify/inotify/inotify_user.c
> +++ b/fs/notify/inotify/inotify_user.c
> @@ -37,6 +37,16 @@
>
>  #include <asm/ioctls.h>
>
> +/*
> + * An inotify watch requires allocating an inotify_inode_mark structure as
> + * well as pinning the watched inode and adding inotify fdinfo procfs file.

Maybe you misunderstood me.
There is no procfs file per watch.
There is a procfs file per inotify_init() fd.
The fdinfo of that procfile lists all the watches of that inotify instance.

> + * The increase in size of a filesystem inode versus a VFS inode varies
> + * depending on the filesystem. An extra 512 bytes is added as rough
> + * estimate of the additional filesystem inode cost.
> + */
> +#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
> +                                2 * sizeof(struct inode) + 512)
> +

I would consider going with double the sizeof inode as rough approximation for
filesystem inode size.

It is a bit less arbitrary than 512 and it has some rationale behind it -
Some kernel config options will grow struct inode (debug, smp)
The same config options may also grow the filesystem part of the inode.

And this approximation can be pretty accurate at times.
For example, on Ubuntu 18.04 kernel 5.4.0:
inode_cache        608
nfs_inode_cache      1088
btrfs_inode            1168
xfs_inode              1024
ext4_inode_cache   1096


Thanks,
Amir.
Waiman Long Oct. 29, 2020, 6:05 p.m. UTC | #2
On 10/29/20 1:27 PM, Amir Goldstein wrote:
> On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@redhat.com> wrote:
>> The default value of inotify.max_user_watches sysctl parameter was set
>> to 8192 since the introduction of the inotify feature in 2005 by
>> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
>> small for many modern usage. As a result, users have to explicitly set
>> it to a larger value to make it work.
>>
>> After some searching around the web, these are the
>> inotify.max_user_watches values used by some projects:
>>   - vscode:  524288
>>   - dropbox support: 100000
>>   - users on stackexchange: 12228
>>   - lsyncd user: 2000000
>>   - code42 support: 1048576
>>   - monodevelop: 16384
>>   - tectonic: 524288
>>   - openshift origin: 65536
>>
>> Each watch point adds an inotify_inode_mark structure to an inode to
>> be watched. It also pins the watched inode as well as an inotify fdinfo
>> procfs file.
>>
>> Modeled after the epoll.max_user_watches behavior to adjust the default
>> value according to the amount of addressable memory available, make
>> inotify.max_user_watches behave in a similar way to make it use no more
>> than 1% of addressable memory within the range [8192, 1048576].
>>
>> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
>> to 2 kbytes. That means a system with 196GB or more memory should have
>> the maximum value of 1048576 for inotify.max_user_watches. This default
>> should be big enough for most use cases.
>>
>> With my x86-64 config, the size of xfs_inode, proc_inode and
>> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
>> 1760 bytes.
>>
>> [v2: increase inotify watch cost as suggested by Amir and Honza]
>>
>> Signed-off-by: Waiman Long <longman@redhat.com>
>> ---
>>   fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
>>   1 file changed, 23 insertions(+), 1 deletion(-)
>>
>> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
>> index 186722ba3894..37d9f09c226f 100644
>> --- a/fs/notify/inotify/inotify_user.c
>> +++ b/fs/notify/inotify/inotify_user.c
>> @@ -37,6 +37,16 @@
>>
>>   #include <asm/ioctls.h>
>>
>> +/*
>> + * An inotify watch requires allocating an inotify_inode_mark structure as
>> + * well as pinning the watched inode and adding inotify fdinfo procfs file.
> Maybe you misunderstood me.
> There is no procfs file per watch.
> There is a procfs file per inotify_init() fd.
> The fdinfo of that procfile lists all the watches of that inotify instance.
Thanks for the clarification. Yes, I probably had misunderstood you 
because of the 2 * sizeof(inode) figure you provided.
>> + * The increase in size of a filesystem inode versus a VFS inode varies
>> + * depending on the filesystem. An extra 512 bytes is added as rough
>> + * estimate of the additional filesystem inode cost.
>> + */
>> +#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
>> +                                2 * sizeof(struct inode) + 512)
>> +
> I would consider going with double the sizeof inode as rough approximation for
> filesystem inode size.
>
> It is a bit less arbitrary than 512 and it has some rationale behind it -
> Some kernel config options will grow struct inode (debug, smp)
> The same config options may also grow the filesystem part of the inode.
>
> And this approximation can be pretty accurate at times.
> For example, on Ubuntu 18.04 kernel 5.4.0:
> inode_cache        608
> nfs_inode_cache      1088
> btrfs_inode            1168
> xfs_inode              1024
> ext4_inode_cache   1096

Just to clarify, is your original 2 * sizeof(struct inode) figure 
include the filesystem inode overhead or there is an additional inode 
somewhere that I needs to go to 4 * sizeof(struct inode)?

Cheers,
Longman
Amir Goldstein Oct. 29, 2020, 6:46 p.m. UTC | #3
On Thu, Oct 29, 2020 at 8:05 PM Waiman Long <longman@redhat.com> wrote:
>
> On 10/29/20 1:27 PM, Amir Goldstein wrote:
> > On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@redhat.com> wrote:
> >> The default value of inotify.max_user_watches sysctl parameter was set
> >> to 8192 since the introduction of the inotify feature in 2005 by
> >> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
> >> small for many modern usage. As a result, users have to explicitly set
> >> it to a larger value to make it work.
> >>
> >> After some searching around the web, these are the
> >> inotify.max_user_watches values used by some projects:
> >>   - vscode:  524288
> >>   - dropbox support: 100000
> >>   - users on stackexchange: 12228
> >>   - lsyncd user: 2000000
> >>   - code42 support: 1048576
> >>   - monodevelop: 16384
> >>   - tectonic: 524288
> >>   - openshift origin: 65536
> >>
> >> Each watch point adds an inotify_inode_mark structure to an inode to
> >> be watched. It also pins the watched inode as well as an inotify fdinfo
> >> procfs file.
> >>
> >> Modeled after the epoll.max_user_watches behavior to adjust the default
> >> value according to the amount of addressable memory available, make
> >> inotify.max_user_watches behave in a similar way to make it use no more
> >> than 1% of addressable memory within the range [8192, 1048576].
> >>
> >> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
> >> to 2 kbytes. That means a system with 196GB or more memory should have
> >> the maximum value of 1048576 for inotify.max_user_watches. This default
> >> should be big enough for most use cases.
> >>
> >> With my x86-64 config, the size of xfs_inode, proc_inode and
> >> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
> >> 1760 bytes.
> >>
> >> [v2: increase inotify watch cost as suggested by Amir and Honza]
> >>
> >> Signed-off-by: Waiman Long <longman@redhat.com>
> >> ---
> >>   fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
> >>   1 file changed, 23 insertions(+), 1 deletion(-)
> >>
> >> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> >> index 186722ba3894..37d9f09c226f 100644
> >> --- a/fs/notify/inotify/inotify_user.c
> >> +++ b/fs/notify/inotify/inotify_user.c
> >> @@ -37,6 +37,16 @@
> >>
> >>   #include <asm/ioctls.h>
> >>
> >> +/*
> >> + * An inotify watch requires allocating an inotify_inode_mark structure as
> >> + * well as pinning the watched inode and adding inotify fdinfo procfs file.
> > Maybe you misunderstood me.
> > There is no procfs file per watch.
> > There is a procfs file per inotify_init() fd.
> > The fdinfo of that procfile lists all the watches of that inotify instance.
> Thanks for the clarification. Yes, I probably had misunderstood you
> because of the 2 * sizeof(inode) figure you provided.
> >> + * The increase in size of a filesystem inode versus a VFS inode varies
> >> + * depending on the filesystem. An extra 512 bytes is added as rough
> >> + * estimate of the additional filesystem inode cost.
> >> + */
> >> +#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
> >> +                                2 * sizeof(struct inode) + 512)
> >> +
> > I would consider going with double the sizeof inode as rough approximation for
> > filesystem inode size.
> >
> > It is a bit less arbitrary than 512 and it has some rationale behind it -
> > Some kernel config options will grow struct inode (debug, smp)
> > The same config options may also grow the filesystem part of the inode.
> >
> > And this approximation can be pretty accurate at times.
> > For example, on Ubuntu 18.04 kernel 5.4.0:
> > inode_cache        608
> > nfs_inode_cache      1088
> > btrfs_inode            1168
> > xfs_inode              1024
> > ext4_inode_cache   1096
>
> Just to clarify, is your original 2 * sizeof(struct inode) figure
> include the filesystem inode overhead or there is an additional inode
> somewhere that I needs to go to 4 * sizeof(struct inode)?

No additional inode.

#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
                                                      2 * sizeof(struct inode))

Not sure if the inotify_inode_mark part matters, but it doesn't hurt.
Do note that Jan had a different proposal for fs inode size estimation (1K).
I have no objection to this estimation if Jan insists.

Thanks,
Amir.
Waiman Long Oct. 29, 2020, 7:04 p.m. UTC | #4
On 10/29/20 2:46 PM, Amir Goldstein wrote:
> On Thu, Oct 29, 2020 at 8:05 PM Waiman Long <longman@redhat.com> wrote:
>> On 10/29/20 1:27 PM, Amir Goldstein wrote:
>>> On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@redhat.com> wrote:
>>>> The default value of inotify.max_user_watches sysctl parameter was set
>>>> to 8192 since the introduction of the inotify feature in 2005 by
>>>> commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
>>>> small for many modern usage. As a result, users have to explicitly set
>>>> it to a larger value to make it work.
>>>>
>>>> After some searching around the web, these are the
>>>> inotify.max_user_watches values used by some projects:
>>>>    - vscode:  524288
>>>>    - dropbox support: 100000
>>>>    - users on stackexchange: 12228
>>>>    - lsyncd user: 2000000
>>>>    - code42 support: 1048576
>>>>    - monodevelop: 16384
>>>>    - tectonic: 524288
>>>>    - openshift origin: 65536
>>>>
>>>> Each watch point adds an inotify_inode_mark structure to an inode to
>>>> be watched. It also pins the watched inode as well as an inotify fdinfo
>>>> procfs file.
>>>>
>>>> Modeled after the epoll.max_user_watches behavior to adjust the default
>>>> value according to the amount of addressable memory available, make
>>>> inotify.max_user_watches behave in a similar way to make it use no more
>>>> than 1% of addressable memory within the range [8192, 1048576].
>>>>
>>>> For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
>>>> to 2 kbytes. That means a system with 196GB or more memory should have
>>>> the maximum value of 1048576 for inotify.max_user_watches. This default
>>>> should be big enough for most use cases.
>>>>
>>>> With my x86-64 config, the size of xfs_inode, proc_inode and
>>>> inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
>>>> 1760 bytes.
>>>>
>>>> [v2: increase inotify watch cost as suggested by Amir and Honza]
>>>>
>>>> Signed-off-by: Waiman Long <longman@redhat.com>
>>>> ---
>>>>    fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
>>>>    1 file changed, 23 insertions(+), 1 deletion(-)
>>>>
>>>> diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
>>>> index 186722ba3894..37d9f09c226f 100644
>>>> --- a/fs/notify/inotify/inotify_user.c
>>>> +++ b/fs/notify/inotify/inotify_user.c
>>>> @@ -37,6 +37,16 @@
>>>>
>>>>    #include <asm/ioctls.h>
>>>>
>>>> +/*
>>>> + * An inotify watch requires allocating an inotify_inode_mark structure as
>>>> + * well as pinning the watched inode and adding inotify fdinfo procfs file.
>>> Maybe you misunderstood me.
>>> There is no procfs file per watch.
>>> There is a procfs file per inotify_init() fd.
>>> The fdinfo of that procfile lists all the watches of that inotify instance.
>> Thanks for the clarification. Yes, I probably had misunderstood you
>> because of the 2 * sizeof(inode) figure you provided.
>>>> + * The increase in size of a filesystem inode versus a VFS inode varies
>>>> + * depending on the filesystem. An extra 512 bytes is added as rough
>>>> + * estimate of the additional filesystem inode cost.
>>>> + */
>>>> +#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
>>>> +                                2 * sizeof(struct inode) + 512)
>>>> +
>>> I would consider going with double the sizeof inode as rough approximation for
>>> filesystem inode size.
>>>
>>> It is a bit less arbitrary than 512 and it has some rationale behind it -
>>> Some kernel config options will grow struct inode (debug, smp)
>>> The same config options may also grow the filesystem part of the inode.
>>>
>>> And this approximation can be pretty accurate at times.
>>> For example, on Ubuntu 18.04 kernel 5.4.0:
>>> inode_cache        608
>>> nfs_inode_cache      1088
>>> btrfs_inode            1168
>>> xfs_inode              1024
>>> ext4_inode_cache   1096
>> Just to clarify, is your original 2 * sizeof(struct inode) figure
>> include the filesystem inode overhead or there is an additional inode
>> somewhere that I needs to go to 4 * sizeof(struct inode)?
> No additional inode.
>
> #define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
>                                                        2 * sizeof(struct inode))
>
> Not sure if the inotify_inode_mark part matters, but it doesn't hurt.
> Do note that Jan had a different proposal for fs inode size estimation (1K).
> I have no objection to this estimation if Jan insists.
>
> Thanks,
> Amir.
>
Thanks for the confirmation. 2*sizeof(struct inode) is more than 1k. 
Besides with debugging turned on, the size will increase more. So that 
figure is good enough.

Cheers,
Longman
Jan Kara Oct. 30, 2020, 10:48 a.m. UTC | #5
On Thu 29-10-20 15:04:56, Waiman Long wrote:
> On 10/29/20 2:46 PM, Amir Goldstein wrote:
> > On Thu, Oct 29, 2020 at 8:05 PM Waiman Long <longman@redhat.com> wrote:
> > > On 10/29/20 1:27 PM, Amir Goldstein wrote:
> > > > On Thu, Oct 29, 2020 at 5:46 PM Waiman Long <longman@redhat.com> wrote:
> > > > > The default value of inotify.max_user_watches sysctl parameter was set
> > > > > to 8192 since the introduction of the inotify feature in 2005 by
> > > > > commit 0eeca28300df ("[PATCH] inotify"). Today this value is just too
> > > > > small for many modern usage. As a result, users have to explicitly set
> > > > > it to a larger value to make it work.
> > > > > 
> > > > > After some searching around the web, these are the
> > > > > inotify.max_user_watches values used by some projects:
> > > > >    - vscode:  524288
> > > > >    - dropbox support: 100000
> > > > >    - users on stackexchange: 12228
> > > > >    - lsyncd user: 2000000
> > > > >    - code42 support: 1048576
> > > > >    - monodevelop: 16384
> > > > >    - tectonic: 524288
> > > > >    - openshift origin: 65536
> > > > > 
> > > > > Each watch point adds an inotify_inode_mark structure to an inode to
> > > > > be watched. It also pins the watched inode as well as an inotify fdinfo
> > > > > procfs file.
> > > > > 
> > > > > Modeled after the epoll.max_user_watches behavior to adjust the default
> > > > > value according to the amount of addressable memory available, make
> > > > > inotify.max_user_watches behave in a similar way to make it use no more
> > > > > than 1% of addressable memory within the range [8192, 1048576].
> > > > > 
> > > > > For 64-bit archs, inotify_inode_mark plus 2 inode have a size close
> > > > > to 2 kbytes. That means a system with 196GB or more memory should have
> > > > > the maximum value of 1048576 for inotify.max_user_watches. This default
> > > > > should be big enough for most use cases.
> > > > > 
> > > > > With my x86-64 config, the size of xfs_inode, proc_inode and
> > > > > inotify_inode_mark is 1680 bytes. The estimated INOTIFY_WATCH_COST is
> > > > > 1760 bytes.
> > > > > 
> > > > > [v2: increase inotify watch cost as suggested by Amir and Honza]
> > > > > 
> > > > > Signed-off-by: Waiman Long <longman@redhat.com>
> > > > > ---
> > > > >    fs/notify/inotify/inotify_user.c | 24 +++++++++++++++++++++++-
> > > > >    1 file changed, 23 insertions(+), 1 deletion(-)
> > > > > 
> > > > > diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
> > > > > index 186722ba3894..37d9f09c226f 100644
> > > > > --- a/fs/notify/inotify/inotify_user.c
> > > > > +++ b/fs/notify/inotify/inotify_user.c
> > > > > @@ -37,6 +37,16 @@
> > > > > 
> > > > >    #include <asm/ioctls.h>
> > > > > 
> > > > > +/*
> > > > > + * An inotify watch requires allocating an inotify_inode_mark structure as
> > > > > + * well as pinning the watched inode and adding inotify fdinfo procfs file.
> > > > Maybe you misunderstood me.
> > > > There is no procfs file per watch.
> > > > There is a procfs file per inotify_init() fd.
> > > > The fdinfo of that procfile lists all the watches of that inotify instance.
> > > Thanks for the clarification. Yes, I probably had misunderstood you
> > > because of the 2 * sizeof(inode) figure you provided.
> > > > > + * The increase in size of a filesystem inode versus a VFS inode varies
> > > > > + * depending on the filesystem. An extra 512 bytes is added as rough
> > > > > + * estimate of the additional filesystem inode cost.
> > > > > + */
> > > > > +#define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
> > > > > +                                2 * sizeof(struct inode) + 512)
> > > > > +
> > > > I would consider going with double the sizeof inode as rough approximation for
> > > > filesystem inode size.
> > > > 
> > > > It is a bit less arbitrary than 512 and it has some rationale behind it -
> > > > Some kernel config options will grow struct inode (debug, smp)
> > > > The same config options may also grow the filesystem part of the inode.
> > > > 
> > > > And this approximation can be pretty accurate at times.
> > > > For example, on Ubuntu 18.04 kernel 5.4.0:
> > > > inode_cache        608
> > > > nfs_inode_cache      1088
> > > > btrfs_inode            1168
> > > > xfs_inode              1024
> > > > ext4_inode_cache   1096
> > > Just to clarify, is your original 2 * sizeof(struct inode) figure
> > > include the filesystem inode overhead or there is an additional inode
> > > somewhere that I needs to go to 4 * sizeof(struct inode)?
> > No additional inode.
> > 
> > #define INOTIFY_WATCH_COST     (sizeof(struct inotify_inode_mark) + \
> >                                                        2 * sizeof(struct inode))
> > 
> > Not sure if the inotify_inode_mark part matters, but it doesn't hurt.
> > Do note that Jan had a different proposal for fs inode size estimation (1K).
> > I have no objection to this estimation if Jan insists.
> > 
> > Thanks,
> > Amir.
> > 
> Thanks for the confirmation. 2*sizeof(struct inode) is more than 1k. Besides
> with debugging turned on, the size will increase more. So that figure is
> good enough.

Yeah, the 2*sizeof(struct inode) is fine by me as well. Please don't forget
to update the comment explaining INOTIFY_WATCH_COST. Thanks!

								Honza
diff mbox series

Patch

diff --git a/fs/notify/inotify/inotify_user.c b/fs/notify/inotify/inotify_user.c
index 186722ba3894..37d9f09c226f 100644
--- a/fs/notify/inotify/inotify_user.c
+++ b/fs/notify/inotify/inotify_user.c
@@ -37,6 +37,16 @@ 
 
 #include <asm/ioctls.h>
 
+/*
+ * An inotify watch requires allocating an inotify_inode_mark structure as
+ * well as pinning the watched inode and adding inotify fdinfo procfs file.
+ * The increase in size of a filesystem inode versus a VFS inode varies
+ * depending on the filesystem. An extra 512 bytes is added as rough
+ * estimate of the additional filesystem inode cost.
+ */
+#define INOTIFY_WATCH_COST	(sizeof(struct inotify_inode_mark) + \
+				 2 * sizeof(struct inode) + 512)
+
 /* configurable via /proc/sys/fs/inotify/ */
 static int inotify_max_queued_events __read_mostly;
 
@@ -801,6 +811,18 @@  SYSCALL_DEFINE2(inotify_rm_watch, int, fd, __s32, wd)
  */
 static int __init inotify_user_setup(void)
 {
+	unsigned int watches_max;
+	struct sysinfo si;
+
+	si_meminfo(&si);
+	/*
+	 * Allow up to 1% of addressible memory to be allocated for inotify
+	 * watches (per user) limited to the range [8192, 1048576].
+	 */
+	watches_max = (((si.totalram - si.totalhigh) / 100) << PAGE_SHIFT) /
+			INOTIFY_WATCH_COST;
+	watches_max = min(1048576U, max(watches_max, 8192U));
+
 	BUILD_BUG_ON(IN_ACCESS != FS_ACCESS);
 	BUILD_BUG_ON(IN_MODIFY != FS_MODIFY);
 	BUILD_BUG_ON(IN_ATTRIB != FS_ATTRIB);
@@ -827,7 +849,7 @@  static int __init inotify_user_setup(void)
 
 	inotify_max_queued_events = 16384;
 	init_user_ns.ucount_max[UCOUNT_INOTIFY_INSTANCES] = 128;
-	init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = 8192;
+	init_user_ns.ucount_max[UCOUNT_INOTIFY_WATCHES] = watches_max;
 
 	return 0;
 }