Message ID | 17f8948d-19b9-beac-cab1-e4bc587d9612@molgen.mpg.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | nfsd: Fix overflow causing non-working mounts on 1 TB machines | expand |
Good catch! And thanks for the detailed explanation. Applying for 5.2 and stable. On Wed, Jul 03, 2019 at 02:54:43PM +0200, Paul Menzel wrote: > Date: Wed, 3 Jul 2019 13:28:15 +0200 > > Since commit 10a68cdf10 (nfsd: fix performance-limiting session > calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with > 1 TB of memory cannot be mounted anymore. The mount just hangs on the > client. > > The gist of commit 10a68cdf10 is the change below. > > -avail = clamp_t(int, avail, slotsize, avail/3); > +avail = clamp_t(int, avail, slotsize, total_avail/3); > > Here are the macros. > > #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <) > #define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi) > > `total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts > the values to `int`, which for 32-bit integers can only hold values > −2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1). > > `avail` (in the function signature) is just 65536, so that no overflow > was happening. Before the commit the assignment would result in 21845, > and `num = 4`. > > When using `total_avail`, it is causing the assignment to be > 18446744072226137429 (printed as %lu), and `num` is then 4164608182. > > My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the > server thinks there is no memory available any more for this client. > > Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long` > fixes the issue. > > Now, `avail = 65536` (before commit 10a68cdf10 `avail = 21845`), but > `num = 4` remains the same. > > Fixes: 10a68cdf10 (nfsd: fix performance-limiting session calculation) > Cc: stable@vger.kernel.org > Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> > --- > > 1. No, idea if `min_t()` arguments also need updating. > 2. Instead of `unsigned long`, should `size_t` be used? > > fs/nfsd/nfs4state.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c > index 618e66078ee5..1a0cdeb3b875 100644 > --- a/fs/nfsd/nfs4state.c > +++ b/fs/nfsd/nfs4state.c > @@ -1563,7 +1563,7 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca) > * Never use more than a third of the remaining memory, > * unless it's the only way to give this client a slot: > */ > - avail = clamp_t(int, avail, slotsize, total_avail/3); > + avail = clamp_t(unsigned long, avail, slotsize, total_avail/3); > num = min_t(int, num, avail / slotsize); > nfsd_drc_mem_used += num * slotsize; > spin_unlock(&nfsd_drc_lock); > -- > 2.22.0 >
Dear Bruce, On 7/3/19 5:56 PM, J. Bruce Fields wrote: > Good catch! And thanks for the detailed explanation. Applying for 5.2 > and stable. Thanks. Please note, that in the last part are some guesses, and I am not well versed in the terminology. So please feel free to reword the commit messages. Kind regards, Paul
On Wed, Jul 03, 2019 at 06:03:06PM +0200, Paul Menzel wrote: > Dear Bruce, > > > On 7/3/19 5:56 PM, J. Bruce Fields wrote: > > Good catch! And thanks for the detailed explanation. Applying for 5.2 > > and stable. > > Thanks. Please note, that in the last part are some guesses, and I am not > well versed in the terminology. So please feel free to reword the commit > messages. I haven't checked all the arithmetic, but it sounds pretty plausible to me, and clearly that type was wrong. --b.
diff --git a/fs/nfsd/nfs4state.c b/fs/nfsd/nfs4state.c index 618e66078ee5..1a0cdeb3b875 100644 --- a/fs/nfsd/nfs4state.c +++ b/fs/nfsd/nfs4state.c @@ -1563,7 +1563,7 @@ static u32 nfsd4_get_drc_mem(struct nfsd4_channel_attrs *ca) * Never use more than a third of the remaining memory, * unless it's the only way to give this client a slot: */ - avail = clamp_t(int, avail, slotsize, total_avail/3); + avail = clamp_t(unsigned long, avail, slotsize, total_avail/3); num = min_t(int, num, avail / slotsize); nfsd_drc_mem_used += num * slotsize; spin_unlock(&nfsd_drc_lock);
Date: Wed, 3 Jul 2019 13:28:15 +0200 Since commit 10a68cdf10 (nfsd: fix performance-limiting session calculation) (Linux 5.1-rc1 and 4.19.31), shares from NFS servers with 1 TB of memory cannot be mounted anymore. The mount just hangs on the client. The gist of commit 10a68cdf10 is the change below. -avail = clamp_t(int, avail, slotsize, avail/3); +avail = clamp_t(int, avail, slotsize, total_avail/3); Here are the macros. #define min_t(type, x, y) __careful_cmp((type)(x), (type)(y), <) #define clamp_t(type, val, lo, hi) min_t(type, max_t(type, val, lo), hi) `total_avail` is 8,434,659,328 on the 1 TB machine. `clamp_t()` casts the values to `int`, which for 32-bit integers can only hold values −2,147,483,648 (−2^31) through 2,147,483,647 (2^31 − 1). `avail` (in the function signature) is just 65536, so that no overflow was happening. Before the commit the assignment would result in 21845, and `num = 4`. When using `total_avail`, it is causing the assignment to be 18446744072226137429 (printed as %lu), and `num` is then 4164608182. My next guess is, that `nfsd_drc_mem_used` is then exceeded, and the server thinks there is no memory available any more for this client. Updating the arguments of `clamp_t()` and `min_t()` to `unsigned long` fixes the issue. Now, `avail = 65536` (before commit 10a68cdf10 `avail = 21845`), but `num = 4` remains the same. Fixes: 10a68cdf10 (nfsd: fix performance-limiting session calculation) Cc: stable@vger.kernel.org Signed-off-by: Paul Menzel <pmenzel@molgen.mpg.de> --- 1. No, idea if `min_t()` arguments also need updating. 2. Instead of `unsigned long`, should `size_t` be used? fs/nfsd/nfs4state.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)