diff mbox

vfs: fix statfs64() returning impossible EOVERFLOW for 64-bit f_files

Message ID 20171005183636.20732-1-sergey.m.klyaus@gmail.com (mailing list archive)
State New, archived
Headers show

Commit Message

Sergey Klyaus Oct. 5, 2017, 6:36 p.m. UTC
compat_statfs64 structure has some 32-bit and some 64-bit fields, so
64d2ab32e "vfs: fix put_compat_statfs64() does not handle errors" fixed
32-bit overflow checks not being performed, but accidentally enabled
checks for f_files and f_ffree that are 64-bit and cannot have overflow.
Now checks for both groups of fields are enabled by different
conditions.

This broke my Steam runtime and can be reproduced with this test case:

    # mount -t tmpfs -o nr_inodes=4294967297 tmpfs /mnt
    $ cat statfs.c

int main() {
        struct statvfs sv;
        statvfs("/mnt", &sv);
        printf("%d %llu %llu\n", errno,
                (unsigned long long) sv.f_files,
                (unsigned long long) sv.f_ffree);
        return 0;
}
    $ gcc -g -m32 -D_FILE_OFFSET_BITS=64 statfs.c
    $ ./a.out
    75 134513445 0
    |  \- some junk on stack
    EOVERFLOW

Signed-off-by: Sergey Klyaus <sergey.m.klyaus@gmail.com>
---
 fs/statfs.c | 2 ++
 1 file changed, 2 insertions(+)

Comments

Al Viro Oct. 5, 2017, 8:57 p.m. UTC | #1
On Thu, Oct 05, 2017 at 09:36:36PM +0300, Sergey Klyaus wrote:
> compat_statfs64 structure has some 32-bit and some 64-bit fields, so
> 64d2ab32e "vfs: fix put_compat_statfs64() does not handle errors" fixed
> 32-bit overflow checks not being performed, but accidentally enabled
> checks for f_files and f_ffree that are 64-bit and cannot have overflow.
> Now checks for both groups of fields are enabled by different
> conditions.

TBH, the logics in there looks very dubious.  First of all, could somebody
show an architecture where compat_statfs64 would *not* have 32bit f_bsize?

AFAICS, there are only 3 variants of struct compat_statfs64 declaration in
the entire tree:
arch/mips/include/uapi/asm/statfs.h:82:struct compat_statfs64 {
arch/s390/include/asm/compat.h:167:struct compat_statfs64 {
include/uapi/asm-generic/statfs.h:68:struct compat_statfs64 {

mips one has
        __u32   f_bsize;
s390 -
        u32             f_bsize;
and generic -
        __u32 f_bsize;

So what is that if (sizeof... == 4) about?  Before or after the commit in
question - f_blocks is consistently 64bit, f_bsize - 32bit.  IOW, that
commit has turned an obfuscated if (0) into equally obfuscated if (1).

In any case, that thing is supposed to behave like statfs64(2) on matching
32bit host, so what the hell is that EOVERFLOW about, anyway?  ->f_type value
not fitting into 32 bits?  That'd be an fs bug; I could see WARN_ON() on that,
but -EOVERFLOW is bloody odd.  And ->f_namelen exceeding 4Gb is even funnier...

Seriously, the old logics had been complete BS and the only saving grace had
been the fact that it never triggered.  What the hell is f_files and f_ffree
logics about?  Those are 64bit in *all* struct statfs64 variants.  Always
had been.

AFAICS, the real bug here is in hugetlbfs; that's where obscene values in
->f_bsize come from.  IMO all that code in put_compat_statfs64() should be
replaced with
	if (kbuf->bsize != (u32)kbuf->bsize)
		return -EOVERFLOW;
That, or hugetlbfs could be taught to fake saner ->f_bsize (recalculating
->f_bavail/->f_bfree/->f_blocks to go with that).

Comments?
Linus Torvalds Oct. 5, 2017, 10:31 p.m. UTC | #2
On Thu, Oct 5, 2017 at 1:57 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> AFAICS, the real bug here is in hugetlbfs; that's where obscene values in
> ->f_bsize come from.  IMO all that code in put_compat_statfs64() should be
> replaced with
>         if (kbuf->bsize != (u32)kbuf->bsize)
>                 return -EOVERFLOW;
> That, or hugetlbfs could be taught to fake saner ->f_bsize (recalculating
> ->f_bavail/->f_bfree/->f_blocks to go with that).

Make it so. Except you shouldn't do

     if (kbuf->bsize != (u32)kbuf->bsize)

you should do something like

    #define FITS_IN(x,y)  ({ typeof x __x = (x); typeof y __y = __x;
__x == __y; })

and then do

    if (!FITS_IN(kbuf->bsize, ubuf->bsize)) ...

because there is nothing that specifies that the ubuf size of all
fields has to be 32 bits.

But yes,m either you need to then special-case that crazy all-ones
value, or just fix hugetlbfs to not use crazy crap.

                  Linus
Al Viro Oct. 5, 2017, 11:06 p.m. UTC | #3
On Thu, Oct 05, 2017 at 03:31:05PM -0700, Linus Torvalds wrote:
> On Thu, Oct 5, 2017 at 1:57 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > AFAICS, the real bug here is in hugetlbfs; that's where obscene values in
> > ->f_bsize come from.  IMO all that code in put_compat_statfs64() should be
> > replaced with
> >         if (kbuf->bsize != (u32)kbuf->bsize)
> >                 return -EOVERFLOW;
> > That, or hugetlbfs could be taught to fake saner ->f_bsize (recalculating
> > ->f_bavail/->f_bfree/->f_blocks to go with that).
> 
> Make it so. Except you shouldn't do
> 
>      if (kbuf->bsize != (u32)kbuf->bsize)
> 
> you should do something like
> 
>     #define FITS_IN(x,y)  ({ typeof x __x = (x); typeof y __y = __x;
> __x == __y; })
> 
> and then do
> 
>     if (!FITS_IN(kbuf->bsize, ubuf->bsize)) ...
> 
> because there is nothing that specifies that the ubuf size of all
> fields has to be 32 bits.
> 
> But yes,m either you need to then special-case that crazy all-ones
> value, or just fix hugetlbfs to not use crazy crap.

All-ones is not a problem at all - those two fields are consistently
64bit in struct statfs64 on all 32bit architectures.  That had pretty
much been the rationale for statfs64(2) in the first place - statfs(2)
couldn't be used on large filesystems; 4Gfiles and you get an overflow
on 32bit.  So the entire "let's check if f_files/f_ffree/f_bavail/f_bfree/
f_blocks fit into 32 bits" had been an utter nonsense from the very
beginning and the only reason it hadn't been spotted earlier was that
this logics was under if (sizeof(u64) == 4) until the last November.

Just to make sure we are on the same page: out of kstatfs fields
we have 5 u64 ones (see above; all of them are u64 is struct statfs64
on all architectures), an opaque 64bit f_fsid and 5 fields that
are long: f_type (magic numbers, all 32bit), f_namelen (max filename
length), f_frsize (0 on most of filesystems, always fits into 32 bits),
f_flags (guaranteed to be 32bit) and f_bsize.

f_bsize is a mess - normal practice for Unices is to have f_blocks in
units of f_frsize, leaving f_bsize as preferred IO granularity.  Linux
didn't have f_frsize until 2003 or so, and f_bsize got used for units
of f_blocks.

hugetlbfs uses it to report the huge page size; the real problem
last year commit tried to deal with was that on boxen with huge pages
4Gb or bigger we get 0 observed in that field by 32bit processes
calling statfs64(2).  I'm not sure whether we treat that use of
f_bsize by hugetlbfs as an accidental ABI (in that case we need to
check that it fits into u32 and fail with EOVERFLOW otherwise;
again, all compat_statfs64 have f_bsize 32bit) or just cap it with
something sane (2Gb?) and adjast f_blocks/f_bavail/f_bfree accordingly.

Fields that are u64 in kstatfs don't need any checks - they are
64bit in compat_statfs64 as well.  Other four 32bit fields... sure,
we could check them, but for those the reasonable reaction is not
EOVERFLOW - it's WARN_ON().
Linus Torvalds Oct. 6, 2017, 1:31 a.m. UTC | #4
On Thu, Oct 5, 2017 at 4:06 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Just to make sure we are on the same page: out of kstatfs fields
> we have 5 u64 ones (see above; all of them are u64 is struct statfs64
> on all architectures), an opaque 64bit f_fsid and 5 fields that
> are long: f_type (magic numbers, all 32bit), f_namelen (max filename
> length), f_frsize (0 on most of filesystems, always fits into 32 bits),
> f_flags (guaranteed to be 32bit) and f_bsize.

Please just use that FITS_IN() kind of macro regardless.

If the sizes match, the compiler will optimize the test away.

If the sizes don't match, that FITS_IN() will do the right thing.

Do *not* manually go and say "these fields are ok, because..". The
whole bug was because people were confused about the field widths.

                 Linus
Al Viro Aug. 6, 2018, 5:06 p.m. UTC | #5
On Thu, Oct 05, 2017 at 06:31:29PM -0700, Linus Torvalds wrote:
> On Thu, Oct 5, 2017 at 4:06 PM, Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > Just to make sure we are on the same page: out of kstatfs fields
> > we have 5 u64 ones (see above; all of them are u64 is struct statfs64
> > on all architectures), an opaque 64bit f_fsid and 5 fields that
> > are long: f_type (magic numbers, all 32bit), f_namelen (max filename
> > length), f_frsize (0 on most of filesystems, always fits into 32 bits),
> > f_flags (guaranteed to be 32bit) and f_bsize.
> 
> Please just use that FITS_IN() kind of macro regardless.
> 
> If the sizes match, the compiler will optimize the test away.
> 
> If the sizes don't match, that FITS_IN() will do the right thing.
> 
> Do *not* manually go and say "these fields are ok, because..". The
> whole bug was because people were confused about the field widths.

To bring that thread back, with apologies for having it fall through
the cracks back last autumn:

compat_statfs64 "overflow checks" are completely bogus.  Some relevant
information:
	* all struct compat_statfs64 instances (all 3 of them) have
identical field sizes.  So any ifdefs on sizeof of anything in them
are nonsense to start with.
	* most of the fields have the same size as their struct kstatfs
counterparts, with the following exceptions - f_type, f_bsize, f_frsize,
f_namelen and f_flags are u32 in compat_statfs64 and long in kstatfs.
Anything other than those fields should not be getting any overflow
checks for obvious reasons - in particular this
                /* f_files and f_ffree may be -1; it's okay
                 * to stuff that into 32 bits */
                if (kbuf->f_files != 0xffffffffffffffffULL
                 && (kbuf->f_files & 0xffffffff00000000ULL))
                        return -EOVERFLOW;
                if (kbuf->f_ffree != 0xffffffffffffffffULL
                 && (kbuf->f_ffree & 0xffffffff00000000ULL))
                        return -EOVERFLOW;
is garbage, blindly copied from handling of compat_statfs where f_files is
32bit.
	* anyone who seriously suggests that some fs might want to report
f_namelen greater than 4Gb is welcome to bludgeon themselves with GNU Hurd
manuals, repeating the mantras about avoiding arbitrary limits until they
get enlightened, go join Hurd development or succeed in bashing out
whatever they had between their ears.  Either way, they won't be pestering
us again.
	* f_type is an opaque magic value; if it doesn't fit into 32 bits,
such filesystem can't be used on 32bit host.  Bug.
	* f_flags is generated right there in fs/statfs.c and the value is
currently limited to 13 bits.

That leaves us with f_bsize and f_frsize (the latter defaulting to the former).
Hugetlbfs can put greater than 4Gb values in there, for really huge pages.
And that's the only thing worth checking in there.

So the whole put_compat_statfs64() thing should be
static int put_compat_statfs64(struct compat_statfs64 __user *ubuf, struct kstatfs *kbuf)
{
	struct compat_statfs64 buf;

	if ((kbuf->f_bsize | kbuf->f_frsize) & 0xffffffff00000000ULL)
		return -EOVERFLOW;

	memset(&buf, 0, sizeof(struct compat_statfs64));
	buf.f_type = kbuf->f_type;
	buf.f_bsize = kbuf->f_bsize;
	buf.f_blocks = kbuf->f_blocks;
	buf.f_bfree = kbuf->f_bfree;
	buf.f_bavail = kbuf->f_bavail;
	buf.f_files = kbuf->f_files;
	buf.f_ffree = kbuf->f_ffree;
	buf.f_namelen = kbuf->f_namelen;
	buf.f_fsid.val[0] = kbuf->f_fsid.val[0];
	buf.f_fsid.val[1] = kbuf->f_fsid.val[1];
	buf.f_frsize = kbuf->f_frsize;
	buf.f_flags = kbuf->f_flags;
	if (copy_to_user(ubuf, &buf, sizeof(struct compat_statfs64)))
		return -EFAULT;
	return 0;
}
and that's it for compat_statfs64 bug reported by Sergey.

I'm somewhat tempted to get rid of those 'long' in struct kstatfs,
turning f_bsize and f_frsize into u64, with f_type, f_namelen and f_flags
becoming u32 and f_spare going to hell (not a single filesystem stores
anything in f_spare and e.g. mips zeroes those on the way to userland in
*native* statfs(2) anyway).  We'd lose the "fast" case of do_statfs64()
and do_statfs_native(), but frankly, it's not a great loss.  Doing e.g.

struct kstatfs {
        u32 f_type;
	u32 f_namelen;	// moved up here
        u64 f_bsize;
        u64 f_blocks;
        u64 f_bfree;
        u64 f_bavail;
        u64 f_files;
        u64 f_ffree;
        __kernel_fsid_t f_fsid;
        u64 f_frsize;
        u32 f_flags;
};

with

#define STATFS_COPYOUT(type)						\
static int put##type(struct kstatfs *st, struct type __user *p)		\
{									\
	struct type buf;						\
									\
	memset(&buf, 0, sizeof(buf));					\
	buf.f_type = st->f_type;					\
	if (!FIT_IN(st->f_bsize, buf.f_bsize))				\
		return -EOVERFLOW;					\
	buf.f_bsize = st->f_bsize;					\
	if (!FIT_IN(st->f_blocks, buf.f_blocks))			\
		return -EOVERFLOW;					\
	buf.f_blocks = st->f_blocks;					\
	if (!FIT_IN(st->f_bfree, buf.f_bfree))				\
		return -EOVERFLOW;					\
	buf.f_bfree = st->f_bfree;					\
	if (!FIT_IN(st->f_bavail, buf.f_bavail))			\
		return -EOVERFLOW;					\
	buf.f_bavail = st->f_bavail;					\
	if (!FIT_IN(st->f_files, buf.f_files) && st->f_files != -1)	\
		return -EOVERFLOW;					\
	buf.f_files = st->f_files;					\
	if (!FIT_IN(st->f_ffree, buf.f_ffree) && st->f_ffree != -1)	\
		return -EOVERFLOW;					\
	buf.f_ffree = st->f_ffree;					\
	buf.f_fsid = st->f_fsid;					\
	buf.f_namelen = st->f_namelen;					\
	if (!FIT_IN(st->f_frsize, buf.f_frsize))			\
		return -EOVERFLOW;					\
	buf.f_frsize = st->f_frsize;					\
	buf.f_flags = st->f_flags;					\
	if (copy_to_user(p, &buf, sizeof(buf)))				\
		return -EFAULT;						\
	return 0;							\
}

STATFS_COPYOUT(statfs)
STATFS_COPYOUT(compat_statfs)
STATFS_COPYOUT(statfs64)
STATFS_COPYOUT(compat_statfs64)

providing all the copyout helpers should be reasonably sane, IMO.

As for the convoluted macros Sergey has proposed...  Not without a very
good evidence that they win enough performance to be worth the complexity.

Comments?  Again, my apologies for losing that thread back then - I've just
found it while doing (unrelated) grep through the old mailboxen...
Linus Torvalds Aug. 6, 2018, 6:45 p.m. UTC | #6
On Mon, Aug 6, 2018 at 10:06 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> That leaves us with f_bsize and f_frsize (the latter defaulting to the former).
> Hugetlbfs can put greater than 4Gb values in there, for really huge pages.
> And that's the only thing worth checking in there.
>
> So the whole put_compat_statfs64() thing should be

Ack, I'm ok with this simplification.

> I'm somewhat tempted to get rid of those 'long' in struct kstatfs,

I'm ok with this one too.

> with
>
> #define STATFS_COPYOUT(type)                                            \
> static int put##type(struct kstatfs *st, struct type __user *p)         \

No. Don't do this.

I'm ok with the #define to avoid duplication, but don't bother with
the FIT_IN() after you've above successfully argued that it's
pointless for anything but f_bsize/frsize.

So if you do the macro to generate the different copyout versions,
just use your simplified code for that macro instead. With FIT_IN()
just for f_bsize/frsize.

                    Linus
Al Viro Aug. 6, 2018, 9:03 p.m. UTC | #7
On Mon, Aug 06, 2018 at 11:45:29AM -0700, Linus Torvalds wrote:
> On Mon, Aug 6, 2018 at 10:06 AM Al Viro <viro@zeniv.linux.org.uk> wrote:
> >
> > That leaves us with f_bsize and f_frsize (the latter defaulting to the former).
> > Hugetlbfs can put greater than 4Gb values in there, for really huge pages.
> > And that's the only thing worth checking in there.
> >
> > So the whole put_compat_statfs64() thing should be
> 
> Ack, I'm ok with this simplification.
> 
> > I'm somewhat tempted to get rid of those 'long' in struct kstatfs,
> 
> I'm ok with this one too.
> 
> > with
> >
> > #define STATFS_COPYOUT(type)                                            \
> > static int put##type(struct kstatfs *st, struct type __user *p)         \
> 
> No. Don't do this.
> 
> I'm ok with the #define to avoid duplication, but don't bother with
> the FIT_IN() after you've above successfully argued that it's
> pointless for anything but f_bsize/frsize.
> 
> So if you do the macro to generate the different copyout versions,
> just use your simplified code for that macro instead. With FIT_IN()
> just for f_bsize/frsize.

For statfs64 (both native and compat) that would do nicely; for statfs,
however...  The following describes the field sizes in all that mess:

        kstatfs	statfs          statfs64        compat_statfs compat_statfs64
       		!s390   s390   !s390   s390
type:   W	W       32      W       32              32      32
namelen:W	W       32      W       32              32      32
flags:  W	W       32      W       32              32      32
bsize:  W	W       32      W       32              32      32
frsize: W	W       32      W       32              32      32
blocks: 64	W       64      64      64              32      64
bfree:  64	W       64      64      64              32      64
bavail: 64	W       64      64      64              32      64
files:  64	W       64      64      64              32      64
ffree:  64	W       64      64      64              32      64
fsid:   __kernel_fsid_t on all

First of all, nobody gives a fuck about type, namelen and flags
overflows - can't happen.

blocks/bfree/bavail/files/ffree: can overflow in plain statfs on 32bit
and in compat statfs.  For files/ffree we also have "-1 means (s32)-1,
not an overflow" there.

bsize/frsize: can oveflow in anything on s390 or mips64 and any compat on anything

Oh, and then there's signedness - in compat_statfs64 everything's unsigned,
but for compat_statfs a bunch of those 32bit ones are signed.  Native on
32bit tend to be unsigned; native on 64bit and compat are often signed.
IMO that's a bug - compat ones should all be same as native 32bit.
As it is,
	arm, parisc, ppc, sparc, x86: on 32bit - unsigned, compat on 64bit - signed
	mips: both signed
	s390: both unsigned
I think we want to switch compat_statfs fields on the first group to u32.  These
declarations are not exposed to userland anyway.  mips is interesting - I've no
idea how does mips32 userland react to e.g. statfs() on 3G block filesystem...
diff mbox

Patch

diff --git a/fs/statfs.c b/fs/statfs.c
index fab9b6a3c116..073bb2d1871e 100644
--- a/fs/statfs.c
+++ b/fs/statfs.c
@@ -307,6 +307,8 @@  static int put_compat_statfs64(struct compat_statfs64 __user *ubuf, struct kstat
 		if ((kbuf->f_type | kbuf->f_bsize | kbuf->f_namelen |
 		     kbuf->f_frsize | kbuf->f_flags) & 0xffffffff00000000ULL)
 			return -EOVERFLOW;
+	}
+	if (sizeof(ubuf->f_blocks) == 4) {
 		/* f_files and f_ffree may be -1; it's okay
 		 * to stuff that into 32 bits */
 		if (kbuf->f_files != 0xffffffffffffffffULL