[Linux] fcntl: add new F_OFD_*32 constants and handle them appropriately
diff mbox

Message ID 1471521804-4291-1-git-send-email-jlayton@redhat.com
State New
Headers show

Commit Message

Jeff Layton Aug. 18, 2016, 12:03 p.m. UTC
When I originally added OFD lock support, we made the assumption that
userland would always pass in a struct flock64 for an OFD lock. It's
possible however for someone to build a binary without large file
support (aka LFS), which will call down into the kernel with the
standard F_OFD_* constants, but pass in a "legacy" struct flock.

My first thought was to patch glibc to cause a build break if anyone
tries to use OFD locks without LFS, but now I think it might be less
problematic to simply support OFD locks without LFS enabled.

The kernel handles this for classic POSIX locks with a separate set of
constants (postfixed with "64") to indicate that the incoming structure
is an LFS one. We can't do that here since the kernel already assumes
that the structure is an LFS one.

Instead, we can define a new set of constants that are postfixed with
"32" to indicate that the incoming structure is a non-LFS one. This
patch adds the kernel plumbing to handle that case. We'll also need a
small patch for glibc to make it to use these constants when LFS is
disabled.

In the event that someone builds a program with a new glibc, and runs it
on a kernel without support for F_OFD_*32 constants, they will just get
back EINVAL. That's preferable to the current situation which is
undefined behavior due to misinterpretation of the struct flock
argument.

Cc: stable@vger.kernel.org	# v3.15+
Reported-by: Cyril Hrubis <chrubis@suse.cz>
Signed-off-by: Jeff Layton <jlayton@redhat.com>
---
 fs/compat.c                      | 3 +++
 fs/fcntl.c                       | 4 +++-
 fs/locks.c                       | 4 +++-
 include/uapi/asm-generic/fcntl.h | 4 ++++
 4 files changed, 13 insertions(+), 2 deletions(-)

Comments

Cyril Hrubis Aug. 18, 2016, 1:24 p.m. UTC | #1
Hi!
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index 350a2c8cfd28..71704aa11170 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -270,6 +270,7 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
>  	/* 32-bit arches must use fcntl64() */
>  	case F_OFD_GETLK:
>  #endif
> +	case F_OFD_GETLK32:
>  	case F_GETLK:
>  		err = fcntl_getlk(filp, cmd, (struct flock __user *) arg);
>  		break;
> @@ -278,7 +279,8 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
>  	case F_OFD_SETLK:
>  	case F_OFD_SETLKW:
>  #endif
> -		/* Fallthrough */
> +	case F_OFD_SETLK32:
> +	case F_OFD_SETLKW32:
>  	case F_SETLK:
>  	case F_SETLKW:
>  		err = fcntl_setlk(fd, filp, cmd, (struct flock __user *) arg);

Shouldn't we do #if BITS_PER_LONG == 32 around the newly added cases?

Since otherwise fcntl() with cmd F_OFD_SETLK32 would expect 64bit off_t
on 64 bit kernel. It will probably never be used that way, but I find it
quite confusing.

The rest looks good to me.
Jeff Layton Aug. 18, 2016, 1:54 p.m. UTC | #2
On Thu, 2016-08-18 at 15:24 +0200, Cyril Hrubis wrote:
> Hi!
> > 
> > diff --git a/fs/fcntl.c b/fs/fcntl.c
> > index 350a2c8cfd28..71704aa11170 100644
> > --- a/fs/fcntl.c
> > +++ b/fs/fcntl.c
> > @@ -270,6 +270,7 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
> > > >  	/* 32-bit arches must use fcntl64() */
> > > >  	case F_OFD_GETLK:
> >  #endif
> > > > +	case F_OFD_GETLK32:
> > > >  	case F_GETLK:
> > > >  		err = fcntl_getlk(filp, cmd, (struct flock __user *) arg);
> > > >  		break;
> > @@ -278,7 +279,8 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
> > > >  	case F_OFD_SETLK:
> > > >  	case F_OFD_SETLKW:
> >  #endif
> > > > -		/* Fallthrough */
> > > > +	case F_OFD_SETLK32:
> > > > +	case F_OFD_SETLKW32:
> > > >  	case F_SETLK:
> > > >  	case F_SETLKW:
> > > >  		err = fcntl_setlk(fd, filp, cmd, (struct flock __user *) arg);
> 
> Shouldn't we do #if BITS_PER_LONG == 32 around the newly added cases?
> 
> Since otherwise fcntl() with cmd F_OFD_SETLK32 would expect 64bit off_t
> on 64 bit kernel. It will probably never be used that way, but I find it
> quite confusing.
> 
> The rest looks good to me.
> 

No, 64 bit machines still need these for the compat syscall case.
Consider someone running a 32-bit, non-LFS binary on a 64-bit host.

Unfortunately, the way this has changed over the decades is just really
hard to follow. Eventually we ought to do a cleanup of this code to
make it simpler, but I'd really like this patch to be applicable to
stable kernels, so I think we ought to wait on that until later.
Cyril Hrubis Aug. 18, 2016, 2:06 p.m. UTC | #3
> > Shouldn't we do #if BITS_PER_LONG == 32 around the newly added cases?
> > 
> > Since otherwise fcntl() with cmd F_OFD_SETLK32 would expect 64bit off_t
> > on 64 bit kernel. It will probably never be used that way, but I find it
> > quite confusing.
> > 
> > The rest looks good to me.
> > 
> 
> No, 64 bit machines still need these for the compat syscall case.
> Consider someone running a 32-bit, non-LFS binary on a 64-bit host.

Ah, we call the sys_fcntl() with these from the compat code supposedly
so that it does all the checks we omit in the compat variant. Then it's
needed and confusing at the same time.

We do convert_fcntl_cmd() for the 64bit variants already, maybe we can
just add the 32bit variants to the switch there as well. I'm not sure if
it is worth of the code size increase though.

> Unfortunately, the way this has changed over the decades is just really
> hard to follow. Eventually we ought to do a cleanup of this code to
> make it simpler, but I'd really like this patch to be applicable to
> stable kernels, so I think we ought to wait on that until later.

I guess that this is fine for quick fix. Cleanup of the code would be
nice, it's quite a maze as it is.
Christoph Hellwig Aug. 18, 2016, 5:05 p.m. UTC | #4
NAK.  People should stop using 32-bit off_t and friends yesterday.
It's a shame that glibc hasn't cought up with last century yet and
stopped providing the non-LFS APIs for newly compiled code, but 
we certainly should not bloat the kernel for the idiotic behavior.

In addition anyone is going to use a new Linux-only feature like the
OFS locks should be using LFS support anyway.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jeff Layton Aug. 18, 2016, 5:28 p.m. UTC | #5
On Thu, 2016-08-18 at 19:05 +0200, Christoph Hellwig wrote:
> NAK.  People should stop using 32-bit off_t and friends yesterday.
> It's a shame that glibc hasn't cought up with last century yet and
> stopped providing the non-LFS APIs for newly compiled code, but 
> we certainly should not bloat the kernel for the idiotic behavior.
> 
> In addition anyone is going to use a new Linux-only feature like the
> OFS locks should be using LFS support anyway.

That was my original thinking, but several people seemed to think that
we should just go ahead and support it. TBH, I don't much care either
way, but we either need to support it properly, or ensure that trying
to use OFD locks in a non-LFS program fails to compile.

The only real concern I have here is whether limiting this to LFS
enabled programs might make it tougher to get this into POSIX. Would
the POSIX standards folks object to having an interface like this that
doesn't support non-LFS cases? I guess if that ever happens though,
then we can just widen the support at that point.

If this is the consensus, then we can go with something like the glibc
patch I sent yesterday, which disables the command definitions when LFS
is not in effect.

Thoughts?
Christoph Hellwig Aug. 18, 2016, 5:31 p.m. UTC | #6
On Thu, Aug 18, 2016 at 01:28:20PM -0400, Jeff Layton wrote:
> That was my original thinking, but several people seemed to think that
> we should just go ahead and support it. TBH, I don't much care either
> way, but we either need to support it properly, or ensure that trying
> to use OFD locks in a non-LFS program fails to compile.

Yes, that's what glibc folks should do for now given that they still
seem to refuse being draggred into the present.

> The only real concern I have here is whether limiting this to LFS
> enabled programs might make it tougher to get this into POSIX. Would
> the POSIX standards folks object to having an interface like this that
> doesn't support non-LFS cases? I guess if that ever happens though,
> then we can just widen the support at that point.

LFS is perfectly Posix compliant (as is non-LFS).  It's really just
a glibc (aka Linux) special to still support non-LFS modes.  4.4BSD
and decendants have made the switch to 64-bit off_t in 1994 and haven't
supported a non-LFS since.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Frysinger Aug. 18, 2016, 5:46 p.m. UTC | #7
On 18 Aug 2016 19:31, Christoph Hellwig wrote:
> On Thu, Aug 18, 2016 at 01:28:20PM -0400, Jeff Layton wrote:
> > That was my original thinking, but several people seemed to think that
> > we should just go ahead and support it. TBH, I don't much care either
> > way, but we either need to support it properly, or ensure that trying
> > to use OFD locks in a non-LFS program fails to compile.
> 
> Yes, that's what glibc folks should do for now given that they still
> seem to refuse being draggred into the present.
> 
> > The only real concern I have here is whether limiting this to LFS
> > enabled programs might make it tougher to get this into POSIX. Would
> > the POSIX standards folks object to having an interface like this that
> > doesn't support non-LFS cases? I guess if that ever happens though,
> > then we can just widen the support at that point.
> 
> LFS is perfectly Posix compliant (as is non-LFS).  It's really just
> a glibc (aka Linux) special to still support non-LFS modes.  4.4BSD
> and decendants have made the switch to 64-bit off_t in 1994 and haven't
> supported a non-LFS since.

there's no need to be so dramatic here.  glibc didn't write the LFS logic
for fun, and hasn't maintained it out of laziness.  in fact, the code is
non-trivial to get right.  the trade offs were considered heavily, and not
breaking backwards compatibility was considered more important.  otherwise
we'd have a repeat of the libc4/libc5 (or gcc's libstdc++.so.5) breakage
where all binaries stop working.  BSD's can get away with it because their
entire modus operandi is that you get no backwards compatibility.

there has been discussion on the glibc lists for how we can move forward
*safely*, but it's not something to be taken lightly -- LFS defines will
implicitly change ABI structs all over the place.  in the mean time, let's
just drop the pointless inflammatory editorializing since it contributes
nothing.
-mike
Christoph Hellwig Aug. 18, 2016, 5:52 p.m. UTC | #8
On Thu, Aug 18, 2016 at 10:46:07AM -0700, Mike Frysinger wrote:
> there's no need to be so dramatic here.  glibc didn't write the LFS logic
> for fun, and hasn't maintained it out of laziness.  in fact, the code is
> non-trivial to get right.

It hasn't maintained it out of lazyness, but out of stupidity - it's been
20 years overdue to get rid of supporting non-LFS for _new code_.  Keeping
the old symbols around is perfectly fine.  And at least a few years
ago I could run FreeBSD 1.x (pre-4.4BSD) code on recent FreeBSD systems
with the right compat defines in the kernel build and the compat libraries
just fine, so it's not like it's an unsolved problem.

At the same time glibc lazuness has caused us Linux developers tons of
problems due to applications or even system programs using the wrong
APIs as they still are the default, including random errors due to "too large"
inode numbers or offset.

So yes, I'm pissed that this crap isn't sorted out and have all the reaons
to be "dramatic".
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Mike Frysinger Aug. 18, 2016, 6:16 p.m. UTC | #9
On 18 Aug 2016 19:52, Christoph Hellwig wrote:
> On Thu, Aug 18, 2016 at 10:46:07AM -0700, Mike Frysinger wrote:
> > there's no need to be so dramatic here.  glibc didn't write the LFS logic
> > for fun, and hasn't maintained it out of laziness.  in fact, the code is
> > non-trivial to get right.
> 
> It hasn't maintained it out of lazyness, but out of stupidity - it's been
> 20 years overdue to get rid of supporting non-LFS for _new code_.

as i said, we've been discussing it of late, and it's a non-trivial problem.
"just make it the default" ignores the fact that LFS shows up in many places
and changes ABIs of downstream libs implicitly when they use impacted structs.

> Keeping
> the old symbols around is perfectly fine.  And at least a few years
> ago I could run FreeBSD 1.x (pre-4.4BSD) code on recent FreeBSD systems
> with the right compat defines in the kernel build and the compat libraries
> just fine, so it's not like it's an unsolved problem.

"and the compat libs" is a pretty key point.  of course if your lowest ABI
boundary is the kernel, things are much easier.  you can do the same thing
with libc5 today because the boundary is the Linux syscall ABI.  the point
is to *not* have to do that but keep using the same SONAMEs which does work
under Linux/glibc today, and generally is what the BSDs do not care about.

> At the same time glibc lazuness has caused us Linux developers tons of
> problems due to applications or even system programs using the wrong
> APIs as they still are the default, including random errors due to "too large"
> inode numbers or offset.

the APIs need to stick around regardless of what glibc does moving forward.  
existing binaries aren't going anywhere.  so if the compat syscals are broken,
then they're broken and need fixing.

> So yes, I'm pissed that this crap isn't sorted out and have all the reaons
> to be "dramatic".

which isn't terribly useful and is more likely to have people just ignore you
-mike
Zack Weinberg Aug. 18, 2016, 7:01 p.m. UTC | #10
On Thu, Aug 18, 2016 at 1:52 PM, Christoph Hellwig <hch@lst.de> wrote:
> On Thu, Aug 18, 2016 at 10:46:07AM -0700, Mike Frysinger wrote:
>> there's no need to be so dramatic here.  glibc didn't write the LFS logic
>> for fun, and hasn't maintained it out of laziness.  in fact, the code is
>> non-trivial to get right.
>
> It hasn't maintained it out of lazyness, but out of stupidity - it's been
> 20 years overdue to get rid of supporting non-LFS for _new code_.

Please say concretely what you mean by "get rid of supporting non-LFS
for new code".  I can think of only two possibilities, both with
severe negative side effects.

We could change the libc headers used on old-ILP32 ABIs so that
_FILE_OFFSET_BITS=64 is defined by default (matching the LP64-ABI
headers).  This would break the ABI of every shared library that
exports a structure (transitively) containing a field of type off_t,
ino_t, fsblkcnt_t, fsfilcnt_t, or rlim_t.  It would also break
non-LFS-safe programs upon recompilation, although they could still be
recompiled with -D_FILE_OFFSET_BITS=32.  This latter breakage could
well be silent, going unnoticed until someone actually tries to use
the program with a very large file -- and such bugs can and have been
security-critical.  So by me this is already a bad plan.

Or we could go even further and remove all modes but
_FILE_OFFSET_BITS=64 from the libc headers, breaking non-LFS-safe
programs upon recompilation with no quick fix; that seems like an even
worse plan.

But perhaps you have something else in mind?

zw
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Joseph Myers Aug. 18, 2016, 8:52 p.m. UTC | #11
On Thu, 18 Aug 2016, Zack Weinberg wrote:

> We could change the libc headers used on old-ILP32 ABIs so that
> _FILE_OFFSET_BITS=64 is defined by default (matching the LP64-ABI
> headers).  This would break the ABI of every shared library that

This.  We have a patch 
<https://sourceware.org/ml/libc-alpha/2014-03/msg00290.html>, and we have 
an analysis <https://sourceware.org/ml/libc-alpha/2014-03/msg00351.html> 
providing some evidence, if not for a very large set of libraries, that 
_FILE_OFFSET_BITS=64 is the normal way GNU/Linux distributions build 
affected libraries.  We have, since then, _FILE_OFFSET_BITS=64 support for 
fts.  (It may be the case that such a patch would no longer build, or 
would cause linknamespace tests to fail because of bug 14106 - which would 
need addressing with new implementation-namespace symbols before such a 
patch could go in.)  But this requires someone to take the lead on getting 
all the prerequisites into glibc and establishing consensus for the change 
(possibly with a more detailed analysis of the use of _FILE_OFFSET_BITS=64 
to build affected libraries in GNU/Linux distributions).
Florian Weimer Aug. 25, 2016, 11:53 a.m. UTC | #12
* Christoph Hellwig:

> On Thu, Aug 18, 2016 at 01:28:20PM -0400, Jeff Layton wrote:
>> That was my original thinking, but several people seemed to think that
>> we should just go ahead and support it. TBH, I don't much care either
>> way, but we either need to support it properly, or ensure that trying
>> to use OFD locks in a non-LFS program fails to compile.
>
> Yes, that's what glibc folks should do for now given that they still
> seem to refuse being draggred into the present.

Your assumptions are wrong, at least for some (many?) of us.  32-bit
architectures are legacy; giving them a 64-bit off_t (or even time_t)
does not really change that.  A hard ABI transition is simply not
worth the effort.
--
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cyril Hrubis Aug. 25, 2016, 12:05 p.m. UTC | #13
Hi!
> > Yes, that's what glibc folks should do for now given that they still
> > seem to refuse being draggred into the present.
> 
> Your assumptions are wrong, at least for some (many?) of us.  32-bit
> architectures are legacy; giving them a 64-bit off_t (or even time_t)
> does not really change that.  A hard ABI transition is simply not
> worth the effort.

Unfortunately there are still 32bit processors manufactured and sold
even these days, mainly for IOT though.

For instance Intel has released Quark (32bit i586 400Mhz CPU used in
Edison board) in 2014. First two batches of Raspberry Pi were 32bit
ARMv6/ARMv7, these are still sold today, etc.

So I would say that 32bit will stick with us for another ten years at
least.

Patch
diff mbox

diff --git a/fs/compat.c b/fs/compat.c
index be6e48b0a46c..a7e9640e9107 100644
--- a/fs/compat.c
+++ b/fs/compat.c
@@ -426,6 +426,9 @@  COMPAT_SYSCALL_DEFINE3(fcntl64, unsigned int, fd, unsigned int, cmd,
 	case F_GETLK:
 	case F_SETLK:
 	case F_SETLKW:
+	case F_OFD_GETLK32:
+	case F_OFD_SETLK32:
+	case F_OFD_SETLKW32:
 		ret = get_compat_flock(&f, compat_ptr(arg));
 		if (ret != 0)
 			break;
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 350a2c8cfd28..71704aa11170 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -270,6 +270,7 @@  static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
 	/* 32-bit arches must use fcntl64() */
 	case F_OFD_GETLK:
 #endif
+	case F_OFD_GETLK32:
 	case F_GETLK:
 		err = fcntl_getlk(filp, cmd, (struct flock __user *) arg);
 		break;
@@ -278,7 +279,8 @@  static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
 	case F_OFD_SETLK:
 	case F_OFD_SETLKW:
 #endif
-		/* Fallthrough */
+	case F_OFD_SETLK32:
+	case F_OFD_SETLKW32:
 	case F_SETLK:
 	case F_SETLKW:
 		err = fcntl_setlk(fd, filp, cmd, (struct flock __user *) arg);
diff --git a/fs/locks.c b/fs/locks.c
index 7e428b78be07..4ffde68322de 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -2065,7 +2065,7 @@  int fcntl_getlk(struct file *filp, unsigned int cmd, struct flock __user *l)
 	if (error)
 		goto out;
 
-	if (cmd == F_OFD_GETLK) {
+	if (cmd == F_OFD_GETLK || cmd == F_OFD_GETLK32) {
 		error = -EINVAL;
 		if (flock.l_pid != 0)
 			goto out;
@@ -2222,6 +2222,7 @@  int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd,
 	 */
 	switch (cmd) {
 	case F_OFD_SETLK:
+	case F_OFD_SETLK32:
 		error = -EINVAL;
 		if (flock.l_pid != 0)
 			goto out;
@@ -2231,6 +2232,7 @@  int fcntl_setlk(unsigned int fd, struct file *filp, unsigned int cmd,
 		file_lock->fl_owner = filp;
 		break;
 	case F_OFD_SETLKW:
+	case F_OFD_SETLKW32:
 		error = -EINVAL;
 		if (flock.l_pid != 0)
 			goto out;
diff --git a/include/uapi/asm-generic/fcntl.h b/include/uapi/asm-generic/fcntl.h
index e063effe0cc1..b407deee68e1 100644
--- a/include/uapi/asm-generic/fcntl.h
+++ b/include/uapi/asm-generic/fcntl.h
@@ -148,6 +148,10 @@ 
 #define F_OFD_SETLK	37
 #define F_OFD_SETLKW	38
 
+#define F_OFD_GETLK32	39
+#define F_OFD_SETLK32	40
+#define F_OFD_SETLKW32	41
+
 #define F_OWNER_TID	0
 #define F_OWNER_PID	1
 #define F_OWNER_PGRP	2