[2/3] fd/locks: allow get the lock owner by F_OFD_GETLK

Message ID	20230620095507.2677463-3-stsp2@yandex.ru (mailing list archive)
State	New, archived
Headers	show Return-Path: <linux-fsdevel-owner@vger.kernel.org> From: Stas Sergeev <stsp2@yandex.ru> To: linux-kernel@vger.kernel.org Cc: Stas Sergeev <stsp2@yandex.ru>, Jeff Layton <jlayton@kernel.org>, Chuck Lever <chuck.lever@oracle.com>, Alexander Viro <viro@zeniv.linux.org.uk>, Christian Brauner <brauner@kernel.org>, linux-fsdevel@vger.kernel.org Subject: [PATCH 2/3] fd/locks: allow get the lock owner by F_OFD_GETLK Date: Tue, 20 Jun 2023 14:55:06 +0500 Message-Id: <20230620095507.2677463-3-stsp2@yandex.ru> In-Reply-To: <20230620095507.2677463-1-stsp2@yandex.ru> References: <20230620095507.2677463-1-stsp2@yandex.ru> MIME-Version: 1.0 Content-Transfer-Encoding: 8bit Precedence: bulk
Series	RFC: F_OFD_GETLK should provide more info \| expand [0/3] RFC: F_OFD_GETLK should provide more info [1/3] fs/locks: F_UNLCK extension for F_OFD_GETLK [2/3] fd/locks: allow get the lock owner by F_OFD_GETLK [3/3] selftests: add OFD lock tests

stsp June 20, 2023, 9:55 a.m. UTC

Currently F_OFD_GETLK sets the pid of the lock owner to -1.
Remove such behavior to allow getting the proper owner's pid.
This may be helpful when you want to send some message (like SIGKILL)
to the offending locker.

Signed-off-by: Stas Sergeev <stsp2@yandex.ru>

CC: Jeff Layton <jlayton@kernel.org>
CC: Chuck Lever <chuck.lever@oracle.com>
CC: Alexander Viro <viro@zeniv.linux.org.uk>
CC: Christian Brauner <brauner@kernel.org>
CC: linux-fsdevel@vger.kernel.org
CC: linux-kernel@vger.kernel.org

---
 fs/locks.c | 2 --
 1 file changed, 2 deletions(-)

Jeff Layton June 20, 2023, 10:51 a.m. UTC | #1

On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote:
> Currently F_OFD_GETLK sets the pid of the lock owner to -1.
> Remove such behavior to allow getting the proper owner's pid.
> This may be helpful when you want to send some message (like SIGKILL)
> to the offending locker.
> 
> Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
> 
> CC: Jeff Layton <jlayton@kernel.org>
> CC: Chuck Lever <chuck.lever@oracle.com>
> CC: Alexander Viro <viro@zeniv.linux.org.uk>
> CC: Christian Brauner <brauner@kernel.org>
> CC: linux-fsdevel@vger.kernel.org
> CC: linux-kernel@vger.kernel.org
> 
> ---
>  fs/locks.c | 2 --
>  1 file changed, 2 deletions(-)
> 
> diff --git a/fs/locks.c b/fs/locks.c
> index 210766007e63..ee265e166542 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns)
>  	pid_t vnr;
>  	struct pid *pid;
>  
> -	if (IS_OFDLCK(fl))
> -		return -1;
>  	if (IS_REMOTELCK(fl))
>  		return fl->fl_pid;
>  	/*

NACK on this one.

OFD locks are not owned by processes. They are owned by the file
description (hence the name). Because of this, returning a pid here is
wrong.

This precedent comes from BSD, where flock() and POSIX locks can
conflict. BSD returns -1 for the pid if you call F_GETLK on a file
locked with flock(). Since OFD locks have similar ownership semantics to
flock() locks, we use the same convention here.

Cheers,

stsp June 20, 2023, 10:57 a.m. UTC | #2

Hello,

20.06.2023 15:51, Jeff Layton пишет:
> On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote:
>> Currently F_OFD_GETLK sets the pid of the lock owner to -1.
>> Remove such behavior to allow getting the proper owner's pid.
>> This may be helpful when you want to send some message (like SIGKILL)
>> to the offending locker.
>>
>> Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
>>
>> CC: Jeff Layton <jlayton@kernel.org>
>> CC: Chuck Lever <chuck.lever@oracle.com>
>> CC: Alexander Viro <viro@zeniv.linux.org.uk>
>> CC: Christian Brauner <brauner@kernel.org>
>> CC: linux-fsdevel@vger.kernel.org
>> CC: linux-kernel@vger.kernel.org
>>
>> ---
>>   fs/locks.c | 2 --
>>   1 file changed, 2 deletions(-)
>>
>> diff --git a/fs/locks.c b/fs/locks.c
>> index 210766007e63..ee265e166542 100644
>> --- a/fs/locks.c
>> +++ b/fs/locks.c
>> @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns)
>>   	pid_t vnr;
>>   	struct pid *pid;
>>   
>> -	if (IS_OFDLCK(fl))
>> -		return -1;
>>   	if (IS_REMOTELCK(fl))
>>   		return fl->fl_pid;
>>   	/*
> NACK on this one.
>
> OFD locks are not owned by processes. They are owned by the file
> description (hence the name). Because of this, returning a pid here is
> wrong.

But fd is owned by a process.
PID has a meaning, you can send SIGKILL
to the returned PID, and the lock is clear.
Was there any reason to hide the PID at
a first place?


> This precedent comes from BSD, where flock() and POSIX locks can
> conflict. BSD returns -1 for the pid if you call F_GETLK on a file
> locked with flock(). Since OFD locks have similar ownership semantics to
> flock() locks, we use the same convention here.
OK if you insist I can drop this one and
search the PID by some other means.
Just a bit unsure what makes it so important
to overwrite the potentially useful info
with -1.

So in case you insist on that, then should
I send a v2 or can you just drop the patch
yourself?

Jeff Layton June 20, 2023, 11:12 a.m. UTC | #3

On Tue, 2023-06-20 at 15:57 +0500, stsp wrote:
> Hello,
> 
> 20.06.2023 15:51, Jeff Layton пишет:
> > On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote:
> > > Currently F_OFD_GETLK sets the pid of the lock owner to -1.
> > > Remove such behavior to allow getting the proper owner's pid.
> > > This may be helpful when you want to send some message (like SIGKILL)
> > > to the offending locker.
> > > 
> > > Signed-off-by: Stas Sergeev <stsp2@yandex.ru>
> > > 
> > > CC: Jeff Layton <jlayton@kernel.org>
> > > CC: Chuck Lever <chuck.lever@oracle.com>
> > > CC: Alexander Viro <viro@zeniv.linux.org.uk>
> > > CC: Christian Brauner <brauner@kernel.org>
> > > CC: linux-fsdevel@vger.kernel.org
> > > CC: linux-kernel@vger.kernel.org
> > > 
> > > ---
> > >   fs/locks.c | 2 --
> > >   1 file changed, 2 deletions(-)
> > > 
> > > diff --git a/fs/locks.c b/fs/locks.c
> > > index 210766007e63..ee265e166542 100644
> > > --- a/fs/locks.c
> > > +++ b/fs/locks.c
> > > @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns)
> > >   	pid_t vnr;
> > >   	struct pid *pid;
> > >   
> > > -	if (IS_OFDLCK(fl))
> > > -		return -1;
> > >   	if (IS_REMOTELCK(fl))
> > >   		return fl->fl_pid;
> > >   	/*
> > NACK on this one.
> > 
> > OFD locks are not owned by processes. They are owned by the file
> > description (hence the name). Because of this, returning a pid here is
> > wrong.
> 
> But fd is owned by a process.

No, it isn't.

fd's can be handed off between processes via unix descriptors.

Multithreaded processes are also a bit of a gray area here: Suppose I
open a file and set an OFD lock on it in one task, and then let that
task exit while the file is still open. What should l_pid say in that
case?

> PID has a meaning, you can send SIGKILL
> to the returned PID, and the lock is clear.
> Was there any reason to hide the PID at
> a first place?
> 

Yes, because OFD locks are not tied to a pid in the same way that
traditional POSIX locks are.

> 
> > This precedent comes from BSD, where flock() and POSIX locks can
> > conflict. BSD returns -1 for the pid if you call F_GETLK on a file
> > locked with flock(). Since OFD locks have similar ownership semantics to
> > flock() locks, we use the same convention here.
> OK if you insist I can drop this one and
> search the PID by some other means.
> Just a bit unsure what makes it so important
> to overwrite the potentially useful info
> with -1.
> 
> So in case you insist on that, then should
> I send a v2 or can you just drop the patch
> yourself?

stsp June 20, 2023, 11:45 a.m. UTC | #4

20.06.2023 16:12, Jeff Layton пишет:
> Multithreaded processes are also a bit of a gray area here: Suppose I
> open a file and set an OFD lock on it in one task, and then let that
> task exit while the file is still open. What should l_pid say in that
> case?

If by the "task" you mean a process, then
the result should be no lock at all.
If you mean the thread exit, then I would
expect l_pid to contain tgid, in which case
it will still point to the valid pid.
Or do you mean l_pid contains tid?
Checking... no, l_pid contains tgid.
So I don't really see the problem you are
pointing with the above example, could
you please clarify?

Jeff Layton June 20, 2023, 12:02 p.m. UTC | #5

On Tue, 2023-06-20 at 16:45 +0500, stsp wrote:
> 20.06.2023 16:12, Jeff Layton пишет:
> > Multithreaded processes are also a bit of a gray area here: Suppose I
> > open a file and set an OFD lock on it in one task, and then let that
> > task exit while the file is still open. What should l_pid say in that
> > case?
> 
> If by the "task" you mean a process, then
> the result should be no lock at all.
> If you mean the thread exit, then I would
> expect l_pid to contain tgid, in which case
> it will still point to the valid pid.
> Or do you mean l_pid contains tid?
> Checking... no, l_pid contains tgid.
> So I don't really see the problem you are
> pointing with the above example, could
> you please clarify?
> 

Suppose I start a process (call it pid 100), and then spawn a thread
(101). I then have 101 open a file and set an OFD lock on it (such that
the resulting fl_pid field in the file_lock is set to 101). Now, join
the thread so that task 101 exits.

Wait a while, and eventually pid 101 will be recycled, so that now there
is a new task 101 that is unrelated to the process above. Another
(unrelated) task then calls F_GETLK on the file and sees that lock. Its
pid is now set to 101 and sends SIGKILL to it, killing an unrelated
process.

That's just one example, of course. The underlying problem is that OFD
locks are not owned by processes in the same way that traditional POSIX
locks are, so reporting a pid there is unreliable, at best.

stsp June 20, 2023, 12:34 p.m. UTC | #6

20.06.2023 17:02, Jeff Layton пишет:
> Suppose I start a process (call it pid 100), and then spawn a thread
> (101). I then have 101 open a file and set an OFD lock on it (such that
> the resulting fl_pid field in the file_lock is set to 101).

How come?
There are multiple places in locks.c
with this line:
fl->fl_pid = current->tgid;

And I've yet to see the line like:
fl->fl_pid = current->pid;
Its simply not there.

No, we put tgid into l_pid!
tgid will still be 100, no matter how
many threads you spawn or destroy.
Or what am I misseng?


> That's just one example, of course. The underlying problem is that OFD
> locks are not owned by processes in the same way that traditional POSIX
> locks are, so reporting a pid there is unreliable, at best.
But we report tgid.
It doesn't depend on threads.
I don't understand. :)

Jeff Layton June 20, 2023, 1:19 p.m. UTC | #7

On Tue, 2023-06-20 at 17:34 +0500, stsp wrote:
> 20.06.2023 17:02, Jeff Layton пишет:
> > Suppose I start a process (call it pid 100), and then spawn a thread
> > (101). I then have 101 open a file and set an OFD lock on it (such that
> > the resulting fl_pid field in the file_lock is set to 101).
> 
> How come?
> There are multiple places in locks.c
> with this line:
> fl->fl_pid = current->tgid;
> 
> And I've yet to see the line like:
> fl->fl_pid = current->pid;
> Its simply not there.
> 
> No, we put tgid into l_pid!
> tgid will still be 100, no matter how
> many threads you spawn or destroy.
> Or what am I misseng?
> 
>
> > That's just one example, of course. The underlying problem is that OFD
> > locks are not owned by processes in the same way that traditional POSIX
> > locks are, so reporting a pid there is unreliable, at best.
> But we report tgid.
> It doesn't depend on threads.
> I don't understand. :)

Good point. I had forgotten that we stuffed the l_pid in there. So in
principle, that example would be OK. But...there is still the problem of
passing file descriptors via unix sockets.

The bottom line is that these locks are specifically not owned by a
process, so returning the l_pid field is unreliable (at best). There is
no guarantee that the pid returned will still represent the task that
set the lock.

You may want to review this article. They're called "File-private" locks
here, but the name was later changed to "Open file description" (OFD)
locks:

    https://lwn.net/Articles/586904/

The rationale for why -1 is reported is noted there.

stsp June 20, 2023, 1:39 p.m. UTC | #8

20.06.2023 18:19, Jeff Layton пишет:
> The bottom line is that these locks are specifically not owned by a
> process, so returning the l_pid field is unreliable (at best). There is
> no guarantee that the pid returned will still represent the task that
> set the lock.

Though it will, for sure, represent the
task that _owns_ the lock.

> You may want to review this article. They're called "File-private" locks
> here, but the name was later changed to "Open file description" (OFD)
> locks:
>
>      https://lwn.net/Articles/586904/
>
> The rationale for why -1 is reported is noted there.
Well, they point to fork() and SCM_RIGHTS.
Yes, these 2 beasts can make the same lock
owned by more than one process.
Yet l_pid returned, is going to be always valid:
it will still represent one of the valid owners.
So my call is to be brave and just re-consider
the conclusion of that article, made 10 years
ago! :)

Of course if returning just 1 of possibly multiple
owners is a problem, then oh well, I'll drop
this patch.

Matthew Wilcox June 20, 2023, 1:46 p.m. UTC | #9

On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
> Though it will, for sure, represent the
> task that _owns_ the lock.

No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
and then exit.  Now the only owner of that lock is the recipient ...
who may not even have received the fd yet.

stsp June 20, 2023, 1:47 p.m. UTC | #10

20.06.2023 18:46, Matthew Wilcox пишет:
> On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
>> Though it will, for sure, represent the
>> task that _owns_ the lock.
> No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
> and then exit.  Now the only owner of that lock is the recipient ...
Won't I get the recipient's pid in an
l_pid then?

Jeff Layton June 20, 2023, 1:58 p.m. UTC | #11

On Tue, 2023-06-20 at 18:39 +0500, stsp wrote:
> 20.06.2023 18:19, Jeff Layton пишет:
> > The bottom line is that these locks are specifically not owned by a
> > process, so returning the l_pid field is unreliable (at best). There is
> > no guarantee that the pid returned will still represent the task that
> > set the lock.
> 
> Though it will, for sure, represent the
> task that _owns_ the lock.
> 
> > You may want to review this article. They're called "File-private" locks
> > here, but the name was later changed to "Open file description" (OFD)
> > locks:
> > 
> >      https://lwn.net/Articles/586904/
> > 
> > The rationale for why -1 is reported is noted there.
> Well, they point to fork() and SCM_RIGHTS.
> Yes, these 2 beasts can make the same lock
> owned by more than one process.
> Yet l_pid returned, is going to be always valid:
> it will still represent one of the valid owners.

No, it won't. The l_pid field is populated from the file_lock->fl_pid.
That field is set when the lock is set, and never updated. So it's quite
possible for F_GETLK to return the pid of a process that no longer
exists.

In principle, we could try to address that by changing how we track lock
ownership, but that's a fairly major overhaul, and I'm not clear on any
use-cases where that matters.

> So my call is to be brave and just re-consider
> the conclusion of that article, made 10 years
> ago! :)
> 

I think my foot has too many bullet wounds for that sort of bravery.

> Of course if returning just 1 of possibly multiple
> owners is a problem, then oh well, I'll drop
> this patch.

Matthew Wilcox June 20, 2023, 2:36 p.m. UTC | #12

On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote:
> 
> 20.06.2023 18:46, Matthew Wilcox пишет:
> > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
> > > Though it will, for sure, represent the
> > > task that _owns_ the lock.
> > No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
> > and then exit.  Now the only owner of that lock is the recipient ...
> Won't I get the recipient's pid in an
> l_pid then?

You snipped the part where I pointed out that at times there can be
_no_ task that owns it.  open a fd, set the lock, pass the fd to another
task, exit.  until that task calls recvmsg(), no task owns it.

stsp June 20, 2023, 3:45 p.m. UTC | #13

20.06.2023 19:36, Matthew Wilcox пишет:
> On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote:
>> 20.06.2023 18:46, Matthew Wilcox пишет:
>>> On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
>>>> Though it will, for sure, represent the
>>>> task that _owns_ the lock.
>>> No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
>>> and then exit.  Now the only owner of that lock is the recipient ...
>> Won't I get the recipient's pid in an
>> l_pid then?
> You snipped the part where I pointed out that at times there can be
> _no_ task that owns it.  open a fd, set the lock, pass the fd to another
> task, exit.  until that task calls recvmsg(), no task owns it.
Hmm, interesting case.
So at least it seems if recipient also exits,
then the untransferred fd gets closed.
Does this mean, by any chance, that the
recipient actually owns an fd before
recvmsg() is done?

Matthew Wilcox June 20, 2023, 5:05 p.m. UTC | #14

On Tue, Jun 20, 2023 at 08:45:21PM +0500, stsp wrote:
> 
> 20.06.2023 19:36, Matthew Wilcox пишет:
> > On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote:
> > > 20.06.2023 18:46, Matthew Wilcox пишет:
> > > > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
> > > > > Though it will, for sure, represent the
> > > > > task that _owns_ the lock.
> > > > No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
> > > > and then exit.  Now the only owner of that lock is the recipient ...
> > > Won't I get the recipient's pid in an
> > > l_pid then?
> > You snipped the part where I pointed out that at times there can be
> > _no_ task that owns it.  open a fd, set the lock, pass the fd to another
> > task, exit.  until that task calls recvmsg(), no task owns it.
> Hmm, interesting case.
> So at least it seems if recipient also exits,
> then the untransferred fd gets closed.

yes, pretty sure this is done by garbage collection in the unix socket
handling code, though i've never looked at it.  it's done that way
because annoying people can do things like open two sockets and send the
fd of each to the other, then exit.

> Does this mean, by any chance, that the
> recipient actually owns an fd before
> recvmsg() is done?

no, it's not in their fd table.  they don't own it.

stsp June 21, 2023, 2:54 a.m. UTC | #15

20.06.2023 22:05, Matthew Wilcox пишет:
>> Does this mean, by any chance, that the
>> recipient actually owns an fd before
>> recvmsg() is done?
> no, it's not in their fd table.  they don't own it.
OK, thanks for showing this pathological
case. Let me just note that this changes
nothing at all. :)

The important thing to note here is that
any lock query is race-prone: locks can
come and go at any time. So if you need
some sequence of operations, you need
to employ some global locking for that.
I use flock(LOCK_EX) on the same fd, before
doing F_OFD_GETLK, and I do flock(LOCK_UN)
only when the entire sequence of operations
is completed. And I do the same on an
F_OFD_SETLK's side to guarantee the
atomicity. You can't do it otherwise,
it would be race-prone.

So given the above, the only thing we
need for l_pid consistency is for the
"donor" process to put LOCK_EX on an
fd before doing SCM_RIGHTS, and the
recipient should do LOCK_UN. Then
the other side, which also uses LOCK_EX,
will never see the owner-less state.
And as for the kernel's POV, l_pid should
be set to -1 only when there is no owner,
like in an example you mentioned.

stsp June 21, 2023, 6:57 a.m. UTC | #16

20.06.2023 18:58, Jeff Layton пишет:
> No, it won't. The l_pid field is populated from the file_lock->fl_pid.
> That field is set when the lock is set, and never updated. So it's quite
> possible for F_GETLK to return the pid of a process that no longer
> exists.
>
> In principle, we could try to address that by changing how we track lock
> ownership, but that's a fairly major overhaul, and I'm not clear on any
> use-cases where that matters.

OK, in this case I'll just put a comments
into the code, summarizing the info I got
from you and Matthew.
Thanks guys for all the info, its very helpful.

Now I only need to convert the current
"fundamental problem" attitude into a "not
implemented yet" via the code comment.


>> So my call is to be brave and just re-consider
>> the conclusion of that article, made 10 years
>> ago! :)
>>
> I think my foot has too many bullet wounds for that sort of bravery.
I am perfectly fine with leaving this thing
unimplemented. But what really bothers
me is the posix proposal, which I think was
done. Please tell me it allows fixing fl_pid
in the future (rather than to mandate -1),
and I am calm.

Jeff Layton June 21, 2023, 10:35 a.m. UTC | #17

On Wed, 2023-06-21 at 11:57 +0500, stsp wrote:
> 20.06.2023 18:58, Jeff Layton пишет:
> > No, it won't. The l_pid field is populated from the file_lock->fl_pid.
> > That field is set when the lock is set, and never updated. So it's quite
> > possible for F_GETLK to return the pid of a process that no longer
> > exists.
> > 
> > In principle, we could try to address that by changing how we track lock
> > ownership, but that's a fairly major overhaul, and I'm not clear on any
> > use-cases where that matters.
> 
> OK, in this case I'll just put a comments
> into the code, summarizing the info I got
> from you and Matthew.
> Thanks guys for all the info, its very helpful.
> 
> Now I only need to convert the current
> "fundamental problem" attitude into a "not
> implemented yet" via the code comment.
> 
> 
> > > So my call is to be brave and just re-consider
> > > the conclusion of that article, made 10 years
> > > ago! :)
> > > 
> > I think my foot has too many bullet wounds for that sort of bravery.
> I am perfectly fine with leaving this thing
> unimplemented. But what really bothers
> me is the posix proposal, which I think was
> done. Please tell me it allows fixing fl_pid
> in the future (rather than to mandate -1),
> and I am calm.

I don't think we can change this at this point.

The bottom line (again) is that OFD locks are owned by the file
descriptor (much like with flock()), and since file descriptors can be
shared across multiple process it's impossible to say that some single
process owns it.

stsp June 21, 2023, 10:42 a.m. UTC | #18

21.06.2023 15:35, Jeff Layton пишет:
> I don't think we can change this at this point.
>
> The bottom line (again) is that OFD locks are owned by the file
> descriptor (much like with flock()), and since file descriptors can be
> shared across multiple process it's impossible to say that some single
> process owns it.
What's the problem with 2 owners?
Can't you get one of them, rather than
meaningless -1?
Compare this situation with read locks.
They can overlap, so when you get an
info about a read lock (except for the
new F_UNLCK case), you get the info
about *some* of the locks in that range.
In the case of multiple owners, you
likewise get the info about about some
owner. If you iteratively send them a
"please release this lock" message
(eg in a form of SIGKILL), then you
traverse all, and end up with the
lock-free area.
Is there really any problem here?

Jeff Layton June 21, 2023, 11:05 a.m. UTC | #19

On Wed, 2023-06-21 at 15:42 +0500, stsp wrote:
> 21.06.2023 15:35, Jeff Layton пишет:
> > I don't think we can change this at this point.
> > 
> > The bottom line (again) is that OFD locks are owned by the file
> > descriptor (much like with flock()), and since file descriptors can be
> > shared across multiple process it's impossible to say that some single
> > process owns it.
> What's the problem with 2 owners?
> Can't you get one of them, rather than
> meaningless -1?
> Compare this situation with read locks.
> They can overlap, so when you get an
> info about a read lock (except for the
> new F_UNLCK case), you get the info
> about *some* of the locks in that range.
> In the case of multiple owners, you
> likewise get the info about about some
> owner. If you iteratively send them a
> "please release this lock" message
> (eg in a form of SIGKILL), then you
> traverse all, and end up with the
> lock-free area.
> Is there really any problem here?

Yes. Ambiguous answers are worse than none at all.

What problem are you trying to solve by having F_OFD_GETLK report a pid?

stsp June 21, 2023, 11:22 a.m. UTC | #20

21.06.2023 16:05, Jeff Layton пишет:
> Yes. Ambiguous answers are worse than none at all.

But same for read locks, when you
query them with F_OFD_GETLK.
It doesn't sound ambiguous to me,
you get the valid owner, and you can
iterate them if you kill them in a
process (same as for read locks).


> What problem are you trying to solve by having F_OFD_GETLK report a pid?
Just a way to abruptly kill offending lockers,
as that most likely means the process hanged
(I have a 3rd party code that drives the locking,
so it can't be trusted not to hang).
Its not essential though, for sure.
Curiosity also plays the role here. :)
Though if you don't want, I can as well
not add a TODO comment to the code.

stsp June 21, 2023, 11:26 a.m. UTC | #21

21.06.2023 16:22, stsp пишет:
>
> 21.06.2023 16:05, Jeff Layton пишет:
>> Yes. Ambiguous answers are worse than none at all.
>
> But same for read locks, when you
> query them with F_OFD_GETLK.
> It doesn't sound ambiguous to me,
> you get the valid owner, and you can
> iterate them if you kill them in a
> process (same as for read locks).
And in fact, no, you can't iterate the
read locks *unless* you use pid to
kill its owners! :) So this way you can
actually iterate the read locks, but
not in a most convenient way. :)

David Laight June 23, 2023, 1:10 p.m. UTC | #22

From: Matthew Wilcox
> Sent: 20 June 2023 14:46
> 
> On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote:
> > Though it will, for sure, represent the
> > task that _owns_ the lock.
> 
> No, it *DOESN'T*.  I can open a file, SCM_RIGHTS pass it to another task
> and then exit.  Now the only owner of that lock is the recipient ...
> who may not even have received the fd yet.

Do these locks persist across fork+exec?

What happens is a completely unrelated process opens /proc/<pid>/fd
while a lock is held and then the (nominally) lock-holding
process closes the fd (or exits).

While it might be a useful diagnostic to know the pid of the
process that acquired the lock it clearly has no relationship
with any process that currently has an fd[] table entry that
references the file.

	David

-
Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK
Registration No: 1397386 (Wales)

Christian Brauner June 23, 2023, 3:25 p.m. UTC | #23

On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote:
> On Wed, 2023-06-21 at 15:42 +0500, stsp wrote:
> > 21.06.2023 15:35, Jeff Layton пишет:
> > > I don't think we can change this at this point.
> > > 
> > > The bottom line (again) is that OFD locks are owned by the file
> > > descriptor (much like with flock()), and since file descriptors can be
> > > shared across multiple process it's impossible to say that some single
> > > process owns it.
> > What's the problem with 2 owners?
> > Can't you get one of them, rather than
> > meaningless -1?
> > Compare this situation with read locks.
> > They can overlap, so when you get an
> > info about a read lock (except for the
> > new F_UNLCK case), you get the info
> > about *some* of the locks in that range.
> > In the case of multiple owners, you
> > likewise get the info about about some
> > owner. If you iteratively send them a
> > "please release this lock" message
> > (eg in a form of SIGKILL), then you
> > traverse all, and end up with the
> > lock-free area.
> > Is there really any problem here?
> 
> Yes. Ambiguous answers are worse than none at all.

I agree.

A few minor observations:

SCM_RIGHTS have already been mentioned multiple times. But I'm not sure
it's been mentioned explicitly but that trivially means it's possible to
send an fd to a completely separate thread-group, then kill off the
sending thread-group by killing their thread-group leader. Bad enough as
the identifier is now useless. But it also means that at some later
point that pid can be recycled. So using that pid for killing sprees is
usually a bad idea and might cause killing a completely separate
thread-group.

pidfd_getfd() can be used to retrieve a file descriptor from another
process; like an inverse dup. Opens up similar problems as SCM_RIGHTS.

Expressing ownership doesn't make sense as this is an inherently shared
concept as Jeff mentioned. So best case is to try and track the creator
similar to SO_PEERCRED for unix sockets.

In any case, this is all made worse by the fact that struct file_lock
stashes raw pid numbers. That means that the kernel itself can't even
detect whether the creator of that lock has died and the pid possibly
been recycled. In essence, it risk reporting as owner a random process
that just happens to have the same pid number as the creator that
might've died.

Even for unix sockets SO_PEERCRED and SCM_CREDS are inherently racy even
though they stash a struct pid instead of a raw pid number and that
makes them problematic in a lot of scenario which is why they will
support pidfds with v6.5.

The situation could probably be improved by stashing a struct pid in
struct file_lock then you could at least somewhat reasonably track the
creator as the kernel would be able to detect whether the creating
process has already been reaped (not exited). For example, I did this
for pidfds where it reports -1 if the process the pidfd refers to has
been reaped.

Btw, is there a way to limit the number of file locks a user might
create? Is that implicitly bound to the maximum number of fds and files
a caller may create because I saw RLIMIT_LOCKS but I didn't see it
checked anywhere.

stsp June 23, 2023, 5:18 p.m. UTC | #24

23.06.2023 20:25, Christian Brauner пишет:
> On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote:
>> On Wed, 2023-06-21 at 15:42 +0500, stsp wrote:
>>> 21.06.2023 15:35, Jeff Layton пишет:
>>>> I don't think we can change this at this point.
>>>>
>>>> The bottom line (again) is that OFD locks are owned by the file
>>>> descriptor (much like with flock()), and since file descriptors can be
>>>> shared across multiple process it's impossible to say that some single
>>>> process owns it.
>>> What's the problem with 2 owners?
>>> Can't you get one of them, rather than
>>> meaningless -1?
>>> Compare this situation with read locks.
>>> They can overlap, so when you get an
>>> info about a read lock (except for the
>>> new F_UNLCK case), you get the info
>>> about *some* of the locks in that range.
>>> In the case of multiple owners, you
>>> likewise get the info about about some
>>> owner. If you iteratively send them a
>>> "please release this lock" message
>>> (eg in a form of SIGKILL), then you
>>> traverse all, and end up with the
>>> lock-free area.
>>> Is there really any problem here?
>> Yes. Ambiguous answers are worse than none at all.
> I agree.
>
> A few minor observations:
>
> SCM_RIGHTS have already been mentioned multiple times. But I'm not sure
> it's been mentioned explicitly but that trivially means it's possible to
> send an fd to a completely separate thread-group, then kill off the
> sending thread-group by killing their thread-group leader. Bad enough as
> the identifier is now useless. But it also means that at some later
> point that pid can be recycled.
Come on.
I never proposed anything like this.
Of course the returned pid should be
the pid of the current, actual owner,
or one of current owners.
If someone else proposed to return
stalled pids, then it wasn't me.

Jeff Layton June 27, 2023, 4 p.m. UTC | #25

On Fri, 2023-06-23 at 22:18 +0500, stsp wrote:
> 23.06.2023 20:25, Christian Brauner пишет:
> > On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote:
> > > On Wed, 2023-06-21 at 15:42 +0500, stsp wrote:
> > > > 21.06.2023 15:35, Jeff Layton пишет:
> > > > > I don't think we can change this at this point.
> > > > > 
> > > > > The bottom line (again) is that OFD locks are owned by the file
> > > > > descriptor (much like with flock()), and since file descriptors can be
> > > > > shared across multiple process it's impossible to say that some single
> > > > > process owns it.
> > > > What's the problem with 2 owners?
> > > > Can't you get one of them, rather than
> > > > meaningless -1?
> > > > Compare this situation with read locks.
> > > > They can overlap, so when you get an
> > > > info about a read lock (except for the
> > > > new F_UNLCK case), you get the info
> > > > about *some* of the locks in that range.
> > > > In the case of multiple owners, you
> > > > likewise get the info about about some
> > > > owner. If you iteratively send them a
> > > > "please release this lock" message
> > > > (eg in a form of SIGKILL), then you
> > > > traverse all, and end up with the
> > > > lock-free area.
> > > > Is there really any problem here?
> > > Yes. Ambiguous answers are worse than none at all.
> > I agree.
> > 
> > A few minor observations:
> > 
> > SCM_RIGHTS have already been mentioned multiple times. But I'm not sure
> > it's been mentioned explicitly but that trivially means it's possible to
> > send an fd to a completely separate thread-group, then kill off the
> > sending thread-group by killing their thread-group leader. Bad enough as
> > the identifier is now useless. But it also means that at some later
> > point that pid can be recycled.
> Come on.
> I never proposed anything like this.
> Of course the returned pid should be
> the pid of the current, actual owner,
> or one of current owners.
> If someone else proposed to return
> stalled pids, then it wasn't me.


Beyond all of this, there is a long history of problems with the l_pid
field as well with network filesystems, even with traditional POSIX
locks. What should go into the l_pid when a traditional POSIX lock is
held by a process on a separate host?

While POSIX mandates it, the l_pid is really sort of a "legacy" field
that is really just for informational purposes only nowadays. It might
have been a reliable bit of information back in the 1980's, but even
since the 90's it was suspect as a source of information.

Even if you _know_ you hold a traditional POSIX lock, be careful
trusting the information in that field.

stsp June 27, 2023, 4:20 p.m. UTC | #26

27.06.2023 21:00, Jeff Layton пишет:
> Beyond all of this, there is a long history of problems with the l_pid
> field as well with network filesystems, even with traditional POSIX
> locks. What should go into the l_pid when a traditional POSIX lock is
> held by a process on a separate host?
>
> While POSIX mandates it, the l_pid is really sort of a "legacy" field
> that is really just for informational purposes only nowadays. It might
> have been a reliable bit of information back in the 1980's, but even
> since the 90's it was suspect as a source of information.
>
> Even if you _know_ you hold a traditional POSIX lock, be careful
> trusting the information in that field.
Thanks for info.
Additional problem with multiple owners
that I can think of, is that you don't know
if more owners are present. And even if
you use SIGKILL to "iterate", you still don't
know if you got another owner of the prev
lock, or maybe you got entirely different
read lock with the same range from another
owner.

Still if you do "man fcntl" you'll see this:

                pid_t l_pid;     /* PID of process blocking our lock

                                    (set by F_GETLK and F_OFD_GETLK) */

And no, its not my patch that did this. :)
So unless properly documented, this would
be treated as a bug. And it should _not_ be
documented as "OFD locks has no owner by
definition" or alike - no one buys that.

[2/3] fd/locks: allow get the lock owner by F_OFD_GETLK

Commit Message

Comments

Patch