Message ID | 20230620095507.2677463-3-stsp2@yandex.ru (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | RFC: F_OFD_GETLK should provide more info | expand |
On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote: > Currently F_OFD_GETLK sets the pid of the lock owner to -1. > Remove such behavior to allow getting the proper owner's pid. > This may be helpful when you want to send some message (like SIGKILL) > to the offending locker. > > Signed-off-by: Stas Sergeev <stsp2@yandex.ru> > > CC: Jeff Layton <jlayton@kernel.org> > CC: Chuck Lever <chuck.lever@oracle.com> > CC: Alexander Viro <viro@zeniv.linux.org.uk> > CC: Christian Brauner <brauner@kernel.org> > CC: linux-fsdevel@vger.kernel.org > CC: linux-kernel@vger.kernel.org > > --- > fs/locks.c | 2 -- > 1 file changed, 2 deletions(-) > > diff --git a/fs/locks.c b/fs/locks.c > index 210766007e63..ee265e166542 100644 > --- a/fs/locks.c > +++ b/fs/locks.c > @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns) > pid_t vnr; > struct pid *pid; > > - if (IS_OFDLCK(fl)) > - return -1; > if (IS_REMOTELCK(fl)) > return fl->fl_pid; > /* NACK on this one. OFD locks are not owned by processes. They are owned by the file description (hence the name). Because of this, returning a pid here is wrong. This precedent comes from BSD, where flock() and POSIX locks can conflict. BSD returns -1 for the pid if you call F_GETLK on a file locked with flock(). Since OFD locks have similar ownership semantics to flock() locks, we use the same convention here. Cheers,
Hello, 20.06.2023 15:51, Jeff Layton пишет: > On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote: >> Currently F_OFD_GETLK sets the pid of the lock owner to -1. >> Remove such behavior to allow getting the proper owner's pid. >> This may be helpful when you want to send some message (like SIGKILL) >> to the offending locker. >> >> Signed-off-by: Stas Sergeev <stsp2@yandex.ru> >> >> CC: Jeff Layton <jlayton@kernel.org> >> CC: Chuck Lever <chuck.lever@oracle.com> >> CC: Alexander Viro <viro@zeniv.linux.org.uk> >> CC: Christian Brauner <brauner@kernel.org> >> CC: linux-fsdevel@vger.kernel.org >> CC: linux-kernel@vger.kernel.org >> >> --- >> fs/locks.c | 2 -- >> 1 file changed, 2 deletions(-) >> >> diff --git a/fs/locks.c b/fs/locks.c >> index 210766007e63..ee265e166542 100644 >> --- a/fs/locks.c >> +++ b/fs/locks.c >> @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns) >> pid_t vnr; >> struct pid *pid; >> >> - if (IS_OFDLCK(fl)) >> - return -1; >> if (IS_REMOTELCK(fl)) >> return fl->fl_pid; >> /* > NACK on this one. > > OFD locks are not owned by processes. They are owned by the file > description (hence the name). Because of this, returning a pid here is > wrong. But fd is owned by a process. PID has a meaning, you can send SIGKILL to the returned PID, and the lock is clear. Was there any reason to hide the PID at a first place? > This precedent comes from BSD, where flock() and POSIX locks can > conflict. BSD returns -1 for the pid if you call F_GETLK on a file > locked with flock(). Since OFD locks have similar ownership semantics to > flock() locks, we use the same convention here. OK if you insist I can drop this one and search the PID by some other means. Just a bit unsure what makes it so important to overwrite the potentially useful info with -1. So in case you insist on that, then should I send a v2 or can you just drop the patch yourself?
On Tue, 2023-06-20 at 15:57 +0500, stsp wrote: > Hello, > > 20.06.2023 15:51, Jeff Layton пишет: > > On Tue, 2023-06-20 at 14:55 +0500, Stas Sergeev wrote: > > > Currently F_OFD_GETLK sets the pid of the lock owner to -1. > > > Remove such behavior to allow getting the proper owner's pid. > > > This may be helpful when you want to send some message (like SIGKILL) > > > to the offending locker. > > > > > > Signed-off-by: Stas Sergeev <stsp2@yandex.ru> > > > > > > CC: Jeff Layton <jlayton@kernel.org> > > > CC: Chuck Lever <chuck.lever@oracle.com> > > > CC: Alexander Viro <viro@zeniv.linux.org.uk> > > > CC: Christian Brauner <brauner@kernel.org> > > > CC: linux-fsdevel@vger.kernel.org > > > CC: linux-kernel@vger.kernel.org > > > > > > --- > > > fs/locks.c | 2 -- > > > 1 file changed, 2 deletions(-) > > > > > > diff --git a/fs/locks.c b/fs/locks.c > > > index 210766007e63..ee265e166542 100644 > > > --- a/fs/locks.c > > > +++ b/fs/locks.c > > > @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns) > > > pid_t vnr; > > > struct pid *pid; > > > > > > - if (IS_OFDLCK(fl)) > > > - return -1; > > > if (IS_REMOTELCK(fl)) > > > return fl->fl_pid; > > > /* > > NACK on this one. > > > > OFD locks are not owned by processes. They are owned by the file > > description (hence the name). Because of this, returning a pid here is > > wrong. > > But fd is owned by a process. No, it isn't. fd's can be handed off between processes via unix descriptors. Multithreaded processes are also a bit of a gray area here: Suppose I open a file and set an OFD lock on it in one task, and then let that task exit while the file is still open. What should l_pid say in that case? > PID has a meaning, you can send SIGKILL > to the returned PID, and the lock is clear. > Was there any reason to hide the PID at > a first place? > Yes, because OFD locks are not tied to a pid in the same way that traditional POSIX locks are. > > > This precedent comes from BSD, where flock() and POSIX locks can > > conflict. BSD returns -1 for the pid if you call F_GETLK on a file > > locked with flock(). Since OFD locks have similar ownership semantics to > > flock() locks, we use the same convention here. > OK if you insist I can drop this one and > search the PID by some other means. > Just a bit unsure what makes it so important > to overwrite the potentially useful info > with -1. > > So in case you insist on that, then should > I send a v2 or can you just drop the patch > yourself?
20.06.2023 16:12, Jeff Layton пишет: > Multithreaded processes are also a bit of a gray area here: Suppose I > open a file and set an OFD lock on it in one task, and then let that > task exit while the file is still open. What should l_pid say in that > case? If by the "task" you mean a process, then the result should be no lock at all. If you mean the thread exit, then I would expect l_pid to contain tgid, in which case it will still point to the valid pid. Or do you mean l_pid contains tid? Checking... no, l_pid contains tgid. So I don't really see the problem you are pointing with the above example, could you please clarify?
On Tue, 2023-06-20 at 16:45 +0500, stsp wrote: > 20.06.2023 16:12, Jeff Layton пишет: > > Multithreaded processes are also a bit of a gray area here: Suppose I > > open a file and set an OFD lock on it in one task, and then let that > > task exit while the file is still open. What should l_pid say in that > > case? > > If by the "task" you mean a process, then > the result should be no lock at all. > If you mean the thread exit, then I would > expect l_pid to contain tgid, in which case > it will still point to the valid pid. > Or do you mean l_pid contains tid? > Checking... no, l_pid contains tgid. > So I don't really see the problem you are > pointing with the above example, could > you please clarify? > Suppose I start a process (call it pid 100), and then spawn a thread (101). I then have 101 open a file and set an OFD lock on it (such that the resulting fl_pid field in the file_lock is set to 101). Now, join the thread so that task 101 exits. Wait a while, and eventually pid 101 will be recycled, so that now there is a new task 101 that is unrelated to the process above. Another (unrelated) task then calls F_GETLK on the file and sees that lock. Its pid is now set to 101 and sends SIGKILL to it, killing an unrelated process. That's just one example, of course. The underlying problem is that OFD locks are not owned by processes in the same way that traditional POSIX locks are, so reporting a pid there is unreliable, at best.
20.06.2023 17:02, Jeff Layton пишет: > Suppose I start a process (call it pid 100), and then spawn a thread > (101). I then have 101 open a file and set an OFD lock on it (such that > the resulting fl_pid field in the file_lock is set to 101). How come? There are multiple places in locks.c with this line: fl->fl_pid = current->tgid; And I've yet to see the line like: fl->fl_pid = current->pid; Its simply not there. No, we put tgid into l_pid! tgid will still be 100, no matter how many threads you spawn or destroy. Or what am I misseng? > That's just one example, of course. The underlying problem is that OFD > locks are not owned by processes in the same way that traditional POSIX > locks are, so reporting a pid there is unreliable, at best. But we report tgid. It doesn't depend on threads. I don't understand. :)
On Tue, 2023-06-20 at 17:34 +0500, stsp wrote: > 20.06.2023 17:02, Jeff Layton пишет: > > Suppose I start a process (call it pid 100), and then spawn a thread > > (101). I then have 101 open a file and set an OFD lock on it (such that > > the resulting fl_pid field in the file_lock is set to 101). > > How come? > There are multiple places in locks.c > with this line: > fl->fl_pid = current->tgid; > > And I've yet to see the line like: > fl->fl_pid = current->pid; > Its simply not there. > > No, we put tgid into l_pid! > tgid will still be 100, no matter how > many threads you spawn or destroy. > Or what am I misseng? > > > > That's just one example, of course. The underlying problem is that OFD > > locks are not owned by processes in the same way that traditional POSIX > > locks are, so reporting a pid there is unreliable, at best. > But we report tgid. > It doesn't depend on threads. > I don't understand. :) Good point. I had forgotten that we stuffed the l_pid in there. So in principle, that example would be OK. But...there is still the problem of passing file descriptors via unix sockets. The bottom line is that these locks are specifically not owned by a process, so returning the l_pid field is unreliable (at best). There is no guarantee that the pid returned will still represent the task that set the lock. You may want to review this article. They're called "File-private" locks here, but the name was later changed to "Open file description" (OFD) locks: https://lwn.net/Articles/586904/ The rationale for why -1 is reported is noted there.
20.06.2023 18:19, Jeff Layton пишет: > The bottom line is that these locks are specifically not owned by a > process, so returning the l_pid field is unreliable (at best). There is > no guarantee that the pid returned will still represent the task that > set the lock. Though it will, for sure, represent the task that _owns_ the lock. > You may want to review this article. They're called "File-private" locks > here, but the name was later changed to "Open file description" (OFD) > locks: > > https://lwn.net/Articles/586904/ > > The rationale for why -1 is reported is noted there. Well, they point to fork() and SCM_RIGHTS. Yes, these 2 beasts can make the same lock owned by more than one process. Yet l_pid returned, is going to be always valid: it will still represent one of the valid owners. So my call is to be brave and just re-consider the conclusion of that article, made 10 years ago! :) Of course if returning just 1 of possibly multiple owners is a problem, then oh well, I'll drop this patch.
On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: > Though it will, for sure, represent the > task that _owns_ the lock. No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task and then exit. Now the only owner of that lock is the recipient ... who may not even have received the fd yet.
20.06.2023 18:46, Matthew Wilcox пишет: > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: >> Though it will, for sure, represent the >> task that _owns_ the lock. > No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task > and then exit. Now the only owner of that lock is the recipient ... Won't I get the recipient's pid in an l_pid then?
On Tue, 2023-06-20 at 18:39 +0500, stsp wrote: > 20.06.2023 18:19, Jeff Layton пишет: > > The bottom line is that these locks are specifically not owned by a > > process, so returning the l_pid field is unreliable (at best). There is > > no guarantee that the pid returned will still represent the task that > > set the lock. > > Though it will, for sure, represent the > task that _owns_ the lock. > > > You may want to review this article. They're called "File-private" locks > > here, but the name was later changed to "Open file description" (OFD) > > locks: > > > > https://lwn.net/Articles/586904/ > > > > The rationale for why -1 is reported is noted there. > Well, they point to fork() and SCM_RIGHTS. > Yes, these 2 beasts can make the same lock > owned by more than one process. > Yet l_pid returned, is going to be always valid: > it will still represent one of the valid owners. No, it won't. The l_pid field is populated from the file_lock->fl_pid. That field is set when the lock is set, and never updated. So it's quite possible for F_GETLK to return the pid of a process that no longer exists. In principle, we could try to address that by changing how we track lock ownership, but that's a fairly major overhaul, and I'm not clear on any use-cases where that matters. > So my call is to be brave and just re-consider > the conclusion of that article, made 10 years > ago! :) > I think my foot has too many bullet wounds for that sort of bravery. > Of course if returning just 1 of possibly multiple > owners is a problem, then oh well, I'll drop > this patch.
On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote: > > 20.06.2023 18:46, Matthew Wilcox пишет: > > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: > > > Though it will, for sure, represent the > > > task that _owns_ the lock. > > No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task > > and then exit. Now the only owner of that lock is the recipient ... > Won't I get the recipient's pid in an > l_pid then? You snipped the part where I pointed out that at times there can be _no_ task that owns it. open a fd, set the lock, pass the fd to another task, exit. until that task calls recvmsg(), no task owns it.
20.06.2023 19:36, Matthew Wilcox пишет: > On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote: >> 20.06.2023 18:46, Matthew Wilcox пишет: >>> On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: >>>> Though it will, for sure, represent the >>>> task that _owns_ the lock. >>> No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task >>> and then exit. Now the only owner of that lock is the recipient ... >> Won't I get the recipient's pid in an >> l_pid then? > You snipped the part where I pointed out that at times there can be > _no_ task that owns it. open a fd, set the lock, pass the fd to another > task, exit. until that task calls recvmsg(), no task owns it. Hmm, interesting case. So at least it seems if recipient also exits, then the untransferred fd gets closed. Does this mean, by any chance, that the recipient actually owns an fd before recvmsg() is done?
On Tue, Jun 20, 2023 at 08:45:21PM +0500, stsp wrote: > > 20.06.2023 19:36, Matthew Wilcox пишет: > > On Tue, Jun 20, 2023 at 06:47:31PM +0500, stsp wrote: > > > 20.06.2023 18:46, Matthew Wilcox пишет: > > > > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: > > > > > Though it will, for sure, represent the > > > > > task that _owns_ the lock. > > > > No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task > > > > and then exit. Now the only owner of that lock is the recipient ... > > > Won't I get the recipient's pid in an > > > l_pid then? > > You snipped the part where I pointed out that at times there can be > > _no_ task that owns it. open a fd, set the lock, pass the fd to another > > task, exit. until that task calls recvmsg(), no task owns it. > Hmm, interesting case. > So at least it seems if recipient also exits, > then the untransferred fd gets closed. yes, pretty sure this is done by garbage collection in the unix socket handling code, though i've never looked at it. it's done that way because annoying people can do things like open two sockets and send the fd of each to the other, then exit. > Does this mean, by any chance, that the > recipient actually owns an fd before > recvmsg() is done? no, it's not in their fd table. they don't own it.
20.06.2023 22:05, Matthew Wilcox пишет: >> Does this mean, by any chance, that the >> recipient actually owns an fd before >> recvmsg() is done? > no, it's not in their fd table. they don't own it. OK, thanks for showing this pathological case. Let me just note that this changes nothing at all. :) The important thing to note here is that any lock query is race-prone: locks can come and go at any time. So if you need some sequence of operations, you need to employ some global locking for that. I use flock(LOCK_EX) on the same fd, before doing F_OFD_GETLK, and I do flock(LOCK_UN) only when the entire sequence of operations is completed. And I do the same on an F_OFD_SETLK's side to guarantee the atomicity. You can't do it otherwise, it would be race-prone. So given the above, the only thing we need for l_pid consistency is for the "donor" process to put LOCK_EX on an fd before doing SCM_RIGHTS, and the recipient should do LOCK_UN. Then the other side, which also uses LOCK_EX, will never see the owner-less state. And as for the kernel's POV, l_pid should be set to -1 only when there is no owner, like in an example you mentioned.
20.06.2023 18:58, Jeff Layton пишет: > No, it won't. The l_pid field is populated from the file_lock->fl_pid. > That field is set when the lock is set, and never updated. So it's quite > possible for F_GETLK to return the pid of a process that no longer > exists. > > In principle, we could try to address that by changing how we track lock > ownership, but that's a fairly major overhaul, and I'm not clear on any > use-cases where that matters. OK, in this case I'll just put a comments into the code, summarizing the info I got from you and Matthew. Thanks guys for all the info, its very helpful. Now I only need to convert the current "fundamental problem" attitude into a "not implemented yet" via the code comment. >> So my call is to be brave and just re-consider >> the conclusion of that article, made 10 years >> ago! :) >> > I think my foot has too many bullet wounds for that sort of bravery. I am perfectly fine with leaving this thing unimplemented. But what really bothers me is the posix proposal, which I think was done. Please tell me it allows fixing fl_pid in the future (rather than to mandate -1), and I am calm.
On Wed, 2023-06-21 at 11:57 +0500, stsp wrote: > 20.06.2023 18:58, Jeff Layton пишет: > > No, it won't. The l_pid field is populated from the file_lock->fl_pid. > > That field is set when the lock is set, and never updated. So it's quite > > possible for F_GETLK to return the pid of a process that no longer > > exists. > > > > In principle, we could try to address that by changing how we track lock > > ownership, but that's a fairly major overhaul, and I'm not clear on any > > use-cases where that matters. > > OK, in this case I'll just put a comments > into the code, summarizing the info I got > from you and Matthew. > Thanks guys for all the info, its very helpful. > > Now I only need to convert the current > "fundamental problem" attitude into a "not > implemented yet" via the code comment. > > > > > So my call is to be brave and just re-consider > > > the conclusion of that article, made 10 years > > > ago! :) > > > > > I think my foot has too many bullet wounds for that sort of bravery. > I am perfectly fine with leaving this thing > unimplemented. But what really bothers > me is the posix proposal, which I think was > done. Please tell me it allows fixing fl_pid > in the future (rather than to mandate -1), > and I am calm. I don't think we can change this at this point. The bottom line (again) is that OFD locks are owned by the file descriptor (much like with flock()), and since file descriptors can be shared across multiple process it's impossible to say that some single process owns it.
21.06.2023 15:35, Jeff Layton пишет: > I don't think we can change this at this point. > > The bottom line (again) is that OFD locks are owned by the file > descriptor (much like with flock()), and since file descriptors can be > shared across multiple process it's impossible to say that some single > process owns it. What's the problem with 2 owners? Can't you get one of them, rather than meaningless -1? Compare this situation with read locks. They can overlap, so when you get an info about a read lock (except for the new F_UNLCK case), you get the info about *some* of the locks in that range. In the case of multiple owners, you likewise get the info about about some owner. If you iteratively send them a "please release this lock" message (eg in a form of SIGKILL), then you traverse all, and end up with the lock-free area. Is there really any problem here?
On Wed, 2023-06-21 at 15:42 +0500, stsp wrote: > 21.06.2023 15:35, Jeff Layton пишет: > > I don't think we can change this at this point. > > > > The bottom line (again) is that OFD locks are owned by the file > > descriptor (much like with flock()), and since file descriptors can be > > shared across multiple process it's impossible to say that some single > > process owns it. > What's the problem with 2 owners? > Can't you get one of them, rather than > meaningless -1? > Compare this situation with read locks. > They can overlap, so when you get an > info about a read lock (except for the > new F_UNLCK case), you get the info > about *some* of the locks in that range. > In the case of multiple owners, you > likewise get the info about about some > owner. If you iteratively send them a > "please release this lock" message > (eg in a form of SIGKILL), then you > traverse all, and end up with the > lock-free area. > Is there really any problem here? Yes. Ambiguous answers are worse than none at all. What problem are you trying to solve by having F_OFD_GETLK report a pid?
21.06.2023 16:05, Jeff Layton пишет: > Yes. Ambiguous answers are worse than none at all. But same for read locks, when you query them with F_OFD_GETLK. It doesn't sound ambiguous to me, you get the valid owner, and you can iterate them if you kill them in a process (same as for read locks). > What problem are you trying to solve by having F_OFD_GETLK report a pid? Just a way to abruptly kill offending lockers, as that most likely means the process hanged (I have a 3rd party code that drives the locking, so it can't be trusted not to hang). Its not essential though, for sure. Curiosity also plays the role here. :) Though if you don't want, I can as well not add a TODO comment to the code.
21.06.2023 16:22, stsp пишет: > > 21.06.2023 16:05, Jeff Layton пишет: >> Yes. Ambiguous answers are worse than none at all. > > But same for read locks, when you > query them with F_OFD_GETLK. > It doesn't sound ambiguous to me, > you get the valid owner, and you can > iterate them if you kill them in a > process (same as for read locks). And in fact, no, you can't iterate the read locks *unless* you use pid to kill its owners! :) So this way you can actually iterate the read locks, but not in a most convenient way. :)
From: Matthew Wilcox > Sent: 20 June 2023 14:46 > > On Tue, Jun 20, 2023 at 06:39:07PM +0500, stsp wrote: > > Though it will, for sure, represent the > > task that _owns_ the lock. > > No, it *DOESN'T*. I can open a file, SCM_RIGHTS pass it to another task > and then exit. Now the only owner of that lock is the recipient ... > who may not even have received the fd yet. Do these locks persist across fork+exec? What happens is a completely unrelated process opens /proc/<pid>/fd while a lock is held and then the (nominally) lock-holding process closes the fd (or exits). While it might be a useful diagnostic to know the pid of the process that acquired the lock it clearly has no relationship with any process that currently has an fd[] table entry that references the file. David - Registered Address Lakeside, Bramley Road, Mount Farm, Milton Keynes, MK1 1PT, UK Registration No: 1397386 (Wales)
On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote: > On Wed, 2023-06-21 at 15:42 +0500, stsp wrote: > > 21.06.2023 15:35, Jeff Layton пишет: > > > I don't think we can change this at this point. > > > > > > The bottom line (again) is that OFD locks are owned by the file > > > descriptor (much like with flock()), and since file descriptors can be > > > shared across multiple process it's impossible to say that some single > > > process owns it. > > What's the problem with 2 owners? > > Can't you get one of them, rather than > > meaningless -1? > > Compare this situation with read locks. > > They can overlap, so when you get an > > info about a read lock (except for the > > new F_UNLCK case), you get the info > > about *some* of the locks in that range. > > In the case of multiple owners, you > > likewise get the info about about some > > owner. If you iteratively send them a > > "please release this lock" message > > (eg in a form of SIGKILL), then you > > traverse all, and end up with the > > lock-free area. > > Is there really any problem here? > > Yes. Ambiguous answers are worse than none at all. I agree. A few minor observations: SCM_RIGHTS have already been mentioned multiple times. But I'm not sure it's been mentioned explicitly but that trivially means it's possible to send an fd to a completely separate thread-group, then kill off the sending thread-group by killing their thread-group leader. Bad enough as the identifier is now useless. But it also means that at some later point that pid can be recycled. So using that pid for killing sprees is usually a bad idea and might cause killing a completely separate thread-group. pidfd_getfd() can be used to retrieve a file descriptor from another process; like an inverse dup. Opens up similar problems as SCM_RIGHTS. Expressing ownership doesn't make sense as this is an inherently shared concept as Jeff mentioned. So best case is to try and track the creator similar to SO_PEERCRED for unix sockets. In any case, this is all made worse by the fact that struct file_lock stashes raw pid numbers. That means that the kernel itself can't even detect whether the creator of that lock has died and the pid possibly been recycled. In essence, it risk reporting as owner a random process that just happens to have the same pid number as the creator that might've died. Even for unix sockets SO_PEERCRED and SCM_CREDS are inherently racy even though they stash a struct pid instead of a raw pid number and that makes them problematic in a lot of scenario which is why they will support pidfds with v6.5. The situation could probably be improved by stashing a struct pid in struct file_lock then you could at least somewhat reasonably track the creator as the kernel would be able to detect whether the creating process has already been reaped (not exited). For example, I did this for pidfds where it reports -1 if the process the pidfd refers to has been reaped. Btw, is there a way to limit the number of file locks a user might create? Is that implicitly bound to the maximum number of fds and files a caller may create because I saw RLIMIT_LOCKS but I didn't see it checked anywhere.
23.06.2023 20:25, Christian Brauner пишет: > On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote: >> On Wed, 2023-06-21 at 15:42 +0500, stsp wrote: >>> 21.06.2023 15:35, Jeff Layton пишет: >>>> I don't think we can change this at this point. >>>> >>>> The bottom line (again) is that OFD locks are owned by the file >>>> descriptor (much like with flock()), and since file descriptors can be >>>> shared across multiple process it's impossible to say that some single >>>> process owns it. >>> What's the problem with 2 owners? >>> Can't you get one of them, rather than >>> meaningless -1? >>> Compare this situation with read locks. >>> They can overlap, so when you get an >>> info about a read lock (except for the >>> new F_UNLCK case), you get the info >>> about *some* of the locks in that range. >>> In the case of multiple owners, you >>> likewise get the info about about some >>> owner. If you iteratively send them a >>> "please release this lock" message >>> (eg in a form of SIGKILL), then you >>> traverse all, and end up with the >>> lock-free area. >>> Is there really any problem here? >> Yes. Ambiguous answers are worse than none at all. > I agree. > > A few minor observations: > > SCM_RIGHTS have already been mentioned multiple times. But I'm not sure > it's been mentioned explicitly but that trivially means it's possible to > send an fd to a completely separate thread-group, then kill off the > sending thread-group by killing their thread-group leader. Bad enough as > the identifier is now useless. But it also means that at some later > point that pid can be recycled. Come on. I never proposed anything like this. Of course the returned pid should be the pid of the current, actual owner, or one of current owners. If someone else proposed to return stalled pids, then it wasn't me.
On Fri, 2023-06-23 at 22:18 +0500, stsp wrote: > 23.06.2023 20:25, Christian Brauner пишет: > > On Wed, Jun 21, 2023 at 07:05:12AM -0400, Jeff Layton wrote: > > > On Wed, 2023-06-21 at 15:42 +0500, stsp wrote: > > > > 21.06.2023 15:35, Jeff Layton пишет: > > > > > I don't think we can change this at this point. > > > > > > > > > > The bottom line (again) is that OFD locks are owned by the file > > > > > descriptor (much like with flock()), and since file descriptors can be > > > > > shared across multiple process it's impossible to say that some single > > > > > process owns it. > > > > What's the problem with 2 owners? > > > > Can't you get one of them, rather than > > > > meaningless -1? > > > > Compare this situation with read locks. > > > > They can overlap, so when you get an > > > > info about a read lock (except for the > > > > new F_UNLCK case), you get the info > > > > about *some* of the locks in that range. > > > > In the case of multiple owners, you > > > > likewise get the info about about some > > > > owner. If you iteratively send them a > > > > "please release this lock" message > > > > (eg in a form of SIGKILL), then you > > > > traverse all, and end up with the > > > > lock-free area. > > > > Is there really any problem here? > > > Yes. Ambiguous answers are worse than none at all. > > I agree. > > > > A few minor observations: > > > > SCM_RIGHTS have already been mentioned multiple times. But I'm not sure > > it's been mentioned explicitly but that trivially means it's possible to > > send an fd to a completely separate thread-group, then kill off the > > sending thread-group by killing their thread-group leader. Bad enough as > > the identifier is now useless. But it also means that at some later > > point that pid can be recycled. > Come on. > I never proposed anything like this. > Of course the returned pid should be > the pid of the current, actual owner, > or one of current owners. > If someone else proposed to return > stalled pids, then it wasn't me. Beyond all of this, there is a long history of problems with the l_pid field as well with network filesystems, even with traditional POSIX locks. What should go into the l_pid when a traditional POSIX lock is held by a process on a separate host? While POSIX mandates it, the l_pid is really sort of a "legacy" field that is really just for informational purposes only nowadays. It might have been a reliable bit of information back in the 1980's, but even since the 90's it was suspect as a source of information. Even if you _know_ you hold a traditional POSIX lock, be careful trusting the information in that field.
27.06.2023 21:00, Jeff Layton пишет: > Beyond all of this, there is a long history of problems with the l_pid > field as well with network filesystems, even with traditional POSIX > locks. What should go into the l_pid when a traditional POSIX lock is > held by a process on a separate host? > > While POSIX mandates it, the l_pid is really sort of a "legacy" field > that is really just for informational purposes only nowadays. It might > have been a reliable bit of information back in the 1980's, but even > since the 90's it was suspect as a source of information. > > Even if you _know_ you hold a traditional POSIX lock, be careful > trusting the information in that field. Thanks for info. Additional problem with multiple owners that I can think of, is that you don't know if more owners are present. And even if you use SIGKILL to "iterate", you still don't know if you got another owner of the prev lock, or maybe you got entirely different read lock with the same range from another owner. Still if you do "man fcntl" you'll see this: pid_t l_pid; /* PID of process blocking our lock (set by F_GETLK and F_OFD_GETLK) */ And no, its not my patch that did this. :) So unless properly documented, this would be treated as a bug. And it should _not_ be documented as "OFD locks has no owner by definition" or alike - no one buys that.
diff --git a/fs/locks.c b/fs/locks.c index 210766007e63..ee265e166542 100644 --- a/fs/locks.c +++ b/fs/locks.c @@ -2158,8 +2158,6 @@ static pid_t locks_translate_pid(struct file_lock *fl, struct pid_namespace *ns) pid_t vnr; struct pid *pid; - if (IS_OFDLCK(fl)) - return -1; if (IS_REMOTELCK(fl)) return fl->fl_pid; /*
Currently F_OFD_GETLK sets the pid of the lock owner to -1. Remove such behavior to allow getting the proper owner's pid. This may be helpful when you want to send some message (like SIGKILL) to the offending locker. Signed-off-by: Stas Sergeev <stsp2@yandex.ru> CC: Jeff Layton <jlayton@kernel.org> CC: Chuck Lever <chuck.lever@oracle.com> CC: Alexander Viro <viro@zeniv.linux.org.uk> CC: Christian Brauner <brauner@kernel.org> CC: linux-fsdevel@vger.kernel.org CC: linux-kernel@vger.kernel.org --- fs/locks.c | 2 -- 1 file changed, 2 deletions(-)