mbox series

[GIT,PULL] iomap: new code for 5.13-rc1

Message ID 20210427025805.GD3122264@magnolia (mailing list archive)
State New
Headers show
Series [GIT,PULL] iomap: new code for 5.13-rc1 | expand

Pull-request

git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.13-merge-2

Message

Darrick J. Wong April 27, 2021, 2:58 a.m. UTC
Hi Linus,

Please pull this single patch to the iomap code for 5.13-rc1, which
augments what gets logged when someone tries to swapon an unacceptable
swap file.  (Yes, this is a continuation of the swapfile drama from last
season...)

The branch merges cleanly with upstream as of a few minutes ago and has
been soaking in for-next for weeks without complaints.  Please let me
know if there are any strange problems.  I anticipate there will be a
second patch next week to remove some (AFAICT) unused struct fields to
reduce memory usage.

--D

The following changes since commit 0d02ec6b3136c73c09e7859f0d0e4e2c4c07b49b:

  Linux 5.12-rc4 (2021-03-21 14:56:43 -0700)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.13-merge-2

for you to fetch changes up to ad89b66cbad18ca146cbc75f64706d4ca6635973:

  iomap: improve the warnings from iomap_swapfile_activate (2021-03-26 10:55:40 -0700)

----------------------------------------------------------------
New code for 5.13:
- When a swap file is rejected, actually log the /name/ of the swapfile.

----------------------------------------------------------------
Christoph Hellwig (1):
      iomap: improve the warnings from iomap_swapfile_activate

 fs/iomap/swapfile.c | 38 ++++++++++++++++++++++----------------
 1 file changed, 22 insertions(+), 16 deletions(-)

Comments

Linus Torvalds April 27, 2021, 7:40 p.m. UTC | #1
On Mon, Apr 26, 2021 at 7:58 PM Darrick J. Wong <djwong@kernel.org> wrote:
>
> Please pull this single patch to the iomap code for 5.13-rc1, which
> augments what gets logged when someone tries to swapon an unacceptable
> swap file.  (Yes, this is a continuation of the swapfile drama from last
> season...)

Hmm. I've pulled this, but that "iomap_swapfile_fail()" thing seems a
bit silly to me.

We have '%pD' for printing a filename. It may not be perfect (by
default it only prints one component, you can do "%pD4" to show up to
four components), but it should "JustWork(tm)".

And if it doesn't, we should fix it.

So instead of having a kmalloc/kfree for the path buffer, I think you
should have been able to just do

    pr_err("swapon: file %pD4 %s\n", isi->file, str);

and be done with it.

And no, we don't have a ton of %pD users, so if it's ugly or buggy
when the file is NULL, or has problems with more (of fewer) than four
path components, let's just fix that (added Jia He and Al Viro to
participants, they've been the two people doing %pd and %pD - for
'struct dentry *' and 'struct file *' respectively).

                Linus
Christoph Hellwig April 27, 2021, 7:57 p.m. UTC | #2
On Tue, Apr 27, 2021 at 12:40:09PM -0700, Linus Torvalds wrote:
> We have '%pD' for printing a filename. It may not be perfect (by
> default it only prints one component, you can do "%pD4" to show up to
> four components), but it should "JustWork(tm)".
> 
> And if it doesn't, we should fix it.
> 
> So instead of having a kmalloc/kfree for the path buffer, I think you
> should have been able to just do
> 
>     pr_err("swapon: file %pD4 %s\n", isi->file, str);
> 
> and be done with it.

I'm aware of %pD, but 4 components here are not enough.  People
need to distinguish between xfstests runs and something real in
the system for these somewhat scary sounding messages.
Linus Torvalds April 27, 2021, 8:05 p.m. UTC | #3
On Tue, Apr 27, 2021 at 12:57 PM Christoph Hellwig <hch@lst.de> wrote:
>
> I'm aware of %pD, but 4 components here are not enough.  People
> need to distinguish between xfstests runs and something real in
> the system for these somewhat scary sounding messages.

So how many _would_ be enough? IOW, what would make %pD work better
for this case?

Why are the xfstest messages so magically different from real cases
that they'd need to be separately distinguished, and that can't be
done with just the final path component?

If you think the message is somehow unique and the path is something
secure and identifiable, you're very confused. file_path() is in no
way more "secure" than using %pD4 would be, since if there's some
actual bad actor they can put newlines etc in the pathname, they can
do chroot() etc to make the path look anything they like.

So I seriously don't understand the thinking where you claim that "<n>
components are not enough". Please explain why that could ever be a
real issue.

             Linus
pr-tracker-bot@kernel.org April 27, 2021, 8:07 p.m. UTC | #4
The pull request you sent on Mon, 26 Apr 2021 19:58:05 -0700:

> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/iomap-5.13-merge-2

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/b34b95ebbba9a10257e3a2c9b2ba4119cb345dc3

Thank you!
Christoph Hellwig April 28, 2021, 6:17 a.m. UTC | #5
On Tue, Apr 27, 2021 at 01:05:13PM -0700, Linus Torvalds wrote:
> So how many _would_ be enough? IOW, what would make %pD work better
> for this case?

Preferably all.

> Why are the xfstest messages so magically different from real cases
> that they'd need to be separately distinguished, and that can't be
> done with just the final path component?
> 
> If you think the message is somehow unique and the path is something
> secure and identifiable, you're very confused. file_path() is in no
> way more "secure" than using %pD4 would be, since if there's some
> actual bad actor they can put newlines etc in the pathname, they can
> do chroot() etc to make the path look anything they like.

Nothing needs to be secure.  It just needs to not scare users because
they can see that the first usually two components clearly identify
this is the test file system.
Linus Torvalds April 28, 2021, 6:38 a.m. UTC | #6
On Tue, Apr 27, 2021 at 11:17 PM Christoph Hellwig <hch@lst.de> wrote:
>
> On Tue, Apr 27, 2021 at 01:05:13PM -0700, Linus Torvalds wrote:
> > So how many _would_ be enough? IOW, what would make %pD work better
> > for this case?
>
> Preferably all.

WHY?

You guys are making no sense at all. You're stating silly things,
backing it up with absolutely nothing.

> Nothing needs to be secure.  It just needs to not scare users because
> they can see that the first usually two components clearly identify
> this is the test file system.

This is inane blathering.

What "scary message"? It will never happen in any normal circumstance,
and the trivial thing to do for any xfs test is to make the last
component name be something really obvious for the tester - who is the
only one who will ever see it.

And if it ever *does* happen in real life, the full path really isn't
necessary either. We're talking swap files. They aren't going to be in
random places.

The "I need the whole path" thing is just crazy, and you seem to be in
denial about it. There is absolutely zero reason for it.

I don't particularly care about this code sequence, but I do care when
people start making completely pointless arguyments to make excuses
for stupid code. You have extra silly code for "oh, the temporary
allocation that we did for no good reason can fail, so now we print
"<unknown>" for that case.

So it's all kinds of odd extra code for something that never used to
even bother with a pathname at all before, and now it's suddenly
"scary" and "really important to have all the components" instead of
just being simple and straightforward.

It's a purely informational message, and you guys made it pointlessly
overcomplicated for absolutely zero reason, and now you're too
embarrassed to just admit how pointless it was.

                   Linus
Christoph Hellwig April 28, 2021, 6:41 a.m. UTC | #7
On Tue, Apr 27, 2021 at 11:38:24PM -0700, Linus Torvalds wrote:
> It's a purely informational message,

yes.

> and you guys made it pointlessly
> overcomplicated for absolutely zero reason, and now you're too
> embarrassed to just admit how pointless it was.

"you guys" here is purely me, so I take the blame.  And no, I actually
did have a first version usind %pD, tested it and looked at the output
and saw how it stripped the actual useful part of the path, that is the
first components.
Linus Torvalds April 28, 2021, 7:14 a.m. UTC | #8
On Tue, Apr 27, 2021 at 11:41 PM Christoph Hellwig <hch@lst.de> wrote:
>
> "you guys" here is purely me, so I take the blame.  And no, I actually
> did have a first version usind %pD, tested it and looked at the output
> and saw how it stripped the actual useful part of the path, that is the
> first components.

So that's why I cc'd Al and Jia.

You may not have realized that the default for %pD is to show only one
component, and if you want to see more, you need to use something like
%pD4.

Which should be _plenty_.

But it's also something where I think that default (ie "no number")
behavior may be a bit surprising, and perhaps not the greatest
interface.

So let me just quote that first reply of mine, because you seem to not
have seen it:

> We have '%pD' for printing a filename. It may not be perfect (by
> default it only prints one component, you can do "%pD4" to show up to
> four components), but it should "JustWork(tm)".
>
> And if it doesn't, we should fix it.

I really think %pD4 should be more than good enough. And I think maybe
we should make plain "%pD" mean "as much of the path that is
reasonable" rather than "as few components as possible" (ie 1).

So I don't think "%pD" (or "%pD4") is necessarily perfect, but I think
it's even worse when people then go and do odd ad-hoc things because
of some inconvenience in our %pD implementation.

For example, changing the default to be "show more by default" should
be as simple as something like the attached.  I do think that would be
the more natural behavior for %pD - don't limit it unnecessarily by
default, but for somebody who literally just wants to see a maximum of
2 components, using '%pD2' makes sense.

(Similarly, changing the limit of 4  components to something slightly
bigger would be trivial)

Hmm?

Grepping for existing users with

    git grep '%pD[^1-4]'

most of them would probably like a full pathname, and the odd s390
hmcdrv_dev.c use should just be fixed (it has a hardcoded "/dev/%pD",
which seems very wrong).

Of course, %pD has some other limitations too. It doesn't follow
mount-points up. It's kind of intentionally a "for simple
informational uses only", but good enough in practice exactly for
things like debug printouts.

             Linus
Rasmus Villemoes April 28, 2021, 7:38 a.m. UTC | #9
On 28/04/2021 09.14, Linus Torvalds wrote:
So let me just quote that first reply of mine, because you seem to not
> have seen it:
> 
>> We have '%pD' for printing a filename. It may not be perfect (by
>> default it only prints one component, you can do "%pD4" to show up to
>> four components), but it should "JustWork(tm)".
>>
>> And if it doesn't, we should fix it.
> 
> I really think %pD4 should be more than good enough. And I think maybe
> we should make plain "%pD" mean "as much of the path that is
> reasonable" rather than "as few components as possible" (ie 1).
> 
> So I don't think "%pD" (or "%pD4") is necessarily perfect, but I think
> it's even worse when people then go and do odd ad-hoc things because
> of some inconvenience in our %pD implementation.
> 
> For example, changing the default to be "show more by default" should
> be as simple as something like the attached.  I do think that would be
> the more natural behavior for %pD - don't limit it unnecessarily by
> default, but for somebody who literally just wants to see a maximum of
> 2 components, using '%pD2' makes sense.
> 
> (Similarly, changing the limit of 4  components to something slightly
> bigger would be trivial)
> 
> Hmm?
> 
> Grepping for existing users with
> 
>     git grep '%pD[^1-4]'
>
> most of them would probably like a full pathname, and the odd s390
> hmcdrv_dev.c use should just be fixed (it has a hardcoded "/dev/%pD",
> which seems very wrong).

So the patch makes sense to me. If somebody says '%pD5', it would get
capped at 4 instead of being forced down to 1. But note that while that
grep only produces ~36 hits, it also affects %pd, of which there are
~200 without a 2-4 following (including some vsprintf test cases that
would break). So I think one would first have to explicitly support '1',
switch over some users by adding that 1 in their format string
(test_vsprintf in particular), then flip the default for 'no digit
following %p[dD]'.

Rasmus
Jia He April 28, 2021, 8:47 a.m. UTC | #10
Hi

> -----Original Message-----
> From: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Sent: Wednesday, April 28, 2021 3:38 PM
> To: Linus Torvalds <torvalds@linux-foundation.org>; Christoph Hellwig
> <hch@lst.de>
> Cc: Darrick J. Wong <djwong@kernel.org>; Justin He <Justin.He@arm.com>; Al
> Viro <viro@zeniv.linux.org.uk>; linux-fsdevel <linux-
> fsdevel@vger.kernel.org>; linux-xfs <linux-xfs@vger.kernel.org>; Dave
> Chinner <david@fromorbit.com>; Linux Kernel Mailing List <linux-
> kernel@vger.kernel.org>; Eric Sandeen <sandeen@sandeen.net>
> Subject: Re: [GIT PULL] iomap: new code for 5.13-rc1
>
> On 28/04/2021 09.14, Linus Torvalds wrote:
> So let me just quote that first reply of mine, because you seem to not
> > have seen it:
> >
> >> We have '%pD' for printing a filename. It may not be perfect (by
> >> default it only prints one component, you can do "%pD4" to show up to
> >> four components), but it should "JustWork(tm)".
> >>
> >> And if it doesn't, we should fix it.
> >
> > I really think %pD4 should be more than good enough. And I think maybe
> > we should make plain "%pD" mean "as much of the path that is
> > reasonable" rather than "as few components as possible" (ie 1).
> >
> > So I don't think "%pD" (or "%pD4") is necessarily perfect, but I think
> > it's even worse when people then go and do odd ad-hoc things because
> > of some inconvenience in our %pD implementation.
> >
> > For example, changing the default to be "show more by default" should
> > be as simple as something like the attached.  I do think that would be
> > the more natural behavior for %pD - don't limit it unnecessarily by
> > default, but for somebody who literally just wants to see a maximum of
> > 2 components, using '%pD2' makes sense.
> >
> > (Similarly, changing the limit of 4  components to something slightly
> > bigger would be trivial)
> >
> > Hmm?
> >
> > Grepping for existing users with
> >
> >     git grep '%pD[^1-4]'
> >
> > most of them would probably like a full pathname, and the odd s390
> > hmcdrv_dev.c use should just be fixed (it has a hardcoded "/dev/%pD",
> > which seems very wrong).
>
> So the patch makes sense to me. If somebody says '%pD5', it would get
> capped at 4 instead of being forced down to 1. But note that while that
> grep only produces ~36 hits, it also affects %pd, of which there are
> ~200 without a 2-4 following (including some vsprintf test cases that
> would break). So I think one would first have to explicitly support '1',
> switch over some users by adding that 1 in their format string
> (test_vsprintf in particular), then flip the default for 'no digit
> following %p[dD]'.

I checked and found a few changes as follows, hoping I didn't miss else:
1. test_vsprintf %pD->%pD1  %pd->%pd1
2. drivers/net/wireless/intel/iwlwifi/mvm/debugfs-vif.c
../../../%pd3/%pd -> %pd
3. s390/hmcdrv as mentioned above

--
Cheers,
Justin (Jia He)


>
> Rasmus
IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Linus Torvalds April 28, 2021, 4:50 p.m. UTC | #11
[ Added Andy, who replied to the separate thread where Jia already
posted the patch ]

On Wed, Apr 28, 2021 at 12:38 AM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> So the patch makes sense to me. If somebody says '%pD5', it would get
> capped at 4 instead of being forced down to 1. But note that while that
> grep only produces ~36 hits, it also affects %pd, of which there are
> ~200 without a 2-4 following (including some vsprintf test cases that
> would break). So I think one would first have to explicitly support '1',
> switch over some users by adding that 1 in their format string
> (test_vsprintf in particular), then flip the default for 'no digit
> following %p[dD]'.

Yeah, and the "show one name" actually makes sense for "%pd", because
that's about the *dentry*.

A dentry has a parent, yes, but at the same time, a dentry really does
inherently have "one name" (and given just the dentry pointers, you
can't show mount-related parenthood, so in many ways the "show just
one name" makes sense for "%pd" in ways it doesn't necessarily for
"%pD"). But while a dentry arguably has that "one primary component",
a _file_ is certainly not exclusively about that last component.

So you're right - my "how about something like this" patch is too
simplistic. The default number of components to show should be about
whether it's %pd or %pD.

That also does explain the arguably odd %pD defaults: %pd came first,
and then %pD came afterwards.

              Linus
Rasmus Villemoes April 29, 2021, 6:39 a.m. UTC | #12
On 28/04/2021 18.50, Linus Torvalds wrote:
> [ Added Andy, who replied to the separate thread where Jia already
> posted the patch ]
> 
> On Wed, Apr 28, 2021 at 12:38 AM Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
>>
>> So the patch makes sense to me. If somebody says '%pD5', it would get
>> capped at 4 instead of being forced down to 1. But note that while that
>> grep only produces ~36 hits, it also affects %pd, of which there are
>> ~200 without a 2-4 following (including some vsprintf test cases that
>> would break). So I think one would first have to explicitly support '1',
>> switch over some users by adding that 1 in their format string
>> (test_vsprintf in particular), then flip the default for 'no digit
>> following %p[dD]'.
> 
> Yeah, and the "show one name" actually makes sense for "%pd", because
> that's about the *dentry*.
> 
> A dentry has a parent, yes, but at the same time, a dentry really does
> inherently have "one name" (and given just the dentry pointers, you
> can't show mount-related parenthood, so in many ways the "show just
> one name" makes sense for "%pd" in ways it doesn't necessarily for
> "%pD"). But while a dentry arguably has that "one primary component",
> a _file_ is certainly not exclusively about that last component.
> 
> So you're right - my "how about something like this" patch is too
> simplistic. The default number of components to show should be about
> whether it's %pd or %pD.

Well, keeping the default at 1 for %pd would certainly simplify things
as there are much fewer %pD instances.

> That also does explain the arguably odd %pD defaults: %pd came first,
> and then %pD came afterwards.

Eh? 4b6ccca701ef5977d0ffbc2c932430dea88b38b6 added them both at the same
time.

Rasmus
Christoph Hellwig April 29, 2021, 6:43 a.m. UTC | #13
On Wed, Apr 28, 2021 at 12:14:42AM -0700, Linus Torvalds wrote:
> Of course, %pD has some other limitations too. It doesn't follow
> mount-points up. It's kind of intentionally a "for simple
> informational uses only", but good enough in practice exactly for
> things like debug printouts.

Which thinking about my testing is probably the real problem.  When
running xfstests the it only printed "swap" as the file name, as the
tests create it under the rest mount points.  Which really is of
not use.  While printing /fstests/scratch/swap actually is useful.

I suspect the s390 issue with the hardcoded "/dev/" prefix is somewhat
similar.
Linus Torvalds April 29, 2021, 4:45 p.m. UTC | #14
On Wed, Apr 28, 2021 at 11:40 PM Rasmus Villemoes
<linux@rasmusvillemoes.dk> wrote:
>
> > That also does explain the arguably odd %pD defaults: %pd came first,
> > and then %pD came afterwards.
>
> Eh? 4b6ccca701ef5977d0ffbc2c932430dea88b38b6 added them both at the same
> time.

Ahh, I looked at "git blame", and saw that file_dentry_name() was
added later. But that turns out to have been an additional fix on top,
not actually "later support".

Looking more at that code, I am starting to think that
"file_dentry_name()" simply shouldn't use "dentry_name()" at all.
Despite that shared code origin, and despite that similar letter
choice (lower-vs-upper case), a dentry and a file really are very very
different from a name standpoint.

And it's not the "a filename is the whale pathname, and a dentry has
its own private dentry name" issue. It's really that the 'struct file'
contains a _path_ - which is not just the dentry pointer, but the
'struct vfsmount' pointer too.

So '%pD' really *could* get the real path right (because it has all
the required information) in ways that '%pd' fundamentally cannot.

At the same time, I really don't like printk specifiers to take any
real locks (ie mount_lock or rename_lock), so I wouldn't want them to
use the full  d_path() logic.

                Linus
Jia He April 30, 2021, 3:17 a.m. UTC | #15
Hi

> -----Original Message-----
> From: Linus Torvalds <torvalds@linux-foundation.org>
> Sent: Friday, April 30, 2021 12:46 AM
> To: Rasmus Villemoes <linux@rasmusvillemoes.dk>
> Cc: Christoph Hellwig <hch@lst.de>; Darrick J. Wong <djwong@kernel.org>;
> Justin He <Justin.He@arm.com>; Al Viro <viro@zeniv.linux.org.uk>; linux-
> fsdevel <linux-fsdevel@vger.kernel.org>; linux-xfs <linux-
> xfs@vger.kernel.org>; Dave Chinner <david@fromorbit.com>; Linux Kernel
> Mailing List <linux-kernel@vger.kernel.org>; Eric Sandeen
> <sandeen@sandeen.net>; Andy Shevchenko <andy.shevchenko@gmail.com>
> Subject: Re: [GIT PULL] iomap: new code for 5.13-rc1
>
> On Wed, Apr 28, 2021 at 11:40 PM Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
> >
> > > That also does explain the arguably odd %pD defaults: %pd came first,
> > > and then %pD came afterwards.
> >
> > Eh? 4b6ccca701ef5977d0ffbc2c932430dea88b38b6 added them both at the same
> > time.
>
> Ahh, I looked at "git blame", and saw that file_dentry_name() was
> added later. But that turns out to have been an additional fix on top,
> not actually "later support".
>
> Looking more at that code, I am starting to think that
> "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
> Despite that shared code origin, and despite that similar letter
> choice (lower-vs-upper case), a dentry and a file really are very very
> different from a name standpoint.
>
> And it's not the "a filename is the whale pathname, and a dentry has
> its own private dentry name" issue. It's really that the 'struct file'
> contains a _path_ - which is not just the dentry pointer, but the
> 'struct vfsmount' pointer too.
>
> So '%pD' really *could* get the real path right (because it has all
> the required information) in ways that '%pd' fundamentally cannot.
>
> At the same time, I really don't like printk specifiers to take any
> real locks (ie mount_lock or rename_lock), so I wouldn't want them to
> use the full  d_path() logic.

Is it a good idea to introduce a new d_path_nolock() for file_dentry_name()?
In d_path_nolock(), if it detects that there is conflicts with mount_lock
or rename_lock, then returned NULL as a name of that vfsmount?

Thanks for further suggestion.

--
Cheers,
Justin (Jia He)


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Al Viro April 30, 2021, 3:21 a.m. UTC | #16
On Fri, Apr 30, 2021 at 03:17:02AM +0000, Justin He wrote:

> Is it a good idea to introduce a new d_path_nolock() for file_dentry_name()?
> In d_path_nolock(), if it detects that there is conflicts with mount_lock
> or rename_lock, then returned NULL as a name of that vfsmount?

Just what does vfsmount have to do with rename_lock?  And what's the point
of the entire mess, anyway?
Jia He April 30, 2021, 6:13 a.m. UTC | #17
Hi Al Viro

> -----Original Message-----
> From: Al Viro <viro@ftp.linux.org.uk> On Behalf Of Al Viro
> Sent: Friday, April 30, 2021 11:21 AM
> To: Justin He <Justin.He@arm.com>
> Cc: Linus Torvalds <torvalds@linux-foundation.org>; Rasmus Villemoes
> <linux@rasmusvillemoes.dk>; Christoph Hellwig <hch@lst.de>; Darrick J. Wong
> <djwong@kernel.org>; linux-fsdevel <linux-fsdevel@vger.kernel.org>; linux-
> xfs <linux-xfs@vger.kernel.org>; Dave Chinner <david@fromorbit.com>; Linux
> Kernel Mailing List <linux-kernel@vger.kernel.org>; Eric Sandeen
> <sandeen@sandeen.net>; Andy Shevchenko <andy.shevchenko@gmail.com>
> Subject: Re: [GIT PULL] iomap: new code for 5.13-rc1
>
> On Fri, Apr 30, 2021 at 03:17:02AM +0000, Justin He wrote:
>
> > Is it a good idea to introduce a new d_path_nolock() for
> file_dentry_name()?
> > In d_path_nolock(), if it detects that there is conflicts with mount_lock
> > or rename_lock, then returned NULL as a name of that vfsmount?
>
> Just what does vfsmount have to do with rename_lock?  And what's the point
> of the entire mess, anyway?

Sorry, do you suggest not considering rename_lock/mount_lock at all for file_dentry_name()?

--
Cheers,
Justin (Jia He)


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.
Eric W. Biederman April 30, 2021, 6:50 p.m. UTC | #18
Linus Torvalds <torvalds@linux-foundation.org> writes:

> On Wed, Apr 28, 2021 at 11:40 PM Rasmus Villemoes
> <linux@rasmusvillemoes.dk> wrote:
>>
>> > That also does explain the arguably odd %pD defaults: %pd came first,
>> > and then %pD came afterwards.
>>
>> Eh? 4b6ccca701ef5977d0ffbc2c932430dea88b38b6 added them both at the same
>> time.
>
> Ahh, I looked at "git blame", and saw that file_dentry_name() was
> added later. But that turns out to have been an additional fix on top,
> not actually "later support".
>
> Looking more at that code, I am starting to think that
> "file_dentry_name()" simply shouldn't use "dentry_name()" at all.
> Despite that shared code origin, and despite that similar letter
> choice (lower-vs-upper case), a dentry and a file really are very very
> different from a name standpoint.
>
> And it's not the "a filename is the whale pathname, and a dentry has
> its own private dentry name" issue. It's really that the 'struct file'
> contains a _path_ - which is not just the dentry pointer, but the
> 'struct vfsmount' pointer too.
>
> So '%pD' really *could* get the real path right (because it has all
> the required information) in ways that '%pd' fundamentally cannot.
>
> At the same time, I really don't like printk specifiers to take any
> real locks (ie mount_lock or rename_lock), so I wouldn't want them to
> use the full  d_path() logic.

Well prepend_path the core of d_path, which is essentially the logic
I think you are suggesting to use does:
read_seqbegin_or_lock(&mount_lock, ...);
read_seqbegin_or_lock(&rename_lock, ...);

A printk specific variant could easily be modified to always restart or
to simply ignore renames and changes to the mount tree.  There are
always the corner cases when there is no sensible full path to display.
A rename or a mount namespace operation could be handled like one of
those.

Eric
Linus Torvalds April 30, 2021, 6:58 p.m. UTC | #19
On Thu, Apr 29, 2021 at 8:21 PM Al Viro <viro@zeniv.linux.org.uk> wrote:
>
> Just what does vfsmount have to do with rename_lock?  And what's the point
> of the entire mess, anyway?

Currently "%pD" doesn't actually show a truly valid pathname.

So we have three cases:

 (a) __d_path and friends get the name right, but are being overly
careful about it, and take mount_lock and rename_lock in prepend_path

 (b) dentry_path() doesn't get the actual path name right (only the
in-filesystem one), and takes rename_lock in __dentry_path

 (c) for the vsnprintf case, dentry_name() is the nice lockless "good
for debugging and printk" that doesn't take any locks at all, and
optimistically gives a valid end result, even if it's perhaps not
*THE* valid end result

Basically, the vsnprintf case does the right thing for dentries, and
the whole "you can use this for debugging messages even when you hold
the rename lock" etc.

So (c) is the "debug messages version of (b)".

But there is no "debug messages version of (a)", which is what would
be good for %pD.

You can see it in how the s390 hmcdriv thing does that

        pr_debug("open file '/dev/%pD' with return code %d\n", fp, rc);

which is really just garbage: the "/dev/" part is just a guess, but
yes, if /dev is devtmpfs - like it usually is - then '%pD' simply
doesn't do the right thing (even if it had '%pD2')

              Linus
Linus Torvalds April 30, 2021, 7:02 p.m. UTC | #20
On Fri, Apr 30, 2021 at 11:50 AM Eric W. Biederman
<ebiederm@xmission.com> wrote:
>
> A printk specific variant could easily be modified to always restart or
> to simply ignore renames and changes to the mount tree.

Exactly. I think a "ignore renames and mount tree changes" version for
printk would be the right thing.

Yeah, you can in theory get inconsistent results, but everything is
RCU-protected, so you'd get the same kind of "its' kind of valid, but
in race situations you might get a mix of two components" that '%pd'
gives for a dentry case.

That would allow people to use '%pD' and get reasonable results,
without having them actually interact with locks (that may or may not
be held by the thread trying to print debug messages).

           Linus