diff mbox series

[09/11] vfs, fscache: Add an IS_KERNEL_FILE() macro for the S_KERNEL_FILE flag

Message ID 164251409447.3435901.10092442643336534999.stgit@warthog.procyon.org.uk (mailing list archive)
State New, archived
Headers show
Series fscache, cachefiles: Rewrite fixes/updates | expand

Commit Message

David Howells Jan. 18, 2022, 1:54 p.m. UTC
Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
common practice for the other inode flags[1].

Suggested-by: Jeff Layton <jlayton@kernel.org>
Signed-off-by: David Howells <dhowells@redhat.com>
Reviewed-by: Jeff Layton <jlayton@kernel.org>
cc: linux-cachefs@redhat.com
Link: https://lore.kernel.org/r/88d7f8970dcc0fd0ead891b1f42f160b8d17d60e.camel@kernel.org/ [1]
---

 fs/cachefiles/namei.c |    6 +++---
 fs/namei.c            |    2 +-
 include/linux/fs.h    |    1 +
 3 files changed, 5 insertions(+), 4 deletions(-)

Comments

Christoph Hellwig Jan. 18, 2022, 4:23 p.m. UTC | #1
On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> common practice for the other inode flags[1].

Please fix the flag to have a sensible name first, as the naming of the
flag and this new helper is utterly wrong as we already discussed.
David Howells Jan. 18, 2022, 5:40 p.m. UTC | #2
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > common practice for the other inode flags[1].
> 
> Please fix the flag to have a sensible name first, as the naming of the
> flag and this new helper is utterly wrong as we already discussed.

And I suggested a new name, which you didn't comment on.

David
Christoph Hellwig Jan. 19, 2022, 5:20 a.m. UTC | #3
On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > > common practice for the other inode flags[1].
> > 
> > Please fix the flag to have a sensible name first, as the naming of the
> > flag and this new helper is utterly wrong as we already discussed.
> 
> And I suggested a new name, which you didn't comment on.

Again, look at the semantics of the flag:  The only thing it does in the
VFS is to prevent a rmdir.  So you might want to name it after that.

Or in fact drop the flag entirely.  We don't have that kind of
protection for other in-kernel file use or important userspace daemons
either.  I can't see why cachefiles is the magic snowflake here that
suddenly needs semantics no one else has.
David Howells Jan. 19, 2022, 9:18 a.m. UTC | #4
Christoph Hellwig <hch@infradead.org> wrote:

> On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote:
> > Christoph Hellwig <hch@infradead.org> wrote:
> > 
> > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > > > common practice for the other inode flags[1].
> > > 
> > > Please fix the flag to have a sensible name first, as the naming of the
> > > flag and this new helper is utterly wrong as we already discussed.
> > 
> > And I suggested a new name, which you didn't comment on.
> 
> Again, look at the semantics of the flag:  The only thing it does in the
> VFS is to prevent a rmdir.  So you might want to name it after that.
> 
> Or in fact drop the flag entirely.  We don't have that kind of
> protection for other in-kernel file use or important userspace daemons
> either.  I can't see why cachefiles is the magic snowflake here that
> suddenly needs semantics no one else has.

The flag cannot just be dropped - it's an important part of the interaction
with cachefilesd with regard to culling.  Culling to free up space is
offloaded to userspace rather than being done within the kernel.

Previously, cachefiles, the kernel module, had to maintain a huge tree of
records of every backing inode that it was currently using so that it could
forbid cachefilesd to cull one when cachefilesd asked.  I've reduced that to a
single bit flag on the inode struct, thereby saving both memory and time.  You
can argue whether it's worth sacrificing an inode flag bit for that, but the
flag can be reused for any other kernel service that wants to similarly mark
an inode in use.

Further, it's used as a mark to prevent cachefiles accidentally using an inode
twice - say someone misconfigures a second cache overlapping the first - and,
again, this works if some other kernel driver wants to mark inode it is using
in use.  Cachefiles will refuse to use them if it ever sees them, so no
problem there.

And it's not true that we don't have that kind of protection for other
in-kernel file use.  See S_SWAPFILE.  I did consider using that, but that has
other side effects.  I mentioned that perhaps I should make swapon set
S_KERNEL_FILE also.  Also blockdevs have some exclusion also, I think.

The rmdir thing should really apply to rename and unlink also.  That's to
prevent someone, cachefilesd included, causing cachefiles to malfunction by
removing the directories it created.  Possibly this should be a separate bit
to S_KERNEL_FILE, maybe S_NO_DELETE.

So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE.

David
Christian Brauner Jan. 19, 2022, 11:15 a.m. UTC | #5
On Wed, Jan 19, 2022 at 09:18:05AM +0000, David Howells wrote:
> Christoph Hellwig <hch@infradead.org> wrote:
> 
> > On Tue, Jan 18, 2022 at 05:40:14PM +0000, David Howells wrote:
> > > Christoph Hellwig <hch@infradead.org> wrote:
> > > 
> > > > On Tue, Jan 18, 2022 at 01:54:54PM +0000, David Howells wrote:
> > > > > Add an IS_KERNEL_FILE() macro to test the S_KERNEL_FILE inode flag as is
> > > > > common practice for the other inode flags[1].
> > > > 
> > > > Please fix the flag to have a sensible name first, as the naming of the
> > > > flag and this new helper is utterly wrong as we already discussed.
> > > 
> > > And I suggested a new name, which you didn't comment on.
> > 
> > Again, look at the semantics of the flag:  The only thing it does in the
> > VFS is to prevent a rmdir.  So you might want to name it after that.
> > 
> > Or in fact drop the flag entirely.  We don't have that kind of
> > protection for other in-kernel file use or important userspace daemons
> > either.  I can't see why cachefiles is the magic snowflake here that
> > suddenly needs semantics no one else has.
> 
> The flag cannot just be dropped - it's an important part of the interaction
> with cachefilesd with regard to culling.  Culling to free up space is
> offloaded to userspace rather than being done within the kernel.
> 
> Previously, cachefiles, the kernel module, had to maintain a huge tree of
> records of every backing inode that it was currently using so that it could
> forbid cachefilesd to cull one when cachefilesd asked.  I've reduced that to a
> single bit flag on the inode struct, thereby saving both memory and time.  You
> can argue whether it's worth sacrificing an inode flag bit for that, but the
> flag can be reused for any other kernel service that wants to similarly mark
> an inode in use.
> 
> Further, it's used as a mark to prevent cachefiles accidentally using an inode
> twice - say someone misconfigures a second cache overlapping the first - and,
> again, this works if some other kernel driver wants to mark inode it is using
> in use.  Cachefiles will refuse to use them if it ever sees them, so no
> problem there.
> 
> And it's not true that we don't have that kind of protection for other
> in-kernel file use.  See S_SWAPFILE.  I did consider using that, but that has
> other side effects.  I mentioned that perhaps I should make swapon set
> S_KERNEL_FILE also.  Also blockdevs have some exclusion also, I think.
> 
> The rmdir thing should really apply to rename and unlink also.  That's to
> prevent someone, cachefilesd included, causing cachefiles to malfunction by
> removing the directories it created.  Possibly this should be a separate bit
> to S_KERNEL_FILE, maybe S_NO_DELETE.
> 
> So I could change S_KERNEL_FILE to S_KERNEL_LOCK, say, or maybe S_EXCLUSIVE.

[ ] S_REMOVE_PROTECTED
[ ] S_UNREMOVABLE
[ ] S_HELD_BUSY
[ ] S_KERNEL_BUSY
[ ] S_BUSY_INTERNAL
[ ] S_BUSY
[ ] S_HELD

?
Christoph Hellwig Jan. 20, 2022, 9:08 a.m. UTC | #6
On Wed, Jan 19, 2022 at 09:18:05AM +0000, David Howells wrote:
> The flag cannot just be dropped - it's an important part of the interaction
> with cachefilesd with regard to culling.  Culling to free up space is
> offloaded to userspace rather than being done within the kernel.
> 
> Previously, cachefiles, the kernel module, had to maintain a huge tree of
> records of every backing inode that it was currently using so that it could
> forbid cachefilesd to cull one when cachefilesd asked.  I've reduced that to a
> single bit flag on the inode struct, thereby saving both memory and time.  You
> can argue whether it's worth sacrificing an inode flag bit for that, but the
> flag can be reused for any other kernel service that wants to similarly mark
> an inode in use.

Which is a horrible interface.   But you tricked Linus into merging this
crap, so let's not pretent it is a "kernel file".  We have plenty of
those, basically every caller of filp_open is one.

It is something like "pinned for fscache/cachefiles", so name it that
way and add a big fat comment expaining the atrocities.
David Howells Jan. 20, 2022, 9:37 a.m. UTC | #7
Christoph Hellwig <hch@infradead.org> wrote:

> But you tricked Linus

Tricked?  I put a notice explicitly pointing out that I was adding it and
indicating that it might be controversial in the cover note and the pull
request and further explained the use in the patches that handle it.  I posted
the patches adding/using it a bunch of times to various mailing lists.  TYVM.

David
diff mbox series

Patch

diff --git a/fs/cachefiles/namei.c b/fs/cachefiles/namei.c
index f256c8aff7bb..04563f759e99 100644
--- a/fs/cachefiles/namei.c
+++ b/fs/cachefiles/namei.c
@@ -20,7 +20,7 @@  static bool __cachefiles_mark_inode_in_use(struct cachefiles_object *object,
 	struct inode *inode = d_backing_inode(dentry);
 	bool can_use = false;
 
-	if (!(inode->i_flags & S_KERNEL_FILE)) {
+	if (!IS_KERNEL_FILE(inode)) {
 		inode->i_flags |= S_KERNEL_FILE;
 		trace_cachefiles_mark_active(object, inode);
 		can_use = true;
@@ -746,7 +746,7 @@  static struct dentry *cachefiles_lookup_for_cull(struct cachefiles_cache *cache,
 		goto lookup_error;
 	if (d_is_negative(victim))
 		goto lookup_put;
-	if (d_inode(victim)->i_flags & S_KERNEL_FILE)
+	if (IS_KERNEL_FILE(d_inode(victim)))
 		goto lookup_busy;
 	return victim;
 
@@ -793,7 +793,7 @@  int cachefiles_cull(struct cachefiles_cache *cache, struct dentry *dir,
 	/* check to see if someone is using this object */
 	inode = d_inode(victim);
 	inode_lock(inode);
-	if (inode->i_flags & S_KERNEL_FILE) {
+	if (IS_KERNEL_FILE(inode)) {
 		ret = -EBUSY;
 	} else {
 		/* Stop the cache from picking it back up */
diff --git a/fs/namei.c b/fs/namei.c
index d81f04f8d818..c2175ab3849d 100644
--- a/fs/namei.c
+++ b/fs/namei.c
@@ -3959,7 +3959,7 @@  int vfs_rmdir(struct user_namespace *mnt_userns, struct inode *dir,
 
 	error = -EBUSY;
 	if (is_local_mountpoint(dentry) ||
-	    (dentry->d_inode->i_flags & S_KERNEL_FILE))
+	    IS_KERNEL_FILE(dentry->d_inode))
 		goto out;
 
 	error = security_inode_rmdir(dir, dentry);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index f5d3bf5b69a6..227497793282 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -2216,6 +2216,7 @@  static inline bool sb_rdonly(const struct super_block *sb) { return sb->s_flags
 #define IS_ENCRYPTED(inode)	((inode)->i_flags & S_ENCRYPTED)
 #define IS_CASEFOLDED(inode)	((inode)->i_flags & S_CASEFOLD)
 #define IS_VERITY(inode)	((inode)->i_flags & S_VERITY)
+#define IS_KERNEL_FILE(inode)	((inode)->i_flags & S_KERNEL_FILE)
 
 #define IS_WHITEOUT(inode)	(S_ISCHR(inode->i_mode) && \
 				 (inode)->i_rdev == WHITEOUT_DEV)