diff mbox series

[v2] file: reclaim 24 bytes from f_owner

Message ID 20240813-work-f_owner-v2-1-4e9343a79f9f@kernel.org (mailing list archive)
State New
Headers show
Series [v2] file: reclaim 24 bytes from f_owner | expand

Commit Message

Christian Brauner Aug. 13, 2024, 12:30 p.m. UTC
We do embedd struct fown_struct into struct file letting it take up 32
bytes in total. We could tweak struct fown_struct to be more compact but
really it shouldn't even be embedded in struct file in the first place.

Instead, actual users of struct fown_struct should allocate the struct
on demand. This frees up 24 bytes in struct file.

That will have some potentially user-visible changes for the ownership
fcntl()s. Some of them can now fail due to allocation failures.
Practically, that probably will almost never happen as the allocations
are small and they only happen once per file.

The fown_struct is used during kill_fasync() which is used by e.g.,
pipes to generate a SIGIO signal. Sending of such signals is conditional
on userspace having set an owner for the file using one of the F_OWNER
fcntl()s. Such users will be unaffected if struct fown_struct is
allocated during the fcntl() call.

There are a few subsystems that call __f_setown() expecting
file->f_owner to be allocated:

(1) tun devices
    file->f_op->fasync::tun_chr_fasync()
    -> __f_setown()

    There are no callers of tun_chr_fasync().

(2) tty devices

    file->f_op->fasync::tty_fasync()
    -> __tty_fasync()
       -> __f_setown()

    tty_fasync() has no additional callers but __tty_fasync() has. Note
    that __tty_fasync() only calls __f_setown() if the @on argument is
    true. It's called from:

    file->f_op->release::tty_release()
    -> tty_release()
       -> __tty_fasync()
          -> __f_setown()

    tty_release() calls __tty_fasync() with @on false
    => __f_setown() is never called from tty_release().
       => All callers of tty_release() are safe as well.

    file->f_op->release::tty_open()
    -> tty_release()
       -> __tty_fasync()
          -> __f_setown()

    __tty_hangup() calls __tty_fasync() with @on false
    => __f_setown() is never called from tty_release().
       => All callers of __tty_hangup() are safe as well.

From the callchains it's obvious that (1) and (2) end up getting called
via file->f_op->fasync(). That can happen either through the F_SETFL
fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If
FASYNC is requested and the file isn't already FASYNC then
file->f_op->fasync() is called with @on true which ends up causing both
(1) and (2) to call __f_setown().

(1) and (2) are the only subsystems that call __f_setown() from the
file->f_op->fasync() handler. So both (1) and (2) have been updated to
allocate a struct fown_struct prior to calling fasync_helper() to
register with the fasync infrastructure. That's safe as they both call
fasync_helper() which also does allocations if @on is true.

The other interesting case are file leases:

(3) file leases
    lease_manager_ops->lm_setup::lease_setup()
    -> __f_setown()

    Which in turn is called from:

    generic_add_lease()
    -> lease_manager_ops->lm_setup::lease_setup()
       -> __f_setown()

So here again we can simply make generic_add_lease() allocate struct
fown_struct prior to the lease_manager_ops->lm_setup::lease_setup()
which happens under a spinlock.

With that the two remaining subsystems that call __f_setown() are:

(4) dnotify
(5) sockets

Both have their own custom ioctls to set struct fown_struct and both
have been converted to allocate a struct fown_struct on demand from
their respective ioctls.

Interactions with O_PATH are fine as well e.g., when opening a /dev/tty
as O_PATH then no file->f_op->open() happens thus no file->f_owner is
allocated. That's fine as no file operation will be set for those and
the device has never been opened. fcntl()s called on such things will
just allocate a ->f_owner on demand. Although I have zero idea why'd you
care about f_owner on an O_PATH fd.

Signed-off-by: Christian Brauner <brauner@kernel.org>
---
* There's no more cleanup macros used in this version as we can solve
  that all much simpler.
* Survives LTP which tests a bunch of that stuff.
* Survives perf's watermak signal tests which make use of FASYNC.

Goes into -next unless I hear objections.
---

---
 drivers/net/tun.c           |   6 ++
 drivers/tty/tty_io.c        |   6 ++
 fs/fcntl.c                  | 153 ++++++++++++++++++++++++++++++++++----------
 fs/file_table.c             |   6 +-
 fs/locks.c                  |   6 +-
 fs/notify/dnotify/dnotify.c |   6 +-
 include/linux/fs.h          |  11 +++-
 net/core/sock.c             |   2 +-
 security/selinux/hooks.c    |   2 +-
 security/smack/smack_lsm.c  |   2 +-
 10 files changed, 157 insertions(+), 43 deletions(-)


---
base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
change-id: 20240813-work-f_owner-0fbbc50f9671

Comments

Mateusz Guzik Aug. 13, 2024, 12:42 p.m. UTC | #1
On Tue, Aug 13, 2024 at 2:31 PM Christian Brauner <brauner@kernel.org> wrote:
> -int send_sigurg(struct fown_struct *fown)
> +int send_sigurg(struct file *file)
>  {
> +       struct fown_struct *fown;
>         struct task_struct *p;
>         enum pid_type type;
>         struct pid *pid;
>         unsigned long flags;
>         int ret = 0;
>
> +       fown = file_f_owner(file);
> +       if (fown)
> +               return 0;
> +

if (!fown) ?

>         read_lock_irqsave(&fown->lock, flags);
>
>         type = fown->pid_type;
> @@ -1027,13 +1109,16 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
Christian Brauner Aug. 13, 2024, 1:01 p.m. UTC | #2
On Tue, Aug 13, 2024 at 02:42:11PM GMT, Mateusz Guzik wrote:
> On Tue, Aug 13, 2024 at 2:31 PM Christian Brauner <brauner@kernel.org> wrote:
> > -int send_sigurg(struct fown_struct *fown)
> > +int send_sigurg(struct file *file)
> >  {
> > +       struct fown_struct *fown;
> >         struct task_struct *p;
> >         enum pid_type type;
> >         struct pid *pid;
> >         unsigned long flags;
> >         int ret = 0;
> >
> > +       fown = file_f_owner(file);
> > +       if (fown)
> > +               return 0;
> > +
> 
> if (!fown) ?

Bah, I had fixed that in-tree. Thanks!
Jeff Layton Aug. 13, 2024, 1:02 p.m. UTC | #3
On Tue, 2024-08-13 at 14:30 +0200, Christian Brauner wrote:
> We do embedd struct fown_struct into struct file letting it take up 32
> bytes in total. We could tweak struct fown_struct to be more compact but
> really it shouldn't even be embedded in struct file in the first place.
> 
> Instead, actual users of struct fown_struct should allocate the struct
> on demand. This frees up 24 bytes in struct file.
> 
> That will have some potentially user-visible changes for the ownership
> fcntl()s. Some of them can now fail due to allocation failures.
> Practically, that probably will almost never happen as the allocations
> are small and they only happen once per file.
> 
> The fown_struct is used during kill_fasync() which is used by e.g.,
> pipes to generate a SIGIO signal. Sending of such signals is conditional
> on userspace having set an owner for the file using one of the F_OWNER
> fcntl()s. Such users will be unaffected if struct fown_struct is
> allocated during the fcntl() call.
> 
> There are a few subsystems that call __f_setown() expecting
> file->f_owner to be allocated:
> 
> (1) tun devices
>     file->f_op->fasync::tun_chr_fasync()
>     -> __f_setown()
> 
>     There are no callers of tun_chr_fasync().
> 
> (2) tty devices
> 
>     file->f_op->fasync::tty_fasync()
>     -> __tty_fasync()
>        -> __f_setown()
> 
>     tty_fasync() has no additional callers but __tty_fasync() has. Note
>     that __tty_fasync() only calls __f_setown() if the @on argument is
>     true. It's called from:
> 
>     file->f_op->release::tty_release()
>     -> tty_release()
>        -> __tty_fasync()
>           -> __f_setown()
> 
>     tty_release() calls __tty_fasync() with @on false
>     => __f_setown() is never called from tty_release().
>        => All callers of tty_release() are safe as well.
> 
>     file->f_op->release::tty_open()
>     -> tty_release()
>        -> __tty_fasync()
>           -> __f_setown()
> 
>     __tty_hangup() calls __tty_fasync() with @on false
>     => __f_setown() is never called from tty_release().
>        => All callers of __tty_hangup() are safe as well.
> 
> From the callchains it's obvious that (1) and (2) end up getting called
> via file->f_op->fasync(). That can happen either through the F_SETFL
> fcntl() with the FASYNC flag raised or via the FIOASYNC ioctl(). If
> FASYNC is requested and the file isn't already FASYNC then
> file->f_op->fasync() is called with @on true which ends up causing both
> (1) and (2) to call __f_setown().
> 
> (1) and (2) are the only subsystems that call __f_setown() from the
> file->f_op->fasync() handler. So both (1) and (2) have been updated to
> allocate a struct fown_struct prior to calling fasync_helper() to
> register with the fasync infrastructure. That's safe as they both call
> fasync_helper() which also does allocations if @on is true.
> 
> The other interesting case are file leases:
> 
> (3) file leases
>     lease_manager_ops->lm_setup::lease_setup()
>     -> __f_setown()
> 
>     Which in turn is called from:
> 
>     generic_add_lease()
>     -> lease_manager_ops->lm_setup::lease_setup()
>        -> __f_setown()
> 
> So here again we can simply make generic_add_lease() allocate struct
> fown_struct prior to the lease_manager_ops->lm_setup::lease_setup()
> which happens under a spinlock.
> 
> With that the two remaining subsystems that call __f_setown() are:
> 
> (4) dnotify
> (5) sockets
> 
> Both have their own custom ioctls to set struct fown_struct and both
> have been converted to allocate a struct fown_struct on demand from
> their respective ioctls.
> 
> Interactions with O_PATH are fine as well e.g., when opening a /dev/tty
> as O_PATH then no file->f_op->open() happens thus no file->f_owner is
> allocated. That's fine as no file operation will be set for those and
> the device has never been opened. fcntl()s called on such things will
> just allocate a ->f_owner on demand. Although I have zero idea why'd you
> care about f_owner on an O_PATH fd.
> 
> Signed-off-by: Christian Brauner <brauner@kernel.org>
> ---
> * There's no more cleanup macros used in this version as we can solve
>   that all much simpler.
> * Survives LTP which tests a bunch of that stuff.
> * Survives perf's watermak signal tests which make use of FASYNC.
> 
> Goes into -next unless I hear objections.
> ---
> 
> ---
>  drivers/net/tun.c           |   6 ++
>  drivers/tty/tty_io.c        |   6 ++
>  fs/fcntl.c                  | 153 ++++++++++++++++++++++++++++++++++----------
>  fs/file_table.c             |   6 +-
>  fs/locks.c                  |   6 +-
>  fs/notify/dnotify/dnotify.c |   6 +-
>  include/linux/fs.h          |  11 +++-
>  net/core/sock.c             |   2 +-
>  security/selinux/hooks.c    |   2 +-
>  security/smack/smack_lsm.c  |   2 +-
>  10 files changed, 157 insertions(+), 43 deletions(-)
> 
> diff --git a/drivers/net/tun.c b/drivers/net/tun.c
> index 1d06c560c5e6..6fe5e8f7017c 100644
> --- a/drivers/net/tun.c
> +++ b/drivers/net/tun.c
> @@ -3451,6 +3451,12 @@ static int tun_chr_fasync(int fd, struct file *file, int on)
>  	struct tun_file *tfile = file->private_data;
>  	int ret;
>  
> +	if (on) {
> +		ret = file_f_owner_allocate(file);
> +		if (ret)
> +			goto out;
> +	}
> +
>  	if ((ret = fasync_helper(fd, file, on, &tfile->fasync)) < 0)
>  		goto out;
>  
> diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
> index 407b0d87b7c1..7ae0c8934f42 100644
> --- a/drivers/tty/tty_io.c
> +++ b/drivers/tty/tty_io.c
> @@ -2225,6 +2225,12 @@ static int __tty_fasync(int fd, struct file *filp, int on)
>  	if (tty_paranoia_check(tty, file_inode(filp), "tty_fasync"))
>  		goto out;
>  
> +	if (on) {
> +		retval = file_f_owner_allocate(filp);
> +		if (retval)
> +			goto out;
> +	}
> +
>  	retval = fasync_helper(fd, filp, on, &tty->fasync);
>  	if (retval <= 0)
>  		goto out;
> diff --git a/fs/fcntl.c b/fs/fcntl.c
> index 300e5d9ad913..b002730fdcd1 100644
> --- a/fs/fcntl.c
> +++ b/fs/fcntl.c
> @@ -87,22 +87,53 @@ static int setfl(int fd, struct file * filp, unsigned int arg)
>  	return error;
>  }
>  
> +/*
> + * Allocate an file->f_owner struct if it doesn't exist, handling racing
> + * allocations correctly.
> + */
> +int file_f_owner_allocate(struct file *file)
> +{
> +	struct fown_struct *f_owner;
> +
> +	f_owner = file_f_owner(file);
> +	if (f_owner)
> +		return 0;
> +
> +	f_owner = kzalloc(sizeof(struct fown_struct), GFP_KERNEL);
> +	if (!f_owner)
> +		return -ENOMEM;
> +
> +	rwlock_init(&f_owner->lock);
> +	f_owner->file = file;
> +	/* If someone else raced us, drop our allocation. */
> +	if (unlikely(cmpxchg(&file->f_owner, NULL, f_owner)))

nit: try_cmpxchg generates better asm and should be fine here.

> +		kfree(f_owner);
> +	return 0;
> +}
> +EXPORT_SYMBOL(file_f_owner_allocate);
> +
>  static void f_modown(struct file *filp, struct pid *pid, enum pid_type type,
>                       int force)
>  {
> -	write_lock_irq(&filp->f_owner.lock);
> -	if (force || !filp->f_owner.pid) {
> -		put_pid(filp->f_owner.pid);
> -		filp->f_owner.pid = get_pid(pid);
> -		filp->f_owner.pid_type = type;
> +	struct fown_struct *f_owner;
> +
> +	f_owner = file_f_owner(filp);
> +	if (WARN_ON_ONCE(!f_owner)) 
> +		return;
> +
> +	write_lock_irq(&f_owner->lock);
> +	if (force || !f_owner->pid) {
> +		put_pid(f_owner->pid);
> +		f_owner->pid = get_pid(pid);
> +		f_owner->pid_type = type;
>  
>  		if (pid) {
>  			const struct cred *cred = current_cred();
> -			filp->f_owner.uid = cred->uid;
> -			filp->f_owner.euid = cred->euid;
> +			f_owner->uid = cred->uid;
> +			f_owner->euid = cred->euid;
>  		}
>  	}
> -	write_unlock_irq(&filp->f_owner.lock);
> +	write_unlock_irq(&f_owner->lock);
>  }
>  
>  void __f_setown(struct file *filp, struct pid *pid, enum pid_type type,
> @@ -119,6 +150,8 @@ int f_setown(struct file *filp, int who, int force)
>  	struct pid *pid = NULL;
>  	int ret = 0;
>  
> +	might_sleep();
> +
>  	type = PIDTYPE_TGID;
>  	if (who < 0) {
>  		/* avoid overflow below */
> @@ -129,6 +162,10 @@ int f_setown(struct file *filp, int who, int force)
>  		who = -who;
>  	}
>  
> +	ret = file_f_owner_allocate(filp);
> +	if (ret)
> +		return ret;
> +
>  	rcu_read_lock();
>  	if (who) {
>  		pid = find_vpid(who);
> @@ -152,16 +189,21 @@ void f_delown(struct file *filp)
>  pid_t f_getown(struct file *filp)
>  {
>  	pid_t pid = 0;
> +	struct fown_struct *f_owner;
>  
> -	read_lock_irq(&filp->f_owner.lock);
> +	f_owner = file_f_owner(filp);
> +	if (!f_owner)
> +		return pid;
> +
> +	read_lock_irq(&f_owner->lock);
>  	rcu_read_lock();
> -	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
> -		pid = pid_vnr(filp->f_owner.pid);
> -		if (filp->f_owner.pid_type == PIDTYPE_PGID)
> +	if (pid_task(f_owner->pid, f_owner->pid_type)) {
> +		pid = pid_vnr(f_owner->pid);
> +		if (f_owner->pid_type == PIDTYPE_PGID)
>  			pid = -pid;
>  	}
>  	rcu_read_unlock();
> -	read_unlock_irq(&filp->f_owner.lock);
> +	read_unlock_irq(&f_owner->lock);
>  	return pid;
>  }
>  
> @@ -194,6 +236,10 @@ static int f_setown_ex(struct file *filp, unsigned long arg)
>  		return -EINVAL;
>  	}
>  
> +	ret = file_f_owner_allocate(filp);
> +	if (ret)
> +		return ret;
> +
>  	rcu_read_lock();
>  	pid = find_vpid(owner.pid);
>  	if (owner.pid && !pid)
> @@ -210,13 +256,20 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
>  	struct f_owner_ex __user *owner_p = (void __user *)arg;
>  	struct f_owner_ex owner = {};
>  	int ret = 0;
> +	struct fown_struct *f_owner;
> +	enum pid_type pid_type = PIDTYPE_PID;
>  
> -	read_lock_irq(&filp->f_owner.lock);
> -	rcu_read_lock();
> -	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
> -		owner.pid = pid_vnr(filp->f_owner.pid);
> -	rcu_read_unlock();
> -	switch (filp->f_owner.pid_type) {
> +	f_owner = file_f_owner(filp);
> +	if (f_owner) {
> +		read_lock_irq(&f_owner->lock);
> +		rcu_read_lock();
> +		if (pid_task(f_owner->pid, f_owner->pid_type))
> +			owner.pid = pid_vnr(f_owner->pid);
> +		rcu_read_unlock();
> +		pid_type = f_owner->pid_type;
> +	}
> +
> +	switch (pid_type) {
>  	case PIDTYPE_PID:
>  		owner.type = F_OWNER_TID;
>  		break;
> @@ -234,7 +287,8 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
>  		ret = -EINVAL;
>  		break;
>  	}
> -	read_unlock_irq(&filp->f_owner.lock);
> +	if (f_owner)
> +		read_unlock_irq(&f_owner->lock);
>  
>  	if (!ret) {
>  		ret = copy_to_user(owner_p, &owner, sizeof(owner));
> @@ -248,14 +302,18 @@ static int f_getown_ex(struct file *filp, unsigned long arg)
>  static int f_getowner_uids(struct file *filp, unsigned long arg)
>  {
>  	struct user_namespace *user_ns = current_user_ns();
> +	struct fown_struct *f_owner;
>  	uid_t __user *dst = (void __user *)arg;
> -	uid_t src[2];
> +	uid_t src[2] = {0, 0};
>  	int err;
>  
> -	read_lock_irq(&filp->f_owner.lock);
> -	src[0] = from_kuid(user_ns, filp->f_owner.uid);
> -	src[1] = from_kuid(user_ns, filp->f_owner.euid);
> -	read_unlock_irq(&filp->f_owner.lock);
> +	f_owner = file_f_owner(filp);
> +	if (f_owner) {
> +		read_lock_irq(&f_owner->lock);
> +		src[0] = from_kuid(user_ns, f_owner->uid);
> +		src[1] = from_kuid(user_ns, f_owner->euid);
> +		read_unlock_irq(&f_owner->lock);
> +	}
>  
>  	err  = put_user(src[0], &dst[0]);
>  	err |= put_user(src[1], &dst[1]);
> @@ -343,6 +401,30 @@ static long f_dupfd_query(int fd, struct file *filp)
>  	return f.file == filp;
>  }
>  
> +static int f_owner_sig(struct file *filp, int signum, bool setsig)
> +{
> +	int ret = 0;
> +	struct fown_struct *f_owner;
> +
> +	might_sleep();
> +
> +	if (setsig) {
> +		if (!valid_signal(signum))
> +			return -EINVAL;
> +
> +		ret = file_f_owner_allocate(filp);
> +		if (ret)
> +			return ret;
> +	}
> +
> +	f_owner = file_f_owner(filp);
> +	if (setsig)
> +		f_owner->signum = signum;
> +	else if (f_owner)
> +		ret = f_owner->signum;
> +	return ret;
> +}
> +
>  static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
>  		struct file *filp)
>  {
> @@ -421,15 +503,10 @@ static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
>  		err = f_getowner_uids(filp, arg);
>  		break;
>  	case F_GETSIG:
> -		err = filp->f_owner.signum;
> +		err = f_owner_sig(filp, 0, false);
>  		break;
>  	case F_SETSIG:
> -		/* arg == 0 restores default behaviour. */
> -		if (!valid_signal(argi)) {
> -			break;
> -		}
> -		err = 0;
> -		filp->f_owner.signum = argi;
> +		err = f_owner_sig(filp, argi, true);
>  		break;
>  	case F_GETLEASE:
>  		err = fcntl_getlease(filp);
> @@ -844,14 +921,19 @@ static void send_sigurg_to_task(struct task_struct *p,
>  		do_send_sig_info(SIGURG, SEND_SIG_PRIV, p, type);
>  }
>  
> -int send_sigurg(struct fown_struct *fown)
> +int send_sigurg(struct file *file)
>  {
> +	struct fown_struct *fown;
>  	struct task_struct *p;
>  	enum pid_type type;
>  	struct pid *pid;
>  	unsigned long flags;
>  	int ret = 0;
>  	
> +	fown = file_f_owner(file);
> +	if (fown)
> +		return 0;

This needs to be fixed (as Mateusz pointed out).

> +
>  	read_lock_irqsave(&fown->lock, flags);
>  
>  	type = fown->pid_type;
> @@ -1027,13 +1109,16 @@ static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
>  		}
>  		read_lock_irqsave(&fa->fa_lock, flags);
>  		if (fa->fa_file) {
> -			fown = &fa->fa_file->f_owner;
> +			fown = file_f_owner(fa->fa_file);
> +			if (!fown)
> +				goto next;
>  			/* Don't send SIGURG to processes which have not set a
>  			   queued signum: SIGURG has its own default signalling
>  			   mechanism. */
>  			if (!(sig == SIGURG && fown->signum == 0))
>  				send_sigio(fown, fa->fa_fd, band);
>  		}
> +next:
>  		read_unlock_irqrestore(&fa->fa_lock, flags);
>  		fa = rcu_dereference(fa->fa_next);
>  	}
> diff --git a/fs/file_table.c b/fs/file_table.c
> index ca7843dde56d..41ff037a8dc9 100644
> --- a/fs/file_table.c
> +++ b/fs/file_table.c
> @@ -155,7 +155,6 @@ static int init_file(struct file *f, int flags, const struct cred *cred)
>  		return error;
>  	}
>  
> -	rwlock_init(&f->f_owner.lock);
>  	spin_lock_init(&f->f_lock);
>  	mutex_init(&f->f_pos_lock);
>  	f->f_flags = flags;
> @@ -425,7 +424,10 @@ static void __fput(struct file *file)
>  		cdev_put(inode->i_cdev);
>  	}
>  	fops_put(file->f_op);
> -	put_pid(file->f_owner.pid);
> +	if (file->f_owner) {
> +		put_pid(file->f_owner->pid);
> +		kfree(file->f_owner);
> +	}
>  	put_file_access(file);
>  	dput(dentry);
>  	if (unlikely(mode & FMODE_NEED_UNMOUNT))
> diff --git a/fs/locks.c b/fs/locks.c
> index 9afb16e0683f..c0d312481b97 100644
> --- a/fs/locks.c
> +++ b/fs/locks.c
> @@ -1451,7 +1451,7 @@ int lease_modify(struct file_lease *fl, int arg, struct list_head *dispose)
>  		struct file *filp = fl->c.flc_file;
>  
>  		f_delown(filp);
> -		filp->f_owner.signum = 0;
> +		file_f_owner(filp)->signum = 0;
>  		fasync_helper(0, fl->c.flc_file, 0, &fl->fl_fasync);
>  		if (fl->fl_fasync != NULL) {
>  			printk(KERN_ERR "locks_delete_lock: fasync == %p\n", fl->fl_fasync);
> @@ -1783,6 +1783,10 @@ generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
>  	lease = *flp;
>  	trace_generic_add_lease(inode, lease);
>  
> +	error = file_f_owner_allocate(filp);
> +	if (error)
> +		return error;
> +
>  	/* Note that arg is never F_UNLCK here */
>  	ctx = locks_get_lock_context(inode, arg);
>  	if (!ctx)
> diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
> index f3669403fabf..46440fbb8662 100644
> --- a/fs/notify/dnotify/dnotify.c
> +++ b/fs/notify/dnotify/dnotify.c
> @@ -110,7 +110,7 @@ static int dnotify_handle_event(struct fsnotify_mark *inode_mark, u32 mask,
>  			prev = &dn->dn_next;
>  			continue;
>  		}
> -		fown = &dn->dn_filp->f_owner;
> +		fown = file_f_owner(dn->dn_filp);
>  		send_sigio(fown, dn->dn_fd, POLL_MSG);
>  		if (dn->dn_mask & FS_DN_MULTISHOT)
>  			prev = &dn->dn_next;
> @@ -316,6 +316,10 @@ int fcntl_dirnotify(int fd, struct file *filp, unsigned int arg)
>  		goto out_err;
>  	}
>  
> +	error = file_f_owner_allocate(filp);
> +	if (error)
> +		goto out_err;
> +
>  	/* set up the new_fsn_mark and new_dn_mark */
>  	new_fsn_mark = &new_dn_mark->fsn_mark;
>  	fsnotify_init_mark(new_fsn_mark, dnotify_group);
> diff --git a/include/linux/fs.h b/include/linux/fs.h
> index fd34b5755c0b..319c566a9e98 100644
> --- a/include/linux/fs.h
> +++ b/include/linux/fs.h
> @@ -947,6 +947,7 @@ static inline unsigned imajor(const struct inode *inode)
>  }
>  
>  struct fown_struct {
> +	struct file *file;	/* backpointer for security modules */

This struct was 32 bytes before (on x86_64). Now it'll be 40. That's
fine, but it may be worthwhile to create a dedicated slabcache for this
now, since it's no longer a power-of-2 size.

>  	rwlock_t lock;          /* protects pid, uid, euid fields */
>  	struct pid *pid;	/* pid or -pgrp where SIGIO should be sent */
>  	enum pid_type pid_type;	/* Kind of process group SIGIO should be sent to */
> @@ -1011,7 +1012,7 @@ struct file {
>  	struct mutex		f_pos_lock;
>  	loff_t			f_pos;
>  	unsigned int		f_flags;
> -	struct fown_struct	f_owner;
> +	struct fown_struct	*f_owner;
>  	const struct cred	*f_cred;
>  	struct file_ra_state	f_ra;
>  	struct path		f_path;
> @@ -1076,6 +1077,12 @@ struct file_lease;
>  #define OFFT_OFFSET_MAX	type_max(off_t)
>  #endif
>  
> +int file_f_owner_allocate(struct file *file);
> +static inline struct fown_struct *file_f_owner(const struct file *file)
> +{
> +	return READ_ONCE(file->f_owner);
> +}
> +
>  extern void send_sigio(struct fown_struct *fown, int fd, int band);
>  
>  static inline struct inode *file_inode(const struct file *f)
> @@ -1124,7 +1131,7 @@ extern void __f_setown(struct file *filp, struct pid *, enum pid_type, int force
>  extern int f_setown(struct file *filp, int who, int force);
>  extern void f_delown(struct file *filp);
>  extern pid_t f_getown(struct file *filp);
> -extern int send_sigurg(struct fown_struct *fown);
> +extern int send_sigurg(struct file *file);
>  
>  /*
>   * sb->s_flags.  Note that these mirror the equivalent MS_* flags where
> diff --git a/net/core/sock.c b/net/core/sock.c
> index 9abc4fe25953..bbe4c58470c3 100644
> --- a/net/core/sock.c
> +++ b/net/core/sock.c
> @@ -3429,7 +3429,7 @@ static void sock_def_destruct(struct sock *sk)
>  void sk_send_sigurg(struct sock *sk)
>  {
>  	if (sk->sk_socket && sk->sk_socket->file)
> -		if (send_sigurg(&sk->sk_socket->file->f_owner))
> +		if (send_sigurg(sk->sk_socket->file))
>  			sk_wake_async(sk, SOCK_WAKE_URG, POLL_PRI);
>  }
>  EXPORT_SYMBOL(sk_send_sigurg);
> diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
> index 55c78c318ccd..ed252cfba4e9 100644
> --- a/security/selinux/hooks.c
> +++ b/security/selinux/hooks.c
> @@ -3940,7 +3940,7 @@ static int selinux_file_send_sigiotask(struct task_struct *tsk,
>  	struct file_security_struct *fsec;
>  
>  	/* struct fown_struct is never outside the context of a struct file */
> -	file = container_of(fown, struct file, f_owner);
> +	file = fown->file;
>  
>  	fsec = selinux_file(file);
>  
> diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
> index 4164699cd4f6..cb33920ab67c 100644
> --- a/security/smack/smack_lsm.c
> +++ b/security/smack/smack_lsm.c
> @@ -1950,7 +1950,7 @@ static int smack_file_send_sigiotask(struct task_struct *tsk,
>  	/*
>  	 * struct fown_struct is never outside the context of a struct file
>  	 */
> -	file = container_of(fown, struct file, f_owner);
> +	file = fown->file;
>  
>  	/* we don't log here as rc can be overriden */
>  	blob = smack_file(file);
> 
> ---
> base-commit: 8400291e289ee6b2bf9779ff1c83a291501f017b
> change-id: 20240813-work-f_owner-0fbbc50f9671
> 

Aside from the bug that Mateusz pointed out, this looks fine to me.
Assuming you fix that bug:

Reviewed-by: Jeff Layton <jlayton@kernel.org>
Mateusz Guzik Aug. 13, 2024, 1:42 p.m. UTC | #4
On Tue, Aug 13, 2024 at 3:02 PM Jeff Layton <jlayton@kernel.org> wrote:
>
> On Tue, 2024-08-13 at 14:30 +0200, Christian Brauner wrote:
> > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > index fd34b5755c0b..319c566a9e98 100644
> > --- a/include/linux/fs.h
> > +++ b/include/linux/fs.h
> > @@ -947,6 +947,7 @@ static inline unsigned imajor(const struct inode *inode)
> >  }
> >
> >  struct fown_struct {
> > +     struct file *file;      /* backpointer for security modules */
>
> This struct was 32 bytes before (on x86_64). Now it'll be 40. That's
> fine, but it may be worthwhile to create a dedicated slabcache for this
> now, since it's no longer a power-of-2 size.
>

creating a dedicated slab would be a waste imo

If someone is concerned with memory usage here, I note that plausibly
the file pointer can be plumbed through sigio code, eliding any use of
the newly added ->file

However, a real daredevil would check if the thing to do is perhaps to
add a 48-byte malloc slab. But that would require quite a bit of
looking at allocations elsewhere, collecting stats and whatnot. Noting
this just in case, I have negative interest in looking at this.
Mateusz Guzik Aug. 13, 2024, 1:57 p.m. UTC | #5
On Tue, Aug 13, 2024 at 3:42 PM Mateusz Guzik <mjguzik@gmail.com> wrote:
>
> On Tue, Aug 13, 2024 at 3:02 PM Jeff Layton <jlayton@kernel.org> wrote:
> >
> > On Tue, 2024-08-13 at 14:30 +0200, Christian Brauner wrote:
> > > diff --git a/include/linux/fs.h b/include/linux/fs.h
> > > index fd34b5755c0b..319c566a9e98 100644
> > > --- a/include/linux/fs.h
> > > +++ b/include/linux/fs.h
> > > @@ -947,6 +947,7 @@ static inline unsigned imajor(const struct inode *inode)
> > >  }
> > >
> > >  struct fown_struct {
> > > +     struct file *file;      /* backpointer for security modules */
> >
> > This struct was 32 bytes before (on x86_64). Now it'll be 40. That's
> > fine, but it may be worthwhile to create a dedicated slabcache for this
> > now, since it's no longer a power-of-2 size.
> >
>
> creating a dedicated slab would be a waste imo
>
> If someone is concerned with memory usage here, I note that plausibly
> the file pointer can be plumbed through sigio code, eliding any use of
> the newly added ->file
>
> However, a real daredevil would check if the thing to do is perhaps to
> add a 48-byte malloc slab. But that would require quite a bit of
> looking at allocations elsewhere, collecting stats and whatnot. Noting
> this just in case, I have negative interest in looking at this.
>

Some massaging of the struct brings me back to 32 bytes despite the pointer:
- rwlock -> spinlock
- signum int->short
- pid_type to 1 byte enum

but I don't think making this work is necessary at this point

struct fown_struct_poc {
        struct file *              file;                 /*     0     8 */
        spinlock_t                 lock;                 /*     8     4 */
        short int                  signum;               /*    12     2 */

        /* Bitfield combined with previous fields */

        enum pid_type              pid_type:8;           /*    12:16  4 */

        /* XXX 8 bits hole, try to pack */

        struct pid *               pid;                  /*    16     8 */
        kuid_t                     uid;                  /*    24     4 */
        kuid_t                     euid;                 /*    28     4 */

        /* size: 32, cachelines: 1, members: 7 */
        /* sum members: 30 */
        /* sum bitfield members: 8 bits, bit holes: 1, sum bit holes: 8 bits */
        /* last cacheline: 32 bytes */
};
diff mbox series

Patch

diff --git a/drivers/net/tun.c b/drivers/net/tun.c
index 1d06c560c5e6..6fe5e8f7017c 100644
--- a/drivers/net/tun.c
+++ b/drivers/net/tun.c
@@ -3451,6 +3451,12 @@  static int tun_chr_fasync(int fd, struct file *file, int on)
 	struct tun_file *tfile = file->private_data;
 	int ret;
 
+	if (on) {
+		ret = file_f_owner_allocate(file);
+		if (ret)
+			goto out;
+	}
+
 	if ((ret = fasync_helper(fd, file, on, &tfile->fasync)) < 0)
 		goto out;
 
diff --git a/drivers/tty/tty_io.c b/drivers/tty/tty_io.c
index 407b0d87b7c1..7ae0c8934f42 100644
--- a/drivers/tty/tty_io.c
+++ b/drivers/tty/tty_io.c
@@ -2225,6 +2225,12 @@  static int __tty_fasync(int fd, struct file *filp, int on)
 	if (tty_paranoia_check(tty, file_inode(filp), "tty_fasync"))
 		goto out;
 
+	if (on) {
+		retval = file_f_owner_allocate(filp);
+		if (retval)
+			goto out;
+	}
+
 	retval = fasync_helper(fd, filp, on, &tty->fasync);
 	if (retval <= 0)
 		goto out;
diff --git a/fs/fcntl.c b/fs/fcntl.c
index 300e5d9ad913..b002730fdcd1 100644
--- a/fs/fcntl.c
+++ b/fs/fcntl.c
@@ -87,22 +87,53 @@  static int setfl(int fd, struct file * filp, unsigned int arg)
 	return error;
 }
 
+/*
+ * Allocate an file->f_owner struct if it doesn't exist, handling racing
+ * allocations correctly.
+ */
+int file_f_owner_allocate(struct file *file)
+{
+	struct fown_struct *f_owner;
+
+	f_owner = file_f_owner(file);
+	if (f_owner)
+		return 0;
+
+	f_owner = kzalloc(sizeof(struct fown_struct), GFP_KERNEL);
+	if (!f_owner)
+		return -ENOMEM;
+
+	rwlock_init(&f_owner->lock);
+	f_owner->file = file;
+	/* If someone else raced us, drop our allocation. */
+	if (unlikely(cmpxchg(&file->f_owner, NULL, f_owner)))
+		kfree(f_owner);
+	return 0;
+}
+EXPORT_SYMBOL(file_f_owner_allocate);
+
 static void f_modown(struct file *filp, struct pid *pid, enum pid_type type,
                      int force)
 {
-	write_lock_irq(&filp->f_owner.lock);
-	if (force || !filp->f_owner.pid) {
-		put_pid(filp->f_owner.pid);
-		filp->f_owner.pid = get_pid(pid);
-		filp->f_owner.pid_type = type;
+	struct fown_struct *f_owner;
+
+	f_owner = file_f_owner(filp);
+	if (WARN_ON_ONCE(!f_owner))
+		return;
+
+	write_lock_irq(&f_owner->lock);
+	if (force || !f_owner->pid) {
+		put_pid(f_owner->pid);
+		f_owner->pid = get_pid(pid);
+		f_owner->pid_type = type;
 
 		if (pid) {
 			const struct cred *cred = current_cred();
-			filp->f_owner.uid = cred->uid;
-			filp->f_owner.euid = cred->euid;
+			f_owner->uid = cred->uid;
+			f_owner->euid = cred->euid;
 		}
 	}
-	write_unlock_irq(&filp->f_owner.lock);
+	write_unlock_irq(&f_owner->lock);
 }
 
 void __f_setown(struct file *filp, struct pid *pid, enum pid_type type,
@@ -119,6 +150,8 @@  int f_setown(struct file *filp, int who, int force)
 	struct pid *pid = NULL;
 	int ret = 0;
 
+	might_sleep();
+
 	type = PIDTYPE_TGID;
 	if (who < 0) {
 		/* avoid overflow below */
@@ -129,6 +162,10 @@  int f_setown(struct file *filp, int who, int force)
 		who = -who;
 	}
 
+	ret = file_f_owner_allocate(filp);
+	if (ret)
+		return ret;
+
 	rcu_read_lock();
 	if (who) {
 		pid = find_vpid(who);
@@ -152,16 +189,21 @@  void f_delown(struct file *filp)
 pid_t f_getown(struct file *filp)
 {
 	pid_t pid = 0;
+	struct fown_struct *f_owner;
 
-	read_lock_irq(&filp->f_owner.lock);
+	f_owner = file_f_owner(filp);
+	if (!f_owner)
+		return pid;
+
+	read_lock_irq(&f_owner->lock);
 	rcu_read_lock();
-	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type)) {
-		pid = pid_vnr(filp->f_owner.pid);
-		if (filp->f_owner.pid_type == PIDTYPE_PGID)
+	if (pid_task(f_owner->pid, f_owner->pid_type)) {
+		pid = pid_vnr(f_owner->pid);
+		if (f_owner->pid_type == PIDTYPE_PGID)
 			pid = -pid;
 	}
 	rcu_read_unlock();
-	read_unlock_irq(&filp->f_owner.lock);
+	read_unlock_irq(&f_owner->lock);
 	return pid;
 }
 
@@ -194,6 +236,10 @@  static int f_setown_ex(struct file *filp, unsigned long arg)
 		return -EINVAL;
 	}
 
+	ret = file_f_owner_allocate(filp);
+	if (ret)
+		return ret;
+
 	rcu_read_lock();
 	pid = find_vpid(owner.pid);
 	if (owner.pid && !pid)
@@ -210,13 +256,20 @@  static int f_getown_ex(struct file *filp, unsigned long arg)
 	struct f_owner_ex __user *owner_p = (void __user *)arg;
 	struct f_owner_ex owner = {};
 	int ret = 0;
+	struct fown_struct *f_owner;
+	enum pid_type pid_type = PIDTYPE_PID;
 
-	read_lock_irq(&filp->f_owner.lock);
-	rcu_read_lock();
-	if (pid_task(filp->f_owner.pid, filp->f_owner.pid_type))
-		owner.pid = pid_vnr(filp->f_owner.pid);
-	rcu_read_unlock();
-	switch (filp->f_owner.pid_type) {
+	f_owner = file_f_owner(filp);
+	if (f_owner) {
+		read_lock_irq(&f_owner->lock);
+		rcu_read_lock();
+		if (pid_task(f_owner->pid, f_owner->pid_type))
+			owner.pid = pid_vnr(f_owner->pid);
+		rcu_read_unlock();
+		pid_type = f_owner->pid_type;
+	}
+
+	switch (pid_type) {
 	case PIDTYPE_PID:
 		owner.type = F_OWNER_TID;
 		break;
@@ -234,7 +287,8 @@  static int f_getown_ex(struct file *filp, unsigned long arg)
 		ret = -EINVAL;
 		break;
 	}
-	read_unlock_irq(&filp->f_owner.lock);
+	if (f_owner)
+		read_unlock_irq(&f_owner->lock);
 
 	if (!ret) {
 		ret = copy_to_user(owner_p, &owner, sizeof(owner));
@@ -248,14 +302,18 @@  static int f_getown_ex(struct file *filp, unsigned long arg)
 static int f_getowner_uids(struct file *filp, unsigned long arg)
 {
 	struct user_namespace *user_ns = current_user_ns();
+	struct fown_struct *f_owner;
 	uid_t __user *dst = (void __user *)arg;
-	uid_t src[2];
+	uid_t src[2] = {0, 0};
 	int err;
 
-	read_lock_irq(&filp->f_owner.lock);
-	src[0] = from_kuid(user_ns, filp->f_owner.uid);
-	src[1] = from_kuid(user_ns, filp->f_owner.euid);
-	read_unlock_irq(&filp->f_owner.lock);
+	f_owner = file_f_owner(filp);
+	if (f_owner) {
+		read_lock_irq(&f_owner->lock);
+		src[0] = from_kuid(user_ns, f_owner->uid);
+		src[1] = from_kuid(user_ns, f_owner->euid);
+		read_unlock_irq(&f_owner->lock);
+	}
 
 	err  = put_user(src[0], &dst[0]);
 	err |= put_user(src[1], &dst[1]);
@@ -343,6 +401,30 @@  static long f_dupfd_query(int fd, struct file *filp)
 	return f.file == filp;
 }
 
+static int f_owner_sig(struct file *filp, int signum, bool setsig)
+{
+	int ret = 0;
+	struct fown_struct *f_owner;
+
+	might_sleep();
+
+	if (setsig) {
+		if (!valid_signal(signum))
+			return -EINVAL;
+
+		ret = file_f_owner_allocate(filp);
+		if (ret)
+			return ret;
+	}
+
+	f_owner = file_f_owner(filp);
+	if (setsig)
+		f_owner->signum = signum;
+	else if (f_owner)
+		ret = f_owner->signum;
+	return ret;
+}
+
 static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
 		struct file *filp)
 {
@@ -421,15 +503,10 @@  static long do_fcntl(int fd, unsigned int cmd, unsigned long arg,
 		err = f_getowner_uids(filp, arg);
 		break;
 	case F_GETSIG:
-		err = filp->f_owner.signum;
+		err = f_owner_sig(filp, 0, false);
 		break;
 	case F_SETSIG:
-		/* arg == 0 restores default behaviour. */
-		if (!valid_signal(argi)) {
-			break;
-		}
-		err = 0;
-		filp->f_owner.signum = argi;
+		err = f_owner_sig(filp, argi, true);
 		break;
 	case F_GETLEASE:
 		err = fcntl_getlease(filp);
@@ -844,14 +921,19 @@  static void send_sigurg_to_task(struct task_struct *p,
 		do_send_sig_info(SIGURG, SEND_SIG_PRIV, p, type);
 }
 
-int send_sigurg(struct fown_struct *fown)
+int send_sigurg(struct file *file)
 {
+	struct fown_struct *fown;
 	struct task_struct *p;
 	enum pid_type type;
 	struct pid *pid;
 	unsigned long flags;
 	int ret = 0;
 	
+	fown = file_f_owner(file);
+	if (fown)
+		return 0;
+
 	read_lock_irqsave(&fown->lock, flags);
 
 	type = fown->pid_type;
@@ -1027,13 +1109,16 @@  static void kill_fasync_rcu(struct fasync_struct *fa, int sig, int band)
 		}
 		read_lock_irqsave(&fa->fa_lock, flags);
 		if (fa->fa_file) {
-			fown = &fa->fa_file->f_owner;
+			fown = file_f_owner(fa->fa_file);
+			if (!fown)
+				goto next;
 			/* Don't send SIGURG to processes which have not set a
 			   queued signum: SIGURG has its own default signalling
 			   mechanism. */
 			if (!(sig == SIGURG && fown->signum == 0))
 				send_sigio(fown, fa->fa_fd, band);
 		}
+next:
 		read_unlock_irqrestore(&fa->fa_lock, flags);
 		fa = rcu_dereference(fa->fa_next);
 	}
diff --git a/fs/file_table.c b/fs/file_table.c
index ca7843dde56d..41ff037a8dc9 100644
--- a/fs/file_table.c
+++ b/fs/file_table.c
@@ -155,7 +155,6 @@  static int init_file(struct file *f, int flags, const struct cred *cred)
 		return error;
 	}
 
-	rwlock_init(&f->f_owner.lock);
 	spin_lock_init(&f->f_lock);
 	mutex_init(&f->f_pos_lock);
 	f->f_flags = flags;
@@ -425,7 +424,10 @@  static void __fput(struct file *file)
 		cdev_put(inode->i_cdev);
 	}
 	fops_put(file->f_op);
-	put_pid(file->f_owner.pid);
+	if (file->f_owner) {
+		put_pid(file->f_owner->pid);
+		kfree(file->f_owner);
+	}
 	put_file_access(file);
 	dput(dentry);
 	if (unlikely(mode & FMODE_NEED_UNMOUNT))
diff --git a/fs/locks.c b/fs/locks.c
index 9afb16e0683f..c0d312481b97 100644
--- a/fs/locks.c
+++ b/fs/locks.c
@@ -1451,7 +1451,7 @@  int lease_modify(struct file_lease *fl, int arg, struct list_head *dispose)
 		struct file *filp = fl->c.flc_file;
 
 		f_delown(filp);
-		filp->f_owner.signum = 0;
+		file_f_owner(filp)->signum = 0;
 		fasync_helper(0, fl->c.flc_file, 0, &fl->fl_fasync);
 		if (fl->fl_fasync != NULL) {
 			printk(KERN_ERR "locks_delete_lock: fasync == %p\n", fl->fl_fasync);
@@ -1783,6 +1783,10 @@  generic_add_lease(struct file *filp, int arg, struct file_lease **flp, void **pr
 	lease = *flp;
 	trace_generic_add_lease(inode, lease);
 
+	error = file_f_owner_allocate(filp);
+	if (error)
+		return error;
+
 	/* Note that arg is never F_UNLCK here */
 	ctx = locks_get_lock_context(inode, arg);
 	if (!ctx)
diff --git a/fs/notify/dnotify/dnotify.c b/fs/notify/dnotify/dnotify.c
index f3669403fabf..46440fbb8662 100644
--- a/fs/notify/dnotify/dnotify.c
+++ b/fs/notify/dnotify/dnotify.c
@@ -110,7 +110,7 @@  static int dnotify_handle_event(struct fsnotify_mark *inode_mark, u32 mask,
 			prev = &dn->dn_next;
 			continue;
 		}
-		fown = &dn->dn_filp->f_owner;
+		fown = file_f_owner(dn->dn_filp);
 		send_sigio(fown, dn->dn_fd, POLL_MSG);
 		if (dn->dn_mask & FS_DN_MULTISHOT)
 			prev = &dn->dn_next;
@@ -316,6 +316,10 @@  int fcntl_dirnotify(int fd, struct file *filp, unsigned int arg)
 		goto out_err;
 	}
 
+	error = file_f_owner_allocate(filp);
+	if (error)
+		goto out_err;
+
 	/* set up the new_fsn_mark and new_dn_mark */
 	new_fsn_mark = &new_dn_mark->fsn_mark;
 	fsnotify_init_mark(new_fsn_mark, dnotify_group);
diff --git a/include/linux/fs.h b/include/linux/fs.h
index fd34b5755c0b..319c566a9e98 100644
--- a/include/linux/fs.h
+++ b/include/linux/fs.h
@@ -947,6 +947,7 @@  static inline unsigned imajor(const struct inode *inode)
 }
 
 struct fown_struct {
+	struct file *file;	/* backpointer for security modules */
 	rwlock_t lock;          /* protects pid, uid, euid fields */
 	struct pid *pid;	/* pid or -pgrp where SIGIO should be sent */
 	enum pid_type pid_type;	/* Kind of process group SIGIO should be sent to */
@@ -1011,7 +1012,7 @@  struct file {
 	struct mutex		f_pos_lock;
 	loff_t			f_pos;
 	unsigned int		f_flags;
-	struct fown_struct	f_owner;
+	struct fown_struct	*f_owner;
 	const struct cred	*f_cred;
 	struct file_ra_state	f_ra;
 	struct path		f_path;
@@ -1076,6 +1077,12 @@  struct file_lease;
 #define OFFT_OFFSET_MAX	type_max(off_t)
 #endif
 
+int file_f_owner_allocate(struct file *file);
+static inline struct fown_struct *file_f_owner(const struct file *file)
+{
+	return READ_ONCE(file->f_owner);
+}
+
 extern void send_sigio(struct fown_struct *fown, int fd, int band);
 
 static inline struct inode *file_inode(const struct file *f)
@@ -1124,7 +1131,7 @@  extern void __f_setown(struct file *filp, struct pid *, enum pid_type, int force
 extern int f_setown(struct file *filp, int who, int force);
 extern void f_delown(struct file *filp);
 extern pid_t f_getown(struct file *filp);
-extern int send_sigurg(struct fown_struct *fown);
+extern int send_sigurg(struct file *file);
 
 /*
  * sb->s_flags.  Note that these mirror the equivalent MS_* flags where
diff --git a/net/core/sock.c b/net/core/sock.c
index 9abc4fe25953..bbe4c58470c3 100644
--- a/net/core/sock.c
+++ b/net/core/sock.c
@@ -3429,7 +3429,7 @@  static void sock_def_destruct(struct sock *sk)
 void sk_send_sigurg(struct sock *sk)
 {
 	if (sk->sk_socket && sk->sk_socket->file)
-		if (send_sigurg(&sk->sk_socket->file->f_owner))
+		if (send_sigurg(sk->sk_socket->file))
 			sk_wake_async(sk, SOCK_WAKE_URG, POLL_PRI);
 }
 EXPORT_SYMBOL(sk_send_sigurg);
diff --git a/security/selinux/hooks.c b/security/selinux/hooks.c
index 55c78c318ccd..ed252cfba4e9 100644
--- a/security/selinux/hooks.c
+++ b/security/selinux/hooks.c
@@ -3940,7 +3940,7 @@  static int selinux_file_send_sigiotask(struct task_struct *tsk,
 	struct file_security_struct *fsec;
 
 	/* struct fown_struct is never outside the context of a struct file */
-	file = container_of(fown, struct file, f_owner);
+	file = fown->file;
 
 	fsec = selinux_file(file);
 
diff --git a/security/smack/smack_lsm.c b/security/smack/smack_lsm.c
index 4164699cd4f6..cb33920ab67c 100644
--- a/security/smack/smack_lsm.c
+++ b/security/smack/smack_lsm.c
@@ -1950,7 +1950,7 @@  static int smack_file_send_sigiotask(struct task_struct *tsk,
 	/*
 	 * struct fown_struct is never outside the context of a struct file
 	 */
-	file = container_of(fown, struct file, f_owner);
+	file = fown->file;
 
 	/* we don't log here as rc can be overriden */
 	blob = smack_file(file);