mbox series

[PATCHSET,v2,0/5] xfs: make attr forks permanent

Message ID 165740691606.73293.12753862498202082021.stgit@magnolia (mailing list archive)
Headers show
Series xfs: make attr forks permanent | expand

Message

Darrick J. Wong July 9, 2022, 10:48 p.m. UTC
Hi all,

This series fixes a use-after-free bug that syzbot uncovered.  The UAF
itself is a result of a race condition between getxattr and removexattr
because callers to getxattr do not necessarily take any sort of locks
before calling into the filesystem.

Although the race condition itself can be fixed through clever use of a
memory barrier, further consideration of the use cases of extended
attributes shows that most files always have at least one attribute, so
we might as well make them permanent.

Note: Patch 3 still needs review.

v2: Minor tweaks suggested by Dave, and convert some more macros to
helper functions.

If you're going to start using this mess, you probably ought to just
pull from my git trees, which are linked below.

This is an extraordinary way to destroy everything.  Enjoy!
Comments and questions are, as always, welcome.

--D

kernel git tree:
https://git.kernel.org/cgit/linux/kernel/git/djwong/xfs-linux.git/log/?h=make-attr-fork-permanent-5.20
---
 fs/xfs/libxfs/xfs_attr.c           |   20 ++++-----
 fs/xfs/libxfs/xfs_attr.h           |   10 ++--
 fs/xfs/libxfs/xfs_attr_leaf.c      |   29 ++++++-------
 fs/xfs/libxfs/xfs_bmap.c           |   81 ++++++++++++++++++------------------
 fs/xfs/libxfs/xfs_bmap_btree.c     |   10 ++--
 fs/xfs/libxfs/xfs_btree.c          |    4 +-
 fs/xfs/libxfs/xfs_dir2.c           |    2 -
 fs/xfs/libxfs/xfs_dir2_block.c     |    6 +--
 fs/xfs/libxfs/xfs_dir2_sf.c        |    8 ++--
 fs/xfs/libxfs/xfs_inode_buf.c      |    7 +--
 fs/xfs/libxfs/xfs_inode_fork.c     |   65 ++++++++++++++++-------------
 fs/xfs/libxfs/xfs_inode_fork.h     |   27 ++----------
 fs/xfs/libxfs/xfs_symlink_remote.c |    2 -
 fs/xfs/scrub/bmap.c                |   14 +++---
 fs/xfs/scrub/btree.c               |    2 -
 fs/xfs/scrub/dabtree.c             |    2 -
 fs/xfs/scrub/dir.c                 |    2 -
 fs/xfs/scrub/quota.c               |    2 -
 fs/xfs/scrub/symlink.c             |    6 +--
 fs/xfs/xfs_attr_inactive.c         |   16 +++----
 fs/xfs/xfs_attr_list.c             |    9 ++--
 fs/xfs/xfs_bmap_util.c             |   22 +++++-----
 fs/xfs/xfs_dir2_readdir.c          |    2 -
 fs/xfs/xfs_icache.c                |   12 +++--
 fs/xfs/xfs_inode.c                 |   24 +++++------
 fs/xfs/xfs_inode.h                 |   62 +++++++++++++++++++++++++++-
 fs/xfs/xfs_inode_item.c            |   58 +++++++++++++-------------
 fs/xfs/xfs_ioctl.c                 |    2 -
 fs/xfs/xfs_iomap.c                 |    8 ++--
 fs/xfs/xfs_iops.c                  |    2 -
 fs/xfs/xfs_itable.c                |    4 +-
 fs/xfs/xfs_qm.c                    |    2 -
 fs/xfs/xfs_reflink.c               |    6 +--
 fs/xfs/xfs_symlink.c               |    2 -
 fs/xfs/xfs_trace.h                 |    2 -
 35 files changed, 285 insertions(+), 247 deletions(-)

Comments

Christoph Hellwig July 11, 2022, 5:25 a.m. UTC | #1
On Sat, Jul 09, 2022 at 03:48:36PM -0700, Darrick J. Wong wrote:
> Although the race condition itself can be fixed through clever use of a
> memory barrier, further consideration of the use cases of extended
> attributes shows that most files always have at least one attribute, so
> we might as well make them permanent.

I kinda hat increase the size of the inode even more, but there is no
arguing about keeping nasty rarely used code simple vs micro-optimizing
it.  Do you have numbers on hand on how many inodes we can cache in
an order 0 or 1 cache before and after this?
Darrick J. Wong July 12, 2022, 1:53 a.m. UTC | #2
On Sun, Jul 10, 2022 at 10:25:43PM -0700, Christoph Hellwig wrote:
> On Sat, Jul 09, 2022 at 03:48:36PM -0700, Darrick J. Wong wrote:
> > Although the race condition itself can be fixed through clever use of a
> > memory barrier, further consideration of the use cases of extended
> > attributes shows that most files always have at least one attribute, so
> > we might as well make them permanent.
> 
> I kinda hat increase the size of the inode even more, but there is no
> arguing about keeping nasty rarely used code simple vs micro-optimizing
> it.  Do you have numbers on hand on how many inodes we can cache in
> an order 0 or 1 cache before and after this?

Hm.  On my laptop running 5.18, xfs_inode before the change was 928 bytes,
and here's what it looks like:

			928 bytes
Order	Pagebytes	Slack	Objs	Slack/Objs
0	4096		384	4	96
1	8192		768	8	96
2	16384		608	17	36
3	32768		288	35	9
4	65536		576	70	9

So I guess that's why it picks order-3 slabs.

On a freshly built djwong-dev kernel, it's now 976 bytes:

			976 bytes
Order	Pagebytes	Slack	Objs	Slack/Objs
0	4096		192	4	48
1	8192		384	8	48
2	16384		768	16	48
3	32768		560	33	17
4	65536		144	67	2

Here it seems to pick order-2 slabs, which admittedly isn't great.

--D
Christoph Hellwig July 12, 2022, 4:50 a.m. UTC | #3
On Mon, Jul 11, 2022 at 06:53:52PM -0700, Darrick J. Wong wrote:
> On a freshly built djwong-dev kernel, it's now 976 bytes:
> 
> 			976 bytes
> Order	Pagebytes	Slack	Objs	Slack/Objs
> 0	4096		192	4	48
> 1	8192		384	8	48
> 2	16384		768	16	48
> 3	32768		560	33	17
> 4	65536		144	67	2


Yes, we're getting at a point where the xfs_inode size starts to
really hurt. It's all the small incrementally and very useful changes
like this series or the in-memory unlinked inode list from Dave that
keep growing it.  I have no really good answer here except that I'd
need to dust off some of my project to reduze the size by removing
fields that we don't strictly need.