mbox series

[0/6] xfs: CPU usage optimizations for realtime allocator

Message ID cover.1687296675.git.osandov@osandov.com (mailing list archive)
Headers show
Series xfs: CPU usage optimizations for realtime allocator | expand

Message

Omar Sandoval June 20, 2023, 9:32 p.m. UTC
Hello,

Our distributed storage system uses XFS's realtime device support as a
way to split an XFS filesystem between an SSD and an HDD -- we configure
the HDD as the realtime device so that metadata goes on the SSD and data
goes on the HDD.

We've been running this in production for a few years now, so we have
some fairly fragmented filesystems. This has exposed various CPU
inefficiencies in the realtime allocator. These became even worse when
we experimented with using XFS_XFLAG_EXTSIZE to force files to be
allocated contiguously.

This series adds several optimizations that don't change the realtime
allocator's decisions, but make them happen more efficiently, mainly by
avoiding redundant work. We've tested these in production and measured
~10% lower CPU utilization. Furthermore, it made it possible to use
XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
patches, our most fragmented systems would become unresponsive due to
high CPU usage in the realtime allocator, but with them, CPU utilization
is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
lower.

Patches 2 and 3 are preparations for later optimizations; the remaining
patches are the optimizations themselves.

This is based on Linus' tree as of today (commit
692b7dc87ca6d55ab254f8259e6f970171dc9d01).

Thanks!

Omar Sandoval (6):
  xfs: cache last bitmap block in realtime allocator
  xfs: invert the realtime summary cache
  xfs: return maximum free size from xfs_rtany_summary()
  xfs: limit maxlen based on available space in
    xfs_rtallocate_extent_near()
  xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
  xfs: don't look for end of extent further than necessary in
    xfs_rtallocate_extent_near()

 fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
 fs/xfs/xfs_mount.h           |   6 +-
 fs/xfs/xfs_rtalloc.c         | 215 ++++++++++++++++-------------------
 fs/xfs/xfs_rtalloc.h         |  28 +++--
 4 files changed, 207 insertions(+), 215 deletions(-)

Comments

Omar Sandoval July 6, 2023, 9:39 p.m. UTC | #1
On Tue, Jun 20, 2023 at 02:32:10PM -0700, Omar Sandoval wrote:
> Hello,
> 
> Our distributed storage system uses XFS's realtime device support as a
> way to split an XFS filesystem between an SSD and an HDD -- we configure
> the HDD as the realtime device so that metadata goes on the SSD and data
> goes on the HDD.
> 
> We've been running this in production for a few years now, so we have
> some fairly fragmented filesystems. This has exposed various CPU
> inefficiencies in the realtime allocator. These became even worse when
> we experimented with using XFS_XFLAG_EXTSIZE to force files to be
> allocated contiguously.
> 
> This series adds several optimizations that don't change the realtime
> allocator's decisions, but make them happen more efficiently, mainly by
> avoiding redundant work. We've tested these in production and measured
> ~10% lower CPU utilization. Furthermore, it made it possible to use
> XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
> patches, our most fragmented systems would become unresponsive due to
> high CPU usage in the realtime allocator, but with them, CPU utilization
> is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
> lower.
> 
> Patches 2 and 3 are preparations for later optimizations; the remaining
> patches are the optimizations themselves.
> 
> This is based on Linus' tree as of today (commit
> 692b7dc87ca6d55ab254f8259e6f970171dc9d01).
> 
> Thanks!
> 
> Omar Sandoval (6):
>   xfs: cache last bitmap block in realtime allocator
>   xfs: invert the realtime summary cache
>   xfs: return maximum free size from xfs_rtany_summary()
>   xfs: limit maxlen based on available space in
>     xfs_rtallocate_extent_near()
>   xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
>   xfs: don't look for end of extent further than necessary in
>     xfs_rtallocate_extent_near()
> 
>  fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
>  fs/xfs/xfs_mount.h           |   6 +-
>  fs/xfs/xfs_rtalloc.c         | 215 ++++++++++++++++-------------------
>  fs/xfs/xfs_rtalloc.h         |  28 +++--
>  4 files changed, 207 insertions(+), 215 deletions(-)

Gentle ping.
Dave Chinner July 7, 2023, 12:36 a.m. UTC | #2
On Thu, Jul 06, 2023 at 02:39:08PM -0700, Omar Sandoval wrote:
> On Tue, Jun 20, 2023 at 02:32:10PM -0700, Omar Sandoval wrote:
> > Hello,
> > 
> > Our distributed storage system uses XFS's realtime device support as a
> > way to split an XFS filesystem between an SSD and an HDD -- we configure
> > the HDD as the realtime device so that metadata goes on the SSD and data
> > goes on the HDD.
> > 
> > We've been running this in production for a few years now, so we have
> > some fairly fragmented filesystems. This has exposed various CPU
> > inefficiencies in the realtime allocator. These became even worse when
> > we experimented with using XFS_XFLAG_EXTSIZE to force files to be
> > allocated contiguously.
> > 
> > This series adds several optimizations that don't change the realtime
> > allocator's decisions, but make them happen more efficiently, mainly by
> > avoiding redundant work. We've tested these in production and measured
> > ~10% lower CPU utilization. Furthermore, it made it possible to use
> > XFS_XFLAG_EXTSIZE to force contiguous allocations -- without these
> > patches, our most fragmented systems would become unresponsive due to
> > high CPU usage in the realtime allocator, but with them, CPU utilization
> > is actually ~4-6% lower than before, and disk I/O utilization is 15-20%
> > lower.
> > 
> > Patches 2 and 3 are preparations for later optimizations; the remaining
> > patches are the optimizations themselves.
> > 
> > This is based on Linus' tree as of today (commit
> > 692b7dc87ca6d55ab254f8259e6f970171dc9d01).
> > 
> > Thanks!
> > 
> > Omar Sandoval (6):
> >   xfs: cache last bitmap block in realtime allocator
> >   xfs: invert the realtime summary cache
> >   xfs: return maximum free size from xfs_rtany_summary()
> >   xfs: limit maxlen based on available space in
> >     xfs_rtallocate_extent_near()
> >   xfs: don't try redundant allocations in xfs_rtallocate_extent_near()
> >   xfs: don't look for end of extent further than necessary in
> >     xfs_rtallocate_extent_near()
> > 
> >  fs/xfs/libxfs/xfs_rtbitmap.c | 173 ++++++++++++++--------------
> >  fs/xfs/xfs_mount.h           |   6 +-
> >  fs/xfs/xfs_rtalloc.c         | 215 ++++++++++++++++-------------------
> >  fs/xfs/xfs_rtalloc.h         |  28 +++--
> >  4 files changed, 207 insertions(+), 215 deletions(-)
> 
> Gentle ping.

Sorry, I haven't had time to get to this yet. I've still got a
couple more bug reports to work through before I can really start
thinking about looking at anything else..

Cheers,

Dave.