mbox series

[GIT,PULL] xfs: bug fixes for 6.4-rc2

Message ID 20230511020107.GI2651828@dread.disaster.area (mailing list archive)
State New, archived
Headers show
Series [GIT,PULL] xfs: bug fixes for 6.4-rc2 | expand

Pull-request

git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-6.4-rc1-fixes

Message

Dave Chinner May 11, 2023, 2:01 a.m. UTC
[ and now with the correct cc's. DOH! ]

Hi Linus,

Can you please pull the latest XFS updates fixes from the tag below?
The are largely minor bug fixes and cleanups, th emost important of
which are probably the fixes for regressions in the extent
allocation code. It merges cleanly with your current tree as of a
few minutes ago, so I don't expect you to see anything weird from
it.

The only thing worth noting is that turning off the counter scrub
code temporarily may produce a new warning about eliding unreachable
code on some compilers. I have not seen this myself (using gcc-12)
but it is harmless and will go away with the kernel-side exclusive
freeze infrastructure that we are hoping will get merged in the next
cycle.

Cheers,

Dave.

---

The following changes since commit 9419092fb2630c30e4ffeb9ef61007ef0c61827a:

  xfs: fix livelock in delayed allocation at ENOSPC (2023-04-27 09:02:11 +1000)

are available in the Git repository at:

  git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-6.4-rc1-fixes

for you to fetch changes up to 2254a7396a0ca6309854948ee1c0a33fa4268cec:

  xfs: fix xfs_inodegc_stop racing with mod_delayed_work (2023-05-02 09:16:14 +1000)

----------------------------------------------------------------
xfs: bug fixes for 6.4-rc2

o fixes for inode garbage collection shutdown racing with work queue
  updates
o ensure inodegc workers run on the CPU they are supposed to
o disable counter scrubbing until we can exclusively freeze the
  filesystem from the kernel
o Regression fixes for new allocation related bugs
o a couple of minor cleanups

----------------------------------------------------------------
Darrick J. Wong (9):
      xfs: don't unconditionally null args->pag in xfs_bmap_btalloc_at_eof
      xfs: set bnobt/cntbt numrecs correctly when formatting new AGs
      xfs: flush dirty data and drain directios before scrubbing cow fork
      xfs: don't allocate into the data fork for an unshare request
      xfs: fix negative array access in xfs_getbmap
      xfs: explicitly specify cpu when forcing inodegc delayed work to run immediately
      xfs: check that per-cpu inodegc workers actually run on that cpu
      xfs: disable reaping in fscounters scrub
      xfs: fix xfs_inodegc_stop racing with mod_delayed_work

 fs/xfs/libxfs/xfs_ag.c    | 19 +++++++++----------
 fs/xfs/libxfs/xfs_bmap.c  |  5 +++--
 fs/xfs/scrub/bmap.c       |  4 ++--
 fs/xfs/scrub/common.c     | 26 --------------------------
 fs/xfs/scrub/common.h     |  2 --
 fs/xfs/scrub/fscounters.c | 13 ++++++-------
 fs/xfs/scrub/scrub.c      |  2 --
 fs/xfs/scrub/scrub.h      |  1 -
 fs/xfs/scrub/trace.h      |  1 -
 fs/xfs/xfs_bmap_util.c    |  4 +++-
 fs/xfs/xfs_icache.c       | 40 +++++++++++++++++++++++++++++++++-------
 fs/xfs/xfs_iomap.c        |  5 +++--
 fs/xfs/xfs_mount.h        |  3 +++
 fs/xfs/xfs_super.c        |  3 +++
 14 files changed, 65 insertions(+), 63 deletions(-)

Comments

Linus Torvalds May 11, 2023, 2:49 a.m. UTC | #1
On Wed, May 10, 2023 at 9:01 PM Dave Chinner <david@fromorbit.com> wrote:
>
> [ and now with the correct cc's. DOH! ]

[ Bah. And now with the correct cc's on the reply too. ]

We do not consider build warnings even remotely "harmless".

That sounds like you will fail all build farms that happen to have
this compiler version.

Which in turn just means that I'd have to revert the commit.

The fact that you seem to think that build warnings don't matter and
are harmless AND you imply that you can just leave some compiler
warning to the next release just makes me go "No, I don't want to pull
this".

So I pulled this. It built fine for me. But then I went "Dave says he
isn't even bothering to fix some build warning he seems to know about,
and isn't even planning on fixing it until 6.5, so no, I think I'll
unpull again".

When you decide that compiler warnings matter, please re-send.

                    Linus
Linus Torvalds May 11, 2023, 3:12 a.m. UTC | #2
On Wed, May 10, 2023 at 9:49 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> That sounds like you will fail all build farms that happen to have
> this compiler version.

Side note: it might have helped if you indicated it was some very
particular old compiler. It's hard to tell. What made me decide to
unpull was really the fact that you seemed so cavalier about it.

The whole "no warnings" thing has been quite important. CONFIG_WERROR
can sometimes be annoying - particularly when a new compiler
introduces a new warning - but it got introduced because I got
*really* tired of people just ignoring warning, and as we had one or
two build warnings, new ones didn't stand out either.

So no, it's *not* ok to say "warnings are harmless". Because they
really really aren't. Unheeded warnings just beget more warnings, and
you end up being warning-blind.

And CONFIG_WERROR is important, because without that, maintainers
won't react to new warnings, because they'll do their random config
testing on farms, and not *look* at the result, they just look at
success/failure reports. I understand why it is like that, but it does
mean that yes, warnings need to actually cause failures.

So then a new warning in one subsystem will fail the build of all the
other subsystems too.

And who knows - maybe the warning came from some odd experimental (or
really old) compiler version that warned about other things too. It
happens. If so, it's fine - to paraphrase (and mangle): "you can make
some compilers happy all of the time, and all compilers happy some of
the time, but you can't make all compilers happy all of the time".

But if you already found the warning before it even hit the main tree,
I suspect it will hit more than a few oddball situations.

            Linus
Darrick J. Wong May 11, 2023, 4:50 p.m. UTC | #3
On Wed, May 10, 2023 at 09:49:25PM -0500, Linus Torvalds wrote:
> On Wed, May 10, 2023 at 9:01 PM Dave Chinner <david@fromorbit.com> wrote:
> >
> > [ and now with the correct cc's. DOH! ]
> 
> [ Bah. And now with the correct cc's on the reply too. ]
> 
> We do not consider build warnings even remotely "harmless".
> 
> That sounds like you will fail all build farms that happen to have
> this compiler version.

...and which version is that?  The build robot report just says ia64
without specifying any details about what compiler was running, etc:

https://lore.kernel.org/linux-xfs/20230510165934.5Zgh4%25lkp@intel.com/T/#u

I'll see if I can fix it, but there's not enough information to
reproduce the exact circumstances of the warning.  I've been building
this branch with gcc 11.3 on x86_64, aarch64, ppc64le, s390x, and
riscv64, and none of them complained.  I normally build the fs/xfs code
with most of the W=12e options enabled, though I confess to turning off
the warnings that trigger on random bits of source code outside xfs.

I don't have access to any ia64 environment, let alone whatever build
farm Intel has.

> Which in turn just means that I'd have to revert the commit.

The PR message didn't specify what warning was being ignored.  Assuming
it's the one the build robot found, would you actually revert a commit
over a build warning about unreachable code *only* on ia64?

> The fact that you seem to think that build warnings don't matter and
> are harmless AND you imply that you can just leave some compiler
> warning to the next release just makes me go "No, I don't want to pull
> this".
>
> So I pulled this. It built fine for me. But then I went "Dave says he
> isn't even bothering to fix some build warning he seems to know about,
> and isn't even planning on fixing it until 6.5, so no, I think I'll
> unpull again".

The real fixes for the code that I turned off are sufficiently involved
that we had to have a session about it at LSFMM.  There's no way to
merge it now; the window is closed.

--D

> When you decide that compiler warnings matter, please re-send.
> 
>                     Linus
Linus Torvalds May 11, 2023, 5:47 p.m. UTC | #4
On Thu, May 11, 2023 at 11:50 AM Darrick J. Wong <djwong@kernel.org> wrote:
>
> ...and which version is that?  The build robot report just says ia64
> without specifying any details about what compiler was running, etc:

Actually, you should find it if you follow the links to the config.

We have the compiler version saved in the config file partly exactly
for reasons like that.

HOWEVER.

If it's *this* report:

> https://lore.kernel.org/linux-xfs/20230510165934.5Zgh4%25lkp@intel.com/T/#u

then don't even worry about it.

That's not even a compiler warning - that "ignoring unreachable code"
is from smatch.

So if *that* single line of

   fs/xfs/scrub/fscounters.c:459 xchk_fscounters() warn: ignoring
unreachable code.

was all this was about, then there are no worries with that pull request.

Those extra warnings (some of them compiler warnings enabled with W=2
for extra warnings, some from smatch) are not a cause for worry. They
are janitorial.

I thought you had an actual failed build report due to some warning.
Those we *do* need to fix, exactly because they will affect other
peoples ability to do basic sanity testing.

So if you can confirm that it was just that one smatch warning line
and nothing else, then I'll happily do that pull.

            Linus
Linus Torvalds May 11, 2023, 6:04 p.m. UTC | #5
On Thu, May 11, 2023 at 12:47 PM Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> So if you can confirm that it was just that one smatch warning line
> and nothing else, then I'll happily do that pull.

Oh, and in case you wonder what the warning actually is about -
because some smatch warnings *are* worth looking at - that "ignoring
unreachable code" is generally things like case statements that have
all cases handled and no "break;" in any of the cases, and then
there's a "return" afterwards anyway.

It can be a sign of some logic error, though, so building smatch and
looking at what it reports could certainly be worthwhile. But no, a
smatch failure is very much not a fatal issue.

                  Linus
Darrick J. Wong May 11, 2023, 6:08 p.m. UTC | #6
On Thu, May 11, 2023 at 12:47:16PM -0500, Linus Torvalds wrote:
> On Thu, May 11, 2023 at 11:50 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > ...and which version is that?  The build robot report just says ia64
> > without specifying any details about what compiler was running, etc:
> 
> Actually, you should find it if you follow the links to the config.

The link in the 9 May report:

https://lore.kernel.org/oe-kbuild-all/202304140707.CoH337Ux-lkp@intel.com/

goes to an old report from 14 April involving gcc 12.1 on powerpc and
doesn't mention xfs at all.

> We have the compiler version saved in the config file partly exactly
> for reasons like that.
> 
> HOWEVER.
> 
> If it's *this* report:
> 
> > https://lore.kernel.org/linux-xfs/20230510165934.5Zgh4%25lkp@intel.com/T/#u
> 
> then don't even worry about it.
> 
> That's not even a compiler warning - that "ignoring unreachable code"
> is from smatch.
> 
> So if *that* single line of
> 
>    fs/xfs/scrub/fscounters.c:459 xchk_fscounters() warn: ignoring
> unreachable code.
> 
> was all this was about, then there are no worries with that pull request.
> 
> Those extra warnings (some of them compiler warnings enabled with W=2
> for extra warnings, some from smatch) are not a cause for worry. They
> are janitorial.
> 
> I thought you had an actual failed build report due to some warning.
> Those we *do* need to fix, exactly because they will affect other
> peoples ability to do basic sanity testing.

AFAICT there aren't any actual build failures, just the smatch warning.

> So if you can confirm that it was just that one smatch warning line
> and nothing else, then I'll happily do that pull.

Yes, it is the smatch warning.  Sprinkling #if 0's everywhere makes
the smatch warning go away, but the "shut up the warnings" fix isn't
necessarily better:

diff --git a/fs/xfs/scrub/fscounters.c b/fs/xfs/scrub/fscounters.c
index e382a35e98d8..09101ba0aae3 100644
--- a/fs/xfs/scrub/fscounters.c
+++ b/fs/xfs/scrub/fscounters.c
@@ -165,6 +165,7 @@ xchk_setup_fscounters(
  * xchk_*_process_error.
  */
 
+#if 0
 /* Count free space btree blocks manually for pre-lazysbcount filesystems. */
 static int
 xchk_fscount_btreeblks(
@@ -349,6 +350,7 @@ xchk_fscount_count_frextents(
 	return 0;
 }
 #endif /* CONFIG_XFS_RT */
+#endif /* 0 */
 
 /*
  * Part 2: Comparing filesystem summary counters.  All we have to do here is
@@ -422,7 +424,6 @@ xchk_fscounters(
 	struct xfs_mount	*mp = sc->mp;
 	struct xchk_fscounters	*fsc = sc->buf;
 	int64_t			icount, ifree, fdblocks, frextents;
-	int			error;
 
 	/* Snapshot the percpu counters. */
 	icount = percpu_counter_sum(&mp->m_icount);
@@ -449,9 +450,11 @@ xchk_fscounters(
 	/*
 	 * XXX: We can't quiesce percpu counter updates, so exit early.
 	 * This can be re-enabled when we gain exclusive freeze functionality.
+	 * Use preprocessor crud to avoid warnings about unreachable code
+	 * since we'd rather leave the git history of the (now unused) code
+	 * intact until we can fix the problem.
 	 */
-	return 0;
-
+#if 0
 	/*
 	 * If ifree exceeds icount by more than the minimum variance then
 	 * something's probably wrong with the counters.
@@ -488,5 +491,6 @@ xchk_fscounters(
 			fsc->frextents))
 		xchk_set_corrupt(sc);
 
+#endif
 	return 0;
 }

--D

>             Linus
pr-tracker-bot@kernel.org May 11, 2023, 10:04 p.m. UTC | #7
The pull request you sent on Thu, 11 May 2023 12:01:07 +1000:

> git://git.kernel.org/pub/scm/fs/xfs/xfs-linux.git tags/xfs-6.4-rc1-fixes

has been merged into torvalds/linux.git:
https://git.kernel.org/torvalds/c/849a4f09730ba3c02da01924c7a6e7a000a4d27c

Thank you!
Dave Chinner May 12, 2023, 1:28 a.m. UTC | #8
On Thu, May 11, 2023 at 12:47:16PM -0500, Linus Torvalds wrote:
> On Thu, May 11, 2023 at 11:50 AM Darrick J. Wong <djwong@kernel.org> wrote:
> >
> > ...and which version is that?  The build robot report just says ia64
> > without specifying any details about what compiler was running, etc:
> 
> Actually, you should find it if you follow the links to the config.
> 
> We have the compiler version saved in the config file partly exactly
> for reasons like that.
> 
> HOWEVER.
> 
> If it's *this* report:
> 
> > https://lore.kernel.org/linux-xfs/20230510165934.5Zgh4%25lkp@intel.com/T/#u
> 
> then don't even worry about it.
> 
> That's not even a compiler warning - that "ignoring unreachable code"
> is from smatch.
> 
> So if *that* single line of
> 
>    fs/xfs/scrub/fscounters.c:459 xchk_fscounters() warn: ignoring
> unreachable code.
>
> was all this was about, then there are no worries with that pull request.

Yes, that's exactly what I was referring to.

> Those extra warnings (some of them compiler warnings enabled with W=2
> for extra warnings, some from smatch) are not a cause for worry. They
> are janitorial.

Which is a pretty good definition of "harmless warning".

> I thought you had an actual failed build report due to some warning.
> Those we *do* need to fix, exactly because they will affect other
> peoples ability to do basic sanity testing.

In retrospect, next time you might first inquire as to what the
issue being referred to is. If you are going to assume anything,
assume good intentions and that the engineer sending the pull
request has done what they have done for a good reason. Once you
know what that reason is you've got the right context to then
educate the OP about the correct procedure in an amicable manner.

-Dave.