Message ID | 20230208175228.2226263-1-leah.rumancik@gmail.com (mailing list archive) |
---|---|
Headers | show |
Series | more xfs fixes for 5.15 | expand |
On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > Hello again, > > Here is the next batch of backports for 5.15.y. Testing included > 25 runs of auto group on 12 xfs configs. No regressions were seen. > I checked xfs/538 was run without issue as this test was mentioned > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > unable to reproduce the issue. Did you find any tests that started to pass or whose failure rate reduced? There are very few Fixes: annotations in these commits so it is hard for me to assess if any of them are relevant/important/worth the effort/risk to backport to 5.10. Unless I know of reproducible bugs in 5.10, I don't think I will invest in backporting this series to 5.10. Chandan, if you find any of these fixes relevant and important for 5.4 let me know and I will make the effort to consider them for 5.10. Leah, please consider working on the SGID bug fixes for the next 5.15 update, because my 5.10 SGID fixes series [1] has been blocked for months and because there are several reproducible test cases in xfstest. I did not push on that until now because SGID test expectations were a moving target, but since xfstests commit 81e6f628 ("generic: update setgid tests") in this week's xfstests release, I think that tests should be stable and we can finally start backporting all relevant SGID fixes to align the SGID behavior of LTS kernels with that of upstream. Thanks, Amir. [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes > > Below I've outlined which series the backports came from: > > series "xfs: intent whiteouts" (1): > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d > xfs: zero inode fork buffer at allocation > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301 > xfs: fix potential log item leak > > series "xfs: fix random format verification issues" (2): > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a > xfs: detect self referencing btree sibling pointers > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y > xfs: validate inode fork size against fork format > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f > xfs: set XFS_FEAT_NLINK correctly > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f > xfs: validate v5 feature fields > > series "xfs: small fixes for 5.19 cycle" (3): > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84 > xfs: avoid unnecessary runtime sibling pointer endian conversions > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea > fs: don't assert fail on perag references on teardown > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f > xfs: assert in xfs_btree_del_cursor should take into account error > > series "xfs: random fixes for 5.19" (4): > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb > xfs: purge dquots after inode walk fails during quotacheck > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d > xfs: don't leak btree cursor when insrec fails after a split > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/ > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/ > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/ > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/ > > Darrick J. Wong (2): > xfs: purge dquots after inode walk fails during quotacheck > xfs: don't leak btree cursor when insrec fails after a split > > Dave Chinner (8): > xfs: zero inode fork buffer at allocation > xfs: fix potential log item leak > xfs: detect self referencing btree sibling pointers > xfs: set XFS_FEAT_NLINK correctly > xfs: validate v5 feature fields > xfs: avoid unnecessary runtime sibling pointer endian conversions > xfs: don't assert fail on perag references on teardown > xfs: assert in xfs_btree_del_cursor should take into account error > > fs/xfs/libxfs/xfs_ag.c | 3 +- > fs/xfs/libxfs/xfs_btree.c | 175 +++++++++++++++++++++++++-------- > fs/xfs/libxfs/xfs_inode_fork.c | 12 ++- > fs/xfs/libxfs/xfs_sb.c | 70 +++++++++++-- > fs/xfs/xfs_bmap_item.c | 2 + > fs/xfs/xfs_icreate_item.c | 1 + > fs/xfs/xfs_qm.c | 9 +- > fs/xfs/xfs_refcount_item.c | 2 + > fs/xfs/xfs_rmap_item.c | 2 + > 9 files changed, 221 insertions(+), 55 deletions(-) > > -- > 2.39.1.519.gcb327c4b5f-goog >
On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote: > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > > > Hello again, > > > > Here is the next batch of backports for 5.15.y. Testing included > > 25 runs of auto group on 12 xfs configs. No regressions were seen. > > I checked xfs/538 was run without issue as this test was mentioned > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > > unable to reproduce the issue. > > Did you find any tests that started to pass or whose failure rate reduced? I wish Leah had, but there basically aren't any tests for the problems fixed in this set for her to find. :( The first two patches I think were from when Dave was working on log intent whiteouts, turned on KASAN to diagnose some other problem he had, and began pulling on the ball of string (as it were) as he noticed other things in the codebase. We don't usually bother with regression tests for kernel memory leaks, since they're not so easy to reproduce. Patches 3-6 are fixes for a rash of fuzzer reports that someone in China posted last May: https://bugzilla.kernel.org/show_bug.cgi?id=215927 (There are more than just that one) As usual, the submitter didn't bother to help triage and just dumped a ton of work in our laps. They didn't follow up with any regression tests, because few fuzz kiddiez ever do. At the time, I was too burned out to deal with it, so Dave posted fixes. Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if you configure your kernel to panic. Not strictly needed since most LTS kernels probably don't even have XFS_DEBUG=y, but it makes the lives of recoveryloop runners easier if they do. Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab debugging enabled and build XFS as a module. I don't know why very few people do this. Patch 10 is a memory leak if you have XFS_DEBUG=y. No need for a separate test for this one, since kmemleak catches it. If you turn it on. (IOWs, LGTM for the whole set.) > There are very few Fixes: annotations in these commits so it is hard for > me to assess if any of them are relevant/important/worth the effort/risk > to backport to 5.10. <nod> Dave's fixpatches rarely have Fixes tags attached. It's difficult to get him to do that because he has so much bad history with AUTOSEL. I've tried to get him to add them in the past, but if I'm already stressed out and Dave doesn't reply then I tend to merge the fix and move on. > Unless I know of reproducible bugs in 5.10, I don't think I will invest > in backporting this series to 5.10. > Chandan, if you find any of these fixes relevant and important for 5.4 > let me know and I will make the effort to consider them for 5.10. > > Leah, please consider working on the SGID bug fixes for the next 5.15 > update, because my 5.10 SGID fixes series [1] has been blocked for > months and because there are several reproducible test cases in xfstest. Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make inode attribute forks a permanent part of struct xfs_inode"), which is a KASAN UAF that we fixed by eliminating the 'F'. Do not pull on the devilstring ("...the file capabilities code calls getxattr with and without i_rwsem held...") if you can avoid it. > I did not push on that until now because SGID test expectations were > a moving target, but since xfstests commit 81e6f628 ("generic: update > setgid tests") in this week's xfstests release, I think that tests should be > stable and we can finally start backporting all relevant SGID fixes to > align the SGID behavior of LTS kernels with that of upstream. Oh good, I've been (gently) waiting on that one too. :) --D > Thanks, > Amir. > > [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes > > > > > Below I've outlined which series the backports came from: > > > > series "xfs: intent whiteouts" (1): > > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d > > xfs: zero inode fork buffer at allocation > > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301 > > xfs: fix potential log item leak > > > > series "xfs: fix random format verification issues" (2): > > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a > > xfs: detect self referencing btree sibling pointers > > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y > > xfs: validate inode fork size against fork format > > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f > > xfs: set XFS_FEAT_NLINK correctly > > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f > > xfs: validate v5 feature fields > > > > series "xfs: small fixes for 5.19 cycle" (3): > > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84 > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea > > fs: don't assert fail on perag references on teardown > > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > series "xfs: random fixes for 5.19" (4): > > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb > > xfs: purge dquots after inode walk fails during quotacheck > > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d > > xfs: don't leak btree cursor when insrec fails after a split > > > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/ > > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/ > > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/ > > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/ > > > > Darrick J. Wong (2): > > xfs: purge dquots after inode walk fails during quotacheck > > xfs: don't leak btree cursor when insrec fails after a split > > > > Dave Chinner (8): > > xfs: zero inode fork buffer at allocation > > xfs: fix potential log item leak > > xfs: detect self referencing btree sibling pointers > > xfs: set XFS_FEAT_NLINK correctly > > xfs: validate v5 feature fields > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > xfs: don't assert fail on perag references on teardown > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > fs/xfs/libxfs/xfs_ag.c | 3 +- > > fs/xfs/libxfs/xfs_btree.c | 175 +++++++++++++++++++++++++-------- > > fs/xfs/libxfs/xfs_inode_fork.c | 12 ++- > > fs/xfs/libxfs/xfs_sb.c | 70 +++++++++++-- > > fs/xfs/xfs_bmap_item.c | 2 + > > fs/xfs/xfs_icreate_item.c | 1 + > > fs/xfs/xfs_qm.c | 9 +- > > fs/xfs/xfs_refcount_item.c | 2 + > > fs/xfs/xfs_rmap_item.c | 2 + > > 9 files changed, 221 insertions(+), 55 deletions(-) > > > > -- > > 2.39.1.519.gcb327c4b5f-goog > >
On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote: > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote: > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > > > > > Hello again, > > > > > > Here is the next batch of backports for 5.15.y. Testing included > > > 25 runs of auto group on 12 xfs configs. No regressions were seen. > > > I checked xfs/538 was run without issue as this test was mentioned > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > > > unable to reproduce the issue. > > > > Did you find any tests that started to pass or whose failure rate reduced? > > I wish Leah had, but there basically aren't any tests for the problems > fixed in this set for her to find. :( ..... > > There are very few Fixes: annotations in these commits so it is hard for > > me to assess if any of them are relevant/important/worth the effort/risk > > to backport to 5.10. > > <nod> Dave's fixpatches rarely have Fixes tags attached. It's difficult > to get him to do that because he has so much bad history with AUTOSEL. > I've tried to get him to add them in the past, but if I'm already > stressed out and Dave doesn't reply then I tend to merge the fix and > move on. In my defense, all the "fixes" from me in this series (except for the one with a fixes tag on it) date back so long ago it was difficult to identify what commit actually introduced the issue. Once we're talking about "it's been there for at least a decade" - espcially for fuzzer issues - identifying the exact commit is time consuming and often not possible, nor really useful for anything. I'm also not going to tag a patch with "fixes commit xyz" when commit xyz isn't actually the cause of the problem just so that someone can blindly use that as a "it's got a fixes tag on it, we should back port it" trigger. That's the whole problem with AUTOSEL - blindly applying anything with a fixes tag on it that merges cleanly into an older kernel - and the whole point of having a human actually manage the stable kernel backports. The stable XFS kernel maintainer is supposed to be actively looking at the commits that go into the upstream kernel to determine if they are relevant or not to the given stable kernel, regardless of whether they address fstests failures, have fixes/stable tags on them, etc. If all we needed stable maintainers to do is turn a crank handle, then we'd be perfectly OK with AUTOSEL and the upstream stable kernel process.... -Dave.
On Thu, Feb 9, 2023 at 12:21 AM Dave Chinner <david@fromorbit.com> wrote: > > On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote: > > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote: > > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > > > > > > > Hello again, > > > > > > > > Here is the next batch of backports for 5.15.y. Testing included > > > > 25 runs of auto group on 12 xfs configs. No regressions were seen. > > > > I checked xfs/538 was run without issue as this test was mentioned > > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > > > > unable to reproduce the issue. > > > > > > Did you find any tests that started to pass or whose failure rate reduced? > > > > I wish Leah had, but there basically aren't any tests for the problems > > fixed in this set for her to find. :( > ..... > > > There are very few Fixes: annotations in these commits so it is hard for > > > me to assess if any of them are relevant/important/worth the effort/risk > > > to backport to 5.10. > > > > <nod> Dave's fixpatches rarely have Fixes tags attached. It's difficult > > to get him to do that because he has so much bad history with AUTOSEL. > > I've tried to get him to add them in the past, but if I'm already > > stressed out and Dave doesn't reply then I tend to merge the fix and > > move on. > > In my defense, all the "fixes" from me in this series (except for > the one with a fixes tag on it) date back so long ago it was > difficult to identify what commit actually introduced the issue. > Once we're talking about "it's been there for at least a decade" - > espcially for fuzzer issues - identifying the exact commit is time > consuming and often not possible, nor really useful for anything. > I wouldn't want anyone to spend more time on fuzzer issues and I myself have given those very little attention. My expectation is to always have a Fixes tag when a developer knows they are fixing a regression and best effort when it is easy for the developer to determine when the buggy code was merged. It doesn't even have to be in the form of a Fixes tag. If the cover letter says "fixes to the $FEATURE from v5.18" that is good enough for me. As a stable maintainer, if a commit has no Fixes tag and no hint in the cover letter about relevant kernel versions, I will assume it is relevant to the LTS kernel and spend time on the manual triage. If a developer adds a Fixes tag where relevant, that can save my time by eliminating commits irrelevant for e.g. 5.10.y. This wasn't the case in this series I guess. > I'm also not going to tag a patch with "fixes commit xyz" when > commit xyz isn't actually the cause of the problem just so that > someone can blindly use that as a "it's got a fixes tag on it, we > should back port it" trigger. > > That's the whole problem with AUTOSEL - blindly applying anything > with a fixes tag on it that merges cleanly into an older kernel - > and the whole point of having a human actually manage the stable > kernel backports. > > The stable XFS kernel maintainer is supposed to be actively looking > at the commits that go into the upstream kernel to determine if they > are relevant or not to the given stable kernel, regardless of > whether they address fstests failures, have fixes/stable tags on > them, etc. If all we needed stable maintainers to do is turn a crank > handle, then we'd be perfectly OK with AUTOSEL and the upstream > stable kernel process.... > Yes. The reason I did not pick any of the commits in this series for 5.10 is not because lack of Fixes/tests, but because I read the cover letters (thanks for the links Leah!) and commit messages and conclude that the risk/effort does not make the cut. But this conclusion was based very much on my own intuition and my own interpretation of the cover letters. This is why I asked Chandan to take another look. I encourage any developer who writes a cover letter to share their own opinion about the relevancy and risks associated with backporting their patch set to LTS kernels - make it as vague statement as you wish, anything that can help a human LTS maintainer do their job better. Thanks, Amir.
On Wed, Feb 08, 2023 at 09:02:58 PM +0200, Amir Goldstein wrote: > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: >> >> Hello again, >> >> Here is the next batch of backports for 5.15.y. Testing included >> 25 runs of auto group on 12 xfs configs. No regressions were seen. >> I checked xfs/538 was run without issue as this test was mentioned >> in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with >> XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was >> unable to reproduce the issue. > > Did you find any tests that started to pass or whose failure rate reduced? > > There are very few Fixes: annotations in these commits so it is hard for > me to assess if any of them are relevant/important/worth the effort/risk > to backport to 5.10. > > Unless I know of reproducible bugs in 5.10, I don't think I will invest > in backporting this series to 5.10. > Chandan, if you find any of these fixes relevant and important for 5.4 > let me know and I will make the effort to consider them for 5.10. Amir, Please find my observations listed below, Patch 1: Intent whiteouts was designed to overcome performance problems with delayed attributes feature. Hence this patch will not be needed for v5.4-LTS. Patch 2: This patch won't be needed since this memory leak is prevalent mostly when intent whiteouts are being used. Patch 3: IMHO this fix does not need to be back-ported since ondisk btree cycles are close to impossible to occur during regular operation of an XFS filesystem. Patch 4: This patch won't be needed since xfs_mount->m_features isn't present in v5.4 xfsprogs. Patch 5: I am not sure about this one. May be mkfs.xfs has always created V5 filesystems with the required V4 feature flags set and probably fuzzing is the only way we could modify a V5 filesystem to have some of the required V4 feature flags disabled. I don't think this patch needs to be backported if my assumptions are true. Patch 6: This patch does not need to be backported since this fixes a performance regression introduced by the third patch. Patch 7: I will want to backport this patch since it will prevent recovery-loop test execution from stalling. Patch 8: I think backporting XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT error tag to 5.4-LTS will be helpful since xfs/538 has revealed some important bugs. Patch 8 needs to backported after backporting patches relevant to XFS_ERRTAG_BMAP_ALLOC_MINLEN_EXTENT. Patch 9: I think this patch needs to be backported to prevent kmemleak warnings. Patch 10: I would backport this patch since this can prevent kmemleak warnings from showing up (even though the bug was detected after the third patch was applied). But maybe you could wait until I come across these patches when working on backporting commits from v5.11 and newer versions of the kernel. I will share the list of candidate patches with you. Once we agree upon the list of patches, you could then backport them to v5.10-LTS.
On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote: > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote: > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > > > > > Hello again, > > > > > > Here is the next batch of backports for 5.15.y. Testing included > > > 25 runs of auto group on 12 xfs configs. No regressions were seen. > > > I checked xfs/538 was run without issue as this test was mentioned > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > > > unable to reproduce the issue. > > > > Did you find any tests that started to pass or whose failure rate reduced? > > I wish Leah had, but there basically aren't any tests for the problems > fixed in this set for her to find. :( > > The first two patches I think were from when Dave was working on log > intent whiteouts, turned on KASAN to diagnose some other problem he had, > and began pulling on the ball of string (as it were) as he noticed other > things in the codebase. We don't usually bother with regression tests > for kernel memory leaks, since they're not so easy to reproduce. > > Patches 3-6 are fixes for a rash of fuzzer reports that someone in China > posted last May: > https://bugzilla.kernel.org/show_bug.cgi?id=215927 > > (There are more than just that one) > > As usual, the submitter didn't bother to help triage and just dumped a > ton of work in our laps. They didn't follow up with any regression > tests, because few fuzz kiddiez ever do. At the time, I was too burned > out to deal with it, so Dave posted fixes. > > Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if > you configure your kernel to panic. Not strictly needed since most LTS > kernels probably don't even have XFS_DEBUG=y, but it makes the lives of > recoveryloop runners easier if they do. > > Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab > debugging enabled and build XFS as a module. I don't know why very few > people do this. > > Patch 10 is a memory leak if you have XFS_DEBUG=y. No need for a > separate test for this one, since kmemleak catches it. If you turn it > on. > > (IOWs, LGTM for the whole set.) Good to add Ack tag? > > > > > Leah, please consider working on the SGID bug fixes for the next 5.15 > > update, because my 5.10 SGID fixes series [1] has been blocked for > > months and because there are several reproducible test cases in xfstest. > > Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make > inode attribute forks a permanent part of struct xfs_inode"), which is a > KASAN UAF that we fixed by eliminating the 'F'. Do not pull on the > devilstring ("...the file capabilities code calls getxattr with and > without i_rwsem held...") if you can avoid it. > > > I did not push on that until now because SGID test expectations were > > a moving target, but since xfstests commit 81e6f628 ("generic: update > > setgid tests") in this week's xfstests release, I think that tests should be > > stable and we can finally start backporting all relevant SGID fixes to > > align the SGID behavior of LTS kernels with that of upstream. Ooo goody, ok, will do this next. The following patches are on my radar to look into for this set. I have yet to look into dependencies, so the set may grow. If the sgid tests still fail after these ptaches, I will continue hunting for more fixes to include in this set. e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes 472c6e46f589 xfs: remove XFS_PREALLOC_SYNC fbe7e5200365 xfs: fallocate() should call file_modified() 0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space() 2b3416ceff5e fs: add mode_strip_sgid() helper 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers ed5a7047d201 attr: use consistent sgid stripping checks 8d84e39d76bd fs: use consistent setgid checks in is_sxid() In addition to the normal regression testing, I will specifically look at the following tests for the sgid changes: generic/673 generic/68[3-7] generic/69[6-7] I will also do some extra runs on the entire perms group. Let me know if you think something should be dropped or added. - Leah > > Oh good, I've been (gently) waiting on that one too. :) > > --D > > > Thanks, > > Amir. > > > > [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes > > > > > > > > Below I've outlined which series the backports came from: > > > > > > series "xfs: intent whiteouts" (1): > > > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d > > > xfs: zero inode fork buffer at allocation > > > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301 > > > xfs: fix potential log item leak > > > > > > series "xfs: fix random format verification issues" (2): > > > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a > > > xfs: detect self referencing btree sibling pointers > > > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y > > > xfs: validate inode fork size against fork format > > > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f > > > xfs: set XFS_FEAT_NLINK correctly > > > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f > > > xfs: validate v5 feature fields > > > > > > series "xfs: small fixes for 5.19 cycle" (3): > > > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84 > > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea > > > fs: don't assert fail on perag references on teardown > > > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f > > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > > > series "xfs: random fixes for 5.19" (4): > > > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb > > > xfs: purge dquots after inode walk fails during quotacheck > > > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d > > > xfs: don't leak btree cursor when insrec fails after a split > > > > > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/ > > > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/ > > > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/ > > > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/ > > > > > > Darrick J. Wong (2): > > > xfs: purge dquots after inode walk fails during quotacheck > > > xfs: don't leak btree cursor when insrec fails after a split > > > > > > Dave Chinner (8): > > > xfs: zero inode fork buffer at allocation > > > xfs: fix potential log item leak > > > xfs: detect self referencing btree sibling pointers > > > xfs: set XFS_FEAT_NLINK correctly > > > xfs: validate v5 feature fields > > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > > xfs: don't assert fail on perag references on teardown > > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > > > fs/xfs/libxfs/xfs_ag.c | 3 +- > > > fs/xfs/libxfs/xfs_btree.c | 175 +++++++++++++++++++++++++-------- > > > fs/xfs/libxfs/xfs_inode_fork.c | 12 ++- > > > fs/xfs/libxfs/xfs_sb.c | 70 +++++++++++-- > > > fs/xfs/xfs_bmap_item.c | 2 + > > > fs/xfs/xfs_icreate_item.c | 1 + > > > fs/xfs/xfs_qm.c | 9 +- > > > fs/xfs/xfs_refcount_item.c | 2 + > > > fs/xfs/xfs_rmap_item.c | 2 + > > > 9 files changed, 221 insertions(+), 55 deletions(-) > > > > > > -- > > > 2.39.1.519.gcb327c4b5f-goog > > >
On Fri, Feb 10, 2023 at 11:51:03AM -0800, Leah Rumancik wrote: > On Wed, Feb 08, 2023 at 11:40:59AM -0800, Darrick J. Wong wrote: > > On Wed, Feb 08, 2023 at 09:02:58PM +0200, Amir Goldstein wrote: > > > On Wed, Feb 8, 2023 at 7:52 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > > > > > > > > Hello again, > > > > > > > > Here is the next batch of backports for 5.15.y. Testing included > > > > 25 runs of auto group on 12 xfs configs. No regressions were seen. > > > > I checked xfs/538 was run without issue as this test was mentioned > > > > in 56486f307100. Also, from 86d40f1e49e9, I ran ran xfs/117 with > > > > XFS compiled as a module and TEST_FS_MODULE_REOLOAD set, but I was > > > > unable to reproduce the issue. > > > > > > Did you find any tests that started to pass or whose failure rate reduced? > > > > I wish Leah had, but there basically aren't any tests for the problems > > fixed in this set for her to find. :( > > > > The first two patches I think were from when Dave was working on log > > intent whiteouts, turned on KASAN to diagnose some other problem he had, > > and began pulling on the ball of string (as it were) as he noticed other > > things in the codebase. We don't usually bother with regression tests > > for kernel memory leaks, since they're not so easy to reproduce. > > > > Patches 3-6 are fixes for a rash of fuzzer reports that someone in China > > posted last May: > > https://bugzilla.kernel.org/show_bug.cgi?id=215927 > > > > (There are more than just that one) > > > > As usual, the submitter didn't bother to help triage and just dumped a > > ton of work in our laps. They didn't follow up with any regression > > tests, because few fuzz kiddiez ever do. At the time, I was too burned > > out to deal with it, so Dave posted fixes. > > > > Patches 7-8 would manifest themselves as test VMs halting on ASSERTs if > > you configure your kernel to panic. Not strictly needed since most LTS > > kernels probably don't even have XFS_DEBUG=y, but it makes the lives of > > recoveryloop runners easier if they do. > > > > Patch 9 trips xfs/434 and xfs/436, but they only run if you have slab > > debugging enabled and build XFS as a module. I don't know why very few > > people do this. > > > > Patch 10 is a memory leak if you have XFS_DEBUG=y. No need for a > > separate test for this one, since kmemleak catches it. If you turn it > > on. > > > > (IOWs, LGTM for the whole set.) > > Good to add Ack tag? Oh, yes, sorry, I forgot about the formal tagging bit: Acked-by: Darrick J. Wong <djwong@kernel.org> --D > > > > > > > > Leah, please consider working on the SGID bug fixes for the next 5.15 > > > update, because my 5.10 SGID fixes series [1] has been blocked for > > > months and because there are several reproducible test cases in xfstest. > > > > Whenever y'all get to 6.0, beware of commit 2ed5b09b3e8f ("xfs: make > > inode attribute forks a permanent part of struct xfs_inode"), which is a > > KASAN UAF that we fixed by eliminating the 'F'. Do not pull on the > > devilstring ("...the file capabilities code calls getxattr with and > > without i_rwsem held...") if you can avoid it. > > > > > I did not push on that until now because SGID test expectations were > > > a moving target, but since xfstests commit 81e6f628 ("generic: update > > > setgid tests") in this week's xfstests release, I think that tests should be > > > stable and we can finally start backporting all relevant SGID fixes to > > > align the SGID behavior of LTS kernels with that of upstream. > > Ooo goody, ok, will do this next. > > The following patches are on my radar to look into for this set. I have > yet to look into dependencies, so the set may grow. If the sgid tests > still fail after these ptaches, I will continue hunting for more fixes > to include in this set. > > e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes > 472c6e46f589 xfs: remove XFS_PREALLOC_SYNC > fbe7e5200365 xfs: fallocate() should call file_modified() > 0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space() > 2b3416ceff5e fs: add mode_strip_sgid() helper > 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers > ed5a7047d201 attr: use consistent sgid stripping checks > 8d84e39d76bd fs: use consistent setgid checks in is_sxid() > > In addition to the normal regression testing, I will specifically look > at the following tests for the sgid changes: > > generic/673 > generic/68[3-7] > generic/69[6-7] > > I will also do some extra runs on the entire perms group. > > Let me know if you think something should be dropped or added. > > - Leah > > > > > Oh good, I've been (gently) waiting on that one too. :) > > > > --D > > > > > Thanks, > > > Amir. > > > > > > [1] https://github.com/amir73il/linux/commits/xfs-5.10.y-sgid-fixes > > > > > > > > > > > Below I've outlined which series the backports came from: > > > > > > > > series "xfs: intent whiteouts" (1): > > > > [01/10] cb512c921639613ce03f87e62c5e93ed9fe8c84d > > > > xfs: zero inode fork buffer at allocation > > > > [02/10] c230a4a85bcdbfc1a7415deec6caf04e8fca1301 > > > > xfs: fix potential log item leak > > > > > > > > series "xfs: fix random format verification issues" (2): > > > > [1/4] dc04db2aa7c9307e740d6d0e173085301c173b1a > > > > xfs: detect self referencing btree sibling pointers > > > > [2/4] 1eb70f54c445fcbb25817841e774adb3d912f3e8 -> already in 5.15.y > > > > xfs: validate inode fork size against fork format > > > > [3/4] dd0d2f9755191690541b09e6385d0f8cd8bc9d8f > > > > xfs: set XFS_FEAT_NLINK correctly > > > > [4/4] f0f5f658065a5af09126ec892e4c383540a1c77f > > > > xfs: validate v5 feature fields > > > > > > > > series "xfs: small fixes for 5.19 cycle" (3): > > > > [1/3] 5672225e8f2a872a22b0cecedba7a6644af1fb84 > > > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > > > [2/3] 5b55cbc2d72632e874e50d2e36bce608e55aaaea > > > > fs: don't assert fail on perag references on teardown > > > > [2/3] 56486f307100e8fc66efa2ebd8a71941fa10bf6f > > > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > > > > > series "xfs: random fixes for 5.19" (4): > > > > [1/2] 86d40f1e49e9a909d25c35ba01bea80dbcd758cb > > > > xfs: purge dquots after inode walk fails during quotacheck > > > > [2/2] a54f78def73d847cb060b18c4e4a3d1d26c9ca6d > > > > xfs: don't leak btree cursor when insrec fails after a split > > > > > > > > (1) https://lore.kernel.org/all/20220503221728.185449-1-david@fromorbit.com/ > > > > (2) https://lore.kernel.org/all/20220502082018.1076561-1-david@fromorbit.com/ > > > > (3) https://lore.kernel.org/all/20220524022158.1849458-1-david@fromorbit.com/ > > > > (4) https://lore.kernel.org/all/165337056527.993079.1232300816023906959.stgit@magnolia/ > > > > > > > > Darrick J. Wong (2): > > > > xfs: purge dquots after inode walk fails during quotacheck > > > > xfs: don't leak btree cursor when insrec fails after a split > > > > > > > > Dave Chinner (8): > > > > xfs: zero inode fork buffer at allocation > > > > xfs: fix potential log item leak > > > > xfs: detect self referencing btree sibling pointers > > > > xfs: set XFS_FEAT_NLINK correctly > > > > xfs: validate v5 feature fields > > > > xfs: avoid unnecessary runtime sibling pointer endian conversions > > > > xfs: don't assert fail on perag references on teardown > > > > xfs: assert in xfs_btree_del_cursor should take into account error > > > > > > > > fs/xfs/libxfs/xfs_ag.c | 3 +- > > > > fs/xfs/libxfs/xfs_btree.c | 175 +++++++++++++++++++++++++-------- > > > > fs/xfs/libxfs/xfs_inode_fork.c | 12 ++- > > > > fs/xfs/libxfs/xfs_sb.c | 70 +++++++++++-- > > > > fs/xfs/xfs_bmap_item.c | 2 + > > > > fs/xfs/xfs_icreate_item.c | 1 + > > > > fs/xfs/xfs_qm.c | 9 +- > > > > fs/xfs/xfs_refcount_item.c | 2 + > > > > fs/xfs/xfs_rmap_item.c | 2 + > > > > 9 files changed, 221 insertions(+), 55 deletions(-) > > > > > > > > -- > > > > 2.39.1.519.gcb327c4b5f-goog > > > >
On Fri, Feb 10, 2023 at 9:51 PM Leah Rumancik <leah.rumancik@gmail.com> wrote: > [...] > > > I did not push on that until now because SGID test expectations were > > > a moving target, but since xfstests commit 81e6f628 ("generic: update > > > setgid tests") in this week's xfstests release, I think that tests should be > > > stable and we can finally start backporting all relevant SGID fixes to > > > align the SGID behavior of LTS kernels with that of upstream. > > Ooo goody, ok, will do this next. > > The following patches are on my radar to look into for this set. I have > yet to look into dependencies, so the set may grow. If the sgid tests > still fail after these ptaches, I will continue hunting for more fixes > to include in this set. > > e014f37db1a2 xfs: use setattr_copy to set vfs inode attributes > 472c6e46f589 xfs: remove XFS_PREALLOC_SYNC > fbe7e5200365 xfs: fallocate() should call file_modified() > 0b02c8c0d75a xfs: set prealloc flag in xfs_alloc_file_space() > 2b3416ceff5e fs: add mode_strip_sgid() helper > 1639a49ccdce fs: move S_ISGID stripping into the vfs_*() helpers > ed5a7047d201 attr: use consistent sgid stripping checks > 8d84e39d76bd fs: use consistent setgid checks in is_sxid() > > In addition to the normal regression testing, I will specifically look > at the following tests for the sgid changes: > > generic/673 > generic/68[3-7] > generic/69[6-7] > > I will also do some extra runs on the entire perms group. > > Let me know if you think something should be dropped or added. > I reckon you will need those dependency prep commits from Christian's PR [1]: 11c2a8700cdc attr: add in_group_or_capable() e243e3f94c80 fs: move should_remove_suid() 72ae017c5451 attr: add setattr_should_drop_sgid() FYI, the ovl commits from this PR are independent fixes that were already applied to 5.15.y. Thanks, Amir. [1] https://lore.kernel.org/linux-fsdevel/20221212112053.99208-1-brauner@kernel.org/