Message ID | 20240429070200.1586537-3-hch@lst.de (mailing list archive) |
---|---|
State | New |
Headers | show |
Series | [1/3] xfs: fix log recovery buffer allocation for the legacy h_size fixup | expand |
On Mon, Apr 29, 2024 at 09:01:59AM +0200, Christoph Hellwig wrote: > The reflink and rmap features require a fixed xfsprogs, so don't allow > this fixup for them. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- Seems reasonable.. > fs/xfs/xfs_log_recover.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index bb8957927c3c2e..d73bec65f93b46 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -3040,10 +3040,14 @@ xlog_do_recovery_pass( > * Detect this condition here. Use lsunit for the buffer size as > * long as this looks like the mkfs case. Otherwise, return an > * error to avoid a buffer overrun. > + * > + * Reject the invalid size if the file system has new enough > + * features that require a fixed mkfs. > */ > h_size = be32_to_cpu(rhead->h_size); > h_len = be32_to_cpu(rhead->h_len); > - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && > + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && > + h_len > h_size && h_len <= log->l_mp->m_logbsize && ... but I'm going to assume this hasn't been tested. ;) Do you mean to also check !rmapbt here? Can you please also just double check that we still handle the original mkfs problem correctly after these changes? I think that just means mkfs from a sufficiently old xfsprogs using a larger log stripe unit, and confirm the fs mounts (with a warning). Brian > rhead->h_num_logops == cpu_to_be32(1)) { > xfs_warn(log->l_mp, > "invalid iclog size (%d bytes), using lsunit (%d bytes)", > -- > 2.39.2 > >
On Mon, Apr 29, 2024 at 09:01:59AM +0200, Christoph Hellwig wrote: > The reflink and rmap features require a fixed xfsprogs, so don't allow > this fixup for them. > > Signed-off-by: Christoph Hellwig <hch@lst.de> > --- > fs/xfs/xfs_log_recover.c | 6 +++++- > 1 file changed, 5 insertions(+), 1 deletion(-) > > diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c > index bb8957927c3c2e..d73bec65f93b46 100644 > --- a/fs/xfs/xfs_log_recover.c > +++ b/fs/xfs/xfs_log_recover.c > @@ -3040,10 +3040,14 @@ xlog_do_recovery_pass( > * Detect this condition here. Use lsunit for the buffer size as > * long as this looks like the mkfs case. Otherwise, return an > * error to avoid a buffer overrun. > + * > + * Reject the invalid size if the file system has new enough > + * features that require a fixed mkfs. > */ > h_size = be32_to_cpu(rhead->h_size); > h_len = be32_to_cpu(rhead->h_len); > - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && > + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && > + h_len > h_size && h_len <= log->l_mp->m_logbsize && > rhead->h_num_logops == cpu_to_be32(1)) { Same comment about do you want to test for rmap and reflink here? I also wonder if this multiline predicate should turn into a static inline helper. I nearly wrote you one, but then I realize that I don't remember enough about the xfsprogs problem to know if the problem was limited to mkfs, or if xfs_repair zeroing the log would also write out a bad h_size? --D > xfs_warn(log->l_mp, > "invalid iclog size (%d bytes), using lsunit (%d bytes)", > -- > 2.39.2 > >
On Mon, Apr 29, 2024 at 08:18:44AM -0400, Brian Foster wrote: > > - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && > > + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && > > + h_len > h_size && h_len <= log->l_mp->m_logbsize && > > ... but I'm going to assume this hasn't been tested. ;) Do you mean to > also check !rmapbt here? Heh. Well, it has been tested in that we don't do the fixup for the reproducer fixed by the previous patch and in that xfstests still passes. I guess nothing in there hits the old mkfs fixup, which isn't surprising. > Can you please also just double check that we still handle the original > mkfs problem correctly after these changes? I think that just means mkfs > from a sufficiently old xfsprogs using a larger log stripe unit, and > confirm the fs mounts (with a warning). Yeah. Is there any way to commit a fs image to xfstests so that we actually regularly test for it?
On Mon, Apr 29, 2024 at 07:15:52PM +0200, Christoph Hellwig wrote: > On Mon, Apr 29, 2024 at 08:18:44AM -0400, Brian Foster wrote: > > > - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && > > > + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && > > > + h_len > h_size && h_len <= log->l_mp->m_logbsize && > > > > ... but I'm going to assume this hasn't been tested. ;) Do you mean to > > also check !rmapbt here? > > Heh. Well, it has been tested in that we don't do the fixup for the > reproducer fixed by the previous patch and in that xfstests still passes. > I guess nothing in there hits the old mkfs fixup, which isn't surprising. > Yeah.. (sorry, just teasing about the testing.. ;). > > Can you please also just double check that we still handle the original > > mkfs problem correctly after these changes? I think that just means mkfs > > from a sufficiently old xfsprogs using a larger log stripe unit, and > > confirm the fs mounts (with a warning). > > Yeah. Is there any way to commit a fs image to xfstests so that we > actually regularly test for it? > Not sure.. ideally we could fuzz the log record header somehow or another to test for these various scenarios, since we clearly broke this once already. I don't quite recall if I looked into that at the time of the original workaround. To Darrick's point, I wonder if there would be some use to an expert logformat command or something that allowed for some bonkers parameters (assuming something like that doesn't exist already). I'm out on PTO for (at least) today, but I can take a closer look at that once I'm back and caught up... Brian
On Tue, Apr 30, 2024 at 06:59:47AM -0400, Brian Foster wrote: > On Mon, Apr 29, 2024 at 07:15:52PM +0200, Christoph Hellwig wrote: > > On Mon, Apr 29, 2024 at 08:18:44AM -0400, Brian Foster wrote: > > > > - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && > > > > + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && > > > > + h_len > h_size && h_len <= log->l_mp->m_logbsize && > > > > > > ... but I'm going to assume this hasn't been tested. ;) Do you mean to > > > also check !rmapbt here? > > > > Heh. Well, it has been tested in that we don't do the fixup for the > > reproducer fixed by the previous patch and in that xfstests still passes. > > I guess nothing in there hits the old mkfs fixup, which isn't surprising. > > > > Yeah.. (sorry, just teasing about the testing.. ;). > > > > Can you please also just double check that we still handle the original > > > mkfs problem correctly after these changes? I think that just means mkfs > > > from a sufficiently old xfsprogs using a larger log stripe unit, and > > > confirm the fs mounts (with a warning). > > > > Yeah. Is there any way to commit a fs image to xfstests so that we > > actually regularly test for it? > > > > Not sure.. ideally we could fuzz the log record header somehow or > another to test for these various scenarios, since we clearly broke this > once already. > > I don't quite recall if I looked into that at the time of the original > workaround. To Darrick's point, I wonder if there would be some use to > an expert logformat command or something that allowed for some bonkers > parameters (assuming something like that doesn't exist already). > > I'm out on PTO for (at least) today, but I can take a closer look at > that once I'm back and caught up... > I've had a chance to poke at this a bit and so far I don't think it's as straightforward as I'd hoped. The logformat command already exists, which makes it pretty easy to malformat a log record, but the recovery code only runs into this path on a dirty log. I suspect the original issue that prompted this logic was something like a crash-recovery test leading to log record trimming (i.e. simulated torn log writes) and then happening onto previously mkfs-formatted log records that way, but that is a guess at this point. I did play around a bit with fuzzing h_size for those sorts of tests (i.e. xfs/141), but that ran into other issues. I went back to using xfsprogs v4.3 or so and that eventually also produces an unmountable filesystem with a similar error (even with the h_size fix patch). It fails to locate the head/tail, which is obviously necessary in order to process the log and perform record validation, so I'm wondering if something else changed on the kernel side to further gate this scenario. Of course it's also possible I'm looking at things wrong and this is just orthogonal to the original problem. But given all of that, I am a little skeptical on the value of retaining this logic at all. Does whatever test case that recently uncovered the h_size problem leave a mountable filesystem, or does it just work around bad behavior to provide a more graceful failure condition? Brian
diff --git a/fs/xfs/xfs_log_recover.c b/fs/xfs/xfs_log_recover.c index bb8957927c3c2e..d73bec65f93b46 100644 --- a/fs/xfs/xfs_log_recover.c +++ b/fs/xfs/xfs_log_recover.c @@ -3040,10 +3040,14 @@ xlog_do_recovery_pass( * Detect this condition here. Use lsunit for the buffer size as * long as this looks like the mkfs case. Otherwise, return an * error to avoid a buffer overrun. + * + * Reject the invalid size if the file system has new enough + * features that require a fixed mkfs. */ h_size = be32_to_cpu(rhead->h_size); h_len = be32_to_cpu(rhead->h_len); - if (h_len > h_size && h_len <= log->l_mp->m_logbsize && + if (!xfs_has_reflink(log->l_mp) && xfs_has_reflink(log->l_mp) && + h_len > h_size && h_len <= log->l_mp->m_logbsize && rhead->h_num_logops == cpu_to_be32(1)) { xfs_warn(log->l_mp, "invalid iclog size (%d bytes), using lsunit (%d bytes)",
The reflink and rmap features require a fixed xfsprogs, so don't allow this fixup for them. Signed-off-by: Christoph Hellwig <hch@lst.de> --- fs/xfs/xfs_log_recover.c | 6 +++++- 1 file changed, 5 insertions(+), 1 deletion(-)