Message ID | 20200922034249.20549-1-hsiangkao@aol.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | xfs: drop the obsolete comment on filestream locking | expand |
On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > From: Gao Xiang <hsiangkao@redhat.com> > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > tree"), there is no m_peraglock anymore, so it's hard to understand > the described situation since per-ag is no longer an array and no > need to reallocate, call xfs_filestream_flush() in growfs. > > In addition, the race condition for shrink feature is quite confusing > to me currently as well. Get rid of it instead. > (Add some words) I think I understand what the race condition could mean after shrink fs is landed then, but the main point for now is inconsistent between code and comment, and there is no infrastructure on shrinkfs so when shrink fs is landed, the locking rule on filestream should be refined or redesigned and xfs_filestream_flush() for shrinkfs which was once deleted by 1c1c6ebcf52 might be restored to drain out in-flight xfs_fstrm_item for these shrink AGs then. From the current code logic, the comment has no use and has been outdated for years. Keep up with the code would be better IMO to save time.
On Tue, Sep 22, 2020 at 12:44:28PM +0800, Gao Xiang wrote: > On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > > From: Gao Xiang <hsiangkao@redhat.com> > > > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > > tree"), there is no m_peraglock anymore, so it's hard to understand > > the described situation since per-ag is no longer an array and no > > need to reallocate, call xfs_filestream_flush() in growfs. > > > > In addition, the race condition for shrink feature is quite confusing > > to me currently as well. Get rid of it instead. > > > > (Add some words) I think I understand what the race condition could mean > after shrink fs is landed then, but the main point for now is inconsistent > between code and comment, and there is no infrastructure on shrinkfs so > when shrink fs is landed, the locking rule on filestream should be refined > or redesigned and xfs_filestream_flush() for shrinkfs which was once > deleted by 1c1c6ebcf52 might be restored to drain out in-flight > xfs_fstrm_item for these shrink AGs then. > > From the current code logic, the comment has no use and has been outdated > for years. Keep up with the code would be better IMO to save time. Not being familiar with the filestream code at all, I wonder, what replaced all that stuff? Does that need a comment? I can't really tell at a quick glance what coordinates growfs with filestreams. --D
Hi Darrick, On Tue, Sep 22, 2020 at 09:03:28AM -0700, Darrick J. Wong wrote: > On Tue, Sep 22, 2020 at 12:44:28PM +0800, Gao Xiang wrote: > > On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > > > From: Gao Xiang <hsiangkao@redhat.com> > > > > > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > > > tree"), there is no m_peraglock anymore, so it's hard to understand > > > the described situation since per-ag is no longer an array and no > > > need to reallocate, call xfs_filestream_flush() in growfs. > > > > > > In addition, the race condition for shrink feature is quite confusing > > > to me currently as well. Get rid of it instead. > > > > > > > (Add some words) I think I understand what the race condition could mean > > after shrink fs is landed then, but the main point for now is inconsistent > > between code and comment, and there is no infrastructure on shrinkfs so > > when shrink fs is landed, the locking rule on filestream should be refined > > or redesigned and xfs_filestream_flush() for shrinkfs which was once > > deleted by 1c1c6ebcf52 might be restored to drain out in-flight > > xfs_fstrm_item for these shrink AGs then. > > > > From the current code logic, the comment has no use and has been outdated > > for years. Keep up with the code would be better IMO to save time. > > Not being familiar with the filestream code at all, I wonder, what > replaced all that stuff? Does that need a comment? I can't really tell > at a quick glance what coordinates growfs with filestreams. (try to cc Dave...) I'm not quite familiar with filestream as well. After several days random glance about the constraint of shrink feature in XFS, I found such comment and try to understand such constraint. Finally, I think it was useful only when perag was once an array and need to be reallocated (before 1c1c6ebcf52). So need to close the race by the m_peraglock (which is now deleted) and drain out in-flight AG filestream by xfs_filestream_flush() in growfs code (I think due to pag array reallocation). For now, m_peraglock and xfs_filestream_flush() in xfs_growfs_data_private() no longer exist... and we don't need to reallocate perag array but rather to use radix tree instead. but IMO, shrink an AG might need to restore to drain in-flight filestream, I couldn't tell much more of it... Overall, the current comment is quite confusing. I'd suggest it'd be better with some more reasonable comment about this at least... Thanks, Gao Xiang > > --D >
On Wed, Sep 23, 2020 at 12:23:28AM +0800, Gao Xiang wrote: > Hi Darrick, > > On Tue, Sep 22, 2020 at 09:03:28AM -0700, Darrick J. Wong wrote: > > On Tue, Sep 22, 2020 at 12:44:28PM +0800, Gao Xiang wrote: > > > On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > > > > From: Gao Xiang <hsiangkao@redhat.com> > > > > > > > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > > > > tree"), there is no m_peraglock anymore, so it's hard to understand > > > > the described situation since per-ag is no longer an array and no > > > > need to reallocate, call xfs_filestream_flush() in growfs. > > > > > > > > In addition, the race condition for shrink feature is quite confusing > > > > to me currently as well. Get rid of it instead. > > > > > > > > > > (Add some words) I think I understand what the race condition could mean > > > after shrink fs is landed then, but the main point for now is inconsistent > > > between code and comment, and there is no infrastructure on shrinkfs so > > > when shrink fs is landed, the locking rule on filestream should be refined > > > or redesigned and xfs_filestream_flush() for shrinkfs which was once > > > deleted by 1c1c6ebcf52 might be restored to drain out in-flight > > > xfs_fstrm_item for these shrink AGs then. > > > > > > From the current code logic, the comment has no use and has been outdated > > > for years. Keep up with the code would be better IMO to save time. > > > > Not being familiar with the filestream code at all, I wonder, what > > replaced all that stuff? Does that need a comment? I can't really tell > > at a quick glance what coordinates growfs with filestreams. > > (try to cc Dave...) > > I'm not quite familiar with filestream as well. After several days random > glance about the constraint of shrink feature in XFS, I found such comment > and try to understand such constraint. > > Finally, I think it was useful only when perag was once an array and need > to be reallocated (before 1c1c6ebcf52). So need to close the race by the > m_peraglock (which is now deleted) and drain out in-flight AG filestream > by xfs_filestream_flush() in growfs code (I think due to pag array > reallocation). > > For now, m_peraglock and xfs_filestream_flush() in xfs_growfs_data_private() > no longer exist... and we don't need to reallocate perag array but rather > to use radix tree instead. Yeah. I guess you could shrink the comment to warn that any code wanting to /remove/ an AG would need to be careful of the racy sequence outlined in the three bullet points. OTOH others have argued against leaving comments that describe features we don't support. But maybe it's better just to kill the whole comment like you proposed? > but IMO, shrink an AG might need to restore to drain in-flight filestream, > I couldn't tell much more of it... Overall, the current comment is quite > confusing. I'd suggest it'd be better with some more reasonable comment > about this at least... Yes, you have to drain /all/ the incore state that pertains to an AG if you're going to remove the AG. --D > > Thanks, > Gao Xiang > > > > > --D > > >
On Tue, Sep 22, 2020 at 09:03:28AM -0700, Darrick J. Wong wrote: > On Tue, Sep 22, 2020 at 12:44:28PM +0800, Gao Xiang wrote: > > On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > > > From: Gao Xiang <hsiangkao@redhat.com> > > > > > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > > > tree"), there is no m_peraglock anymore, so it's hard to understand > > > the described situation since per-ag is no longer an array and no > > > need to reallocate, call xfs_filestream_flush() in growfs. > > > > > > In addition, the race condition for shrink feature is quite confusing > > > to me currently as well. Get rid of it instead. > > > > > > > (Add some words) I think I understand what the race condition could mean > > after shrink fs is landed then, but the main point for now is inconsistent > > between code and comment, and there is no infrastructure on shrinkfs so > > when shrink fs is landed, the locking rule on filestream should be refined > > or redesigned and xfs_filestream_flush() for shrinkfs which was once > > deleted by 1c1c6ebcf52 might be restored to drain out in-flight > > xfs_fstrm_item for these shrink AGs then. > > > > From the current code logic, the comment has no use and has been outdated > > for years. Keep up with the code would be better IMO to save time. > > Not being familiar with the filestream code at all, I wonder, what > replaced all that stuff? Does that need a comment? I can't really tell > at a quick glance what coordinates growfs with filestreams. The filestream perag state would get trashed by the realloc of the perag array that growfs used to do. Hence the filestreams had to be flushed before growfs could realloc the array so there was no state that could be lost by a grow. The m_peraglock was needed to serialise that. Moving to the current perag tree setup meant grow no longer reallocated perag structures, so they didn't go away transiently and lose state any more, hence none of the flushing or perag locking was needed anymore. Shrink is a different matter altogether. The shrink context is going to have to flush the filestreams itself after it makes sure that new filestreams can't be created in that AG..... Cheers, Dave.
On Wed, Sep 23, 2020 at 07:26:46AM +1000, Dave Chinner wrote: > On Tue, Sep 22, 2020 at 09:03:28AM -0700, Darrick J. Wong wrote: > > On Tue, Sep 22, 2020 at 12:44:28PM +0800, Gao Xiang wrote: > > > On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > > > > From: Gao Xiang <hsiangkao@redhat.com> > > > > > > > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > > > > tree"), there is no m_peraglock anymore, so it's hard to understand > > > > the described situation since per-ag is no longer an array and no > > > > need to reallocate, call xfs_filestream_flush() in growfs. > > > > > > > > In addition, the race condition for shrink feature is quite confusing > > > > to me currently as well. Get rid of it instead. > > > > > > > > > > (Add some words) I think I understand what the race condition could mean > > > after shrink fs is landed then, but the main point for now is inconsistent > > > between code and comment, and there is no infrastructure on shrinkfs so > > > when shrink fs is landed, the locking rule on filestream should be refined > > > or redesigned and xfs_filestream_flush() for shrinkfs which was once > > > deleted by 1c1c6ebcf52 might be restored to drain out in-flight > > > xfs_fstrm_item for these shrink AGs then. > > > > > > From the current code logic, the comment has no use and has been outdated > > > for years. Keep up with the code would be better IMO to save time. > > > > Not being familiar with the filestream code at all, I wonder, what > > replaced all that stuff? Does that need a comment? I can't really tell > > at a quick glance what coordinates growfs with filestreams. > > The filestream perag state would get trashed by the realloc of the > perag array that growfs used to do. Hence the filestreams had to be > flushed before growfs could realloc the array so there was no state > that could be lost by a grow. The m_peraglock was needed to > serialise that. Moving to the current perag tree setup meant grow no > longer reallocated perag structures, so they didn't go away > transiently and lose state any more, hence none of the flushing or > perag locking was needed anymore. > > Shrink is a different matter altogether. The shrink context is going > to have to flush the filestreams itself after it makes sure that new > filestreams can't be created in that AG..... Yeah, just the suggestion about this comment. (And as Darrick also mentioned before, maybe just removing the related stuff for now to avoid confusion would be better... I'm not sure...) Thanks, Gao Xiang > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com >
On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > From: Gao Xiang <hsiangkao@redhat.com> > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > tree"), there is no m_peraglock anymore, so it's hard to understand > the described situation since per-ag is no longer an array and no > need to reallocate, call xfs_filestream_flush() in growfs. > > In addition, the race condition for shrink feature is quite confusing > to me currently as well. Get rid of it instead. > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com> Yes, this is all very outdated with the removal of the realloced ag array: Reviewed-by: Christoph Hellwig <hch@lst.de>
On Tue, Sep 22, 2020 at 11:42:49AM +0800, Gao Xiang wrote: > From: Gao Xiang <hsiangkao@redhat.com> > > Since commit 1c1c6ebcf52 ("xfs: Replace per-ag array with a radix > tree"), there is no m_peraglock anymore, so it's hard to understand > the described situation since per-ag is no longer an array and no > need to reallocate, call xfs_filestream_flush() in growfs. > > In addition, the race condition for shrink feature is quite confusing > to me currently as well. Get rid of it instead. > > Signed-off-by: Gao Xiang <hsiangkao@redhat.com> I'm convinced by the discussion, so: Reviewed-by: Darrick J. Wong <darrick.wong@oracle.com> --D > --- > fs/xfs/xfs_filestream.c | 34 +--------------------------------- > 1 file changed, 1 insertion(+), 33 deletions(-) > > diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c > index 1a88025e68a3..db23e455eb91 100644 > --- a/fs/xfs/xfs_filestream.c > +++ b/fs/xfs/xfs_filestream.c > @@ -33,39 +33,7 @@ enum xfs_fstrm_alloc { > /* > * Allocation group filestream associations are tracked with per-ag atomic > * counters. These counters allow xfs_filestream_pick_ag() to tell whether a > - * particular AG already has active filestreams associated with it. The mount > - * point's m_peraglock is used to protect these counters from per-ag array > - * re-allocation during a growfs operation. When xfs_growfs_data_private() is > - * about to reallocate the array, it calls xfs_filestream_flush() with the > - * m_peraglock held in write mode. > - * > - * Since xfs_mru_cache_flush() guarantees that all the free functions for all > - * the cache elements have finished executing before it returns, it's safe for > - * the free functions to use the atomic counters without m_peraglock protection. > - * This allows the implementation of xfs_fstrm_free_func() to be agnostic about > - * whether it was called with the m_peraglock held in read mode, write mode or > - * not held at all. The race condition this addresses is the following: > - * > - * - The work queue scheduler fires and pulls a filestream directory cache > - * element off the LRU end of the cache for deletion, then gets pre-empted. > - * - A growfs operation grabs the m_peraglock in write mode, flushes all the > - * remaining items from the cache and reallocates the mount point's per-ag > - * array, resetting all the counters to zero. > - * - The work queue thread resumes and calls the free function for the element > - * it started cleaning up earlier. In the process it decrements the > - * filestreams counter for an AG that now has no references. > - * > - * With a shrinkfs feature, the above scenario could panic the system. > - * > - * All other uses of the following macros should be protected by either the > - * m_peraglock held in read mode, or the cache's internal locking exposed by the > - * interval between a call to xfs_mru_cache_lookup() and a call to > - * xfs_mru_cache_done(). In addition, the m_peraglock must be held in read mode > - * when new elements are added to the cache. > - * > - * Combined, these locking rules ensure that no associations will ever exist in > - * the cache that reference per-ag array elements that have since been > - * reallocated. > + * particular AG already has active filestreams associated with it. > */ > int > xfs_filestream_peek_ag( > -- > 2.24.0 >
diff --git a/fs/xfs/xfs_filestream.c b/fs/xfs/xfs_filestream.c index 1a88025e68a3..db23e455eb91 100644 --- a/fs/xfs/xfs_filestream.c +++ b/fs/xfs/xfs_filestream.c @@ -33,39 +33,7 @@ enum xfs_fstrm_alloc { /* * Allocation group filestream associations are tracked with per-ag atomic * counters. These counters allow xfs_filestream_pick_ag() to tell whether a - * particular AG already has active filestreams associated with it. The mount - * point's m_peraglock is used to protect these counters from per-ag array - * re-allocation during a growfs operation. When xfs_growfs_data_private() is - * about to reallocate the array, it calls xfs_filestream_flush() with the - * m_peraglock held in write mode. - * - * Since xfs_mru_cache_flush() guarantees that all the free functions for all - * the cache elements have finished executing before it returns, it's safe for - * the free functions to use the atomic counters without m_peraglock protection. - * This allows the implementation of xfs_fstrm_free_func() to be agnostic about - * whether it was called with the m_peraglock held in read mode, write mode or - * not held at all. The race condition this addresses is the following: - * - * - The work queue scheduler fires and pulls a filestream directory cache - * element off the LRU end of the cache for deletion, then gets pre-empted. - * - A growfs operation grabs the m_peraglock in write mode, flushes all the - * remaining items from the cache and reallocates the mount point's per-ag - * array, resetting all the counters to zero. - * - The work queue thread resumes and calls the free function for the element - * it started cleaning up earlier. In the process it decrements the - * filestreams counter for an AG that now has no references. - * - * With a shrinkfs feature, the above scenario could panic the system. - * - * All other uses of the following macros should be protected by either the - * m_peraglock held in read mode, or the cache's internal locking exposed by the - * interval between a call to xfs_mru_cache_lookup() and a call to - * xfs_mru_cache_done(). In addition, the m_peraglock must be held in read mode - * when new elements are added to the cache. - * - * Combined, these locking rules ensure that no associations will ever exist in - * the cache that reference per-ag array elements that have since been - * reallocated. + * particular AG already has active filestreams associated with it. */ int xfs_filestream_peek_ag(