Message ID | 20190801021752.4986-16-david@fromorbit.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | mm, xfs: non-blocking inode reclaim | expand |
On Thu, Aug 01, 2019 at 12:17:43PM +1000, Dave Chinner wrote: > From: Dave Chinner <dchinner@redhat.com> > > The CIL can pin a lot of memory and effectively defines the lower > free memory boundary of operation for XFS. The way we hang onto > log item shadow buffers "just in case" effectively doubles the > memory footprint of the CIL for dubious reasons. > > That is, we hang onto the old shadow buffer in case the next time > we log the item it will fit into the shadow buffer and we won't have > to allocate a new one. However, we only ever tend to grow dirty > objects in the CIL through relogging, so once we've allocated a > larger buffer the old buffer we set as a shadow buffer will never > get reused as the amount we log never decreases until the item is > clean. And then for buffer items we free the log item and the shadow > buffers, anyway. Inode items will hold onto their shadow buffer > until they are reclaimed - this could double the inode's memory > footprint for it's lifetime... > > Hence we should just free the old log item buffer when we replace it > with a new shadow buffer rather than storing it for later use. It's > not useful, get rid of it as early as possible. > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > --- > fs/xfs/xfs_log_cil.c | 7 +++---- > 1 file changed, 3 insertions(+), 4 deletions(-) > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > index fa5602d0fd7f..1863a9bdf4a9 100644 > --- a/fs/xfs/xfs_log_cil.c > +++ b/fs/xfs/xfs_log_cil.c > @@ -238,9 +238,7 @@ xfs_cil_prepare_item( > /* > * If there is no old LV, this is the first time we've seen the item in > * this CIL context and so we need to pin it. If we are replacing the > - * old_lv, then remove the space it accounts for and make it the shadow > - * buffer for later freeing. In both cases we are now switching to the > - * shadow buffer, so update the the pointer to it appropriately. > + * old_lv, then remove the space it accounts for and free it. > */ The comment above xlog_cil_alloc_shadow_bufs() needs a similar update around how we handle the old buffer when the shadow buffer is used. > if (!old_lv) { > if (lv->lv_item->li_ops->iop_pin) > @@ -251,7 +249,8 @@ xfs_cil_prepare_item( > > *diff_len -= old_lv->lv_bytes; > *diff_iovecs -= old_lv->lv_niovecs; > - lv->lv_item->li_lv_shadow = old_lv; > + kmem_free(old_lv); > + lv->lv_item->li_lv_shadow = NULL; > } So IIUC this is the case where we allocated a shadow buffer, the item was already pinned (so old_lv is still around) but we ended up using the shadow buffer for this relog. Instead of keeping the old buffer around as a new shadow, we toss it. That makes sense, but if the objective is to not leave dangling shadow buffers around as such, what about the case where we allocated a shadow buffer but didn't end up using it because old_lv was reusable? It looks like we still keep the shadow buffer around in that scenario with a similar lifetime as the swapout scenario this patch removes. Hm? Brian > > /* attach new log vector to log item */ > -- > 2.22.0 >
On Mon, Aug 05, 2019 at 02:03:01PM -0400, Brian Foster wrote: > On Thu, Aug 01, 2019 at 12:17:43PM +1000, Dave Chinner wrote: > > From: Dave Chinner <dchinner@redhat.com> > > > > The CIL can pin a lot of memory and effectively defines the lower > > free memory boundary of operation for XFS. The way we hang onto > > log item shadow buffers "just in case" effectively doubles the > > memory footprint of the CIL for dubious reasons. > > > > That is, we hang onto the old shadow buffer in case the next time > > we log the item it will fit into the shadow buffer and we won't have > > to allocate a new one. However, we only ever tend to grow dirty > > objects in the CIL through relogging, so once we've allocated a > > larger buffer the old buffer we set as a shadow buffer will never > > get reused as the amount we log never decreases until the item is > > clean. And then for buffer items we free the log item and the shadow > > buffers, anyway. Inode items will hold onto their shadow buffer > > until they are reclaimed - this could double the inode's memory > > footprint for it's lifetime... > > > > Hence we should just free the old log item buffer when we replace it > > with a new shadow buffer rather than storing it for later use. It's > > not useful, get rid of it as early as possible. > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > > --- > > fs/xfs/xfs_log_cil.c | 7 +++---- > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > > index fa5602d0fd7f..1863a9bdf4a9 100644 > > --- a/fs/xfs/xfs_log_cil.c > > +++ b/fs/xfs/xfs_log_cil.c > > @@ -238,9 +238,7 @@ xfs_cil_prepare_item( > > /* > > * If there is no old LV, this is the first time we've seen the item in > > * this CIL context and so we need to pin it. If we are replacing the > > - * old_lv, then remove the space it accounts for and make it the shadow > > - * buffer for later freeing. In both cases we are now switching to the > > - * shadow buffer, so update the the pointer to it appropriately. > > + * old_lv, then remove the space it accounts for and free it. > > */ > > The comment above xlog_cil_alloc_shadow_bufs() needs a similar update > around how we handle the old buffer when the shadow buffer is used. *nod* > > > if (!old_lv) { > > if (lv->lv_item->li_ops->iop_pin) > > @@ -251,7 +249,8 @@ xfs_cil_prepare_item( > > > > *diff_len -= old_lv->lv_bytes; > > *diff_iovecs -= old_lv->lv_niovecs; > > - lv->lv_item->li_lv_shadow = old_lv; > > + kmem_free(old_lv); > > + lv->lv_item->li_lv_shadow = NULL; > > } > > So IIUC this is the case where we allocated a shadow buffer, the item > was already pinned (so old_lv is still around) but we ended up using the > shadow buffer for this relog. Instead of keeping the old buffer around > as a new shadow, we toss it. That makes sense, but if the objective is > to not leave dangling shadow buffers around as such, what about the case > where we allocated a shadow buffer but didn't end up using it because > old_lv was reusable? It looks like we still keep the shadow buffer > around in that scenario with a similar lifetime as the swapout scenario > this patch removes. Hm? Of the top of my head, we shouldn't allocate a new shadow buffer in that case (see xlog_cil_alloc_shadow_bufs()). i.e. we check up front if the formatted size of the item will fit in the existing buffer, and if it does we do not allocate a new shadow buffer as we just reuse the existing one. SO we should only have to free a shadow buffer when we switch them, not when we overwrite. I'll recheck this, but I'm pretty sure overwrite won't leave a shadow buffer around. Cheers, Dave.
On Tue, Aug 06, 2019 at 09:33:26AM +1000, Dave Chinner wrote: > On Mon, Aug 05, 2019 at 02:03:01PM -0400, Brian Foster wrote: > > On Thu, Aug 01, 2019 at 12:17:43PM +1000, Dave Chinner wrote: > > > From: Dave Chinner <dchinner@redhat.com> > > > > > > The CIL can pin a lot of memory and effectively defines the lower > > > free memory boundary of operation for XFS. The way we hang onto > > > log item shadow buffers "just in case" effectively doubles the > > > memory footprint of the CIL for dubious reasons. > > > > > > That is, we hang onto the old shadow buffer in case the next time > > > we log the item it will fit into the shadow buffer and we won't have > > > to allocate a new one. However, we only ever tend to grow dirty > > > objects in the CIL through relogging, so once we've allocated a > > > larger buffer the old buffer we set as a shadow buffer will never > > > get reused as the amount we log never decreases until the item is > > > clean. And then for buffer items we free the log item and the shadow > > > buffers, anyway. Inode items will hold onto their shadow buffer > > > until they are reclaimed - this could double the inode's memory > > > footprint for it's lifetime... > > > > > > Hence we should just free the old log item buffer when we replace it > > > with a new shadow buffer rather than storing it for later use. It's > > > not useful, get rid of it as early as possible. > > > > > > Signed-off-by: Dave Chinner <dchinner@redhat.com> > > > --- > > > fs/xfs/xfs_log_cil.c | 7 +++---- > > > 1 file changed, 3 insertions(+), 4 deletions(-) > > > > > > diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c > > > index fa5602d0fd7f..1863a9bdf4a9 100644 > > > --- a/fs/xfs/xfs_log_cil.c > > > +++ b/fs/xfs/xfs_log_cil.c > > > @@ -238,9 +238,7 @@ xfs_cil_prepare_item( > > > /* > > > * If there is no old LV, this is the first time we've seen the item in > > > * this CIL context and so we need to pin it. If we are replacing the > > > - * old_lv, then remove the space it accounts for and make it the shadow > > > - * buffer for later freeing. In both cases we are now switching to the > > > - * shadow buffer, so update the the pointer to it appropriately. > > > + * old_lv, then remove the space it accounts for and free it. > > > */ > > > > The comment above xlog_cil_alloc_shadow_bufs() needs a similar update > > around how we handle the old buffer when the shadow buffer is used. > > *nod* > > > > > > if (!old_lv) { > > > if (lv->lv_item->li_ops->iop_pin) > > > @@ -251,7 +249,8 @@ xfs_cil_prepare_item( > > > > > > *diff_len -= old_lv->lv_bytes; > > > *diff_iovecs -= old_lv->lv_niovecs; > > > - lv->lv_item->li_lv_shadow = old_lv; > > > + kmem_free(old_lv); > > > + lv->lv_item->li_lv_shadow = NULL; > > > } > > > > So IIUC this is the case where we allocated a shadow buffer, the item > > was already pinned (so old_lv is still around) but we ended up using the > > shadow buffer for this relog. Instead of keeping the old buffer around > > as a new shadow, we toss it. That makes sense, but if the objective is > > to not leave dangling shadow buffers around as such, what about the case > > where we allocated a shadow buffer but didn't end up using it because > > old_lv was reusable? It looks like we still keep the shadow buffer > > around in that scenario with a similar lifetime as the swapout scenario > > this patch removes. Hm? > > Of the top of my head, we shouldn't allocate a new shadow buffer in > that case (see xlog_cil_alloc_shadow_bufs()). i.e. we check up front > if the formatted size of the item will fit in the existing buffer, > and if it does we do not allocate a new shadow buffer as we just > reuse the existing one. SO we should only have to free a shadow > buffer when we switch them, not when we overwrite. > We have such a check in xlog_cil_insert_format_items(), so we'd reuse ->li_lv if it will suffice even if we have a shadow buffer available. > I'll recheck this, but I'm pretty sure overwrite won't leave a > shadow buffer around. > But before that we have the following logic: static void xlog_cil_alloc_shadow_bufs( ... if (!lip->li_lv_shadow || buf_size > lip->li_lv_shadow->lv_size) { ... lv = kmem_alloc_large(buf_size, KM_SLEEP | KM_NOFS); ... lip->li_lv_shadow = lv; } else { <reuse shadow> } ... } ... which always allocates a shadow buffer if one doesn't exist. We don't look at the currently used (lip->li_lv) buffer at all here. IIUC, that has to do with the TOCTOU race described in the big comment above the function.. hm? Brian > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com
On Tue, Aug 06, 2019 at 08:57:27AM -0400, Brian Foster wrote: > On Tue, Aug 06, 2019 at 09:33:26AM +1000, Dave Chinner wrote: > > I'll recheck this, but I'm pretty sure overwrite won't leave a > > shadow buffer around. > > > > But before that we have the following logic: > > static void > xlog_cil_alloc_shadow_bufs( > ... > > if (!lip->li_lv_shadow || > buf_size > lip->li_lv_shadow->lv_size) { > ... > lv = kmem_alloc_large(buf_size, KM_SLEEP | KM_NOFS); > ... > lip->li_lv_shadow = lv; > } else { > <reuse shadow> > } > ... > } > > ... which always allocates a shadow buffer if one doesn't exist. We > don't look at the currently used (lip->li_lv) buffer at all here. IIUC, > that has to do with the TOCTOU race described in the big comment above > the function.. hm? You might be right there. I haven't had a chance to follow up on this from yesterday yet, so I'll keep this in mind when I look at it again. Cheers, Dave.
diff --git a/fs/xfs/xfs_log_cil.c b/fs/xfs/xfs_log_cil.c index fa5602d0fd7f..1863a9bdf4a9 100644 --- a/fs/xfs/xfs_log_cil.c +++ b/fs/xfs/xfs_log_cil.c @@ -238,9 +238,7 @@ xfs_cil_prepare_item( /* * If there is no old LV, this is the first time we've seen the item in * this CIL context and so we need to pin it. If we are replacing the - * old_lv, then remove the space it accounts for and make it the shadow - * buffer for later freeing. In both cases we are now switching to the - * shadow buffer, so update the the pointer to it appropriately. + * old_lv, then remove the space it accounts for and free it. */ if (!old_lv) { if (lv->lv_item->li_ops->iop_pin) @@ -251,7 +249,8 @@ xfs_cil_prepare_item( *diff_len -= old_lv->lv_bytes; *diff_iovecs -= old_lv->lv_niovecs; - lv->lv_item->li_lv_shadow = old_lv; + kmem_free(old_lv); + lv->lv_item->li_lv_shadow = NULL; } /* attach new log vector to log item */