Message ID | 20200519081424.103318-1-hare@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | dm-zoned: improve cache performance | expand |
On Tue, May 19 2020 at 4:14am -0400, Hannes Reinecke <hare@suse.de> wrote: > Hi all, > > here's an update to dm-zoned to separate out cache zones. > In the update to metadata version 2 the regular drive was split > in emulated zones, which were handled just like 'normal' random > write zones. > This causes a performance drop once these emulated zones have > been mapped, as typicall the random zones from the zoned drive > will perform noticeably slower than those from the regular drive. > (After all, that was kinda the idea of using a regular disk in > the first place ...) > > So in this patchset I've introduced a separate 'cache' zone type, > allowing us to differentiate between emulated and real zones. > With that we can switch the allocation mode to use only cache > zones, and use random zones similar to sequential write zones. > That avoids the performance issue noted above. > > I've also found that the sequential write zones perform noticeably > better on writes (which is all we're caching anyway), so I've > added another patch switching the allocation routine from preferring > sequential write zones for reclaim. > > This patchset also contains some minor fixes like remving an unused > variable etc. > > As usual, comments and reviews are welcome. > > Changes to v1: > - Include reviews from Damien I'll take a look at this series now, but I'm still waiting for formal Reviewed-by: or Acked-by: from Damien before I'll stage for 5.8 -- current development window is coming to a close though. > - Introduce allocation flags > - Terminate reclaim on contention > - Rework original patch series > > Hannes Reinecke (6): > dm-zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a zone > dm-zoned: separate random and cache zones > dm-zoned: reclaim random zones when idle > dm-zoned: start reclaim with sequential zones > dm-zoned: terminate reclaim on congestion > dm-zoned: remove unused variable in dmz_do_reclaim() FYI I folded the last 6/6 patch back into the original commit that introduced the unused variable (via rebase). Thanks, Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 2020/05/19 17:14, Hannes Reinecke wrote: > Hi all, > > here's an update to dm-zoned to separate out cache zones. > In the update to metadata version 2 the regular drive was split > in emulated zones, which were handled just like 'normal' random > write zones. > This causes a performance drop once these emulated zones have > been mapped, as typicall the random zones from the zoned drive > will perform noticeably slower than those from the regular drive. > (After all, that was kinda the idea of using a regular disk in > the first place ...) > > So in this patchset I've introduced a separate 'cache' zone type, > allowing us to differentiate between emulated and real zones. > With that we can switch the allocation mode to use only cache > zones, and use random zones similar to sequential write zones. > That avoids the performance issue noted above. > > I've also found that the sequential write zones perform noticeably > better on writes (which is all we're caching anyway), so I've > added another patch switching the allocation routine from preferring > sequential write zones for reclaim. > > This patchset also contains some minor fixes like remving an unused > variable etc. > > As usual, comments and reviews are welcome. I ran this overnight with no problems. Throughput results attached. Reclaim seems to be a little too aggressive as it triggers very early. But we can tune that later if really needed: the combination of ext4 writing all over the place and the faster cache zones on SSD may simply result in a percentage of free cache zones becoming low very quickly, in which case, reclaim is working exactly as expected :) Mike, With the NVMe io_opt fix patch applied, the alignment warning for the target limits is gone. For the entire series: Tested-by: Damien Le Moal <damien.lemoal@wdc.com> > > Changes to v1: > - Include reviews from Damien > - Introduce allocation flags > - Terminate reclaim on contention > - Rework original patch series > > Hannes Reinecke (6): > dm-zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a > zone > dm-zoned: separate random and cache zones > dm-zoned: reclaim random zones when idle > dm-zoned: start reclaim with sequential zones > dm-zoned: terminate reclaim on congestion > dm-zoned: remove unused variable in dmz_do_reclaim() > > drivers/md/dm-zoned-metadata.c | 140 ++++++++++++++++++++++++++++++----------- > drivers/md/dm-zoned-reclaim.c | 109 ++++++++++++++++++++------------ > drivers/md/dm-zoned-target.c | 19 ++++-- > drivers/md/dm-zoned.h | 13 +++- > 4 files changed, 196 insertions(+), 85 deletions(-) >
On Tue, May 19 2020 at 6:36pm -0400, Damien Le Moal <Damien.LeMoal@wdc.com> wrote: > On 2020/05/19 17:14, Hannes Reinecke wrote: > > Hi all, > > > > here's an update to dm-zoned to separate out cache zones. > > In the update to metadata version 2 the regular drive was split > > in emulated zones, which were handled just like 'normal' random > > write zones. > > This causes a performance drop once these emulated zones have > > been mapped, as typicall the random zones from the zoned drive > > will perform noticeably slower than those from the regular drive. > > (After all, that was kinda the idea of using a regular disk in > > the first place ...) > > > > So in this patchset I've introduced a separate 'cache' zone type, > > allowing us to differentiate between emulated and real zones. > > With that we can switch the allocation mode to use only cache > > zones, and use random zones similar to sequential write zones. > > That avoids the performance issue noted above. > > > > I've also found that the sequential write zones perform noticeably > > better on writes (which is all we're caching anyway), so I've > > added another patch switching the allocation routine from preferring > > sequential write zones for reclaim. > > > > This patchset also contains some minor fixes like remving an unused > > variable etc. > > > > As usual, comments and reviews are welcome. > > I ran this overnight with no problems. Throughput results attached. > Reclaim seems to be a little too aggressive as it triggers very early. But we > can tune that later if really needed: the combination of ext4 writing all over > the place and the faster cache zones on SSD may simply result in a percentage of > free cache zones becoming low very quickly, in which case, reclaim is working > exactly as expected :) I've staged this series for 5.8 in linux-next Just to make sure no regressions due to all the metadata2 changes: Did you happen to verify all worked as expected without using an extra drive? > Mike, > > With the NVMe io_opt fix patch applied, the alignment warning for the target > limits is gone. OK Thanks, Mike -- dm-devel mailing list dm-devel@redhat.com https://www.redhat.com/mailman/listinfo/dm-devel
On 2020/05/21 3:53, Mike Snitzer wrote: > On Tue, May 19 2020 at 6:36pm -0400, > Damien Le Moal <Damien.LeMoal@wdc.com> wrote: > >> On 2020/05/19 17:14, Hannes Reinecke wrote: >>> Hi all, >>> >>> here's an update to dm-zoned to separate out cache zones. >>> In the update to metadata version 2 the regular drive was split >>> in emulated zones, which were handled just like 'normal' random >>> write zones. >>> This causes a performance drop once these emulated zones have >>> been mapped, as typicall the random zones from the zoned drive >>> will perform noticeably slower than those from the regular drive. >>> (After all, that was kinda the idea of using a regular disk in >>> the first place ...) >>> >>> So in this patchset I've introduced a separate 'cache' zone type, >>> allowing us to differentiate between emulated and real zones. >>> With that we can switch the allocation mode to use only cache >>> zones, and use random zones similar to sequential write zones. >>> That avoids the performance issue noted above. >>> >>> I've also found that the sequential write zones perform noticeably >>> better on writes (which is all we're caching anyway), so I've >>> added another patch switching the allocation routine from preferring >>> sequential write zones for reclaim. >>> >>> This patchset also contains some minor fixes like remving an unused >>> variable etc. >>> >>> As usual, comments and reviews are welcome. >> >> I ran this overnight with no problems. Throughput results attached. >> Reclaim seems to be a little too aggressive as it triggers very early. But we >> can tune that later if really needed: the combination of ext4 writing all over >> the place and the faster cache zones on SSD may simply result in a percentage of >> free cache zones becoming low very quickly, in which case, reclaim is working >> exactly as expected :) > > I've staged this series for 5.8 in linux-next > > Just to make sure no regressions due to all the metadata2 changes: Did > you happen to verify all worked as expected without using an extra > drive? Yes, I did. Single and dual drive both work fine with v2 metadata. I will retest the case of a V1 format using V2 code to be extra sure. > >> Mike, >> >> With the NVMe io_opt fix patch applied, the alignment warning for the target >> limits is gone. > > OK > > Thanks, > Mike > >
On 2020/05/21 3:53, Mike Snitzer wrote: > On Tue, May 19 2020 at 6:36pm -0400, > Damien Le Moal <Damien.LeMoal@wdc.com> wrote: > >> On 2020/05/19 17:14, Hannes Reinecke wrote: >>> Hi all, >>> >>> here's an update to dm-zoned to separate out cache zones. >>> In the update to metadata version 2 the regular drive was split >>> in emulated zones, which were handled just like 'normal' random >>> write zones. >>> This causes a performance drop once these emulated zones have >>> been mapped, as typicall the random zones from the zoned drive >>> will perform noticeably slower than those from the regular drive. >>> (After all, that was kinda the idea of using a regular disk in >>> the first place ...) >>> >>> So in this patchset I've introduced a separate 'cache' zone type, >>> allowing us to differentiate between emulated and real zones. >>> With that we can switch the allocation mode to use only cache >>> zones, and use random zones similar to sequential write zones. >>> That avoids the performance issue noted above. >>> >>> I've also found that the sequential write zones perform noticeably >>> better on writes (which is all we're caching anyway), so I've >>> added another patch switching the allocation routine from preferring >>> sequential write zones for reclaim. >>> >>> This patchset also contains some minor fixes like remving an unused >>> variable etc. >>> >>> As usual, comments and reviews are welcome. >> >> I ran this overnight with no problems. Throughput results attached. >> Reclaim seems to be a little too aggressive as it triggers very early. But we >> can tune that later if really needed: the combination of ext4 writing all over >> the place and the faster cache zones on SSD may simply result in a percentage of >> free cache zones becoming low very quickly, in which case, reclaim is working >> exactly as expected :) > > I've staged this series for 5.8 in linux-next > > Just to make sure no regressions due to all the metadata2 changes: Did > you happen to verify all worked as expected without using an extra > drive? Mike, Checked again: no problems detected with backward compatibility for single drive V1 formatted disks. And as I already signaled, I tested both single and dual drive setups with V2 metadata. Everything is good for me, not seeing any problem/regression. Thanks ! > >> Mike, >> >> With the NVMe io_opt fix patch applied, the alignment warning for the target >> limits is gone. > > OK > > Thanks, > Mike > >