mbox series

[PATCHv2,0/6] dm-zoned: improve cache performance

Message ID 20200519081424.103318-1-hare@suse.de (mailing list archive)
Headers show
Series dm-zoned: improve cache performance | expand

Message

Hannes Reinecke May 19, 2020, 8:14 a.m. UTC
Hi all,

here's an update to dm-zoned to separate out cache zones.
In the update to metadata version 2 the regular drive was split
in emulated zones, which were handled just like 'normal' random
write zones.
This causes a performance drop once these emulated zones have
been mapped, as typicall the random zones from the zoned drive
will perform noticeably slower than those from the regular drive.
(After all, that was kinda the idea of using a regular disk in
the first place ...)

So in this patchset I've introduced a separate 'cache' zone type,
allowing us to differentiate between emulated and real zones.
With that we can switch the allocation mode to use only cache
zones, and use random zones similar to sequential write zones.
That avoids the performance issue noted above.

I've also found that the sequential write zones perform noticeably
better on writes (which is all we're caching anyway), so I've
added another patch switching the allocation routine from preferring
sequential write zones for reclaim.

This patchset also contains some minor fixes like remving an unused
variable etc.

As usual, comments and reviews are welcome.

Changes to v1:
- Include reviews from Damien
- Introduce allocation flags
- Terminate reclaim on contention
- Rework original patch series

Hannes Reinecke (6):
  dm-zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a
    zone
  dm-zoned: separate random and cache zones
  dm-zoned: reclaim random zones when idle
  dm-zoned: start reclaim with sequential zones
  dm-zoned: terminate reclaim on congestion
  dm-zoned: remove unused variable in dmz_do_reclaim()

 drivers/md/dm-zoned-metadata.c | 140 ++++++++++++++++++++++++++++++-----------
 drivers/md/dm-zoned-reclaim.c  | 109 ++++++++++++++++++++------------
 drivers/md/dm-zoned-target.c   |  19 ++++--
 drivers/md/dm-zoned.h          |  13 +++-
 4 files changed, 196 insertions(+), 85 deletions(-)

Comments

Mike Snitzer May 19, 2020, 5:36 p.m. UTC | #1
On Tue, May 19 2020 at  4:14am -0400,
Hannes Reinecke <hare@suse.de> wrote:

> Hi all,
> 
> here's an update to dm-zoned to separate out cache zones.
> In the update to metadata version 2 the regular drive was split
> in emulated zones, which were handled just like 'normal' random
> write zones.
> This causes a performance drop once these emulated zones have
> been mapped, as typicall the random zones from the zoned drive
> will perform noticeably slower than those from the regular drive.
> (After all, that was kinda the idea of using a regular disk in
> the first place ...)
> 
> So in this patchset I've introduced a separate 'cache' zone type,
> allowing us to differentiate between emulated and real zones.
> With that we can switch the allocation mode to use only cache
> zones, and use random zones similar to sequential write zones.
> That avoids the performance issue noted above.
> 
> I've also found that the sequential write zones perform noticeably
> better on writes (which is all we're caching anyway), so I've
> added another patch switching the allocation routine from preferring
> sequential write zones for reclaim.
> 
> This patchset also contains some minor fixes like remving an unused
> variable etc.
> 
> As usual, comments and reviews are welcome.
> 
> Changes to v1:
> - Include reviews from Damien

I'll take a look at this series now, but I'm still waiting for formal
Reviewed-by: or Acked-by: from Damien before I'll stage for 5.8 --
current development window is coming to a close though.

> - Introduce allocation flags
> - Terminate reclaim on contention
> - Rework original patch series
> 
> Hannes Reinecke (6):
>   dm-zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a zone
>   dm-zoned: separate random and cache zones
>   dm-zoned: reclaim random zones when idle
>   dm-zoned: start reclaim with sequential zones
>   dm-zoned: terminate reclaim on congestion
>   dm-zoned: remove unused variable in dmz_do_reclaim()

FYI I folded the last 6/6 patch back into the original commit that
introduced the unused variable (via rebase).

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Damien Le Moal May 19, 2020, 10:36 p.m. UTC | #2
On 2020/05/19 17:14, Hannes Reinecke wrote:
> Hi all,
> 
> here's an update to dm-zoned to separate out cache zones.
> In the update to metadata version 2 the regular drive was split
> in emulated zones, which were handled just like 'normal' random
> write zones.
> This causes a performance drop once these emulated zones have
> been mapped, as typicall the random zones from the zoned drive
> will perform noticeably slower than those from the regular drive.
> (After all, that was kinda the idea of using a regular disk in
> the first place ...)
> 
> So in this patchset I've introduced a separate 'cache' zone type,
> allowing us to differentiate between emulated and real zones.
> With that we can switch the allocation mode to use only cache
> zones, and use random zones similar to sequential write zones.
> That avoids the performance issue noted above.
> 
> I've also found that the sequential write zones perform noticeably
> better on writes (which is all we're caching anyway), so I've
> added another patch switching the allocation routine from preferring
> sequential write zones for reclaim.
> 
> This patchset also contains some minor fixes like remving an unused
> variable etc.
> 
> As usual, comments and reviews are welcome.

I ran this overnight with no problems. Throughput results attached.
Reclaim seems to be a little too aggressive as it triggers very early. But we
can tune that later if really needed: the combination of ext4 writing all over
the place and the faster cache zones on SSD may simply result in a percentage of
free cache zones becoming low very quickly, in which case, reclaim is working
exactly as expected :)

Mike,

With the NVMe io_opt fix patch applied, the alignment warning for the target
limits is gone.

For the entire series:

Tested-by: Damien Le Moal <damien.lemoal@wdc.com>


> 
> Changes to v1:
> - Include reviews from Damien
> - Introduce allocation flags
> - Terminate reclaim on contention
> - Rework original patch series
> 
> Hannes Reinecke (6):
>   dm-zoned: return NULL if dmz_get_zone_for_reclaim() fails to find a
>     zone
>   dm-zoned: separate random and cache zones
>   dm-zoned: reclaim random zones when idle
>   dm-zoned: start reclaim with sequential zones
>   dm-zoned: terminate reclaim on congestion
>   dm-zoned: remove unused variable in dmz_do_reclaim()
> 
>  drivers/md/dm-zoned-metadata.c | 140 ++++++++++++++++++++++++++++++-----------
>  drivers/md/dm-zoned-reclaim.c  | 109 ++++++++++++++++++++------------
>  drivers/md/dm-zoned-target.c   |  19 ++++--
>  drivers/md/dm-zoned.h          |  13 +++-
>  4 files changed, 196 insertions(+), 85 deletions(-)
>
Mike Snitzer May 20, 2020, 6:53 p.m. UTC | #3
On Tue, May 19 2020 at  6:36pm -0400,
Damien Le Moal <Damien.LeMoal@wdc.com> wrote:

> On 2020/05/19 17:14, Hannes Reinecke wrote:
> > Hi all,
> > 
> > here's an update to dm-zoned to separate out cache zones.
> > In the update to metadata version 2 the regular drive was split
> > in emulated zones, which were handled just like 'normal' random
> > write zones.
> > This causes a performance drop once these emulated zones have
> > been mapped, as typicall the random zones from the zoned drive
> > will perform noticeably slower than those from the regular drive.
> > (After all, that was kinda the idea of using a regular disk in
> > the first place ...)
> > 
> > So in this patchset I've introduced a separate 'cache' zone type,
> > allowing us to differentiate between emulated and real zones.
> > With that we can switch the allocation mode to use only cache
> > zones, and use random zones similar to sequential write zones.
> > That avoids the performance issue noted above.
> > 
> > I've also found that the sequential write zones perform noticeably
> > better on writes (which is all we're caching anyway), so I've
> > added another patch switching the allocation routine from preferring
> > sequential write zones for reclaim.
> > 
> > This patchset also contains some minor fixes like remving an unused
> > variable etc.
> > 
> > As usual, comments and reviews are welcome.
> 
> I ran this overnight with no problems. Throughput results attached.
> Reclaim seems to be a little too aggressive as it triggers very early. But we
> can tune that later if really needed: the combination of ext4 writing all over
> the place and the faster cache zones on SSD may simply result in a percentage of
> free cache zones becoming low very quickly, in which case, reclaim is working
> exactly as expected :)

I've staged this series for 5.8 in linux-next

Just to make sure no regressions due to all the metadata2 changes: Did
you happen to verify all worked as expected without using an extra
drive?

> Mike,
> 
> With the NVMe io_opt fix patch applied, the alignment warning for the target
> limits is gone.

OK

Thanks,
Mike

--
dm-devel mailing list
dm-devel@redhat.com
https://www.redhat.com/mailman/listinfo/dm-devel
Damien Le Moal May 20, 2020, 11:59 p.m. UTC | #4
On 2020/05/21 3:53, Mike Snitzer wrote:
> On Tue, May 19 2020 at  6:36pm -0400,
> Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> 
>> On 2020/05/19 17:14, Hannes Reinecke wrote:
>>> Hi all,
>>>
>>> here's an update to dm-zoned to separate out cache zones.
>>> In the update to metadata version 2 the regular drive was split
>>> in emulated zones, which were handled just like 'normal' random
>>> write zones.
>>> This causes a performance drop once these emulated zones have
>>> been mapped, as typicall the random zones from the zoned drive
>>> will perform noticeably slower than those from the regular drive.
>>> (After all, that was kinda the idea of using a regular disk in
>>> the first place ...)
>>>
>>> So in this patchset I've introduced a separate 'cache' zone type,
>>> allowing us to differentiate between emulated and real zones.
>>> With that we can switch the allocation mode to use only cache
>>> zones, and use random zones similar to sequential write zones.
>>> That avoids the performance issue noted above.
>>>
>>> I've also found that the sequential write zones perform noticeably
>>> better on writes (which is all we're caching anyway), so I've
>>> added another patch switching the allocation routine from preferring
>>> sequential write zones for reclaim.
>>>
>>> This patchset also contains some minor fixes like remving an unused
>>> variable etc.
>>>
>>> As usual, comments and reviews are welcome.
>>
>> I ran this overnight with no problems. Throughput results attached.
>> Reclaim seems to be a little too aggressive as it triggers very early. But we
>> can tune that later if really needed: the combination of ext4 writing all over
>> the place and the faster cache zones on SSD may simply result in a percentage of
>> free cache zones becoming low very quickly, in which case, reclaim is working
>> exactly as expected :)
> 
> I've staged this series for 5.8 in linux-next
> 
> Just to make sure no regressions due to all the metadata2 changes: Did
> you happen to verify all worked as expected without using an extra
> drive?

Yes, I did.  Single and dual drive both work fine with v2 metadata.
I will retest the case of a V1 format using V2 code to be extra sure.

> 
>> Mike,
>>
>> With the NVMe io_opt fix patch applied, the alignment warning for the target
>> limits is gone.
> 
> OK
> 
> Thanks,
> Mike
> 
>
Damien Le Moal May 21, 2020, 7:56 a.m. UTC | #5
On 2020/05/21 3:53, Mike Snitzer wrote:
> On Tue, May 19 2020 at  6:36pm -0400,
> Damien Le Moal <Damien.LeMoal@wdc.com> wrote:
> 
>> On 2020/05/19 17:14, Hannes Reinecke wrote:
>>> Hi all,
>>>
>>> here's an update to dm-zoned to separate out cache zones.
>>> In the update to metadata version 2 the regular drive was split
>>> in emulated zones, which were handled just like 'normal' random
>>> write zones.
>>> This causes a performance drop once these emulated zones have
>>> been mapped, as typicall the random zones from the zoned drive
>>> will perform noticeably slower than those from the regular drive.
>>> (After all, that was kinda the idea of using a regular disk in
>>> the first place ...)
>>>
>>> So in this patchset I've introduced a separate 'cache' zone type,
>>> allowing us to differentiate between emulated and real zones.
>>> With that we can switch the allocation mode to use only cache
>>> zones, and use random zones similar to sequential write zones.
>>> That avoids the performance issue noted above.
>>>
>>> I've also found that the sequential write zones perform noticeably
>>> better on writes (which is all we're caching anyway), so I've
>>> added another patch switching the allocation routine from preferring
>>> sequential write zones for reclaim.
>>>
>>> This patchset also contains some minor fixes like remving an unused
>>> variable etc.
>>>
>>> As usual, comments and reviews are welcome.
>>
>> I ran this overnight with no problems. Throughput results attached.
>> Reclaim seems to be a little too aggressive as it triggers very early. But we
>> can tune that later if really needed: the combination of ext4 writing all over
>> the place and the faster cache zones on SSD may simply result in a percentage of
>> free cache zones becoming low very quickly, in which case, reclaim is working
>> exactly as expected :)
> 
> I've staged this series for 5.8 in linux-next
> 
> Just to make sure no regressions due to all the metadata2 changes: Did
> you happen to verify all worked as expected without using an extra
> drive?

Mike,

Checked again: no problems detected with backward compatibility for single drive
V1 formatted disks. And as I already signaled, I tested both single and dual
drive setups with V2 metadata. Everything is good for me, not seeing any
problem/regression.

Thanks !

> 
>> Mike,
>>
>> With the NVMe io_opt fix patch applied, the alignment warning for the target
>> limits is gone.
> 
> OK
> 
> Thanks,
> Mike
> 
>