[0/2] New zoned loop block device driver

Message ID	20250106142439.216598-1-dlemoal@kernel.org (mailing list archive)
Headers	show Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A3F291DED4F for <linux-block@vger.kernel.org>; Mon, 6 Jan 2025 14:25:23 +0000 (UTC) From: Damien Le Moal <dlemoal@kernel.org> To: Jens Axboe <axboe@kernel.dk>, linux-block@vger.kernel.org Cc: Christoph Hellwig <hch@lst.de> Subject: [PATCH 0/2] New zoned loop block device driver Date: Mon, 6 Jan 2025 23:24:37 +0900 Message-ID: <20250106142439.216598-1-dlemoal@kernel.org> Precedence: bulk MIME-Version: 1.0 Content-Transfer-Encoding: 8bit
Series	New zoned loop block device driver \| expand [0/2] New zoned loop block device driver [1/2] block: new zoned loop block device driver [2/2] Documentation: Document the new zoned loop block device driver

Damien Le Moal Jan. 6, 2025, 2:24 p.m. UTC

The first patch implements the new "zloop" zoned block device driver
which allows creating zoned block devices using one regular file per
zone as backing storage.

The second patch adds documentation for this driver (overview and usage
examples).

About half of the code of the first patch is from Christoph Hellwig.

Damien Le Moal (2):
  block: new zoned loop block device driver
  Documentation: Document the new zoned loop block device driver

 Documentation/admin-guide/blockdev/index.rst  |    1 +
 .../admin-guide/blockdev/zoned_loop.rst       |  168 +++
 MAINTAINERS                                   |    8 +
 drivers/block/Kconfig                         |   16 +
 drivers/block/Makefile                        |    1 +
 drivers/block/zloop.c                         | 1330 +++++++++++++++++
 6 files changed, 1524 insertions(+)
 create mode 100644 Documentation/admin-guide/blockdev/zoned_loop.rst
 create mode 100644 drivers/block/zloop.c


base-commit: 9d89551994a430b50c4fffcb1e617a057fa76e20

Jens Axboe Jan. 6, 2025, 2:54 p.m. UTC | #1

On 1/6/25 7:24 AM, Damien Le Moal wrote:
> The first patch implements the new "zloop" zoned block device driver
> which allows creating zoned block devices using one regular file per
> zone as backing storage.

Couldn't we do this with ublk and keep most of this stuff in userspace
rather than need a whole new loop driver?

Christoph Hellwig Jan. 6, 2025, 3:21 p.m. UTC | #2

On Mon, Jan 06, 2025 at 07:54:05AM -0700, Jens Axboe wrote:
> On 1/6/25 7:24 AM, Damien Le Moal wrote:
> > The first patch implements the new "zloop" zoned block device driver
> > which allows creating zoned block devices using one regular file per
> > zone as backing storage.
> 
> Couldn't we do this with ublk and keep most of this stuff in userspace
> rather than need a whole new loop driver?

I'm pretty sure we could do that.  But dealing with ublk is complete
pain especially when setting up and tearing it down all the time for
test, and would require a lot more code, so why?  As-is I can directly
add this to xfstests for the much needed large file system testing that
currently doesn't work for zoned file systems, which was the motivation
why I started writing this code (before Damien gladly took over and
polished it).

Jens Axboe Jan. 6, 2025, 3:24 p.m. UTC | #3

On 1/6/25 8:21 AM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 07:54:05AM -0700, Jens Axboe wrote:
>> On 1/6/25 7:24 AM, Damien Le Moal wrote:
>>> The first patch implements the new "zloop" zoned block device driver
>>> which allows creating zoned block devices using one regular file per
>>> zone as backing storage.
>>
>> Couldn't we do this with ublk and keep most of this stuff in userspace
>> rather than need a whole new loop driver?
> 
> I'm pretty sure we could do that.  But dealing with ublk is complete
> pain especially when setting up and tearing it down all the time for
> test, and would require a lot more code, so why?  As-is I can directly
>
> add this to xfstests for the much needed large file system testing that
> currently doesn't work for zoned file systems, which was the motivation
> why I started writing this code (before Damien gladly took over and
> polished it).

A lot more code where? Not in the kernel. And now we're stuck with a new
driver for a relatively niche use case. Seems like a bad tradeoff to me.

Christoph Hellwig Jan. 6, 2025, 3:32 p.m. UTC | #4

On Mon, Jan 06, 2025 at 08:24:06AM -0700, Jens Axboe wrote:
> A lot more code where?

Very good and relevant question.  Some random new repo that no one knows
about?  Not very helpful.  xfstests itself?  Maybe, but that would just
means other users have to fork it.

> Not in the kernel. And now we're stuck with a new
> driver for a relatively niche use case. Seems like a bad tradeoff to me.

Seriously, if you can't Damien and me to maintain a little driver
using completely standard interfaces without any magic you'll have
different problems keepign the block layer alive :)

Jens Axboe Jan. 6, 2025, 3:38 p.m. UTC | #5

On 1/6/25 8:32 AM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 08:24:06AM -0700, Jens Axboe wrote:
>> A lot more code where?
> 
> Very good and relevant question.  Some random new repo that no one knows
> about?  Not very helpful.  xfstests itself?  Maybe, but that would just
> means other users have to fork it.

Why would they have to fork it? Just put it in xfstests itself. These
are very weak reasons, imho.

>> Not in the kernel. And now we're stuck with a new
>> driver for a relatively niche use case. Seems like a bad tradeoff to me.
> 
> Seriously, if you can't Damien and me to maintain a little driver
> using completely standard interfaces without any magic you'll have
> different problems keepign the block layer alive :)

Asking "why do we need this driver, when we can accomplish the same with
existing stuff" is a valid question, and I'm a bit puzzled why we can't
just have a reasonable discussion about this. If that simple question
can't be asked, and answered suitably, then something is really amiss.

Christoph Hellwig Jan. 6, 2025, 3:44 p.m. UTC | #6

On Mon, Jan 06, 2025 at 08:38:26AM -0700, Jens Axboe wrote:
> On 1/6/25 8:32 AM, Christoph Hellwig wrote:
> > On Mon, Jan 06, 2025 at 08:24:06AM -0700, Jens Axboe wrote:
> >> A lot more code where?
> > 
> > Very good and relevant question.  Some random new repo that no one knows
> > about?  Not very helpful.  xfstests itself?  Maybe, but that would just
> > means other users have to fork it.
> 
> Why would they have to fork it? Just put it in xfstests itself. These
> are very weak reasons, imho.

Because that way other users can't use it.  Damien has already mentioned
some.

And someone would actually have to write that hypothetical thing.

> >> Not in the kernel. And now we're stuck with a new
> >> driver for a relatively niche use case. Seems like a bad tradeoff to me.
> > 
> > Seriously, if you can't Damien and me to maintain a little driver
> > using completely standard interfaces without any magic you'll have
> > different problems keepign the block layer alive :)
> 
> Asking "why do we need this driver, when we can accomplish the same with
> existing stuff"

There is no "existing stuff"

> is a valid question, and I'm a bit puzzled why we can't
> just have a reasonable discussion about this.

I think this is a valid and reasonable discussion.  But maybe we're
just not on the same page.  I don't know anything existing and usable,
maybe I've just not found it?

Jens Axboe Jan. 6, 2025, 5:38 p.m. UTC | #7

On 1/6/25 8:44 AM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 08:38:26AM -0700, Jens Axboe wrote:
>> On 1/6/25 8:32 AM, Christoph Hellwig wrote:
>>> On Mon, Jan 06, 2025 at 08:24:06AM -0700, Jens Axboe wrote:
>>>> A lot more code where?
>>>
>>> Very good and relevant question.  Some random new repo that no one knows
>>> about?  Not very helpful.  xfstests itself?  Maybe, but that would just
>>> means other users have to fork it.
>>
>> Why would they have to fork it? Just put it in xfstests itself. These
>> are very weak reasons, imho.
> 
> Because that way other users can't use it.  Damien has already mentioned
> some.

If it's actually useful to others, then it can become a standalone
thing. Really nothing new there.

> And someone would actually have to write that hypothetical thing.

That is certainly true.

>>>> Not in the kernel. And now we're stuck with a new
>>>> driver for a relatively niche use case. Seems like a bad tradeoff to me.
>>>
>>> Seriously, if you can't Damien and me to maintain a little driver
>>> using completely standard interfaces without any magic you'll have
>>> different problems keepign the block layer alive :)
>>
>> Asking "why do we need this driver, when we can accomplish the same with
>> existing stuff"
> 
> There is no "existing stuff"

Right, that's true on both sides now. Yes this kernel driver has been
written, but in practice there is no existing stuff.

>> is a valid question, and I'm a bit puzzled why we can't
>> just have a reasonable discussion about this.
> 
> I think this is a valid and reasonable discussion.  But maybe we're
> just not on the same page.  I don't know anything existing and usable,
> maybe I've just not found it?

Not that I'm aware of, it was just a suggestion/thought that we could
utilize an existing driver for this, rather than have a separate one.
Yes the proposed one is pretty simple and not large, and maintaining it
isn't a big deal, but it's still a new driver and hence why I was asking
"why can't we just use ublk for this". That also keeps the code mostly
in userspace which is nice, rather than needing kernel changes for new
features, changes, etc.

Christoph Hellwig Jan. 6, 2025, 6:05 p.m. UTC | #8

On Mon, Jan 06, 2025 at 10:38:24AM -0700, Jens Axboe wrote:
> > just not on the same page.  I don't know anything existing and usable,
> > maybe I've just not found it?
> 
> Not that I'm aware of, it was just a suggestion/thought that we could
> utilize an existing driver for this, rather than have a separate one.
> Yes the proposed one is pretty simple and not large, and maintaining it
> isn't a big deal, but it's still a new driver and hence why I was asking
> "why can't we just use ublk for this". That also keeps the code mostly
> in userspace which is nice, rather than needing kernel changes for new
> features, changes, etc.

Well, the reason to do a kernel driver rather than a ublk back end
boils down to a few things:

 - writing highly concurrent code is actually a lot simpler in the kernel
   than in userspace because we have the right primitives for it
 - these primitives tend to actually be a lot faster than those available
   in glibc as well
 - the double context switch into the kernel and back for a ublk device
   backed by a file system will actually show up for some xfstests that
   do a lot of synchronous ops
 - having an in-tree kernel driver that you just configure / unconfigure
   from the shell is a lot easier to use than a daemon that needs to
   be running.  Especially from xfstests or other test suites that do
   a lot of per-test setup and teardown
 - the kernel actually has really nice infrastructure for block drivers.
   I'm pretty sure doing this in userspace would actually be more
   code, while being harder to use and lower performance.

So we could go both ways, but the kernel version was pretty obviously
the preferred one to me.  Maybe that's a little biasses by doing a lot
of kernel work, and having run into a lot of problems and performance
issues with the SCSI target user backend lately.

Damien Le Moal Jan. 7, 2025, 1:08 a.m. UTC | #9

On 1/7/25 02:38, Jens Axboe wrote:
>> I think this is a valid and reasonable discussion.  But maybe we're
>> just not on the same page.  I don't know anything existing and usable,
>> maybe I've just not found it?
> 
> Not that I'm aware of, it was just a suggestion/thought that we could
> utilize an existing driver for this, rather than have a separate one.
> Yes the proposed one is pretty simple and not large, and maintaining it
> isn't a big deal, but it's still a new driver and hence why I was asking
> "why can't we just use ublk for this". That also keeps the code mostly
> in userspace which is nice, rather than needing kernel changes for new
> features, changes, etc.

I did consider ublk at some point but did not switch to it because a ublk
backend driver to do the same as zloop in userspace would need a lot more code
to be efficient. And even then, as Christoph already mentioned, we would still
have performance suffer from the context switches. But that performance point
was not the primary stopper though as this driver is not intended for production
use but rather to be the simplest possible setup that can be used in CI systems
to test zoned file systems (among other zone related things).

A kernel-based implementation is simpler and the configuration interface
literally needs only a single echo bash command to add or remove devices. This
allows minimal VM configurations with no dependencies on user tools/libraries to
run these zoned devices, which is what we wanted.

I completely agree about the user-space vs kernel tradeoff you mentioned. I did
consider it but the code simplicity and ease of use in practice won for us and I
chose to stick with the kernel driver approach.

Note that if you are OK with this, I need to send a V2 to correct the Kconfig
description which currently shows an invalid configuration command example.

Jens Axboe Jan. 7, 2025, 9:08 p.m. UTC | #10

On 1/6/25 6:08 PM, Damien Le Moal wrote:
> On 1/7/25 02:38, Jens Axboe wrote:
>>> I think this is a valid and reasonable discussion.  But maybe we're
>>> just not on the same page.  I don't know anything existing and usable,
>>> maybe I've just not found it?
>>
>> Not that I'm aware of, it was just a suggestion/thought that we could
>> utilize an existing driver for this, rather than have a separate one.
>> Yes the proposed one is pretty simple and not large, and maintaining it
>> isn't a big deal, but it's still a new driver and hence why I was asking
>> "why can't we just use ublk for this". That also keeps the code mostly
>> in userspace which is nice, rather than needing kernel changes for new
>> features, changes, etc.
> 
> I did consider ublk at some point but did not switch to it because a
> ublk backend driver to do the same as zloop in userspace would need a
> lot more code to be efficient. And even then, as Christoph already
> mentioned, we would still have performance suffer from the context
> switches. But that performance point was not the primary stopper

I don't buy this context switch argument at all. Why would it mean more
sleeping? There's absolutely zero reason why a ublk solution would be at
least as performant as the kernel one.

And why would it need "a lot more code to be efficient"?

> though as this driver is not intended for production use but rather to
> be the simplest possible setup that can be used in CI systems to test
> zoned file systems (among other zone related things).

Right, that too.

> A kernel-based implementation is simpler and the configuration
> interface literally needs only a single echo bash command to add or
> remove devices. This allows minimal VM configurations with no
> dependencies on user tools/libraries to run these zoned devices, which
> is what we wanted.
> 
> I completely agree about the user-space vs kernel tradeoff you
> mentioned. I did consider it but the code simplicity and ease of use
> in practice won for us and I chose to stick with the kernel driver
> approach.
> 
> Note that if you are OK with this, I need to send a V2 to correct the
> Kconfig description which currently shows an invalid configuration
> command example.

Sure, I'm not totally against it, even if I think the arguments are
very weak, and in some places also just wrong. It's not like it's a
huge driver.

Jens Axboe Jan. 7, 2025, 9:10 p.m. UTC | #11

On 1/6/25 11:05 AM, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 10:38:24AM -0700, Jens Axboe wrote:
>>> just not on the same page.  I don't know anything existing and usable,
>>> maybe I've just not found it?
>>
>> Not that I'm aware of, it was just a suggestion/thought that we could
>> utilize an existing driver for this, rather than have a separate one.
>> Yes the proposed one is pretty simple and not large, and maintaining it
>> isn't a big deal, but it's still a new driver and hence why I was asking
>> "why can't we just use ublk for this". That also keeps the code mostly
>> in userspace which is nice, rather than needing kernel changes for new
>> features, changes, etc.
> 
> Well, the reason to do a kernel driver rather than a ublk back end
> boils down to a few things:
> 
>  - writing highly concurrent code is actually a lot simpler in the kernel
>    than in userspace because we have the right primitives for it
>  - these primitives tend to actually be a lot faster than those available
>    in glibc as well

That's certainly true.

>  - the double context switch into the kernel and back for a ublk device
>    backed by a file system will actually show up for some xfstests that
>    do a lot of synchronous ops

Like I replied to Damien, that's mostly a bogus argument. If you're
doing sync stuff, you can do that with a single system call. If you're
building up depth, then it doesn't matter.

>  - having an in-tree kernel driver that you just configure / unconfigure
>    from the shell is a lot easier to use than a daemon that needs to
>    be running.  Especially from xfstests or other test suites that do
>    a lot of per-test setup and teardown

This is always true when it's a new piece of userspace, but not
necessarily true once the use case has been established.

>  - the kernel actually has really nice infrastructure for block drivers.
>    I'm pretty sure doing this in userspace would actually be more
>    code, while being harder to use and lower performance.

That's very handwavy...

> So we could go both ways, but the kernel version was pretty obviously
> the preferred one to me.  Maybe that's a little biasses by doing a lot
> of kernel work, and having run into a lot of problems and performance
> issues with the SCSI target user backend lately.

Sure, that is understandable.

Ming Lei Jan. 8, 2025, 2:29 a.m. UTC | #12

On Mon, Jan 06, 2025 at 04:21:18PM +0100, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 07:54:05AM -0700, Jens Axboe wrote:
> > On 1/6/25 7:24 AM, Damien Le Moal wrote:
> > > The first patch implements the new "zloop" zoned block device driver
> > > which allows creating zoned block devices using one regular file per
> > > zone as backing storage.
> > 
> > Couldn't we do this with ublk and keep most of this stuff in userspace
> > rather than need a whole new loop driver?
> 
> I'm pretty sure we could do that.  But dealing with ublk is complete
> pain especially when setting up and tearing it down all the time for
> test, and would require a lot more code, so why?  As-is I can directly

You can link with libublk or add it to rublk, which supports ramdisk zone
already, then install rublk from crates.io directly for setup the
test.

Forking one new loop could add much more pain since you may have to address
everything we have fixed for loop, please look at 'git log loop'


Thanks,
Ming

Ming Lei Jan. 8, 2025, 2:47 a.m. UTC | #13

On Mon, Jan 06, 2025 at 04:44:33PM +0100, Christoph Hellwig wrote:
> On Mon, Jan 06, 2025 at 08:38:26AM -0700, Jens Axboe wrote:
> > On 1/6/25 8:32 AM, Christoph Hellwig wrote:
> > > On Mon, Jan 06, 2025 at 08:24:06AM -0700, Jens Axboe wrote:
> > >> A lot more code where?
> > > 
> > > Very good and relevant question.  Some random new repo that no one knows
> > > about?  Not very helpful.  xfstests itself?  Maybe, but that would just
> > > means other users have to fork it.
> > 
> > Why would they have to fork it? Just put it in xfstests itself. These
> > are very weak reasons, imho.
> 
> Because that way other users can't use it.  Damien has already mentioned
> some.

- cargo install rublk
- rublk add zoned

Then you can setup xfstests over the ublk/zoned disk, also Fedora 42
starts to ship rublk.


Thanks,
Ming

Damien Le Moal Jan. 8, 2025, 5:06 a.m. UTC | #14

On 1/8/25 11:29 AM, Ming Lei wrote:
> On Mon, Jan 06, 2025 at 04:21:18PM +0100, Christoph Hellwig wrote:
>> On Mon, Jan 06, 2025 at 07:54:05AM -0700, Jens Axboe wrote:
>>> On 1/6/25 7:24 AM, Damien Le Moal wrote:
>>>> The first patch implements the new "zloop" zoned block device driver
>>>> which allows creating zoned block devices using one regular file per
>>>> zone as backing storage.
>>>
>>> Couldn't we do this with ublk and keep most of this stuff in userspace
>>> rather than need a whole new loop driver?
>>
>> I'm pretty sure we could do that.  But dealing with ublk is complete
>> pain especially when setting up and tearing it down all the time for
>> test, and would require a lot more code, so why?  As-is I can directly
> 
> You can link with libublk or add it to rublk, which supports ramdisk zone
> already, then install rublk from crates.io directly for setup the
> test.

Thanks, but memory backing is not what we want. We need to emulate large drives
for FS tests (to catch problems such as overflows), and for that, a file based
storage backing is better.

> Forking one new loop could add much more pain since you may have to address
> everything we have fixed for loop, please look at 'git log loop'

Which is why Christoph initially started with the kernel driver approach in the
first place. To avoid such issues/difficulties.

Damien Le Moal Jan. 8, 2025, 5:11 a.m. UTC | #15

On 1/8/25 6:08 AM, Jens Axboe wrote:
>> A kernel-based implementation is simpler and the configuration
>> interface literally needs only a single echo bash command to add or
>> remove devices. This allows minimal VM configurations with no
>> dependencies on user tools/libraries to run these zoned devices, which
>> is what we wanted.
>>
>> I completely agree about the user-space vs kernel tradeoff you
>> mentioned. I did consider it but the code simplicity and ease of use
>> in practice won for us and I chose to stick with the kernel driver
>> approach.
>>
>> Note that if you are OK with this, I need to send a V2 to correct the
>> Kconfig description which currently shows an invalid configuration
>> command example.
> 
> Sure, I'm not totally against it, even if I think the arguments are
> very weak, and in some places also just wrong. It's not like it's a
> huge driver.

I am not going to try contesting that our arguments are somewhat weak. Yes, if
we spend enough time on it, we could eventually get something workable with ublk.

But with that said, when you spend your days developing and testing stuff for
zoned storage, having a super easy to use emulation setup for VMs without any
userspace dependencies does a world of good for productivity. That is a strong
argument for those involved, I think.

So may I send V2 for getting it queued up ?

Christoph Hellwig Jan. 8, 2025, 5:44 a.m. UTC | #16

On Tue, Jan 07, 2025 at 02:08:20PM -0700, Jens Axboe wrote:
> > ublk backend driver to do the same as zloop in userspace would need a
> > lot more code to be efficient. And even then, as Christoph already
> > mentioned, we would still have performance suffer from the context
> > switches. But that performance point was not the primary stopper
> 
> I don't buy this context switch argument at all.

The zloop write goes straight from kblockd into the the filesystem.
ublk switches to userspace, which goes back to the kernel when the
file system writes.  Similar double context switch on the completion
side.

> Why would it mean more
> sleeping?

?

> There's absolutely zero reason why a ublk solution would be at
> least as performant as the kernel one.

Well, prove it.  From haing worked on similar schemes in the past
I highly doubt it.

> And why would it need "a lot more code to be efficient"?

Because we don't have all the nice locking and even infrastructure
in userspace that we have in the kernel.

Christoph Hellwig Jan. 8, 2025, 5:47 a.m. UTC | #17

On Wed, Jan 08, 2025 at 10:29:57AM +0800, Ming Lei wrote:
> You can link with libublk or add it to rublk, which supports ramdisk zone
> already, then install rublk from crates.io directly for setup the
> test.

ramdisk are nicely supported in null_blk already.  And rust crates
are a massive pain as they tend to not be packaged nicely.  Exatly
what I do not want to depend on.

> Forking one new loop could add much more pain since you may have to address
> everything we have fixed for loop, please look at 'git log loop'

The biggest problem with the loop driver is the historic baggage in
the user interface. That's side stepped by this driver (and even for
conventional device a loop-ng doing the same might be nice, but that's
a separate story).

Christoph Hellwig Jan. 8, 2025, 5:49 a.m. UTC | #18

On Tue, Jan 07, 2025 at 02:10:45PM -0700, Jens Axboe wrote:
> >  - the double context switch into the kernel and back for a ublk device
> >    backed by a file system will actually show up for some xfstests that
> >    do a lot of synchronous ops
> 
> Like I replied to Damien, that's mostly a bogus argument. If you're
> doing sync stuff, you can do that with a single system call. If you're
> building up depth, then it doesn't matter.

How do I do a single system call for retrive the requet from the
kernel and execture it on the file system after examining it?

Ming Lei Jan. 8, 2025, 8:13 a.m. UTC | #19

On Wed, Jan 8, 2025 at 1:07 PM Damien Le Moal <dlemoal@kernel.org> wrote:
>
> On 1/8/25 11:29 AM, Ming Lei wrote:
> > On Mon, Jan 06, 2025 at 04:21:18PM +0100, Christoph Hellwig wrote:
> >> On Mon, Jan 06, 2025 at 07:54:05AM -0700, Jens Axboe wrote:
> >>> On 1/6/25 7:24 AM, Damien Le Moal wrote:
> >>>> The first patch implements the new "zloop" zoned block device driver
> >>>> which allows creating zoned block devices using one regular file per
> >>>> zone as backing storage.
> >>>
> >>> Couldn't we do this with ublk and keep most of this stuff in userspace
> >>> rather than need a whole new loop driver?
> >>
> >> I'm pretty sure we could do that.  But dealing with ublk is complete
> >> pain especially when setting up and tearing it down all the time for
> >> test, and would require a lot more code, so why?  As-is I can directly
> >
> > You can link with libublk or add it to rublk, which supports ramdisk zone
> > already, then install rublk from crates.io directly for setup the
> > test.
>
> Thanks, but memory backing is not what we want. We need to emulate large drives
> for FS tests (to catch problems such as overflows), and for that, a file based
> storage backing is better.

It is backed by virtual memory, which can be big enough because of swap, and
it is also easy to extend to file backed support since zloop doesn't store
zone meta data, which is similar to ram backed zoned actually.

Not like loop, zloop can only serve for test purposes, because each zone's
meta data is always reset when adding a new device.

Thanks,
Ming

Christoph Hellwig Jan. 8, 2025, 9:09 a.m. UTC | #20

On Wed, Jan 08, 2025 at 04:13:01PM +0800, Ming Lei wrote:
> It is backed by virtual memory, which can be big enough because of swap, and

Good luck getting half way decent performance out of swapping for a 50TB
data set.  Or even a partially filled one which really is the use case
here so it might only be a TB or so.

> it is also easy to extend to file backed support since zloop doesn't store
> zone meta data, which is similar to ram backed zoned actually.

No, zloop does store write point in the file sizse of each zone.  That's
sorta the whole point becauce it enables things like mount and even
power fail testing.

All of this is mentioned explicitly in the commit logs, documentation and
code comments, so claiming something else here feels a bit uninformed.

Ming Lei Jan. 8, 2025, 9:39 a.m. UTC | #21

On Wed, Jan 08, 2025 at 10:09:12AM +0100, Christoph Hellwig wrote:
> On Wed, Jan 08, 2025 at 04:13:01PM +0800, Ming Lei wrote:
> > It is backed by virtual memory, which can be big enough because of swap, and
> 
> Good luck getting half way decent performance out of swapping for a 50TB
> data set.  Or even a partially filled one which really is the use case
> here so it might only be a TB or so.
> 
> > it is also easy to extend to file backed support since zloop doesn't store
> > zone meta data, which is similar to ram backed zoned actually.
> 
> No, zloop does store write point in the file sizse of each zone.  That's
> sorta the whole point becauce it enables things like mount and even
> power fail testing.
> 
> All of this is mentioned explicitly in the commit logs, documentation and
> code comments, so claiming something else here feels a bit uninformed.

OK, looks one smart idea.

It is easy to extend rublk/zoned in this way with io_uring io emulation, :-)



Thanks,
Ming

Theodore Ts'o Jan. 8, 2025, 2:10 p.m. UTC | #22

On Wed, Jan 08, 2025 at 10:47:57AM +0800, Ming Lei wrote:
> > > Why would they have to fork it? Just put it in xfstests itself. These
> > > are very weak reasons, imho.
> > 
> > Because that way other users can't use it.  Damien has already mentioned
> > some.
> 
> - cargo install rublk
> - rublk add zoned
> 
> Then you can setup xfstests over the ublk/zoned disk, also Fedora 42
> starts to ship rublk.

Um, I build xfstests on Debian Stable; other people build xfstests on
enterprise Linux distributions (e.g., RHEL).

I'd be really nice if we don't add a rust dependency on xfstests
anytime soon.  Or at least, have a way of skipping tests that have a
rust dependency if xfstesets is built on a system that doesn't have
Rust, and to not add Rust dependency on existing tests, so that we
don't suddenly lose a lot of test coverage all in the name of adding
Rust....

						- Ted

[0/2] New zoned loop block device driver

Message

Comments