mbox series

[0/1] block io layer filters api

Message ID 1598555619-14792-1-git-send-email-sergei.shtepa@veeam.com (mailing list archive)
Headers show
Series block io layer filters api | expand

Message

Sergei Shtepa Aug. 27, 2020, 7:13 p.m. UTC
Hello everyone! Requesting for your comments and suggestions.

We propose new kernel API that should be beneficial for out-of-tree
kernel modules of multiple backup vendors: block layer filter API.

Functionality:
* Provide callback to intercept bio requests, the main purpose is to
allow block level snapshots for the devices that do not support it,
for example, non-LVM block devices and implementation of changed block
tracking for faster incremental backups without system reconfiguration
or reboot, but there could be other use cases that we have not thought of.
* Allow multiple filters to work at the same time. The order in which the
request is intercepted is determined by their altitude.
* When new block devices appear, send a synchronous request to the
registered filter to add it for filtering.
* If a block device is permanently deleted or disappears, send a
synchronous request to remove the device from filtering.

Why dm-snap and dm-era is not the solution:
Device mapper must be set up in advance, usually backup vendors have very
little ability to change or convince users to modify the existing setup
at the time of software installation.
One of the most common setups is still a block device without LVM and
formatted with ext4.
Convincing users to redeploy or reconfigure machine, just to make block
level snapshots/backup software work, is a challenging task.

As of now, commit c62b37d96b6e removed make_request_fn from
struct request_queue and our out-of-tree module [1] can no longer
hook/replace it to intercept bio requests. And fops in struct gendisk
is declared as const and cannot be hooked as well.

We would appreciate your feedback!

[1] https://github.com/veeam/veeamsnap

Sergei Shtepa (1):
  block io layer filters api

 block/Kconfig               |  11 ++
 block/Makefile              |   1 +
 block/blk-core.c            |  11 +-
 block/blk-filter-internal.h |  34 +++++
 block/blk-filter.c          | 288 ++++++++++++++++++++++++++++++++++++
 block/genhd.c               |  24 +++
 include/linux/blk-filter.h  |  41 +++++
 include/linux/genhd.h       |   2 +
 8 files changed, 410 insertions(+), 2 deletions(-)
 create mode 100644 block/blk-filter-internal.h
 create mode 100644 block/blk-filter.c
 create mode 100644 include/linux/blk-filter.h

Comments

Damien Le Moal Aug. 28, 2020, 7:24 a.m. UTC | #1
On 2020/08/28 4:14, Sergei Shtepa wrote:
> Hello everyone! Requesting for your comments and suggestions.
> 
> We propose new kernel API that should be beneficial for out-of-tree
> kernel modules of multiple backup vendors: block layer filter API.
> 
> Functionality:
> * Provide callback to intercept bio requests, the main purpose is to
> allow block level snapshots for the devices that do not support it,
> for example, non-LVM block devices and implementation of changed block
> tracking for faster incremental backups without system reconfiguration
> or reboot, but there could be other use cases that we have not thought of.
> * Allow multiple filters to work at the same time. The order in which the
> request is intercepted is determined by their altitude.
> * When new block devices appear, send a synchronous request to the
> registered filter to add it for filtering.
> * If a block device is permanently deleted or disappears, send a
> synchronous request to remove the device from filtering.
> 
> Why dm-snap and dm-era is not the solution:
> Device mapper must be set up in advance, usually backup vendors have very
> little ability to change or convince users to modify the existing setup
> at the time of software installation.
> One of the most common setups is still a block device without LVM and
> formatted with ext4.
> Convincing users to redeploy or reconfigure machine, just to make block
> level snapshots/backup software work, is a challenging task.

And convincing said users to change their kernel is not challenging ? In my
experience, that is even harder than trying to get them to change their
configuration.

> As of now, commit c62b37d96b6e removed make_request_fn from
> struct request_queue and our out-of-tree module [1] can no longer
> hook/replace it to intercept bio requests. And fops in struct gendisk
> is declared as const and cannot be hooked as well.
> 
> We would appreciate your feedback!

Upstream your out-of-tree module ?

> [1] https://github.com/veeam/veeamsnap
> 
> Sergei Shtepa (1):
>   block io layer filters api
> 
>  block/Kconfig               |  11 ++
>  block/Makefile              |   1 +
>  block/blk-core.c            |  11 +-
>  block/blk-filter-internal.h |  34 +++++
>  block/blk-filter.c          | 288 ++++++++++++++++++++++++++++++++++++
>  block/genhd.c               |  24 +++
>  include/linux/blk-filter.h  |  41 +++++
>  include/linux/genhd.h       |   2 +
>  8 files changed, 410 insertions(+), 2 deletions(-)
>  create mode 100644 block/blk-filter-internal.h
>  create mode 100644 block/blk-filter.c
>  create mode 100644 include/linux/blk-filter.h
>
Jens Axboe Aug. 28, 2020, 1:54 p.m. UTC | #2
On 8/27/20 1:13 PM, Sergei Shtepa wrote:
> Hello everyone! Requesting for your comments and suggestions.
> 
> We propose new kernel API that should be beneficial for out-of-tree
> kernel modules of multiple backup vendors: block layer filter API.

That's just a non-starter, I'm afraid. We generally don't carry
infrastructure in the kernel for out-of-tree modules, that includes
even exports of existing code.

If there's a strong use case *in* the kernel, then such functionality
could be entertained.
Sergei Shtepa Sept. 1, 2020, 1:29 p.m. UTC | #3
The 08/28/2020 16:54, Jens Axboe wrote:
> On 8/27/20 1:13 PM, Sergei Shtepa wrote:
> > Hello everyone! Requesting for your comments and suggestions.
> > 
> > We propose new kernel API that should be beneficial for out-of-tree
> > kernel modules of multiple backup vendors: block layer filter API.
> 
> That's just a non-starter, I'm afraid. We generally don't carry
> infrastructure in the kernel for out-of-tree modules, that includes
> even exports of existing code.
> 
> If there's a strong use case *in* the kernel, then such functionality
> could be entertained.
> 
> -- 
> Jens Axboe
>

To be honest, we've always dreamed to include our out-of-tree module into
the kernel itself - so if you're open to that, that is great news indeed!

We've spent some time before responding to estimate how long it will take
us to update the current source code to meet coding requirements.
It looks like we will need 2-4 months of development and QC, and possibly
much more to work on your feedback thereafter.
This is much work, but we are fully committed to this if you are willing
to include this module into the kernel.

However, the same time requirement also presents a big immediate problem -
as until this is done, over a hundred thousands of Linux users will not be
able to protect their servers running the impacted kernels
(our backup agent is free).
They will be forced to stop using the new version of the kernel
(or take a risk of data loss).

Given that, is there any chance that you accept the proposed patch now, to
restore the ability to protect their Linux machines - and buy us time to 
deliver the compliant module for inclusion into the kernel?
Jens Axboe Sept. 1, 2020, 2:49 p.m. UTC | #4
On 9/1/20 7:29 AM, Sergei Shtepa wrote:
> The 08/28/2020 16:54, Jens Axboe wrote:
>> On 8/27/20 1:13 PM, Sergei Shtepa wrote:
>>> Hello everyone! Requesting for your comments and suggestions.
>>>
>>> We propose new kernel API that should be beneficial for out-of-tree
>>> kernel modules of multiple backup vendors: block layer filter API.
>>
>> That's just a non-starter, I'm afraid. We generally don't carry
>> infrastructure in the kernel for out-of-tree modules, that includes
>> even exports of existing code.
>>
>> If there's a strong use case *in* the kernel, then such functionality
>> could be entertained.
>>
>> -- 
>> Jens Axboe
>>
> 
> To be honest, we've always dreamed to include our out-of-tree module
> into the kernel itself - so if you're open to that, that is great news
> indeed!

We're always open to that, provided that a promise is made to maintain
the in-kernel version. Sometimes we see submissions that end up being an
over-the-wall kind of dump, and then the vendor only maintains their own
out-of-tree version after the fact and point customers at that one too.
For those cases we don't want the driver, as it just becomes a
maintenance hassle for us.

So if you are serious about this, it's important to set and manage
internal expectations on how the driver is developed and maintained
going forward. The upstream driver *must* be the canonical version, and
if you want and need to have versions for older kernels available, then
it should be based on backports of the current in-tree driver.

> We've spent some time before responding to estimate how long it will
> take us to update the current source code to meet coding requirements.
> It looks like we will need 2-4 months of development and QC, and
> possibly much more to work on your feedback thereafter. This is much
> work, but we are fully committed to this if you are willing to include
> this module into the kernel.

Honestly I don't think that is that much work, and I wouldn't personally
be too worried about that being succesful. Complications are generally
mostly around APIs, since an in-tree driver might need to change how you
communicate with the driver. So yes, it'll be some work, but the
important part is how we treat the maintenance of it after all that is
said and done.

> However, the same time requirement also presents a big immediate
> problem - as until this is done, over a hundred thousands of Linux
> users will not be able to protect their servers running the impacted
> kernels (our backup agent is free). They will be forced to stop using
> the new version of the kernel (or take a risk of data loss).

You have plenty of time to get this done before it becomes a problem.
It's not like the current patches are going to -stable.

> Given that, is there any chance that you accept the proposed patch
> now, to restore the ability to protect their Linux machines - and buy
> us time to deliver the compliant module for inclusion into the kernel?

I'm afraid not, we simply cannot allow exposing internals like that for
a use case that isn't currently covered by existing in-tree drivers.
Bart Van Assche Sept. 1, 2020, 3:34 p.m. UTC | #5
On 2020-09-01 06:29, Sergei Shtepa wrote:
> However, the same time requirement also presents a big immediate problem -
> as until this is done, over a hundred thousands of Linux users will not be
> able to protect their servers running the impacted kernels
> (our backup agent is free).
> They will be forced to stop using the new version of the kernel
> (or take a risk of data loss).

How can backup software work at the block layer level without any cooperation
of higher layers? How is it e.g. guaranteed that backups are crash-consistent?

What happens if the network connection with the backup server is lost? How
much time is needed to recover after the network connection has been restored?
Does the contents of the entire block device has to be resent to the backup
server?

Thanks,

Bart.
Sergei Shtepa Sept. 1, 2020, 4:35 p.m. UTC | #6
Jens, thank you so much for your prompt response and clarification on your
position.

We’re totally committed to having the upstream driver as the canonical version,
and to maintaining it. 

It was probably a mistake not to build it like that right away considering it
has always been our vision, but frankly speaking we were simply too shy to go
this route when we first started 4 years ago, with just a handful of users and
an unclear demand/future. But now in 2020, it’s a different story.

We’ll get started on the in-tree kernel mode right away, and will reconvene
once ready.