mbox series

[v2,0/3] Fix dm-crypt zoned block device support

Message ID 20210417023323.852530-1-damien.lemoal@wdc.com (mailing list archive)
Headers show
Series Fix dm-crypt zoned block device support | expand

Message

Damien Le Moal April 17, 2021, 2:33 a.m. UTC
Mike,

Zone append BIOs (REQ_OP_ZONE_APPEND) always specify the start sector
of the zone to be written instead of the actual location sector to
write. The write location is determined by the device and returned to
the host upon completion of the operation.

This interface, while simple and efficient for writing into sequential
zones of a zoned block device, is incompatible with the use of sector
values to calculate a cypher block IV. All data written in a zone is
encrypted using an IV calculated from the first sectors of the zone,
but read operation will specify any sector within the zone, resulting
in an IV mismatch between encryption and decryption. Reads fail in that
case.

Using a single sector value (e.g. the zone start sector) for all read
and writes into a zone can solve this problem, but at the cost of
weakening the cypher chosen by the user. Emulating zone append using
regular writes would be another potential solution, but it is complex
and would add a lot of overhead.

Instead, to solve this problem, explicitly disable support for zone
append operations in dm-crypt if the target was setup using a cypher IV
mode using sector values. The null and random IV modes can still be used
with zone append operations. This lack of support for zone append is
exposed to the user by setting the dm-crypt target queue limit
max_zone_append_sectors to 0. This change is done in patches 1 and 2.

Patch 3 fixes zonefs to fall back to using regular write when
max_zone_append_sectors is 0 (Note: I can take this patch through the
zonefs tree. But since I have nothing else for an eventual rc8 and next
cycle, you can take it too. No chance of conflict).

Overall, these changes do not break user space:
1) There is no interface allowing a user to use zone append write
without a file system. So applications using directly a raw dm-crypt
device will continue working using regular write operations.
2) btrfs zoned support was added in 5.12. Anybody trying btrfs-zoned on
top of dm-crypt would have faced the read failures already. So there
are no existing deployments to preserve. Same for zonefs.

For file systems, using zone append with encryption will need to be
supported within the file system (e.g. fscrypt). In this case, cypher IV
calculation can rely for instance on file block offsets as these are
known before a zone append operation write these blocks to disk at
unknown locations.

Reviews and comments are very much welcome.

Changes from v1:
* Addressed Johannes comments by renaming the CRYPT_IV_NO_SECTORS flag
  to CRYPT_IV_ZONE_APPEND to avoid a double negation with !test_bit().
  This also clarifies that the flag is used solely to check zone append
  support.
* Removed btrfs patch (former patch 3) as David is taking that patch
  through the btrfs tree misc-next branch.
* Added reviewed-by, Fixes and Cc tags.

Damien Le Moal (3):
  dm: Introduce zone append support control
  dm crypt: Fix zoned block device support
  zonefs: fix synchronous write to sequential zone files

 drivers/md/dm-crypt.c         | 49 ++++++++++++++++++++++++++++-------
 drivers/md/dm-table.c         | 41 +++++++++++++++++++++++++++++
 fs/zonefs/super.c             | 16 +++++++++---
 fs/zonefs/zonefs.h            |  2 ++
 include/linux/device-mapper.h |  6 +++++
 5 files changed, 101 insertions(+), 13 deletions(-)

Comments

Mikulas Patocka April 19, 2021, 12:52 p.m. UTC | #1
On Sat, 17 Apr 2021, Damien Le Moal wrote:

> Mike,
> 
> Zone append BIOs (REQ_OP_ZONE_APPEND) always specify the start sector
> of the zone to be written instead of the actual location sector to
> write. The write location is determined by the device and returned to
> the host upon completion of the operation.

I'd like to ask what's the reason for this semantics? Why can't users of 
the zoned device supply real sector numbers?

> This interface, while simple and efficient for writing into sequential
> zones of a zoned block device, is incompatible with the use of sector
> values to calculate a cypher block IV. All data written in a zone is
> encrypted using an IV calculated from the first sectors of the zone,
> but read operation will specify any sector within the zone, resulting
> in an IV mismatch between encryption and decryption. Reads fail in that
> case.

I would say that it is incompatible with all dm targets - even the linear 
target is changing the sector number and so it may redirect the write 
outside of the range specified in dm-table and cause corruption.

Instead of complicating device mapper with imperfect support, I would just 
disable REQ_OP_ZONE_APPEND on device mapper at all.

Mikulas
Damien Le Moal April 19, 2021, 1:02 p.m. UTC | #2
On 2021/04/19 21:52, Mikulas Patocka wrote:
> 
> 
> On Sat, 17 Apr 2021, Damien Le Moal wrote:
> 
>> Mike,
>>
>> Zone append BIOs (REQ_OP_ZONE_APPEND) always specify the start sector
>> of the zone to be written instead of the actual location sector to
>> write. The write location is determined by the device and returned to
>> the host upon completion of the operation.
> 
> I'd like to ask what's the reason for this semantics? Why can't users of 
> the zoned device supply real sector numbers?

They can, if they use regular write commands :)

Zone append was designed to address sequential write ordering difficulties on
the host. Unlike regular writes which must be delivered to the device in
sequential order, zone append commands can be sent to the device in any order.
The device will process the write at the WP position when the command is
executed and return the first sector written. This command makes it easy on the
host in multi-queue environment and avoids the need for serializing commands &
locking zones for writing. So very efficient performance.

>> This interface, while simple and efficient for writing into sequential
>> zones of a zoned block device, is incompatible with the use of sector
>> values to calculate a cypher block IV. All data written in a zone is
>> encrypted using an IV calculated from the first sectors of the zone,
>> but read operation will specify any sector within the zone, resulting
>> in an IV mismatch between encryption and decryption. Reads fail in that
>> case.
> 
> I would say that it is incompatible with all dm targets - even the linear 
> target is changing the sector number and so it may redirect the write 
> outside of the range specified in dm-table and cause corruption.

DM remapping of BIO sectors is zone compatible because target entries must be
zone aligned. In the case of zone append, the BIO sector always point to the
start sector of the target zone. DM sector remapping will remap that to another
zone start as all zones are the same size. No issue here. We extensively use
dm-linear for various test environment to reduce the size of the device tested
(to speed up tests). I am confident there are no problems there.

> Instead of complicating device mapper with imperfect support, I would just 
> disable REQ_OP_ZONE_APPEND on device mapper at all.

That was my initial approach, but for dm-crypt only since other targets that
support zoned devices are fine. However, this breaks zoned block device
requirement that zone append be supported so that users are presented with a
uniform interface for different devices. So while simple to do, disabling zone
append is far from ideal.

> 
> Mikulas
> 
>
Mikulas Patocka April 19, 2021, 1:55 p.m. UTC | #3
On Mon, 19 Apr 2021, Damien Le Moal wrote:

> > I would say that it is incompatible with all dm targets - even the linear 
> > target is changing the sector number and so it may redirect the write 
> > outside of the range specified in dm-table and cause corruption.
> 
> DM remapping of BIO sectors is zone compatible because target entries must be
> zone aligned. In the case of zone append, the BIO sector always point to the
> start sector of the target zone. DM sector remapping will remap that to another
> zone start as all zones are the same size. No issue here. We extensively use
> dm-linear for various test environment to reduce the size of the device tested
> (to speed up tests). I am confident there are no problems there.
> 
> > Instead of complicating device mapper with imperfect support, I would just 
> > disable REQ_OP_ZONE_APPEND on device mapper at all.
> 
> That was my initial approach, but for dm-crypt only since other targets that
> support zoned devices are fine. However, this breaks zoned block device
> requirement that zone append be supported so that users are presented with a
> uniform interface for different devices. So while simple to do, disabling zone
> append is far from ideal.

So, we could enable it for the linear target and disable for all other 
targets?

I talked with Milan about it and he doesn't want to add more bloat to the 
crypt target. I agree with him.

Mikulas
Milan Broz April 19, 2021, 2:31 p.m. UTC | #4
On 19/04/2021 15:55, Mikulas Patocka wrote:
> 
> 
> On Mon, 19 Apr 2021, Damien Le Moal wrote:
> 
>>> I would say that it is incompatible with all dm targets - even the linear 
>>> target is changing the sector number and so it may redirect the write 
>>> outside of the range specified in dm-table and cause corruption.
>>
>> DM remapping of BIO sectors is zone compatible because target entries must be
>> zone aligned. In the case of zone append, the BIO sector always point to the
>> start sector of the target zone. DM sector remapping will remap that to another
>> zone start as all zones are the same size. No issue here. We extensively use
>> dm-linear for various test environment to reduce the size of the device tested
>> (to speed up tests). I am confident there are no problems there.
>>
>>> Instead of complicating device mapper with imperfect support, I would just 
>>> disable REQ_OP_ZONE_APPEND on device mapper at all.
>>
>> That was my initial approach, but for dm-crypt only since other targets that
>> support zoned devices are fine. However, this breaks zoned block device
>> requirement that zone append be supported so that users are presented with a
>> uniform interface for different devices. So while simple to do, disabling zone
>> append is far from ideal.
> 
> So, we could enable it for the linear target and disable for all other 
> targets?
> 
> I talked with Milan about it and he doesn't want to add more bloat to the 
> crypt target. I agree with him.

This is all fine even for dm-crypt IF the tweaking is unique for the sector position
(it can be something just derived from the sector offset in principle).

For FDE, we must never allow writing sectors to different positions with the same
tweak (IV) and key - there are real attacks based on this issue.

So zones can do any recalculation and reshuffling it wants if sector tweak
in dm-crypt is unique.
(Another solution would be to use different keys for different areas, but that
is not possible with dm-crypt or FDE in general, but fs encryption can do that.)

If you want dm-crypt to support zones properly, there is a need for emulation
of the real sector offset - because that is what IV expects now.

And I think such emulation should be in DM core, not in dm-crypt itself, because other
targets can need the same functionality (I guess that dm-integrity journal
has a problem with that already, Mikulas will know more).

For online reencryption we also use multiple targets in the table that dynamically moves
(linear combined with dm-crypt), so dm-crypt must support all commands as
dm-linear to make this work.

I hope I understand the problem correctly; all I want is to so avoid patching
the wrong place (dmcrypt crypto) because that problem will appear elsewhere later.
Also for security it would be nice to not add exceptions to encryption
code - it is always recipe for disaster.

Milan