mbox series

[rdma-next,00/13] Add RDMA inline crypto support

Message ID cover.1673873422.git.leon@kernel.org (mailing list archive)
Headers show
Series Add RDMA inline crypto support | expand

Message

Leon Romanovsky Jan. 16, 2023, 1:05 p.m. UTC
From Israel,

The purpose of this patchset is to add support for inline
encryption/decryption of the data at storage protocols like nvmf over
RDMA (at a similar way like integrity is used via unique mkey).

This patchset adds support for plaintext keys. The patches were tested
on BF-3 HW with fscrypt tool to test this feature, which showed reduce
in CPU utilization when comparing at 64k or more IO size. The CPU utilization
was improved by more than 50% comparing to the SW only solution at this case.

How to configure fscrypt to enable plaintext keys:
 # mkfs.ext4 -O encrypt /dev/nvme0n1
 # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
 # head -c 64 /dev/urandom > /tmp/master_key
 # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
 # mkdir /mnt/crypto/test1
 # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
   ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
      “fscryptctl add_key” command.
 # echo foo > /mnt/crypto/test1/foo

Notes:
 - At plaintext mode only, the user set a master key and the fscrypt
   driver derived from it the DEK and the key identifier.
 - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
 - Only on the first IO, nvme-rdma gets a callback to load the derived DEK. 

There is no special configuration to support crypto at nvme modules.

Thanks

Israel Rukshin (13):
  net/mlx5: Introduce crypto IFC bits and structures
  net/mlx5: Introduce crypto capabilities macro
  RDMA: Split kernel-only create QP flags from uverbs create QP flags
  RDMA/core: Add cryptographic device capabilities
  RDMA/core: Add DEK management API
  RDMA/core: Introduce MR type for crypto operations
  RDMA/core: Add support for creating crypto enabled QPs
  RDMA/mlx5: Add cryptographic device capabilities
  RDMA/mlx5: Add DEK management API
  RDMA/mlx5: Add AES-XTS crypto support
  nvme: Introduce a local variable
  nvme: Add crypto profile at nvme controller
  nvme-rdma: Add inline encryption support

 drivers/infiniband/core/device.c              |   3 +
 drivers/infiniband/core/mr_pool.c             |   2 +
 drivers/infiniband/core/uverbs_std_types_qp.c |  12 +-
 drivers/infiniband/core/verbs.c               |  91 ++++++-
 drivers/infiniband/hw/bnxt_re/ib_verbs.c      |   2 +-
 drivers/infiniband/hw/mlx4/mlx4_ib.h          |   4 +-
 drivers/infiniband/hw/mlx4/qp.c               |   4 +-
 drivers/infiniband/hw/mlx5/Makefile           |   1 +
 drivers/infiniband/hw/mlx5/crypto.c           | 115 +++++++++
 drivers/infiniband/hw/mlx5/crypto.h           |  46 ++++
 drivers/infiniband/hw/mlx5/main.c             |   5 +
 drivers/infiniband/hw/mlx5/mlx5_ib.h          |   5 +-
 drivers/infiniband/hw/mlx5/mr.c               |  33 +++
 drivers/infiniband/hw/mlx5/qp.c               |  15 +-
 drivers/infiniband/hw/mlx5/wr.c               | 164 +++++++++++-
 drivers/infiniband/hw/vmw_pvrdma/pvrdma_qp.c  |   2 +-
 drivers/net/ethernet/mellanox/mlx5/core/fw.c  |   6 +
 .../net/ethernet/mellanox/mlx5/core/main.c    |   1 +
 drivers/nvme/host/core.c                      |  10 +-
 drivers/nvme/host/nvme.h                      |   4 +
 drivers/nvme/host/rdma.c                      | 236 +++++++++++++++++-
 include/linux/mlx5/device.h                   |   4 +
 include/linux/mlx5/mlx5_ifc.h                 |  36 ++-
 include/rdma/crypto.h                         | 118 +++++++++
 include/rdma/ib_verbs.h                       |  46 +++-
 include/trace/events/rdma_core.h              |  33 +++
 26 files changed, 954 insertions(+), 44 deletions(-)
 create mode 100644 drivers/infiniband/hw/mlx5/crypto.c
 create mode 100644 drivers/infiniband/hw/mlx5/crypto.h
 create mode 100644 include/rdma/crypto.h

Comments

Eric Biggers Jan. 18, 2023, 6:47 a.m. UTC | #1
Hi Leon,

On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> >From Israel,
> 
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
> 
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.
> 
> How to configure fscrypt to enable plaintext keys:
>  # mkfs.ext4 -O encrypt /dev/nvme0n1
>  # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>  # head -c 64 /dev/urandom > /tmp/master_key
>  # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>  # mkdir /mnt/crypto/test1
>  # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
>    ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>       “fscryptctl add_key” command.
>  # echo foo > /mnt/crypto/test1/foo
> 
> Notes:
>  - At plaintext mode only, the user set a master key and the fscrypt
>    driver derived from it the DEK and the key identifier.
>  - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>  - Only on the first IO, nvme-rdma gets a callback to load the derived DEK. 
> 
> There is no special configuration to support crypto at nvme modules.
> 
> Thanks

Very interesting work!  Can you Cc me on future versions?

I'm glad to see that this hardware allows all 16 IV bytes to be specified.

Does it also handle programming and evicting keys efficiently?

Also, just checking: have you tested that the ciphertext that this inline
encryption hardware produces is correct?  That's always super important to test.
There are xfstests that test for it, e.g. generic/582.  Another way to test it
is to just manually test whether encrypted files that were created when the
filesystem was mounted with '-o inlinecrypt' show the same contents when the
filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).

- Eric
Chaitanya Kulkarni Jan. 18, 2023, 7:14 a.m. UTC | #2
Eric,

>> Notes:
>>   - At plaintext mode only, the user set a master key and the fscrypt
>>     driver derived from it the DEK and the key identifier.
>>   - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>>   - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>>
>> Thanks
> 
> Very interesting work!  Can you Cc me on future versions?
> 
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
> 
> Does it also handle programming and evicting keys efficiently?
> 
> Also, just checking: have you tested that the ciphertext that this inline
> encryption hardware produces is correct?  That's always super important to test.
> There are xfstests that test for it, e.g. generic/582.  Another way to test it
> is to just manually test whether encrypted files that were created when the
> filesystem was mounted with '-o inlinecrypt' show the same contents when the
> filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
> 
> - Eric
> 

I'm wondering which are the xfstests that needs to run in order
to establish the correctness/stability apart from generic/582
this work ?

-ck
Eric Biggers Jan. 18, 2023, 7:17 a.m. UTC | #3
On Wed, Jan 18, 2023 at 07:14:30AM +0000, Chaitanya Kulkarni wrote:
> Eric,
> 
> >> Notes:
> >>   - At plaintext mode only, the user set a master key and the fscrypt
> >>     driver derived from it the DEK and the key identifier.
> >>   - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> >>   - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
> >>
> >> There is no special configuration to support crypto at nvme modules.
> >>
> >> Thanks
> > 
> > Very interesting work!  Can you Cc me on future versions?
> > 
> > I'm glad to see that this hardware allows all 16 IV bytes to be specified.
> > 
> > Does it also handle programming and evicting keys efficiently?
> > 
> > Also, just checking: have you tested that the ciphertext that this inline
> > encryption hardware produces is correct?  That's always super important to test.
> > There are xfstests that test for it, e.g. generic/582.  Another way to test it
> > is to just manually test whether encrypted files that were created when the
> > filesystem was mounted with '-o inlinecrypt' show the same contents when the
> > filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
> > 
> > - Eric
> > 
> 
> I'm wondering which are the xfstests that needs to run in order
> to establish the correctness/stability apart from generic/582
> this work ?
> 

See https://docs.kernel.org/filesystems/fscrypt.html#tests.

- Eric
Christoph Hellwig Jan. 18, 2023, 7:36 a.m. UTC | #4
On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> >From Israel,
> 
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
> 
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.

One thing that needs to be solved before we can look into this is the
interaction with protection information (or integrity data in Linux
terms).  Currently inline encryption and protection information are
mutally incompatible.
Leon Romanovsky Jan. 18, 2023, 8:22 a.m. UTC | #5
On Tue, Jan 17, 2023 at 10:47:44PM -0800, Eric Biggers wrote:
> Hi Leon,
> 
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
> > >From Israel,
> > 
> > The purpose of this patchset is to add support for inline
> > encryption/decryption of the data at storage protocols like nvmf over
> > RDMA (at a similar way like integrity is used via unique mkey).
> > 
> > This patchset adds support for plaintext keys. The patches were tested
> > on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> > in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> > was improved by more than 50% comparing to the SW only solution at this case.
> > 
> > How to configure fscrypt to enable plaintext keys:
> >  # mkfs.ext4 -O encrypt /dev/nvme0n1
> >  # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
> >  # head -c 64 /dev/urandom > /tmp/master_key
> >  # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
> >  # mkdir /mnt/crypto/test1
> >  # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
> >    ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
> >       “fscryptctl add_key” command.
> >  # echo foo > /mnt/crypto/test1/foo
> > 
> > Notes:
> >  - At plaintext mode only, the user set a master key and the fscrypt
> >    driver derived from it the DEK and the key identifier.
> >  - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
> >  - Only on the first IO, nvme-rdma gets a callback to load the derived DEK. 
> > 
> > There is no special configuration to support crypto at nvme modules.
> > 
> > Thanks
> 
> Very interesting work!  Can you Cc me on future versions?

Sure

> 
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
> 
> Does it also handle programming and evicting keys efficiently?

"efficiently" is a very subjective term. We are using FW command
interface to program keys and this interface can do hundreds/thousands
commands per-second.

Thanks
Israel Rukshin Jan. 18, 2023, 8:58 a.m. UTC | #6
Hi Eric,

On 1/18/2023 8:47 AM, Eric Biggers wrote:
> Hi Leon,
>
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
>> >From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
>> was improved by more than 50% comparing to the SW only solution at this case.
>>
>> How to configure fscrypt to enable plaintext keys:
>>   # mkfs.ext4 -O encrypt /dev/nvme0n1
>>   # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>>   # head -c 64 /dev/urandom > /tmp/master_key
>>   # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>>   # mkdir /mnt/crypto/test1
>>   # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
>>     ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>>        “fscryptctl add_key” command.
>>   # echo foo > /mnt/crypto/test1/foo
>>
>> Notes:
>>   - At plaintext mode only, the user set a master key and the fscrypt
>>     driver derived from it the DEK and the key identifier.
>>   - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>>   - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>>
>> Thanks
> Very interesting work!  Can you Cc me on future versions?
>
> I'm glad to see that this hardware allows all 16 IV bytes to be specified.
>
> Does it also handle programming and evicting keys efficiently?
>
> Also, just checking: have you tested that the ciphertext that this inline
> encryption hardware produces is correct?  That's always super important to test.
> There are xfstests that test for it, e.g. generic/582.  Another way to test it
> is to just manually test whether encrypted files that were created when the
> filesystem was mounted with '-o inlinecrypt' show the same contents when the
> filesystem is *not* mounted with '-o inlinecrypt' (or vice versa).
sure, I ran the manual test of comparing the encrypted files content 
with and without the '-o inlinecrypt' at the mount command.
>
> - Eric
  - Israel
Max Gurtovoy Jan. 18, 2023, 2:20 p.m. UTC | #7
On 18/01/2023 9:36, Christoph Hellwig wrote:
> On Mon, Jan 16, 2023 at 03:05:47PM +0200, Leon Romanovsky wrote:
>> >From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
>> was improved by more than 50% comparing to the SW only solution at this case.
> One thing that needs to be solved before we can look into this is the
> interaction with protection information (or integrity data in Linux
> terms).  Currently inline encryption and protection information are
> mutally incompatible.

Well, for sure this should be solved. Not sure that it should be done 
before this series.

This patch set doesn't break/change existing behavior of PI and 
Encryption offloads so the above can be done incrementally.
It's already big enough series and 2 subsystems are involved.

Today, if one would like to run both features using a capable device 
then the PI will be offloaded (no SW fallback) and the Encryption will 
be using SW fallback.
There should be a serious instrumentation in the block layer to make 
both operations offloaded.
We'll start looking into it. Any suggestions and designs are welcomed.

We also prepared patches to extend the blk lavel encryption (in addition 
to fscrypt exist today).
Sagi Grimberg Jan. 23, 2023, 11:27 a.m. UTC | #8
>  From Israel,
> 
> The purpose of this patchset is to add support for inline
> encryption/decryption of the data at storage protocols like nvmf over
> RDMA (at a similar way like integrity is used via unique mkey).
> 
> This patchset adds support for plaintext keys. The patches were tested
> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
> in CPU utilization when comparing at 64k or more IO size. The CPU utilization
> was improved by more than 50% comparing to the SW only solution at this case.
> 
> How to configure fscrypt to enable plaintext keys:
>   # mkfs.ext4 -O encrypt /dev/nvme0n1
>   # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>   # head -c 64 /dev/urandom > /tmp/master_key
>   # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>   # mkdir /mnt/crypto/test1
>   # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb /mnt/crypto/test1
>     ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>        “fscryptctl add_key” command.
>   # echo foo > /mnt/crypto/test1/foo
> 
> Notes:
>   - At plaintext mode only, the user set a master key and the fscrypt
>     driver derived from it the DEK and the key identifier.
>   - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>   - Only on the first IO, nvme-rdma gets a callback to load the derived DEK.
> 
> There is no special configuration to support crypto at nvme modules.

Hey, this looks sane to me in a very first glance.

Few high level questions:
- what happens with multipathing? when if not all devices are
capable. SW fallback?
- Does the crypt stuff stay intact when bio is requeued?

I'm assuming you tested this with multipathing? This is not very
useful if it is incompatible with it.
Israel Rukshin Jan. 23, 2023, 12:57 p.m. UTC | #9
Hi Sagi,

On 1/23/2023 1:27 PM, Sagi Grimberg wrote:
>
>>  From Israel,
>>
>> The purpose of this patchset is to add support for inline
>> encryption/decryption of the data at storage protocols like nvmf over
>> RDMA (at a similar way like integrity is used via unique mkey).
>>
>> This patchset adds support for plaintext keys. The patches were tested
>> on BF-3 HW with fscrypt tool to test this feature, which showed reduce
>> in CPU utilization when comparing at 64k or more IO size. The CPU 
>> utilization
>> was improved by more than 50% comparing to the SW only solution at 
>> this case.
>>
>> How to configure fscrypt to enable plaintext keys:
>>   # mkfs.ext4 -O encrypt /dev/nvme0n1
>>   # mount /dev/nvme0n1 /mnt/crypto -o inlinecrypt
>>   # head -c 64 /dev/urandom > /tmp/master_key
>>   # fscryptctl add_key /mnt/crypto/ < /tmp/master_key
>>   # mkdir /mnt/crypto/test1
>>   # fscryptctl set_policy 152c41b2ea39fa3d90ea06448456e7fb 
>> /mnt/crypto/test1
>>     ** “152c41b2ea39fa3d90ea06448456e7fb” is the output of the
>>        “fscryptctl add_key” command.
>>   # echo foo > /mnt/crypto/test1/foo
>>
>> Notes:
>>   - At plaintext mode only, the user set a master key and the fscrypt
>>     driver derived from it the DEK and the key identifier.
>>   - 152c41b2ea39fa3d90ea06448456e7fb is the derived key identifier
>>   - Only on the first IO, nvme-rdma gets a callback to load the 
>> derived DEK.
>>
>> There is no special configuration to support crypto at nvme modules.
>
> Hey, this looks sane to me in a very first glance.
>
> Few high level questions:
> - what happens with multipathing? when if not all devices are
> capable. SW fallback?
SW fallback happens every time the device doesn't support the specific 
crypto request (which include data-unit-size, mode and dun_bytes).
So with multipathing, one path uses the HW crypto offload and the other 
one uses the SW fallback.
> - Does the crypt stuff stay intact when bio is requeued?
Yes, the crypto ctx is copied when cloning the bio.
>
> I'm assuming you tested this with multipathing? This is not very
> useful if it is incompatible with it.
Yes, sure.
You can see the call to  blk_crypto_reprogram_all_keys(), which is 
called when the controller was reconnected after port toggling.

- Israel
Christoph Hellwig Jan. 30, 2023, 12:35 p.m. UTC | #10
On Wed, Jan 18, 2023 at 04:20:27PM +0200, Max Gurtovoy wrote:
> Today, if one would like to run both features using a capable device then 
> the PI will be offloaded (no SW fallback) and the Encryption will be using 
> SW fallback.

It's not just running both - it's offering both as currently registering
these fatures is mutally incompatible.
Christoph Hellwig Jan. 30, 2023, 12:36 p.m. UTC | #11
On Mon, Jan 23, 2023 at 02:57:18PM +0200, Israel Rukshin wrote:
>> - what happens with multipathing? when if not all devices are
>> capable. SW fallback?
> SW fallback happens every time the device doesn't support the specific 
> crypto request (which include data-unit-size, mode and dun_bytes).
> So with multipathing, one path uses the HW crypto offload and the other one 
> uses the SW fallback.

That's a big no-go.  The blk-crypto-fallback code is just a toy example
and not actually safe to use in prodution.  Most importantly it just
kmallocs a bio clone and pages for it without any mempool that guarantees
forward progress.
Max Gurtovoy Jan. 30, 2023, 2:33 p.m. UTC | #12
On 30/01/2023 14:35, Christoph Hellwig wrote:
> On Wed, Jan 18, 2023 at 04:20:27PM +0200, Max Gurtovoy wrote:
>> Today, if one would like to run both features using a capable device then
>> the PI will be offloaded (no SW fallback) and the Encryption will be using
>> SW fallback.
> It's not just running both - it's offering both as currently registering
> these fatures is mutally incompatible.

For sure we would like to offer both, but as mentioned before, this will 
require blk layer instrumentation and ib_core instrumentation to some 
pipeline_mr or something like that.
We are interested in doing this IO path but doing all in 1 series 
doesn't sounds possible.

First we would like to have at least the mutual exclusive solution since 
there are users that may want only one type of acceleration.

re-designing, both blk layer and ib_core for the PI + CRYPTO case in 
this series is not feasible IMO.
Sagi Grimberg Feb. 14, 2023, 10:01 a.m. UTC | #13
>>> Today, if one would like to run both features using a capable device 
>>> then
>>> the PI will be offloaded (no SW fallback) and the Encryption will be 
>>> using
>>> SW fallback.
>> It's not just running both - it's offering both as currently registering
>> these fatures is mutally incompatible.
> 
> For sure we would like to offer both, but as mentioned before, this will 
> require blk layer instrumentation and ib_core instrumentation to some 
> pipeline_mr or something like that.

Can you explain what is needed in the block layer that prevents queueing
a request that has PI and crypto ctx?

As for the rdma portion, I don't know enough if this is something
that is supported by the HW and lacks a SW interface, or inherently
unsupported in HW...