diff mbox series

[14/17] btrfs: directly call into crypto framework for checsumming

Message ID 20190510111547.15310-15-jthumshirn@suse.de (mailing list archive)
State New, archived
Headers show
Series Add support for SHA-256 checksums | expand

Commit Message

Johannes Thumshirn May 10, 2019, 11:15 a.m. UTC
Currently btrfs_csum_data() relied on the crc32c() wrapper around the crypto
framework for calculating the CRCs.

As we have our own crypto_shash structure in the fs_info now, we can
directly call into the crypto framework without going trough the wrapper.

Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
---
 fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++--
 1 file changed, 26 insertions(+), 2 deletions(-)

Comments

Chris Mason May 10, 2019, 1:45 p.m. UTC | #1
On 10 May 2019, at 7:15, Johannes Thumshirn wrote:

> Currently btrfs_csum_data() relied on the crc32c() wrapper around the 
> crypto
> framework for calculating the CRCs.
>
> As we have our own crypto_shash structure in the fs_info now, we can
> directly call into the crypto framework without going trough the 
> wrapper.

It has been a while since I dug through the cryptoapi internals.  I have 
one vague and somewhat unfounded concern, where we're defining the disk 
format to be whatever is returned from looking up sha256 or crc32c in 
the crypto tables.  I'm not really sure how to resolve this since we 
obviously don't want our own sha256 implementation.

I'm a little concerned about about btrfs_csum_data() and 
btrfs_csum_final() below.  We're using two different 
SHASH_DESC_ON_STACK() and then overwriting internals (shash_desc_ctx()) 
with the assumption that whatever we're doing is going to be the same as 
using the same shash_desc struct for both the update and final calls.  I 
think we should be either using or adding a helper to the crypto api for 
this.  We're digging too deep into cryptoapi private structs with the 
current usage.

Otherwise, thanks for doing this, it looks great overall.

-chris

>
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++--
>  1 file changed, 26 insertions(+), 2 deletions(-)
>
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index fb0b06b7b9f6..0c9ac7b45ce8 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -253,12 +253,36 @@ struct extent_map *btree_get_extent(struct 
> btrfs_inode *inode,
>  u32 btrfs_csum_data(struct btrfs_fs_info *fs_info, const char *data,
>  		    u32 seed, size_t len)
>  {
> -	return crc32c(seed, data, len);
> +	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
> +	u32 *ctx = (u32 *)shash_desc_ctx(shash);
> +	u32 retval;
> +	int err;
> +
> +	shash->tfm = fs_info->csum_shash;
> +	shash->flags = 0;
> +	*ctx = seed;
> +
> +	err = crypto_shash_update(shash, data, len);
> +	ASSERT(err);
> +
> +	retval = *ctx;
> +	barrier_data(ctx);
> +
> +	return retval;
>  }
>
>  void btrfs_csum_final(struct btrfs_fs_info *fs_info, u32 crc, u8 
> *result)
>  {
> -	put_unaligned_le32(~crc, result);
> +	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
> +	u32 *ctx = (u32 *)shash_desc_ctx(shash);
> +	int err;
> +
> +	shash->tfm = fs_info->csum_shash;
> +	shash->flags = 0;
> +	*ctx = crc;
> +
> +	err = crypto_shash_final(shash, result);
> +	ASSERT(err);
>  }
>
>  /*
> -- 
> 2.16.4
Chris Mason May 10, 2019, 1:54 p.m. UTC | #2
On 10 May 2019, at 9:45, Chris Mason wrote:

> On 10 May 2019, at 7:15, Johannes Thumshirn wrote:
>
>> Currently btrfs_csum_data() relied on the crc32c() wrapper around the 
>> crypto
>> framework for calculating the CRCs.
>>
>> As we have our own crypto_shash structure in the fs_info now, we can
>> directly call into the crypto framework without going trough the 
>> wrapper.
>
> It has been a while since I dug through the cryptoapi internals.  I 
> have one vague and somewhat unfounded concern, where we're defining 
> the disk format to be whatever is returned from looking up sha256 or 
> crc32c in the crypto tables.  I'm not really sure how to resolve this 
> since we obviously don't want our own sha256 implementation.
>
> I'm a little concerned about about btrfs_csum_data() and 
> btrfs_csum_final() below.  We're using two different 
> SHASH_DESC_ON_STACK() and then overwriting internals 
> (shash_desc_ctx()) with the assumption that whatever we're doing is 
> going to be the same as using the same shash_desc struct for both the 
> update and final calls.  I think we should be either using or adding a 
> helper to the crypto api for this.  We're digging too deep into 
> cryptoapi private structs with the current usage.

Looking at the callers of btrfs_csum_final() and btrfs_csum_data(), lets 
just pass the shash_desc from the callers.  That way you don't have to 
poke at memcpy and shash_desc_ctx().

I'm a little worried about stack usage (at least 360 bytes), but worst 
case we can do some percpu gymnastics.

-chris
Johannes Thumshirn May 13, 2019, 7:17 a.m. UTC | #3
On Fri, May 10, 2019 at 01:54:30PM +0000, Chris Mason wrote:
> Looking at the callers of btrfs_csum_final() and btrfs_csum_data(), lets 
> just pass the shash_desc from the callers.  That way you don't have to 
> poke at memcpy and shash_desc_ctx().
 
I wanted to avoid spreading knowledge about the crypto api to deep into the
filesystem. I'd actually loved to have something akin to ubifs'
ubifs_info::log_hash but wasn't really sure how to handle concurrency.

> I'm a little worried about stack usage (at least 360 bytes), but worst 
> case we can do some percpu gymnastics.
> 
> -chris
David Sterba May 13, 2019, 1 p.m. UTC | #4
On Fri, May 10, 2019 at 01:15:44PM +0200, Johannes Thumshirn wrote:
> Currently btrfs_csum_data() relied on the crc32c() wrapper around the crypto
> framework for calculating the CRCs.
> 
> As we have our own crypto_shash structure in the fs_info now, we can
> directly call into the crypto framework without going trough the wrapper.
> 
> Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> ---
>  fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++--
>  1 file changed, 26 insertions(+), 2 deletions(-)
> 
> diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> index fb0b06b7b9f6..0c9ac7b45ce8 100644
> --- a/fs/btrfs/disk-io.c
> +++ b/fs/btrfs/disk-io.c
> @@ -253,12 +253,36 @@ struct extent_map *btree_get_extent(struct btrfs_inode *inode,
>  u32 btrfs_csum_data(struct btrfs_fs_info *fs_info, const char *data,
>  		    u32 seed, size_t len)
>  {
> -	return crc32c(seed, data, len);
> +	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
> +	u32 *ctx = (u32 *)shash_desc_ctx(shash);
> +	u32 retval;
> +	int err;
> +
> +	shash->tfm = fs_info->csum_shash;
> +	shash->flags = 0;
> +	*ctx = seed;
> +
> +	err = crypto_shash_update(shash, data, len);
> +	ASSERT(err);
> +
> +	retval = *ctx;
> +	barrier_data(ctx);
> +
> +	return retval;

I unfortunatelly had a dive into the crypto API calls and since then I'm
biased against using it (too much indirection and extra memory
references). So my idea of this function is:

switch (fs_info->csum_type) {
case CRC32:
	... calculate crc32c;
	break;
case SHA256:
	... calculate sha56;
	break;
}

with direct calls to the hash primitives rather than thravelling trough
the shash.
Johannes Thumshirn May 13, 2019, 1:01 p.m. UTC | #5
On Mon, May 13, 2019 at 03:00:03PM +0200, David Sterba wrote:
> On Fri, May 10, 2019 at 01:15:44PM +0200, Johannes Thumshirn wrote:
> > Currently btrfs_csum_data() relied on the crc32c() wrapper around the crypto
> > framework for calculating the CRCs.
> > 
> > As we have our own crypto_shash structure in the fs_info now, we can
> > directly call into the crypto framework without going trough the wrapper.
> > 
> > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> > ---
> >  fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++--
> >  1 file changed, 26 insertions(+), 2 deletions(-)
> > 
> > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > index fb0b06b7b9f6..0c9ac7b45ce8 100644
> > --- a/fs/btrfs/disk-io.c
> > +++ b/fs/btrfs/disk-io.c
> > @@ -253,12 +253,36 @@ struct extent_map *btree_get_extent(struct btrfs_inode *inode,
> >  u32 btrfs_csum_data(struct btrfs_fs_info *fs_info, const char *data,
> >  		    u32 seed, size_t len)
> >  {
> > -	return crc32c(seed, data, len);
> > +	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
> > +	u32 *ctx = (u32 *)shash_desc_ctx(shash);
> > +	u32 retval;
> > +	int err;
> > +
> > +	shash->tfm = fs_info->csum_shash;
> > +	shash->flags = 0;
> > +	*ctx = seed;
> > +
> > +	err = crypto_shash_update(shash, data, len);
> > +	ASSERT(err);
> > +
> > +	retval = *ctx;
> > +	barrier_data(ctx);
> > +
> > +	return retval;
> 
> I unfortunatelly had a dive into the crypto API calls and since then I'm
> biased against using it (too much indirection and extra memory
> references). So my idea of this function is:
> 
> switch (fs_info->csum_type) {
> case CRC32:
> 	... calculate crc32c;
> 	break;
> case SHA256:
> 	... calculate sha56;
> 	break;
> }
> 
> with direct calls to the hash primitives rather than thravelling trough
> the shash.

well at least the crc3c2() call we use does nothing else (from
lib/libcrc32c.c):

u32 crc32c(u32 crc, const void *address, unsigned int length)
{
	SHASH_DESC_ON_STACK(shash, tfm);
	u32 ret, *ctx = (u32 *)shash_desc_ctx(shash);
	int err;

	shash->tfm = tfm;
	shash->flags = 0;
	*ctx = crc;

	err = crypto_shash_update(shash, address, length);
	BUG_ON(err);

	ret = *ctx;
	barrier_data(ctx);
	return ret;
}

EXPORT_SYMBOL(crc32c);
Chris Mason May 13, 2019, 1:55 p.m. UTC | #6
On 13 May 2019, at 3:17, Johannes Thumshirn wrote:

> On Fri, May 10, 2019 at 01:54:30PM +0000, Chris Mason wrote:
>> Looking at the callers of btrfs_csum_final() and btrfs_csum_data(), 
>> lets
>> just pass the shash_desc from the callers.  That way you don't have 
>> to
>> poke at memcpy and shash_desc_ctx().
>
> I wanted to avoid spreading knowledge about the crypto api to deep 
> into the
> filesystem. I'd actually loved to have something akin to ubifs'
> ubifs_info::log_hash but wasn't really sure how to handle concurrency.

It's much better to be explicit that we're using the crypto API than to 
muck around in crypto api internal data structs.  It's just a few 
callers, and not major surgery to change it around some time later if we 
need to.

-chris
David Sterba May 13, 2019, 2:30 p.m. UTC | #7
On Mon, May 13, 2019 at 03:01:11PM +0200, Johannes Thumshirn wrote:
> On Mon, May 13, 2019 at 03:00:03PM +0200, David Sterba wrote:
> > On Fri, May 10, 2019 at 01:15:44PM +0200, Johannes Thumshirn wrote:
> > > Currently btrfs_csum_data() relied on the crc32c() wrapper around the crypto
> > > framework for calculating the CRCs.
> > > 
> > > As we have our own crypto_shash structure in the fs_info now, we can
> > > directly call into the crypto framework without going trough the wrapper.
> > > 
> > > Signed-off-by: Johannes Thumshirn <jthumshirn@suse.de>
> > > ---
> > >  fs/btrfs/disk-io.c | 28 ++++++++++++++++++++++++++--
> > >  1 file changed, 26 insertions(+), 2 deletions(-)
> > > 
> > > diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
> > > index fb0b06b7b9f6..0c9ac7b45ce8 100644
> > > --- a/fs/btrfs/disk-io.c
> > > +++ b/fs/btrfs/disk-io.c
> > > @@ -253,12 +253,36 @@ struct extent_map *btree_get_extent(struct btrfs_inode *inode,
> > >  u32 btrfs_csum_data(struct btrfs_fs_info *fs_info, const char *data,
> > >  		    u32 seed, size_t len)
> > >  {
> > > -	return crc32c(seed, data, len);
> > > +	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
> > > +	u32 *ctx = (u32 *)shash_desc_ctx(shash);
> > > +	u32 retval;
> > > +	int err;
> > > +
> > > +	shash->tfm = fs_info->csum_shash;
> > > +	shash->flags = 0;
> > > +	*ctx = seed;
> > > +
> > > +	err = crypto_shash_update(shash, data, len);
> > > +	ASSERT(err);
> > > +
> > > +	retval = *ctx;
> > > +	barrier_data(ctx);
> > > +
> > > +	return retval;
> > 
> > I unfortunatelly had a dive into the crypto API calls and since then I'm
> > biased against using it (too much indirection and extra memory
> > references). So my idea of this function is:
> > 
> > switch (fs_info->csum_type) {
> > case CRC32:
> > 	... calculate crc32c;
> > 	break;
> > case SHA256:
> > 	... calculate sha56;
> > 	break;
> > }
> > 
> > with direct calls to the hash primitives rather than thravelling trough
> > the shash.
> 
> well at least the crc3c2() call we use does nothing else (from
> lib/libcrc32c.c):

Oh I see, and we actually can't avoid the crypto api because we want to
let it select the implementation based on acceleration hw or use the
fallback. Ok then.
Johannes Thumshirn May 14, 2019, 12:46 p.m. UTC | #8
On Fri, May 10, 2019 at 01:45:47PM +0000, Chris Mason wrote:
> I'm a little concerned about about btrfs_csum_data() and 
> btrfs_csum_final() below.  We're using two different 
> SHASH_DESC_ON_STACK() and then overwriting internals (shash_desc_ctx()) 
> with the assumption that whatever we're doing is going to be the same as 
> using the same shash_desc struct for both the update and final calls.  I 
> think we should be either using or adding a helper to the crypto api for 
> this.  We're digging too deep into cryptoapi private structs with the 
> current usage.
 
I think I found the solution. Instead of doing memset() + memcpy() +
crypto_shash_update() + crypto_shash_final(), I could call
crypto_shash_digest() which would wrap all of them for me in both the crc32c
and sha256 case.

Let me have a look and throw it in some testing.

> Otherwise, thanks for doing this, it looks great overall.

Thanks :)
diff mbox series

Patch

diff --git a/fs/btrfs/disk-io.c b/fs/btrfs/disk-io.c
index fb0b06b7b9f6..0c9ac7b45ce8 100644
--- a/fs/btrfs/disk-io.c
+++ b/fs/btrfs/disk-io.c
@@ -253,12 +253,36 @@  struct extent_map *btree_get_extent(struct btrfs_inode *inode,
 u32 btrfs_csum_data(struct btrfs_fs_info *fs_info, const char *data,
 		    u32 seed, size_t len)
 {
-	return crc32c(seed, data, len);
+	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
+	u32 *ctx = (u32 *)shash_desc_ctx(shash);
+	u32 retval;
+	int err;
+
+	shash->tfm = fs_info->csum_shash;
+	shash->flags = 0;
+	*ctx = seed;
+
+	err = crypto_shash_update(shash, data, len);
+	ASSERT(err);
+
+	retval = *ctx;
+	barrier_data(ctx);
+
+	return retval;
 }
 
 void btrfs_csum_final(struct btrfs_fs_info *fs_info, u32 crc, u8 *result)
 {
-	put_unaligned_le32(~crc, result);
+	SHASH_DESC_ON_STACK(shash, fs_info->csum_shash);
+	u32 *ctx = (u32 *)shash_desc_ctx(shash);
+	int err;
+
+	shash->tfm = fs_info->csum_shash;
+	shash->flags = 0;
+	*ctx = crc;
+
+	err = crypto_shash_final(shash, result);
+	ASSERT(err);
 }
 
 /*