diff mbox series

generic/692: Generalize the test for non-4k merkle tree block sizes

Message ID 20230111175314.73346-1-ojaswin@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series generic/692: Generalize the test for non-4k merkle tree block sizes | expand

Commit Message

Ojaswin Mujoo Jan. 11, 2023, 5:53 p.m. UTC
Due to the assumtion of Merkle tree block size being 4k, the size calculated
for second test was taking way too long to hit EFBIG in case of bigger block
sizes like 64k. Fix this by genralizing the calculation.

Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
---
 tests/generic/692 | 21 +++++++++++++++------
 1 file changed, 15 insertions(+), 6 deletions(-)

Comments

Eric Biggers Jan. 11, 2023, 8:51 p.m. UTC | #1
On Wed, Jan 11, 2023 at 11:23:14PM +0530, Ojaswin Mujoo wrote:
> Due to the assumtion of Merkle tree block size being 4k, the size calculated
> for second test was taking way too long to hit EFBIG in case of bigger block
> sizes like 64k. Fix this by genralizing the calculation.
> 
> Signed-off-by: Ojaswin Mujoo <ojaswin@linux.ibm.com>
> ---
>  tests/generic/692 | 21 +++++++++++++++------
>  1 file changed, 15 insertions(+), 6 deletions(-)
> 
> diff --git a/tests/generic/692 b/tests/generic/692
> index d6da734b..0a354802 100755
> --- a/tests/generic/692
> +++ b/tests/generic/692
> @@ -54,15 +54,24 @@ _fsv_enable $fsv_file |& _filter_scratch
>  # (MAX) to be in the middle of L0 -- ideally near the beginning of L0 so that we
>  # don't have to write many blocks before getting an error.
>  #
> -# With SHA-256 and 4K blocks, there are 128 hashes per block.  Thus, ignoring
> -# padding, L0 is 1/128 of the file size while the other levels in total are
> -# 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size.  So still
> +# For example, with SHA-256 and 4K blocks, there are 128 hashes per block. Thus,
> +# ignoring padding, L0 is 1/128 of the file size while the other levels in total
> +# are 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size. So still
>  # ignoring padding, for L0 start exactly at MAX, the file size must be s such
> -# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257).  Then to get a file size
> +# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257). Then to get a file size
>  # where MAX occurs *near* the start of L0 rather than *at* the start, we can
>  # just subtract an overestimate of the padding: 64K after the file contents,
> -# then 4K per level, where the consideration of 8 levels is sufficient.
> -sz=$(echo "scale=20; $max_sz * (16256/16257) - 65536 - 4096*8" | $BC -q | cut -d. -f1)
> +# then 4K per level, where the consideration of 8 levels is sufficient. Below
> +# code generalizes this logic for all merkle tree sizes.
> +bs=$FSV_BLOCK_SIZE
> +hash_size=32   # SHA-256
> +hash_per_block=$(echo "scale=20; $bs/($hash_size)" | $BC -q)
> +a=$(echo "scale=20; 1/($hash_per_block^2)" | $BC -q)
> +r=$(echo "scale=20; 1/$hash_per_block" | $BC -q)
> +treesize_without_l1=$(echo "scale=20; $a/(1-$r)" | $BC -q)
> +sz=$(echo "scale=20; $max_sz/(1+$treesize_without_l1)" | $BC -q)
> +# adjust $sz so we are more likely to hit EFBIG while building level 1
> +sz=$(echo "scale=20; $sz - 65536 - $bs*8" | $BC -q | cut -d. -f1)
>  _fsv_scratch_begin_subtest "still too big: fail on first invalid merkle block"
>  truncate -s $sz $fsv_file
>  _fsv_enable $fsv_file |& _filter_scratch

Thanks!  I'd like to improve the explanation of the calculation, and fix up a
few other things, so I ended up just sending out an updated version of this
patch --- I hope that's okay with you.  Can you take a look?
https://lore.kernel.org/r/20230111204739.77828-1-ebiggers@kernel.org

- Eric
diff mbox series

Patch

diff --git a/tests/generic/692 b/tests/generic/692
index d6da734b..0a354802 100755
--- a/tests/generic/692
+++ b/tests/generic/692
@@ -54,15 +54,24 @@  _fsv_enable $fsv_file |& _filter_scratch
 # (MAX) to be in the middle of L0 -- ideally near the beginning of L0 so that we
 # don't have to write many blocks before getting an error.
 #
-# With SHA-256 and 4K blocks, there are 128 hashes per block.  Thus, ignoring
-# padding, L0 is 1/128 of the file size while the other levels in total are
-# 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size.  So still
+# For example, with SHA-256 and 4K blocks, there are 128 hashes per block. Thus,
+# ignoring padding, L0 is 1/128 of the file size while the other levels in total
+# are 1/128**2 + 1/128**3 + 1/128**4 + ... = 1/16256 of the file size. So still
 # ignoring padding, for L0 start exactly at MAX, the file size must be s such
-# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257).  Then to get a file size
+# that s + s/16256 = MAX, i.e. s = MAX * (16256/16257). Then to get a file size
 # where MAX occurs *near* the start of L0 rather than *at* the start, we can
 # just subtract an overestimate of the padding: 64K after the file contents,
-# then 4K per level, where the consideration of 8 levels is sufficient.
-sz=$(echo "scale=20; $max_sz * (16256/16257) - 65536 - 4096*8" | $BC -q | cut -d. -f1)
+# then 4K per level, where the consideration of 8 levels is sufficient. Below
+# code generalizes this logic for all merkle tree sizes.
+bs=$FSV_BLOCK_SIZE
+hash_size=32   # SHA-256
+hash_per_block=$(echo "scale=20; $bs/($hash_size)" | $BC -q)
+a=$(echo "scale=20; 1/($hash_per_block^2)" | $BC -q)
+r=$(echo "scale=20; 1/$hash_per_block" | $BC -q)
+treesize_without_l1=$(echo "scale=20; $a/(1-$r)" | $BC -q)
+sz=$(echo "scale=20; $max_sz/(1+$treesize_without_l1)" | $BC -q)
+# adjust $sz so we are more likely to hit EFBIG while building level 1
+sz=$(echo "scale=20; $sz - 65536 - $bs*8" | $BC -q | cut -d. -f1)
 _fsv_scratch_begin_subtest "still too big: fail on first invalid merkle block"
 truncate -s $sz $fsv_file
 _fsv_enable $fsv_file |& _filter_scratch