[v2,3/4] btrfs: Add zstd support
diff mbox

Message ID 15FC42E2-5823-4AF4-A945-0F7C60E7751C@fb.com
State New
Headers show

Commit Message

Nick Terrell July 11, 2017, 4:57 a.m. UTC
On 7/10/17, 5:36 AM, "Austin S. Hemmelgarn" <ahferroin7@gmail.com> wrote:
> On 2017-07-07 23:07, Adam Borowski wrote:

>> On Sat, Jul 08, 2017 at 01:40:18AM +0200, Adam Borowski wrote:

>>> On Fri, Jul 07, 2017 at 11:17:49PM +0000, Nick Terrell wrote:

>>>> On 7/6/17, 9:32 AM, "Adam Borowski" <kilobyte@angband.pl> wrote:

>>>>> Got a reproducible crash on amd64:

>>>

>>>> Thanks for the bug report Adam! I'm looking into the failure, and haven't

>>>> been able to reproduce it yet. I've built my kernel from your tree, and

>>>> I ran your script with the kernel.tar tarball 100 times, but haven't gotten

>>>> a failure yet.

>>>

>>>> I have a few questions to guide my debugging.

>>>>

>>>> - How many cores are you running with? I’ve run the script with 1, 2, and 4 cores.

>>>> - Which version of gcc are you using to compile the kernel? I’m using gcc-6.2.0-5ubuntu12.

>>>> - Are the failures always in exactly the same place, and does it fail 100%

>>>>    of the time or just regularly?

>>>

>>> 6 cores -- all on bare metal.  gcc-7.1.0-9.

>>> Lemme try with gcc-6, a different config or in a VM.

>>

>> I've tried the following:

>> * gcc-6, defconfig (+btrfs obviously)

>> * gcc-7, defconfig

>> * gcc-6, my regular config

>> * gcc-7, my regular config

>> * gcc-7, debug + UBSAN + etc

>> * gcc-7, defconfig, qemu-kvm with only 1 core

>>

>> Every build with gcc-7 reproduces the crash, every with gcc-6 does not.

>>

> Got a GCC7 tool-chain built, and I can confirm this here too, tested 

> with various numbers of cores ranging from 1-32 in a QEMU+KVM VM, with 

> various combinations of debug options and other config switches.


The problem is caused by a gcc-7 bug [1]. It miscompiles
ZSTD_wildcopy(void *dst, void const *src, ptrdiff_t len) when len is 0.
It only happens when it can't analyze ZSTD_copy8(), which is the case in
the kernel, because memcpy() is implemented with inline assembly. The
generated code is slow anyways, so I propose this workaround, which will
be included in the next patch set. I've confirmed that it fixes the bug for
me. This alternative implementation is also 10-20x faster, and compiles to
the same x86 assembly as the original ZSTD_wildcopy() with the userland
memcpy() implementation [2].

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=81388#add_comment
[2] https://godbolt.org/g/q5YpLx

Signed-off-by: Nick Terrell <terrelln@fb.com>

---
 lib/zstd/zstd_internal.h | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

--
2.9.3

Comments

Adam Borowski July 12, 2017, 3:38 a.m. UTC | #1
On Tue, Jul 11, 2017 at 06:01:38AM +0000, Nick Terrell wrote:
> On 7/10/17, 9:57 PM, "Nick Terrell" <terrelln@fb.com> wrote:
> > The problem is caused by a gcc-7 bug [1]. It miscompiles
> > ZSTD_wildcopy(void *dst, void const *src, ptrdiff_t len) when len is 0.
>
> Sorry, my patch still triggered the gcc bug, I used the wrong compiler.
> This patch works, and runs about the same speed as before the patch for
> small inputs, and slightly faster for larger inputs (100+ bytes). I'll
> look for a faster workaround if benchmarks show it matters.

Confirmed, the fix stops the crash for me.  Yay!

> --- a/lib/zstd/zstd_internal.h
> +++ b/lib/zstd/zstd_internal.h
> @@ -139,12 +139,8 @@ static void ZSTD_copy8(void *dst, const void *src) { memcpy(dst, src, 8); }
>  #define WILDCOPY_OVERLENGTH 8
>  ZSTD_STATIC void ZSTD_wildcopy(void *dst, const void *src, ptrdiff_t length)
>  {
> -	const BYTE *ip = (const BYTE *)src;
> -	BYTE *op = (BYTE *)dst;
> -	BYTE *const oend = op + length;
> -	do
> -		COPY8(op, ip)
> -	while (op < oend);
> +	if (length > 0)
> +		memcpy(dst, src, length);
>  }
>  
>  ZSTD_STATIC void ZSTD_wildcopy_e(void *dst, const void *src, void *dstEnd) /* should be faster for decoding, but strangely, not verified on all platform */

Patch
diff mbox

diff --git a/lib/zstd/zstd_internal.h b/lib/zstd/zstd_internal.h
index 6748719..ade0365 100644
--- a/lib/zstd/zstd_internal.h
+++ b/lib/zstd/zstd_internal.h
@@ -126,7 +126,9 @@  static const U32 OF_defaultNormLog = OF_DEFAULTNORMLOG;
 /*-*******************************************
 *  Shared functions to include for inlining
 *********************************************/
-static void ZSTD_copy8(void *dst, const void *src) { memcpy(dst, src, 8); }
+static void ZSTD_copy8(void *dst, const void *src) {
+       ZSTD_write64(dst, ZSTD_read64(src));
+}
 #define COPY8(d, s)               \
        {                         \
                ZSTD_copy8(d, s); \