diff mbox series

multifd: Copy pages before compressing them with zlib

Message ID 20220704164112.2890137-1-iii@linux.ibm.com (mailing list archive)
State New, archived
Headers show
Series multifd: Copy pages before compressing them with zlib | expand

Commit Message

Ilya Leoshkevich July 4, 2022, 4:41 p.m. UTC
zlib_send_prepare() compresses pages of a running VM. zlib does not
make any thread-safety guarantees with respect to changing deflate()
input concurrently with deflate() [1].

One can observe problems due to this with the IBM zEnterprise Data
Compression accelerator capable zlib [2]. When the hardware
acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
intermittently [3] due to sliding window corruption. The accelerator's
architecture explicitly discourages concurrent accesses [4]:

    Page 26-57, "Other Conditions":

    As observed by this CPU, other CPUs, and channel
    programs, references to the parameter block, first,
    second, and third operands may be multiple-access
    references, accesses to these storage locations are
    not necessarily block-concurrent, and the sequence
    of these accesses or references is undefined.

Mark Adler pointed out that vanilla zlib performs double fetches under
certain circumstances as well [5], therefore we need to copy data
before passing it to deflate().

[1] https://zlib.net/manual.html
[2] https://github.com/madler/zlib/pull/410
[3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
[4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
[5] https://gitlab.com/qemu-project/qemu/-/issues/1099

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
---

v1: https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html
v1 -> v2: Rebase, mention Mark Adler's reply in the commit message.

 migration/multifd-zlib.c | 35 ++++++++++++++++++++++-------------
 1 file changed, 22 insertions(+), 13 deletions(-)

Comments

Juan Quintela July 4, 2022, 4:51 p.m. UTC | #1
Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> zlib_send_prepare() compresses pages of a running VM. zlib does not
> make any thread-safety guarantees with respect to changing deflate()
> input concurrently with deflate() [1].
>
> One can observe problems due to this with the IBM zEnterprise Data
> Compression accelerator capable zlib [2]. When the hardware
> acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
> intermittently [3] due to sliding window corruption. The accelerator's
> architecture explicitly discourages concurrent accesses [4]:
>
>     Page 26-57, "Other Conditions":
>
>     As observed by this CPU, other CPUs, and channel
>     programs, references to the parameter block, first,
>     second, and third operands may be multiple-access
>     references, accesses to these storage locations are
>     not necessarily block-concurrent, and the sequence
>     of these accesses or references is undefined.
>
> Mark Adler pointed out that vanilla zlib performs double fetches under
> certain circumstances as well [5], therefore we need to copy data
> before passing it to deflate().
>
> [1] https://zlib.net/manual.html
> [2] https://github.com/madler/zlib/pull/410
> [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
>
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>

Reviewed-by: Juan Quintela <quintela@redhat.com>

And now I wonder if we need this for zstd.

Once told that, compression (not multifd one) has always operated the
other way, sniff.
Dr. David Alan Gilbert July 5, 2022, 3:27 p.m. UTC | #2
* Ilya Leoshkevich (iii@linux.ibm.com) wrote:
> zlib_send_prepare() compresses pages of a running VM. zlib does not
> make any thread-safety guarantees with respect to changing deflate()
> input concurrently with deflate() [1].
> 
> One can observe problems due to this with the IBM zEnterprise Data
> Compression accelerator capable zlib [2]. When the hardware
> acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
> intermittently [3] due to sliding window corruption. The accelerator's
> architecture explicitly discourages concurrent accesses [4]:
> 
>     Page 26-57, "Other Conditions":
> 
>     As observed by this CPU, other CPUs, and channel
>     programs, references to the parameter block, first,
>     second, and third operands may be multiple-access
>     references, accesses to these storage locations are
>     not necessarily block-concurrent, and the sequence
>     of these accesses or references is undefined.
> 
> Mark Adler pointed out that vanilla zlib performs double fetches under
> certain circumstances as well [5], therefore we need to copy data
> before passing it to deflate().

Thanks for fixing that!

> [1] https://zlib.net/manual.html
> [2] https://github.com/madler/zlib/pull/410
> [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
> 
> Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> ---
> 
> v1: https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html
> v1 -> v2: Rebase, mention Mark Adler's reply in the commit message.
> 
>  migration/multifd-zlib.c | 35 ++++++++++++++++++++++-------------
>  1 file changed, 22 insertions(+), 13 deletions(-)
> 
> diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
> index 3a7ae44485..b6b22b7d1f 100644
> --- a/migration/multifd-zlib.c
> +++ b/migration/multifd-zlib.c
> @@ -27,6 +27,8 @@ struct zlib_data {
>      uint8_t *zbuff;
>      /* size of compressed buffer */
>      uint32_t zbuff_len;
> +    /* uncompressed buffer */
> +    uint8_t buf[];
>  };
>  
>  /* Multifd zlib compression */
> @@ -43,9 +45,18 @@ struct zlib_data {
>   */
>  static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
>  {
> -    struct zlib_data *z = g_new0(struct zlib_data, 1);
> -    z_stream *zs = &z->zs;
> +    /* This is the maximum size of the compressed buffer */
> +    uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
> +    size_t buf_len = qemu_target_page_size();
> +    struct zlib_data *z;
> +    z_stream *zs;
>  
> +    z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len);

So I think this works; but wouldn't life be easier if you just used
separate malloc's for the buffers?  You've got a lot of hairy pointer
maths below that would go away if they were separate.

Dave

> +    if (!z) {
> +        error_setg(errp, "multifd %u: out of memory for zlib_data", p->id);
> +        return -1;
> +    }
> +    zs = &z->zs;
>      zs->zalloc = Z_NULL;
>      zs->zfree = Z_NULL;
>      zs->opaque = Z_NULL;
> @@ -54,15 +65,8 @@ static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
>          error_setg(errp, "multifd %u: deflate init failed", p->id);
>          return -1;
>      }
> -    /* This is the maxium size of the compressed buffer */
> -    z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
> -    z->zbuff = g_try_malloc(z->zbuff_len);
> -    if (!z->zbuff) {
> -        deflateEnd(&z->zs);
> -        g_free(z);
> -        error_setg(errp, "multifd %u: out of memory for zbuff", p->id);
> -        return -1;
> -    }
> +    z->zbuff_len = zbuff_len;
> +    z->zbuff = z->buf + buf_len;
>      p->data = z;
>      return 0;
>  }
> @@ -80,7 +84,6 @@ static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
>      struct zlib_data *z = p->data;
>  
>      deflateEnd(&z->zs);
> -    g_free(z->zbuff);
>      z->zbuff = NULL;
>      g_free(p->data);
>      p->data = NULL;
> @@ -114,8 +117,14 @@ static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
>              flush = Z_SYNC_FLUSH;
>          }
>  
> +        /*
> +         * Since the VM might be running, the page may be changing concurrently
> +         * with compression. zlib does not guarantee that this is safe,
> +         * therefore copy the page before calling deflate().
> +         */
> +        memcpy(z->buf, p->pages->block->host + p->normal[i], page_size);
>          zs->avail_in = page_size;
> -        zs->next_in = p->pages->block->host + p->normal[i];
> +        zs->next_in = z->buf;
>  
>          zs->avail_out = available;
>          zs->next_out = z->zbuff + out_size;
> -- 
> 2.35.3
>
Peter Maydell July 5, 2022, 4 p.m. UTC | #3
On Mon, 4 Jul 2022 at 17:43, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>
> zlib_send_prepare() compresses pages of a running VM. zlib does not
> make any thread-safety guarantees with respect to changing deflate()
> input concurrently with deflate() [1].
>
> One can observe problems due to this with the IBM zEnterprise Data
> Compression accelerator capable zlib [2]. When the hardware
> acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
> intermittently [3] due to sliding window corruption. The accelerator's
> architecture explicitly discourages concurrent accesses [4]:
>
>     Page 26-57, "Other Conditions":
>
>     As observed by this CPU, other CPUs, and channel
>     programs, references to the parameter block, first,
>     second, and third operands may be multiple-access
>     references, accesses to these storage locations are
>     not necessarily block-concurrent, and the sequence
>     of these accesses or references is undefined.
>
> Mark Adler pointed out that vanilla zlib performs double fetches under
> certain circumstances as well [5], therefore we need to copy data
> before passing it to deflate().
>
> [1] https://zlib.net/manual.html
> [2] https://github.com/madler/zlib/pull/410
> [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> [5] https://gitlab.com/qemu-project/qemu/-/issues/1099

Is this [5] the wrong link? It's to our issue tracker, not zlib's
or a zlib mailing list thread, and it doesn't contain any messages
from Mark Adler.

thanks
-- PMM
Dr. David Alan Gilbert July 5, 2022, 4:16 p.m. UTC | #4
* Peter Maydell (peter.maydell@linaro.org) wrote:
> On Mon, 4 Jul 2022 at 17:43, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >
> > zlib_send_prepare() compresses pages of a running VM. zlib does not
> > make any thread-safety guarantees with respect to changing deflate()
> > input concurrently with deflate() [1].
> >
> > One can observe problems due to this with the IBM zEnterprise Data
> > Compression accelerator capable zlib [2]. When the hardware
> > acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
> > intermittently [3] due to sliding window corruption. The accelerator's
> > architecture explicitly discourages concurrent accesses [4]:
> >
> >     Page 26-57, "Other Conditions":
> >
> >     As observed by this CPU, other CPUs, and channel
> >     programs, references to the parameter block, first,
> >     second, and third operands may be multiple-access
> >     references, accesses to these storage locations are
> >     not necessarily block-concurrent, and the sequence
> >     of these accesses or references is undefined.
> >
> > Mark Adler pointed out that vanilla zlib performs double fetches under
> > certain circumstances as well [5], therefore we need to copy data
> > before passing it to deflate().
> >
> > [1] https://zlib.net/manual.html
> > [2] https://github.com/madler/zlib/pull/410
> > [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> > [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
> 
> Is this [5] the wrong link? It's to our issue tracker, not zlib's
> or a zlib mailing list thread, and it doesn't contain any messages
> from Mark Adler.

Looking at Mark's message, I'm not seeing that it was cc'd to the lists.
I did however ask him to update zlib's docs to describe the requirement.

Dave

> thanks
> -- PMM
>
Christian Borntraeger July 5, 2022, 4:27 p.m. UTC | #5
Am 05.07.22 um 18:16 schrieb Dr. David Alan Gilbert:
> * Peter Maydell (peter.maydell@linaro.org) wrote:
>> On Mon, 4 Jul 2022 at 17:43, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
>>>
>>> zlib_send_prepare() compresses pages of a running VM. zlib does not
>>> make any thread-safety guarantees with respect to changing deflate()
>>> input concurrently with deflate() [1].
>>>
>>> One can observe problems due to this with the IBM zEnterprise Data
>>> Compression accelerator capable zlib [2]. When the hardware
>>> acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
>>> intermittently [3] due to sliding window corruption. The accelerator's
>>> architecture explicitly discourages concurrent accesses [4]:
>>>
>>>      Page 26-57, "Other Conditions":
>>>
>>>      As observed by this CPU, other CPUs, and channel
>>>      programs, references to the parameter block, first,
>>>      second, and third operands may be multiple-access
>>>      references, accesses to these storage locations are
>>>      not necessarily block-concurrent, and the sequence
>>>      of these accesses or references is undefined.
>>>
>>> Mark Adler pointed out that vanilla zlib performs double fetches under
>>> certain circumstances as well [5], therefore we need to copy data
>>> before passing it to deflate().
>>>
>>> [1] https://zlib.net/manual.html
>>> [2] https://github.com/madler/zlib/pull/410
>>> [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
>>> [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
>>> [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
>>
>> Is this [5] the wrong link? It's to our issue tracker, not zlib's
>> or a zlib mailing list thread, and it doesn't contain any messages
>> from Mark Adler.
> 
> Looking at Mark's message, I'm not seeing that it was cc'd to the lists.
> I did however ask him to update zlib's docs to describe the requirement.


Can you maybe forward the message here?
Dr. David Alan Gilbert July 5, 2022, 4:33 p.m. UTC | #6
* Christian Borntraeger (borntraeger@de.ibm.com) wrote:
> Am 05.07.22 um 18:16 schrieb Dr. David Alan Gilbert:
> > * Peter Maydell (peter.maydell@linaro.org) wrote:
> > > On Mon, 4 Jul 2022 at 17:43, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> > > > 
> > > > zlib_send_prepare() compresses pages of a running VM. zlib does not
> > > > make any thread-safety guarantees with respect to changing deflate()
> > > > input concurrently with deflate() [1].
> > > > 
> > > > One can observe problems due to this with the IBM zEnterprise Data
> > > > Compression accelerator capable zlib [2]. When the hardware
> > > > acceleration is enabled, migration/multifd/tcp/plain/zlib test fails
> > > > intermittently [3] due to sliding window corruption. The accelerator's
> > > > architecture explicitly discourages concurrent accesses [4]:
> > > > 
> > > >      Page 26-57, "Other Conditions":
> > > > 
> > > >      As observed by this CPU, other CPUs, and channel
> > > >      programs, references to the parameter block, first,
> > > >      second, and third operands may be multiple-access
> > > >      references, accesses to these storage locations are
> > > >      not necessarily block-concurrent, and the sequence
> > > >      of these accesses or references is undefined.
> > > > 
> > > > Mark Adler pointed out that vanilla zlib performs double fetches under
> > > > certain circumstances as well [5], therefore we need to copy data
> > > > before passing it to deflate().
> > > > 
> > > > [1] https://zlib.net/manual.html
> > > > [2] https://github.com/madler/zlib/pull/410
> > > > [3] https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> > > > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> > > > [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
> > > 
> > > Is this [5] the wrong link? It's to our issue tracker, not zlib's
> > > or a zlib mailing list thread, and it doesn't contain any messages
> > > from Mark Adler.
> > 
> > Looking at Mark's message, I'm not seeing that it was cc'd to the lists.
> > I did however ask him to update zlib's docs to describe the requirement.
> 
> 
> Can you maybe forward the message here?

Lets see, it looks OK from the content, here's a copy of my reply with
the thread in it.  I've add Mark to the cc here so he knows.

Dave

<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
* Mark Adler (zlib@madler.net) wrote:
> Dave,
> 
> d), which should also result in an invalid check value (CRC-32 or Adler-32). I suppose you could call that b).
> 
> To get c), the input data would need to be read exactly once. However there is a case in deflate when writing a stored block where the input is accessed twice — once to copy to the output, and then a second time to fill in the sliding window. If the data changes between those two, then the sliding window does not reflect the data written, which can propagate to incorrect matches downstream of the modified data.
> 
> That is the only place I see that. The impact would usually be c), but if you are trying to compress incompressible data followed by compressible data, you will have stored blocks followed by dynamic blocks with matches to the incorrect data. Your testing would likely not expose that. (I tried to compile the linked test, but went down a rat hole to find include files and gave up.)

OK - thanks for your clarification!
I've created:
  https://gitlab.com/qemu-project/qemu/-/issues/1099

as a reminder we need to fix this in qemu somewhere.

Could you please add a note to the zlib docs somewhere to make this
explicit.

Thanks again,

Dave

> Mark
> 
> 
> > On Jun 30, 2022, at 9:26 AM, Dr. David Alan Gilbert <dgilbert@redhat.com> wrote:
> > 
> > * Mark Adler (zlib@madler.net <mailto:zlib@madler.net>) wrote:
> >> Ilya,
> >> 
> >> What exactly do you mean by “concurrently”? What is an example of this?
> > 
> > In qemu's live migration we have a thread that shuffles the contents of
> > guest memory out over the network. The guest is still
> > running at the time and changing the contents of the memory we're
> > saving.
> > Fortunately we keep a 'modified' flag so that if the guest does modify
> > it while we're saving, we know about it and will send the block again.
> > 
> > Zlib (and zstd) have recently been forcibly inserted into this; so zlib
> > may be compressing a page of memory that changes.
> > 
> >> If you mean modifying the data provided to deflate() before deflate() has returned, then that is certainly not safe.
> > 
> > So a question is what does 'not safe' mean:
> > a) It explodes and segs
> > b) It produces an invalid stream
> > c) It produces a valid stream but the data for the modified block may
> > be garbage
> > d) It produces a valid stream but the data for the modified block and
> > other blocks may be garbage.
> > 
> > The qemu live migration code is happy with (c) because it'll retransmit
> > a stable block later. So far with the software zlib libraries we've
> > seen that be fine; I think Ilya is finding something like (b) or (d) on
> > their compression hardware.
> > 
> > Dave
> > 
> >> 
> >> Mark
> >> 
> >> 
> >>> On Jun 22, 2022, at 2:04 AM, Ilya Leoshkevich <iii@linux.ibm.com> wrote:
> >>> 
> >>> [resending with a smaller cc: list in order to pass the
> >>> zlib-devel mailing list moderation process]
> >>> 
> >>> Hello zlib developers,
> >>> 
> >>> I've been investigating a problem in the QEMU test suite on IBM Z [1]
> >>> [2] in connection with the IBM Z compression accelerator patch [3].
> >>> 
> >>> The problem is that a QEMU thread compresses data that is being
> >>> modified by another QEMU thread. zlib manual [4] does not state that
> >>> this is safe, however, the current stable zlib in fact tolerates it.
> >>> 
> >>> The accelerator, however, does not: not only what it compresses ends up
> >>> being unpredictable - QEMU actually resolves this just fine -
> >>> but the accelerator's internal state also ends up being corrupted.
> >>> 
> >>> I have a design question in connection to this: does zlib guarantee
> >>> that modifying deflate() input concurrently with deflate() is safe?
> >>> Or does it reserve the right to change this in the future versions?
> >>> 
> >>> Cc:ing zlib-ng folks for their opinion as well.
> >>> 
> >>> [1] https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html
> >>> [2] https://lists.gnu.org/archive/html/qemu-devel/2022-04/msg00329.html
> >>> [3] https://github.com/madler/zlib/pull/410
> >>> [4] https://zlib.net/manual.html
> >>> 
> >>> Best regards,
> >>> Ilya
> >> 
> > -- 
> > Dr. David Alan Gilbert / dgilbert@redhat.com <mailto:dgilbert@redhat.com> / Manchester, UK
> 
<><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><><>
Ilya Leoshkevich July 5, 2022, 5:22 p.m. UTC | #7
On Tue, 2022-07-05 at 16:27 +0100, Dr. David Alan Gilbert wrote:
> * Ilya Leoshkevich (iii@linux.ibm.com) wrote:
> > zlib_send_prepare() compresses pages of a running VM. zlib does not
> > make any thread-safety guarantees with respect to changing
> > deflate()
> > input concurrently with deflate() [1].
> > 
> > One can observe problems due to this with the IBM zEnterprise Data
> > Compression accelerator capable zlib [2]. When the hardware
> > acceleration is enabled, migration/multifd/tcp/plain/zlib test
> > fails
> > intermittently [3] due to sliding window corruption. The
> > accelerator's
> > architecture explicitly discourages concurrent accesses [4]:
> > 
> >     Page 26-57, "Other Conditions":
> > 
> >     As observed by this CPU, other CPUs, and channel
> >     programs, references to the parameter block, first,
> >     second, and third operands may be multiple-access
> >     references, accesses to these storage locations are
> >     not necessarily block-concurrent, and the sequence
> >     of these accesses or references is undefined.
> > 
> > Mark Adler pointed out that vanilla zlib performs double fetches
> > under
> > certain circumstances as well [5], therefore we need to copy data
> > before passing it to deflate().
> 
> Thanks for fixing that!
> 
> > [1] https://zlib.net/manual.html
> > [2] https://github.com/madler/zlib/pull/410
> > [3]
> > https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> > [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
> > 
> > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > ---
> > 
> > v1:
> > https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html
> > v1 -> v2: Rebase, mention Mark Adler's reply in the commit message.
> > 
> >  migration/multifd-zlib.c | 35 ++++++++++++++++++++++-------------
> >  1 file changed, 22 insertions(+), 13 deletions(-)
> > 
> > diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
> > index 3a7ae44485..b6b22b7d1f 100644
> > --- a/migration/multifd-zlib.c
> > +++ b/migration/multifd-zlib.c
> > @@ -27,6 +27,8 @@ struct zlib_data {
> >      uint8_t *zbuff;
> >      /* size of compressed buffer */
> >      uint32_t zbuff_len;
> > +    /* uncompressed buffer */
> > +    uint8_t buf[];
> >  };
> >  
> >  /* Multifd zlib compression */
> > @@ -43,9 +45,18 @@ struct zlib_data {
> >   */
> >  static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
> >  {
> > -    struct zlib_data *z = g_new0(struct zlib_data, 1);
> > -    z_stream *zs = &z->zs;
> > +    /* This is the maximum size of the compressed buffer */
> > +    uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
> > +    size_t buf_len = qemu_target_page_size();
> > +    struct zlib_data *z;
> > +    z_stream *zs;
> >  
> > +    z = g_try_malloc0(sizeof(struct zlib_data) + buf_len +
> > zbuff_len);
> 
> So I think this works; but wouldn't life be easier if you just used
> separate malloc's for the buffers?  You've got a lot of hairy pointer
> maths below that would go away if they were separate.
> 
> Dave

I was trying to avoid an (IMHO equally hairy) error handling sequence
here. But I don't mind changing this if an alternative would be more
maintainable.

Best regards,
Ilya
Dr. David Alan Gilbert July 5, 2022, 5:32 p.m. UTC | #8
* Ilya Leoshkevich (iii@linux.ibm.com) wrote:
> On Tue, 2022-07-05 at 16:27 +0100, Dr. David Alan Gilbert wrote:
> > * Ilya Leoshkevich (iii@linux.ibm.com) wrote:
> > > zlib_send_prepare() compresses pages of a running VM. zlib does not
> > > make any thread-safety guarantees with respect to changing
> > > deflate()
> > > input concurrently with deflate() [1].
> > > 
> > > One can observe problems due to this with the IBM zEnterprise Data
> > > Compression accelerator capable zlib [2]. When the hardware
> > > acceleration is enabled, migration/multifd/tcp/plain/zlib test
> > > fails
> > > intermittently [3] due to sliding window corruption. The
> > > accelerator's
> > > architecture explicitly discourages concurrent accesses [4]:
> > > 
> > >     Page 26-57, "Other Conditions":
> > > 
> > >     As observed by this CPU, other CPUs, and channel
> > >     programs, references to the parameter block, first,
> > >     second, and third operands may be multiple-access
> > >     references, accesses to these storage locations are
> > >     not necessarily block-concurrent, and the sequence
> > >     of these accesses or references is undefined.
> > > 
> > > Mark Adler pointed out that vanilla zlib performs double fetches
> > > under
> > > certain circumstances as well [5], therefore we need to copy data
> > > before passing it to deflate().
> > 
> > Thanks for fixing that!
> > 
> > > [1] https://zlib.net/manual.html
> > > [2] https://github.com/madler/zlib/pull/410
> > > [3]
> > > https://lists.nongnu.org/archive/html/qemu-devel/2022-03/msg03988.html
> > > [4] http://publibfp.dhe.ibm.com/epubs/pdf/a227832c.pdf
> > > [5] https://gitlab.com/qemu-project/qemu/-/issues/1099
> > > 
> > > Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
> > > ---
> > > 
> > > v1:
> > > https://lists.gnu.org/archive/html/qemu-devel/2022-03/msg06841.html
> > > v1 -> v2: Rebase, mention Mark Adler's reply in the commit message.
> > > 
> > >  migration/multifd-zlib.c | 35 ++++++++++++++++++++++-------------
> > >  1 file changed, 22 insertions(+), 13 deletions(-)
> > > 
> > > diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
> > > index 3a7ae44485..b6b22b7d1f 100644
> > > --- a/migration/multifd-zlib.c
> > > +++ b/migration/multifd-zlib.c
> > > @@ -27,6 +27,8 @@ struct zlib_data {
> > >      uint8_t *zbuff;
> > >      /* size of compressed buffer */
> > >      uint32_t zbuff_len;
> > > +    /* uncompressed buffer */
> > > +    uint8_t buf[];
> > >  };
> > >  
> > >  /* Multifd zlib compression */
> > > @@ -43,9 +45,18 @@ struct zlib_data {
> > >   */
> > >  static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
> > >  {
> > > -    struct zlib_data *z = g_new0(struct zlib_data, 1);
> > > -    z_stream *zs = &z->zs;
> > > +    /* This is the maximum size of the compressed buffer */
> > > +    uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
> > > +    size_t buf_len = qemu_target_page_size();
> > > +    struct zlib_data *z;
> > > +    z_stream *zs;
> > >  
> > > +    z = g_try_malloc0(sizeof(struct zlib_data) + buf_len +
> > > zbuff_len);
> > 
> > So I think this works; but wouldn't life be easier if you just used
> > separate malloc's for the buffers?  You've got a lot of hairy pointer
> > maths below that would go away if they were separate.
> > 
> > Dave
> 
> I was trying to avoid an (IMHO equally hairy) error handling sequence
> here. But I don't mind changing this if an alternative would be more
> maintainable.

It's probably worth trying; I bet it works out a lot simpler.
Remember that g_free(NULL) is safe; so if you want to do a cleanup
of a bunch of pointers you can do:
  g_free(a);
  g_free(b);
  g_free(c);

even if some combination of those hadn't been allocated yet.

Dave

> Best regards,
> Ilya
>
diff mbox series

Patch

diff --git a/migration/multifd-zlib.c b/migration/multifd-zlib.c
index 3a7ae44485..b6b22b7d1f 100644
--- a/migration/multifd-zlib.c
+++ b/migration/multifd-zlib.c
@@ -27,6 +27,8 @@  struct zlib_data {
     uint8_t *zbuff;
     /* size of compressed buffer */
     uint32_t zbuff_len;
+    /* uncompressed buffer */
+    uint8_t buf[];
 };
 
 /* Multifd zlib compression */
@@ -43,9 +45,18 @@  struct zlib_data {
  */
 static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
 {
-    struct zlib_data *z = g_new0(struct zlib_data, 1);
-    z_stream *zs = &z->zs;
+    /* This is the maximum size of the compressed buffer */
+    uint32_t zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
+    size_t buf_len = qemu_target_page_size();
+    struct zlib_data *z;
+    z_stream *zs;
 
+    z = g_try_malloc0(sizeof(struct zlib_data) + buf_len + zbuff_len);
+    if (!z) {
+        error_setg(errp, "multifd %u: out of memory for zlib_data", p->id);
+        return -1;
+    }
+    zs = &z->zs;
     zs->zalloc = Z_NULL;
     zs->zfree = Z_NULL;
     zs->opaque = Z_NULL;
@@ -54,15 +65,8 @@  static int zlib_send_setup(MultiFDSendParams *p, Error **errp)
         error_setg(errp, "multifd %u: deflate init failed", p->id);
         return -1;
     }
-    /* This is the maxium size of the compressed buffer */
-    z->zbuff_len = compressBound(MULTIFD_PACKET_SIZE);
-    z->zbuff = g_try_malloc(z->zbuff_len);
-    if (!z->zbuff) {
-        deflateEnd(&z->zs);
-        g_free(z);
-        error_setg(errp, "multifd %u: out of memory for zbuff", p->id);
-        return -1;
-    }
+    z->zbuff_len = zbuff_len;
+    z->zbuff = z->buf + buf_len;
     p->data = z;
     return 0;
 }
@@ -80,7 +84,6 @@  static void zlib_send_cleanup(MultiFDSendParams *p, Error **errp)
     struct zlib_data *z = p->data;
 
     deflateEnd(&z->zs);
-    g_free(z->zbuff);
     z->zbuff = NULL;
     g_free(p->data);
     p->data = NULL;
@@ -114,8 +117,14 @@  static int zlib_send_prepare(MultiFDSendParams *p, Error **errp)
             flush = Z_SYNC_FLUSH;
         }
 
+        /*
+         * Since the VM might be running, the page may be changing concurrently
+         * with compression. zlib does not guarantee that this is safe,
+         * therefore copy the page before calling deflate().
+         */
+        memcpy(z->buf, p->pages->block->host + p->normal[i], page_size);
         zs->avail_in = page_size;
-        zs->next_in = p->pages->block->host + p->normal[i];
+        zs->next_in = z->buf;
 
         zs->avail_out = available;
         zs->next_out = z->zbuff + out_size;