[for-2.12,v2] qemu-iotests: update 185 output

Message ID	5c93e912-9d6a-214c-e4fb-8a4e6fc6e7f9@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> To: Stefan Hajnoczi <stefanha@redhat.com>, qemu-devel@nongnu.org References: <20180404150146.23281-1-stefanha@redhat.com> From: Max Reitz <mreitz@redhat.com> Openpgp: preference=signencrypt Autocrypt: addr=mreitz@redhat.com; prefer-encrypt=mutual; keydata= xsBNBFXOJlcBCADEyyhOTsoa/2ujoTRAJj4MKA21dkxxELVj3cuILpLTmtachWj7QW+TVG8U /PsMCFbpwsQR7oEy8eHHZwuGQsNpEtNC2G/L8Yka0BIBzv7dEgrPzIu+W3anZXQW4702+uES U29G8TP/NGfXRRHGlbBIH9KNUnOSUD2vRtpOLXkWsV5CN6vQFYgQfFvmp5ZpPeUe6xNplu8V mcTw8OSEDW/ZnxJc8TekCKZSpdzYoxfzjm7xGmZqB18VFwgJZlIibt1HE0EB4w5GsD7x5ekh awIe3RwoZgZDLQMdOitJ1tUc8aqaxvgA4tz6J6st8D8pS//m1gAoYJWGwwIVj1DjTYLtABEB AAHNHU1heCBSZWl0eiA8bXJlaXR6QHJlZGhhdC5jb20+wsCTBBMBCAA9AhsDBQkSzAMABQsJ CAcCBhUICQoLAgQWAgMBAh4BAheABQJVzie5FRhoa3A6Ly9rZXlzLmdudXBnLm5ldAAKCRD0 B9sAYdXPQDcIB/9uNkbYEex1rHKz3mr12uxYMwLOOFY9fstP5aoVJQ1nWQVB6m2cfKGdcRe1 2/nFaHSNAzT0NnKz2MjhZVmcrpyd2Gp2QyISCfb1FbT82GMtXFj1wiHmPb3CixYmWGQUUh+I AvUqsevLA+WihgBUyaJq/vuDVM1/K9Un+w+Tz5vpeMidlIsTYhcsMhn0L9wlCjoucljvbDy/ 8C9L2DUdgi3XTa0ORKeflUhdL4gucWoAMrKX2nmPjBMKLgU7WLBc8AtV+84b9OWFML6NEyo4 4cP7cM/07VlJK53pqNg5cHtnWwjHcbpGkQvx6RUx6F1My3y52vM24rNUA3+ligVEgPYBzsBN BFXOJlcBCADAmcVUNTWT6yLWQHvxZ0o47KCP8OcLqD+67T0RCe6d0LP8GsWtrJdeDIQk+T+F xO7DolQPS6iQ6Ak2/lJaPX8L0BkEAiMuLCKFU6Bn3lFOkrQeKp3u05wCSV1iKnhg0UPji9V2 W5eNfy8F4ZQHpeGUGy+liGXlxqkeRVhLyevUqfU0WgNqAJpfhHSGpBgihUupmyUg7lfUPeRM DzAN1pIqoFuxnN+BRHdAecpsLcbR8sQddXmDg9BpSKozO/JyBmaS1RlquI8HERQoe6EynJhd 64aICHDfj61rp+/0jTIcevxIIAzW70IadoS/y3DVIkuhncgDBvGbF3aBtjrJVP+5ABEBAAHC wGUEGAEIAA8FAlXOJlcCGwwFCRLMAwAACgkQ9AfbAGHVz0CbFwf9F/PXxQR9i4N0iipISYjU sxVdjJOM2TMut+ZZcQ6NSMvhZ0ogQxJ+iEQ5OjnIputKvPVd5U7WRh+4lF1lB/NQGrGZQ1ic alkj6ocscQyFwfib+xIe9w8TG1CVGkII7+TbS5pXHRxZH1niaRpoi/hYtgzkuOPp35jJyqT/ /ELbqQTDAWcqtJhzxKLE/ugcOMK520dJDeb6x2xVES+S5LXby0D4juZlvUj+1fwZu+7Io5+B bkhSVPb/QdOVTpnz7zWNyNw+OONo1aBUKkhq2UIByYXgORPFnbfMY7QWHcjpBVw9MgC4tGeF R4bv+1nAMMxKmb5VvQCExr0eFhJUAHAhVg== Message-ID: <5c93e912-9d6a-214c-e4fb-8a4e6fc6e7f9@redhat.com> Date: Wed, 4 Apr 2018 18:16:12 +0200 User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101 Thunderbird/52.6.0 MIME-Version: 1.0 In-Reply-To: <20180404150146.23281-1-stefanha@redhat.com> Content-Type: text/plain; charset=utf-8 Content-Language: en-US Content-Transfer-Encoding: 7bit Subject: Re: [Qemu-devel] [PATCH for-2.12 v2] qemu-iotests: update 185 output Precedence: list Cc: Kevin Wolf <kwolf@redhat.com>, QingFeng Hao <haoqf@linux.vnet.ibm.com>, qemu-block@nongnu.org Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Message ID

5c93e912-9d6a-214c-e4fb-8a4e6fc6e7f9@redhat.com (mailing list archive)

State

New, archived

Headers

To: Stefan Hajnoczi <stefanha@redhat.com>, qemu-devel@nongnu.org
References: <20180404150146.23281-1-stefanha@redhat.com>
From: Max Reitz <mreitz@redhat.com>
Openpgp: preference=signencrypt
Autocrypt: addr=mreitz@redhat.com; prefer-encrypt=mutual; keydata=
	xsBNBFXOJlcBCADEyyhOTsoa/2ujoTRAJj4MKA21dkxxELVj3cuILpLTmtachWj7QW+TVG8U
	/PsMCFbpwsQR7oEy8eHHZwuGQsNpEtNC2G/L8Yka0BIBzv7dEgrPzIu+W3anZXQW4702+uES
	U29G8TP/NGfXRRHGlbBIH9KNUnOSUD2vRtpOLXkWsV5CN6vQFYgQfFvmp5ZpPeUe6xNplu8V
	mcTw8OSEDW/ZnxJc8TekCKZSpdzYoxfzjm7xGmZqB18VFwgJZlIibt1HE0EB4w5GsD7x5ekh
	awIe3RwoZgZDLQMdOitJ1tUc8aqaxvgA4tz6J6st8D8pS//m1gAoYJWGwwIVj1DjTYLtABEB
	AAHNHU1heCBSZWl0eiA8bXJlaXR6QHJlZGhhdC5jb20+wsCTBBMBCAA9AhsDBQkSzAMABQsJ
	CAcCBhUICQoLAgQWAgMBAh4BAheABQJVzie5FRhoa3A6Ly9rZXlzLmdudXBnLm5ldAAKCRD0
	B9sAYdXPQDcIB/9uNkbYEex1rHKz3mr12uxYMwLOOFY9fstP5aoVJQ1nWQVB6m2cfKGdcRe1
	2/nFaHSNAzT0NnKz2MjhZVmcrpyd2Gp2QyISCfb1FbT82GMtXFj1wiHmPb3CixYmWGQUUh+I
	AvUqsevLA+WihgBUyaJq/vuDVM1/K9Un+w+Tz5vpeMidlIsTYhcsMhn0L9wlCjoucljvbDy/
	8C9L2DUdgi3XTa0ORKeflUhdL4gucWoAMrKX2nmPjBMKLgU7WLBc8AtV+84b9OWFML6NEyo4
	4cP7cM/07VlJK53pqNg5cHtnWwjHcbpGkQvx6RUx6F1My3y52vM24rNUA3+ligVEgPYBzsBN
	BFXOJlcBCADAmcVUNTWT6yLWQHvxZ0o47KCP8OcLqD+67T0RCe6d0LP8GsWtrJdeDIQk+T+F
	xO7DolQPS6iQ6Ak2/lJaPX8L0BkEAiMuLCKFU6Bn3lFOkrQeKp3u05wCSV1iKnhg0UPji9V2
	W5eNfy8F4ZQHpeGUGy+liGXlxqkeRVhLyevUqfU0WgNqAJpfhHSGpBgihUupmyUg7lfUPeRM
	DzAN1pIqoFuxnN+BRHdAecpsLcbR8sQddXmDg9BpSKozO/JyBmaS1RlquI8HERQoe6EynJhd
	64aICHDfj61rp+/0jTIcevxIIAzW70IadoS/y3DVIkuhncgDBvGbF3aBtjrJVP+5ABEBAAHC
	wGUEGAEIAA8FAlXOJlcCGwwFCRLMAwAACgkQ9AfbAGHVz0CbFwf9F/PXxQR9i4N0iipISYjU
	sxVdjJOM2TMut+ZZcQ6NSMvhZ0ogQxJ+iEQ5OjnIputKvPVd5U7WRh+4lF1lB/NQGrGZQ1ic
	alkj6ocscQyFwfib+xIe9w8TG1CVGkII7+TbS5pXHRxZH1niaRpoi/hYtgzkuOPp35jJyqT/
	/ELbqQTDAWcqtJhzxKLE/ugcOMK520dJDeb6x2xVES+S5LXby0D4juZlvUj+1fwZu+7Io5+B
	bkhSVPb/QdOVTpnz7zWNyNw+OONo1aBUKkhq2UIByYXgORPFnbfMY7QWHcjpBVw9MgC4tGeF
	R4bv+1nAMMxKmb5VvQCExr0eFhJUAHAhVg==
Message-ID: <5c93e912-9d6a-214c-e4fb-8a4e6fc6e7f9@redhat.com>
Date: Wed, 4 Apr 2018 18:16:12 +0200
User-Agent: Mozilla/5.0 (X11; Linux x86_64; rv:52.0) Gecko/20100101
	Thunderbird/52.6.0
MIME-Version: 1.0
In-Reply-To: <20180404150146.23281-1-stefanha@redhat.com>
Content-Type: text/plain; charset=utf-8
Content-Language: en-US
Content-Transfer-Encoding: 7bit
Subject: Re: [Qemu-devel] [PATCH for-2.12 v2] qemu-iotests: update 185 output
Precedence: list
Cc: Kevin Wolf <kwolf@redhat.com>, QingFeng Hao <haoqf@linux.vnet.ibm.com>, 
	qemu-block@nongnu.org
Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org
Sender: "Qemu-devel"
	<qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>

Commit Message

Max Reitz April 4, 2018, 4:16 p.m. UTC

On 2018-04-04 17:01, Stefan Hajnoczi wrote:
> Commit 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce
> vm_shutdown()") added a bdrv_drain_all() call.  As a side-effect of the
> drain operation the block job iterates one more time than before.  The
> 185 output no longer matches and the test is failing now.
> 
> It may be possible to avoid the superfluous block job iteration, but
> that type of patch is not suitable late in the QEMU 2.12 release cycle.
> 
> This patch simply updates the 185 output file.  The new behavior is
> correct, just not optimal, so make the test pass again.
> 
> Fixes: 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce vm_shutdown()")
> Cc: Kevin Wolf <kwolf@redhat.com>
> Cc: QingFeng Hao <haoqf@linux.vnet.ibm.com>
> Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> ---
>  tests/qemu-iotests/185     | 10 ++++++----
>  tests/qemu-iotests/185.out | 12 +++++++-----
>  2 files changed, 13 insertions(+), 9 deletions(-)

On tmpfs, this isn't enough to let the test pass.  There, the active
commit job finishes before the quit is sent, resulting in this diff:

"event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
"offset": 4194304, "speed": 65536, "type": "commit"}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
"event": "BLOCK_JOB_COMPLETED", "data": {"device": "disk", "len":
4194304, "offset": 4194304, "speed": 65536, "type": "commit"}}

 === Start mirror job and exit qemu ===

This seems to be independent of whether there is actually data on
TEST_IMG (the commit source), so something doesn't seem quite right with
the block job throttling here...?

Max

Comments

Stefan Hajnoczi April 9, 2018, 7:24 a.m. UTC | #1

On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
> On 2018-04-04 17:01, Stefan Hajnoczi wrote:
> > Commit 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce
> > vm_shutdown()") added a bdrv_drain_all() call.  As a side-effect of the
> > drain operation the block job iterates one more time than before.  The
> > 185 output no longer matches and the test is failing now.
> > 
> > It may be possible to avoid the superfluous block job iteration, but
> > that type of patch is not suitable late in the QEMU 2.12 release cycle.
> > 
> > This patch simply updates the 185 output file.  The new behavior is
> > correct, just not optimal, so make the test pass again.
> > 
> > Fixes: 4486e89c219c0d1b9bd8dfa0b1dd5b0d51ff2268 ("vl: introduce vm_shutdown()")
> > Cc: Kevin Wolf <kwolf@redhat.com>
> > Cc: QingFeng Hao <haoqf@linux.vnet.ibm.com>
> > Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
> > ---
> >  tests/qemu-iotests/185     | 10 ++++++----
> >  tests/qemu-iotests/185.out | 12 +++++++-----
> >  2 files changed, 13 insertions(+), 9 deletions(-)
> 
> On tmpfs, this isn't enough to let the test pass.  There, the active
> commit job finishes before the quit is sent, resulting in this diff:
> 
> --- tests/qemu-iotests/185.out      2018-04-04 18:10:02.015935435 +0200
> +++ tests/qemu-iotests/185.out.bad  2018-04-04 18:10:21.045473817 +0200
> @@ -26,9 +26,9 @@
> 
>  {"return": {}}
>  {"return": {}}
> +{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
> "offset": 4194304, "speed": 65536, "type": "commit"}}
>  {"return": {}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "SHUTDOWN", "data": {"guest": false}}
> -{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
> "offset": 4194304, "speed": 65536, "type": "commit"}}
>  {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
> "event": "BLOCK_JOB_COMPLETED", "data": {"device": "disk", "len":
> 4194304, "offset": 4194304, "speed": 65536, "type": "commit"}}
> 
>  === Start mirror job and exit qemu ===
> 
> This seems to be independent of whether there is actually data on
> TEST_IMG (the commit source), so something doesn't seem quite right with
> the block job throttling here...?

That is a race condition.  I can reproduce it on XFS too.

Will see if I can figure it out...

Stefan

Stefan Hajnoczi April 10, 2018, 8:11 a.m. UTC | #2

On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
> On 2018-04-04 17:01, Stefan Hajnoczi wrote:
>  === Start mirror job and exit qemu ===
> 
> This seems to be independent of whether there is actually data on
> TEST_IMG (the commit source), so something doesn't seem quite right with
> the block job throttling here...?

I've been playing around with this test failure.  Let's leave it broken
(!) in QEMU 2.12 because this test has uncovered a block job ratelimit
issue that's not a regression.  The fix shouldn't be rushed.

block/mirror.c calculates delay_ns but then discards it:

  static void coroutine_fn mirror_run(void *opaque)
  {
      ...

      for (;;) {
          ...delay_ns is calculated based on job speed...

          ret = 0;
          trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
          if (block_job_is_cancelled(&s->common) && s->common.force) {
              break;
          } else if (!should_complete) {
              delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
	        ^--- we discard it!
              block_job_sleep_ns(&s->common, delay_ns);
          }
          s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
      }

      ...
  }

This is why the mirror-based tests are completing even though we'd
expect them to only reach the "next" iteration due to the job speed.

Increasing the 4 MB write operation in the test to >16 MB (the mirror
buffer size) doesn't solve the problem because the next QMP command will
race with the job's 0 ns sleep.  It would just make the test unreliable.

I'll work on the following for QEMU 2.13:

1. Avoid spurious block_job_enter() from block_job_drain().  This
   eliminates the extra block job iteration that changed the output in
   the first place.

2. Honor the job speed in block/mirror.c.

The end result will be that the test output will not require changes.

Stefan

Kevin Wolf April 10, 2018, 2:35 p.m. UTC | #3

Am 10.04.2018 um 10:11 hat Stefan Hajnoczi geschrieben:
> On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
> > On 2018-04-04 17:01, Stefan Hajnoczi wrote:
> >  === Start mirror job and exit qemu ===
> > 
> > This seems to be independent of whether there is actually data on
> > TEST_IMG (the commit source), so something doesn't seem quite right with
> > the block job throttling here...?
> 
> I've been playing around with this test failure.  Let's leave it broken
> (!) in QEMU 2.12 because this test has uncovered a block job ratelimit
> issue that's not a regression.  The fix shouldn't be rushed.

Makes sense.

Thanks, applied to the block branch.

Kevin

Max Reitz April 11, 2018, 10:32 a.m. UTC | #4

On 2018-04-10 10:11, Stefan Hajnoczi wrote:
> On Wed, Apr 04, 2018 at 06:16:12PM +0200, Max Reitz wrote:
>> On 2018-04-04 17:01, Stefan Hajnoczi wrote:
>>  === Start mirror job and exit qemu ===
>>
>> This seems to be independent of whether there is actually data on
>> TEST_IMG (the commit source), so something doesn't seem quite right with
>> the block job throttling here...?
> 
> I've been playing around with this test failure.  Let's leave it broken
> (!) in QEMU 2.12 because this test has uncovered a block job ratelimit
> issue that's not a regression.  The fix shouldn't be rushed.

OK for me.

> block/mirror.c calculates delay_ns but then discards it:
> 
>   static void coroutine_fn mirror_run(void *opaque)
>   {
>       ...
> 
>       for (;;) {
>           ...delay_ns is calculated based on job speed...
> 
>           ret = 0;
>           trace_mirror_before_sleep(s, cnt, s->synced, delay_ns);
>           if (block_job_is_cancelled(&s->common) && s->common.force) {
>               break;
>           } else if (!should_complete) {
>               delay_ns = (s->in_flight == 0 && cnt == 0 ? SLICE_TIME : 0);
> 	        ^--- we discard it!
>               block_job_sleep_ns(&s->common, delay_ns);
>           }
>           s->last_pause_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
>       }
> 
>       ...
>   }

Uh, nice.

> This is why the mirror-based tests are completing even though we'd
> expect them to only reach the "next" iteration due to the job speed.
> 
> Increasing the 4 MB write operation in the test to >16 MB (the mirror
> buffer size) doesn't solve the problem because the next QMP command will
> race with the job's 0 ns sleep.  It would just make the test unreliable.
> 
> I'll work on the following for QEMU 2.13:
> 
> 1. Avoid spurious block_job_enter() from block_job_drain().  This
>    eliminates the extra block job iteration that changed the output in
>    the first place.
> 
> 2. Honor the job speed in block/mirror.c.
> 
> The end result will be that the test output will not require changes.

Thanks!

Max

--- tests/qemu-iotests/185.out      2018-04-04 18:10:02.015935435 +0200
+++ tests/qemu-iotests/185.out.bad  2018-04-04 18:10:21.045473817 +0200
@@ -26,9 +26,9 @@ 

 {"return": {}}
 {"return": {}}
+{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
"event": "BLOCK_JOB_READY", "data": {"device": "disk", "len": 4194304,
"offset": 4194304, "speed": 65536, "type": "commit"}}
 {"return": {}}
 {"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},
"event": "SHUTDOWN", "data": {"guest": false}}
-{"timestamp": {"seconds":  TIMESTAMP, "microseconds":  TIMESTAMP},

[for-2.12,v2] qemu-iotests: update 185 output

Commit Message

Comments

Patch