[for-2.8] iotests: Do not rely on unavailable domains in 162
diff mbox

Message ID 20160819131202.19369-1-mreitz@redhat.com
State New
Headers show

Commit Message

Max Reitz Aug. 19, 2016, 1:12 p.m. UTC
There are some (mostly ISP-specific) name servers who will redirect
non-existing domains to special hosts. In this case, we will get a
different error message when trying to connect to such a host, which
breaks test 162.

162 needed this specific error message so it can confirm that qemu was
indeed trying to connect to the user-specified port. However, we can
also confirm this by setting up a local NBD server on exactly that port;
so we can fix the issue by doing just that.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/162     | 10 ++++++----
 tests/qemu-iotests/162.out |  2 +-
 2 files changed, 7 insertions(+), 5 deletions(-)

Comments

Sascha Silbe Aug. 23, 2016, 11:44 a.m. UTC | #1
Dear Max,

Max Reitz <mreitz@redhat.com> writes:

[tests/qemu-iotests/162]
[...]
> +# (We need to set up a server here, because the error message for "Connection
> +#  refused" does not contain the destination port)
> +$QEMU_NBD -p 42424 -f raw null-co:// &
> +sleep 0.5
> +$QEMU_IMG info 'json:{"driver": "nbd", "host": "localhost", "port": 42424}' \
> +    | grep '^image'

Using a fixed port number means multiple users won't be able to run this
in parallel. That it's only open for a short time actually makes the
issue a bit worse as it's hard to understand just why the test failed
intermittently.

Is there a way to have qemu-nbd use a random port and print the port
number?

Sascha
Max Reitz Aug. 23, 2016, 12:55 p.m. UTC | #2
On 2016-08-23 at 07:44, Sascha Silbe wrote:
> Dear Max,
>
> Max Reitz <mreitz@redhat.com> writes:
>
> [tests/qemu-iotests/162]
> [...]
>> +# (We need to set up a server here, because the error message for "Connection
>> +#  refused" does not contain the destination port)
>> +$QEMU_NBD -p 42424 -f raw null-co:// &
>> +sleep 0.5
>> +$QEMU_IMG info 'json:{"driver": "nbd", "host": "localhost", "port": 42424}' \
>> +    | grep '^image'
>
> Using a fixed port number means multiple users won't be able to run this
> in parallel. That it's only open for a short time actually makes the
> issue a bit worse as it's hard to understand just why the test failed
> intermittently.
>
> Is there a way to have qemu-nbd use a random port and print the port
> number?

Good idea. We can just let the script generate a random port; 
$(($RANDOM+32768)) should do the trick.

Max
Sascha Silbe Aug. 23, 2016, 2:17 p.m. UTC | #3
Dear Max,

Max Reitz <mreitz@redhat.com> writes:

> On 2016-08-23 at 07:44, Sascha Silbe wrote:
>> Max Reitz <mreitz@redhat.com> writes:
[tests/qemu-iotests/162]
>> Using a fixed port number means multiple users won't be able to run this
>> in parallel. That it's only open for a short time actually makes the
>> issue a bit worse as it's hard to understand just why the test failed
>> intermittently.
>>
>> Is there a way to have qemu-nbd use a random port and print the port
>> number?
>
> Good idea. We can just let the script generate a random port; 
> $(($RANDOM+32768)) should do the trick.

Which will fail just the same as the original version if anything (not
just qemu-nbd) is already occupying the port you happened to
choose.

Maybe we should just fix this part:

>>> +# (We need to set up a server here, because the error message for "Connection
>>> +#  refused" does not contain the destination port)

Including the port number in the "Connection refused" error message is a
useful diagnostic, especially if it's a non-default port.

Sascha
Max Reitz Aug. 23, 2016, 3:19 p.m. UTC | #4
On 2016-08-23 at 10:17, Sascha Silbe wrote:
> Dear Max,
>
> Max Reitz <mreitz@redhat.com> writes:
>
>> On 2016-08-23 at 07:44, Sascha Silbe wrote:
>>> Max Reitz <mreitz@redhat.com> writes:
> [tests/qemu-iotests/162]
>>> Using a fixed port number means multiple users won't be able to run this
>>> in parallel. That it's only open for a short time actually makes the
>>> issue a bit worse as it's hard to understand just why the test failed
>>> intermittently.
>>>
>>> Is there a way to have qemu-nbd use a random port and print the port
>>> number?
>>
>> Good idea. We can just let the script generate a random port;
>> $(($RANDOM+32768)) should do the trick.
>
> Which will fail just the same as the original version if anything (not
> just qemu-nbd) is already occupying the port you happened to
> choose.

But I think figuring that out from the error log will be rather trivial:

 > +Failed to bind socket: Address already in use

> Maybe we should just fix this part:
>
>>>> +# (We need to set up a server here, because the error message for "Connection
>>>> +#  refused" does not contain the destination port)
>
> Including the port number in the "Connection refused" error message is a
> useful diagnostic, especially if it's a non-default port.

This would need to be included in the socket code itself, because it 
wouldn't make sense to do this for NBD alone. However, doing that is not 
trivial, because at the point where the error message is generated we 
only have a generic addrinfo structure which I'm not really inclined to 
add unparsing code for, just so this test (for developers) does not fail 
with a very low probability.

While I do agree that it would be useful diagnostics in principle, in 
practice this is kind of rendered obsolete by the fact that we generally 
do print the URL (the "file name") anyway. The same applies for any 
other protocol: When raw-posix cannot open some file, its Error object 
will not contain the file name but just "No such file or directory" or 
something like that. The file name is supplied by the caller, e.g. qemu-img.

The thing is just that this test tests whether we can trust NBD to 
correctly interpret that file name. So we can't trust this information, 
but a normal user usually can (if they can't, it's a bug).

So adding this information would in my opinion only really be useful for 
this very test, and I'm not so sure whether the effort is really worth it.

If you think this test failing is really an issue that we should fix, 
then I'd rather put the qemu-nbd launch inside of a loop, retrying until 
it succeeds with some random port.

Max
Sascha Silbe Aug. 23, 2016, 7:20 p.m. UTC | #5
Dear Max,

Max Reitz <mreitz@redhat.com> writes:

> On 2016-08-23 at 10:17, Sascha Silbe wrote:
>> Max Reitz <mreitz@redhat.com> writes:
[tests/qemu-iotests/162]
>>> Good idea. We can just let the script generate a random port;
>>> $(($RANDOM+32768)) should do the trick.
>>
>> Which will fail just the same as the original version if anything (not
>> just qemu-nbd) is already occupying the port you happened to
>> choose.
>
> But I think figuring that out from the error log will be rather trivial:
>
>  > +Failed to bind socket: Address already in use

Once a human starts digging into the logs, yes.


[...]
> If you think this test failing is really an issue that we should fix, 
> then I'd rather put the qemu-nbd launch inside of a loop, retrying until 
> it succeeds with some random port.

Seems to be the least ugly approach for now. And yeah, fixing it would
be important IMO. Having automated tests and CI systems fail randomly is
really annoying. People stop running the tests resp. ignore the CI mails
pretty quickly if there are false positives.

Sascha

Patch
diff mbox

diff --git a/tests/qemu-iotests/162 b/tests/qemu-iotests/162
index 0b43ea3..e38e99a 100755
--- a/tests/qemu-iotests/162
+++ b/tests/qemu-iotests/162
@@ -43,10 +43,12 @@  echo '=== NBD ==='
 $QEMU_IMG info 'json:{"driver": "nbd", "host": 42}'
 
 # And this should not treat @port as if it had not been specified
-# (We cannot use localhost with an invalid port here, but we need to use a
-#  non-existing domain, because otherwise the error message will not contain
-#  the port)
-$QEMU_IMG info 'json:{"driver": "nbd", "host": "does.not.exist.example.com", "port": 42}'
+# (We need to set up a server here, because the error message for "Connection
+#  refused" does not contain the destination port)
+$QEMU_NBD -p 42424 -f raw null-co:// &
+sleep 0.5
+$QEMU_IMG info 'json:{"driver": "nbd", "host": "localhost", "port": 42424}' \
+    | grep '^image'
 
 # This is a test for NBD's bdrv_refresh_filename() implementation: It expects
 # either host or path to be set, but it must not assume that they are set to
diff --git a/tests/qemu-iotests/162.out b/tests/qemu-iotests/162.out
index 9bba723..0654e3d 100644
--- a/tests/qemu-iotests/162.out
+++ b/tests/qemu-iotests/162.out
@@ -2,7 +2,7 @@  QA output created by 162
 
 === NBD ===
 qemu-img: Could not open 'json:{"driver": "nbd", "host": 42}': Failed to connect socket: Invalid argument
-qemu-img: Could not open 'json:{"driver": "nbd", "host": "does.not.exist.example.com", "port": 42}': address resolution failed for does.not.exist.example.com:42: Name or service not known
+image: nbd://localhost:42424
 image: nbd+unix://?socket=42
 
 === SSH ===