[v2] iotests: Do not rely on unavailable domains in 162
diff mbox

Message ID 1471987467-31313-1-git-send-email-mreitz@redhat.com
State New
Headers show

Commit Message

Max Reitz Aug. 23, 2016, 9:24 p.m. UTC
There are some (mostly ISP-specific) name servers who will redirect
non-existing domains to special hosts. In this case, we will get a
different error message when trying to connect to such a host, which
breaks test 162.

162 needed this specific error message so it can confirm that qemu was
indeed trying to connect to the user-specified port. However, we can
also confirm this by setting up a local NBD server on exactly that port;
so we can fix the issue by doing just that.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
---
v2:
- Use a loop and random ports for starting qemu-nbd so we don't collide
  with ports which are already in use [Sascha]
---
 tests/qemu-iotests/162     | 25 +++++++++++++++++++++----
 tests/qemu-iotests/162.out |  2 +-
 2 files changed, 22 insertions(+), 5 deletions(-)

Comments

Sascha Silbe Aug. 25, 2016, 3:44 p.m. UTC | #1
Dear Max,

Max Reitz <mreitz@redhat.com> writes:

[tests/qemu-iotests/162]
[...]
> +while true; do
> +    port=$((RANDOM + 32768))
> +    $QEMU_NBD -p $port -f raw null-co:// &> /dev/null &
> +    nbd_pid=$!
> +    sleep 0.5
> +
> +    # Check whether the process is still alive
> +    # (which is the case if the server has been created successfully)
> +    if kill -0 $nbd_pid &> /dev/null; then
> +        break
> +    fi
> +done

Apart from being inherently racy, the chosen sleep time of 0.5s is also
rather short in practice when running tests on a busy or slow
host. Since in this case 162 believes start-up worked, it will still run
into the original problem.

Traditionally daemons fork and wait for their child to finish
initialisation. Only once the child succeeded (or failed) will they exit
themselves. That solves exactly the race condition above.

If we don't have / want to introduce a good way to discover whether
qemu-nbd start-up was successful, we need to check and wait for either
qemu-nbd exiting (⇒ failed, try different port) or qemu-nbd having
successfully acquired the port (⇒ success, continue with
test). Unfortunately all the ways to discover whether a specific process
is listening on a port are non-portable (no pun intended).

Sorry to be a bother, but qemu-iotests failing intermittently on slow
hosts is really a PITA. It's hard to tell the difference between the
test randomly failing because the test is racy (→ false positive, not a
bug) and the test randomly failing because the implementation is racy (→
true positive, bug).

Sascha

Patch
diff mbox

diff --git a/tests/qemu-iotests/162 b/tests/qemu-iotests/162
index 0b43ea3..c60a247 100755
--- a/tests/qemu-iotests/162
+++ b/tests/qemu-iotests/162
@@ -43,10 +43,27 @@  echo '=== NBD ==='
 $QEMU_IMG info 'json:{"driver": "nbd", "host": 42}'
 
 # And this should not treat @port as if it had not been specified
-# (We cannot use localhost with an invalid port here, but we need to use a
-#  non-existing domain, because otherwise the error message will not contain
-#  the port)
-$QEMU_IMG info 'json:{"driver": "nbd", "host": "does.not.exist.example.com", "port": 42}'
+# (We need to set up a server here, because the error message for "Connection
+#  refused" does not contain the destination port)
+
+# Launching qemu-nbd is done in a loop: We try to set up an NBD server on some
+# random port and continue until success, i.e. until we have found a port that
+# is not in use yet.
+while true; do
+    port=$((RANDOM + 32768))
+    $QEMU_NBD -p $port -f raw null-co:// &> /dev/null &
+    nbd_pid=$!
+    sleep 0.5
+
+    # Check whether the process is still alive
+    # (which is the case if the server has been created successfully)
+    if kill -0 $nbd_pid &> /dev/null; then
+        break
+    fi
+done
+
+$QEMU_IMG info "json:{'driver': 'nbd', 'host': 'localhost', 'port': $port}" \
+    | grep '^image' | sed -e "s/$port/PORT/"
 
 # This is a test for NBD's bdrv_refresh_filename() implementation: It expects
 # either host or path to be set, but it must not assume that they are set to
diff --git a/tests/qemu-iotests/162.out b/tests/qemu-iotests/162.out
index 9bba723..3c5be2c 100644
--- a/tests/qemu-iotests/162.out
+++ b/tests/qemu-iotests/162.out
@@ -2,7 +2,7 @@  QA output created by 162
 
 === NBD ===
 qemu-img: Could not open 'json:{"driver": "nbd", "host": 42}': Failed to connect socket: Invalid argument
-qemu-img: Could not open 'json:{"driver": "nbd", "host": "does.not.exist.example.com", "port": 42}': address resolution failed for does.not.exist.example.com:42: Name or service not known
+image: nbd://localhost:PORT
 image: nbd+unix://?socket=42
 
 === SSH ===