[3/3] iotests: Allow 147 to be run concurrently

Message ID	20181221234750.23577-4-mreitz@redhat.com (mailing list archive)
State	New, archived
Headers	show Return-Path: <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org> From: Max Reitz <mreitz@redhat.com> To: qemu-block@nongnu.org Date: Sat, 22 Dec 2018 00:47:50 +0100 Message-Id: <20181221234750.23577-4-mreitz@redhat.com> In-Reply-To: <20181221234750.23577-1-mreitz@redhat.com> References: <20181221234750.23577-1-mreitz@redhat.com> MIME-Version: 1.0 Content-Transfer-Encoding: quoted-printable Subject: [Qemu-devel] [PATCH 3/3] iotests: Allow 147 to be run concurrently Precedence: list Cc: Kevin Wolf <kwolf@redhat.com>, qemu-devel@nongnu.org, Max Reitz <mreitz@redhat.com> Errors-To: qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org Sender: "Qemu-devel" <qemu-devel-bounces+patchwork-qemu-devel=patchwork.kernel.org@nongnu.org>
Series	iotests: Allow 147 to be run concurrently \| expand [0/3] iotests: Allow 147 to be run concurrently [1/3] iotests.py: Add qemu_nbd_pipe() [2/3] iotests: Bind qemu-nbd to localhost in 147 [3/3] iotests: Allow 147 to be run concurrently

Max Reitz Dec. 21, 2018, 11:47 p.m. UTC

To do this, we need to allow creating the NBD server on various ports
instead of a single one (which may not even work if you run just one
instance, because something entirely else might be using that port).

So we just pick a random port in [32768, 32768 + 1024) and try to create
a server there.  If that fails, we just retry until something sticks.

For the IPv6 test, we need a different range, though (just above that
one).  This is because "localhost" resolves to both 127.0.0.1 and ::1.
This means that if you bind to it, it will bind to both, if possible, or
just one if the other is already in use.  Therefore, if the IPv6 test
has already taken [::1]:some_port and we then try to take
localhost:some_port, that will work -- only the second server will be
bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
So we have two different servers on the same port, one for IPv4 and one
for IPv6.

But when we then try to connect to the server through
localhost:some_port, we will always end up at the IPv6 one (as long as
it is up), and this may not be the one we want.

Thus, we must make sure not to create an IPv6-only NBD server on the
same port as a normal "dual-stack" NBD server -- which is done by using
distinct port ranges, as explained above.

Signed-off-by: Max Reitz <mreitz@redhat.com>
---
 tests/qemu-iotests/147 | 98 +++++++++++++++++++++++++++++-------------
 1 file changed, 68 insertions(+), 30 deletions(-)

John Snow Jan. 21, 2019, 8:50 p.m. UTC | #1

On 12/21/18 6:47 PM, Max Reitz wrote:
> To do this, we need to allow creating the NBD server on various ports
> instead of a single one (which may not even work if you run just one
> instance, because something entirely else might be using that port).
> 
> So we just pick a random port in [32768, 32768 + 1024) and try to create
> a server there.  If that fails, we just retry until something sticks.
> 
> For the IPv6 test, we need a different range, though (just above that
> one).  This is because "localhost" resolves to both 127.0.0.1 and ::1.
> This means that if you bind to it, it will bind to both, if possible, or
> just one if the other is already in use.  Therefore, if the IPv6 test
> has already taken [::1]:some_port and we then try to take
> localhost:some_port, that will work -- only the second server will be
> bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
> So we have two different servers on the same port, one for IPv4 and one
> for IPv6.
> 
> But when we then try to connect to the server through
> localhost:some_port, we will always end up at the IPv6 one (as long as
> it is up), and this may not be the one we want.
> 
> Thus, we must make sure not to create an IPv6-only NBD server on the
> same port as a normal "dual-stack" NBD server -- which is done by using
> distinct port ranges, as explained above.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/147 | 98 +++++++++++++++++++++++++++++-------------
>  1 file changed, 68 insertions(+), 30 deletions(-)
> 
> diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
> index 3e10a9969e..82513279b0 100755
> --- a/tests/qemu-iotests/147
> +++ b/tests/qemu-iotests/147
> @@ -19,13 +19,17 @@
>  #
>  
>  import os
> +import random
>  import socket
>  import stat
>  import time
>  import iotests
> -from iotests import cachemode, imgfmt, qemu_img, qemu_nbd
> +from iotests import cachemode, imgfmt, qemu_img, qemu_nbd, qemu_nbd_pipe
>  
> -NBD_PORT = 10811
> +NBD_PORT_START      = 32768
> +NBD_PORT_END        = NBD_PORT_START + 1024
> +NBD_IPV6_PORT_START = NBD_PORT_END
> +NBD_IPV6_PORT_END   = NBD_IPV6_PORT_START + 1024
>  
>  test_img = os.path.join(iotests.test_dir, 'test.img')
>  unix_socket = os.path.join(iotests.test_dir, 'nbd.socket')
> @@ -88,17 +92,29 @@ class QemuNBD(NBDBlockdevAddBase):
>          except OSError:
>              pass
>  
> +    def _try_server_up(self, *args):
> +        status, msg = qemu_nbd_pipe('-f', imgfmt, test_img, *args)
> +        if status == 0:
> +            return True
> +        if 'Address already in use' in msg:
> +            return False

My only half-empty question is if we are guaranteed to have a locale and
environment such that this string will always be present when we
encounter that specific error?

I suppose even if not, that it'll just fail hard and it's no worse than
what we had before.

> +        self.fail(msg)
> +
>      def _server_up(self, *args):
> -        self.assertEqual(qemu_nbd('-f', imgfmt, test_img, *args), 0)
> +        self.assertTrue(self._try_server_up(*args))
>  
>      def test_inet(self):
> -        self._server_up('-b', 'localhost', '-p', str(NBD_PORT))
> +        while True:
> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
> +            if self._try_server_up('-b', 'localhost', '-p', str(nbd_port)):
> +                break

Why not just iterate over the range so that once the range is exhausted
we're guaranteed to fail?

> +
>          address = { 'type': 'inet',
>                      'data': {
>                          'host': 'localhost',
> -                        'port': str(NBD_PORT)
> +                        'port': str(nbd_port)
>                      } }
> -        self.client_test('nbd://localhost:%i' % NBD_PORT,
> +        self.client_test('nbd://localhost:%i' % nbd_port,
>                           flatten_sock_addr(address))
>  
>      def test_unix(self):
> @@ -130,8 +146,13 @@ class BuiltinNBD(NBDBlockdevAddBase):
>          except OSError:
>              pass
>  
> -    def _server_up(self, address, export_name=None, export_name2=None):
> +    # Returns False on EADDRINUSE; fails an assertion on other errors.
> +    # Returns True on success.
> +    def _try_server_up(self, address, export_name=None, export_name2=None):
>          result = self.server.qmp('nbd-server-start', addr=address)
> +        if 'error' in result and \
> +           'Address already in use' in result['error']['desc']:
> +            return False
>          self.assert_qmp(result, 'return', {})
>  
>          if export_name is None:
> @@ -146,20 +167,28 @@ class BuiltinNBD(NBDBlockdevAddBase):
>                                       name=export_name2)
>              self.assert_qmp(result, 'return', {})
>  
> +        return True
> +
> +    def _server_up(self, address, export_name=None, export_name2=None):
> +        self.assertTrue(self._try_server_up(address, export_name, export_name2))
>  
>      def _server_down(self):
>          result = self.server.qmp('nbd-server-stop')
>          self.assert_qmp(result, 'return', {})
>  
>      def do_test_inet(self, export_name=None):
> -        address = { 'type': 'inet',
> -                    'data': {
> -                        'host': 'localhost',
> -                        'port': str(NBD_PORT)
> -                    } }
> -        self._server_up(address, export_name)
> +        while True:
> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
> +            address = { 'type': 'inet',
> +                        'data': {
> +                            'host': 'localhost',
> +                            'port': str(nbd_port)
> +                        } }
> +            if self._try_server_up(address, export_name):
> +                break
> +
>          export_name = export_name or 'nbd-export'
> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, export_name),
> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, export_name),
>                           flatten_sock_addr(address), export_name)
>          self._server_down()
>  
> @@ -173,15 +202,19 @@ class BuiltinNBD(NBDBlockdevAddBase):
>          self.do_test_inet('shadow')
>  
>      def test_inet_two_exports(self):
> -        address = { 'type': 'inet',
> -                    'data': {
> -                        'host': 'localhost',
> -                        'port': str(NBD_PORT)
> -                    } }
> -        self._server_up(address, 'exp1', 'exp2')
> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp1'),
> +        while True:
> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
> +            address = { 'type': 'inet',
> +                        'data': {
> +                            'host': 'localhost',
> +                            'port': str(nbd_port)
> +                        } }
> +            if self._try_server_up(address, 'exp1', 'exp2'):
> +                break
> +
> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, 'exp1'),
>                           flatten_sock_addr(address), 'exp1', 'node1', False)
> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp2'),
> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, 'exp2'),
>                           flatten_sock_addr(address), 'exp2', 'node2', False)
>          result = self.vm.qmp('blockdev-del', node_name='node1')
>          self.assert_qmp(result, 'return', {})
> @@ -197,20 +230,25 @@ class BuiltinNBD(NBDBlockdevAddBase):
>          except socket.gaierror:
>              # IPv6 not available, skip
>              return
> -        address = { 'type': 'inet',
> -                    'data': {
> -                        'host': '::1',
> -                        'port': str(NBD_PORT),
> -                        'ipv4': False,
> -                        'ipv6': True
> -                    } }
> +
> +        while True:
> +            nbd_port = random.randrange(NBD_IPV6_PORT_START, NBD_IPV6_PORT_END)
> +            address = { 'type': 'inet',
> +                        'data': {
> +                            'host': '::1',
> +                            'port': str(nbd_port),
> +                            'ipv4': False,
> +                            'ipv6': True
> +                        } }
> +            if self._try_server_up(address):
> +                break
> +
>          filename = { 'driver': 'raw',
>                       'file': {
>                           'driver': 'nbd',
>                           'export': 'nbd-export',
>                           'server': flatten_sock_addr(address)
>                       } }
> -        self._server_up(address)
>          self.client_test(filename, flatten_sock_addr(address), 'nbd-export')
>          self._server_down()
>  
> 

I guess my only other observation is that we have a lot of "while True"
loops now, is it worth creating some kind of helper that does the dirty
work of finding a serviceable port or nah?


If none of my observations spark joy:

1-3: Reviewed-By: John Snow <jsnow@redhat.com>

Eric Blake Jan. 21, 2019, 9:02 p.m. UTC | #2

On 12/21/18 5:47 PM, Max Reitz wrote:
> To do this, we need to allow creating the NBD server on various ports
> instead of a single one (which may not even work if you run just one
> instance, because something entirely else might be using that port).

Can you instead reuse the ideas from nbd_server_set_tcp_port() from
qemu-iotests/common.nbd?

> 
> So we just pick a random port in [32768, 32768 + 1024) and try to create
> a server there.  If that fails, we just retry until something sticks.

That has the advantage of checking whether a port is actually in use
(using 'ss' - although it does limit the test to Linux-only; perhaps
using socat instead of ss could make the test portable to non-Linux?)

> 
> For the IPv6 test, we need a different range, though (just above that
> one).  This is because "localhost" resolves to both 127.0.0.1 and ::1.
> This means that if you bind to it, it will bind to both, if possible, or
> just one if the other is already in use.  Therefore, if the IPv6 test
> has already taken [::1]:some_port and we then try to take
> localhost:some_port, that will work -- only the second server will be
> bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
> So we have two different servers on the same port, one for IPv4 and one
> for IPv6.
> 
> But when we then try to connect to the server through
> localhost:some_port, we will always end up at the IPv6 one (as long as
> it is up), and this may not be the one we want.
> 
> Thus, we must make sure not to create an IPv6-only NBD server on the
> same port as a normal "dual-stack" NBD server -- which is done by using
> distinct port ranges, as explained above.
> 
> Signed-off-by: Max Reitz <mreitz@redhat.com>
> ---
>  tests/qemu-iotests/147 | 98 +++++++++++++++++++++++++++++-------------
>  1 file changed, 68 insertions(+), 30 deletions(-)
> 

> @@ -88,17 +92,29 @@ class QemuNBD(NBDBlockdevAddBase):
>          except OSError:
>              pass
>  
> +    def _try_server_up(self, *args):
> +        status, msg = qemu_nbd_pipe('-f', imgfmt, test_img, *args)
> +        if status == 0:
> +            return True
> +        if 'Address already in use' in msg:
> +            return False
> +        self.fail(msg)

Do you actually need to attempt a qemu-nbd process, if you take my
suggestion of using ss to probe for an unused port?  And if not, do we
still need qemu_nbd_pipe() added earlier in the series?


> -        address = { 'type': 'inet',
> -                    'data': {
> -                        'host': 'localhost',
> -                        'port': str(NBD_PORT)
> -                    } }
> -        self._server_up(address, export_name)
> +        while True:
> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)

common.nbd just iterates, instead of trying random ports.

Max Reitz Jan. 23, 2019, 1:05 p.m. UTC | #3

On 21.01.19 21:50, John Snow wrote:
> 
> 
> On 12/21/18 6:47 PM, Max Reitz wrote:
>> To do this, we need to allow creating the NBD server on various ports
>> instead of a single one (which may not even work if you run just one
>> instance, because something entirely else might be using that port).
>>
>> So we just pick a random port in [32768, 32768 + 1024) and try to create
>> a server there.  If that fails, we just retry until something sticks.
>>
>> For the IPv6 test, we need a different range, though (just above that
>> one).  This is because "localhost" resolves to both 127.0.0.1 and ::1.
>> This means that if you bind to it, it will bind to both, if possible, or
>> just one if the other is already in use.  Therefore, if the IPv6 test
>> has already taken [::1]:some_port and we then try to take
>> localhost:some_port, that will work -- only the second server will be
>> bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
>> So we have two different servers on the same port, one for IPv4 and one
>> for IPv6.
>>
>> But when we then try to connect to the server through
>> localhost:some_port, we will always end up at the IPv6 one (as long as
>> it is up), and this may not be the one we want.
>>
>> Thus, we must make sure not to create an IPv6-only NBD server on the
>> same port as a normal "dual-stack" NBD server -- which is done by using
>> distinct port ranges, as explained above.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  tests/qemu-iotests/147 | 98 +++++++++++++++++++++++++++++-------------
>>  1 file changed, 68 insertions(+), 30 deletions(-)
>>
>> diff --git a/tests/qemu-iotests/147 b/tests/qemu-iotests/147
>> index 3e10a9969e..82513279b0 100755
>> --- a/tests/qemu-iotests/147
>> +++ b/tests/qemu-iotests/147
>> @@ -19,13 +19,17 @@
>>  #
>>  
>>  import os
>> +import random
>>  import socket
>>  import stat
>>  import time
>>  import iotests
>> -from iotests import cachemode, imgfmt, qemu_img, qemu_nbd
>> +from iotests import cachemode, imgfmt, qemu_img, qemu_nbd, qemu_nbd_pipe
>>  
>> -NBD_PORT = 10811
>> +NBD_PORT_START      = 32768
>> +NBD_PORT_END        = NBD_PORT_START + 1024
>> +NBD_IPV6_PORT_START = NBD_PORT_END
>> +NBD_IPV6_PORT_END   = NBD_IPV6_PORT_START + 1024
>>  
>>  test_img = os.path.join(iotests.test_dir, 'test.img')
>>  unix_socket = os.path.join(iotests.test_dir, 'nbd.socket')
>> @@ -88,17 +92,29 @@ class QemuNBD(NBDBlockdevAddBase):
>>          except OSError:
>>              pass
>>  
>> +    def _try_server_up(self, *args):
>> +        status, msg = qemu_nbd_pipe('-f', imgfmt, test_img, *args)
>> +        if status == 0:
>> +            return True
>> +        if 'Address already in use' in msg:
>> +            return False
> 
> My only half-empty question is if we are guaranteed to have a locale and
> environment such that this string will always be present when we
> encounter that specific error?
> 
> I suppose even if not, that it'll just fail hard and it's no worse than
> what we had before.

I can tell you at least that my locale is de_DE.UTF-8.

>> +        self.fail(msg)
>> +
>>      def _server_up(self, *args):
>> -        self.assertEqual(qemu_nbd('-f', imgfmt, test_img, *args), 0)
>> +        self.assertTrue(self._try_server_up(*args))
>>  
>>      def test_inet(self):
>> -        self._server_up('-b', 'localhost', '-p', str(NBD_PORT))
>> +        while True:
>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
>> +            if self._try_server_up('-b', 'localhost', '-p', str(nbd_port)):
>> +                break
> 
> Why not just iterate over the range so that once the range is exhausted
> we're guaranteed to fail?

Hm.  With this approach chances are higher that the first port you try
works, so it does have a benefit.  I'm not sure whether it's actually a
problem that this will pretty much break down if everything in the range
is allocated, because I can't see that happening.

Iterating over a range wouldn't make the code easier because I would
have to account for the error case. O:-)

>> +
>>          address = { 'type': 'inet',
>>                      'data': {
>>                          'host': 'localhost',
>> -                        'port': str(NBD_PORT)
>> +                        'port': str(nbd_port)
>>                      } }
>> -        self.client_test('nbd://localhost:%i' % NBD_PORT,
>> +        self.client_test('nbd://localhost:%i' % nbd_port,
>>                           flatten_sock_addr(address))
>>  
>>      def test_unix(self):
>> @@ -130,8 +146,13 @@ class BuiltinNBD(NBDBlockdevAddBase):
>>          except OSError:
>>              pass
>>  
>> -    def _server_up(self, address, export_name=None, export_name2=None):
>> +    # Returns False on EADDRINUSE; fails an assertion on other errors.
>> +    # Returns True on success.
>> +    def _try_server_up(self, address, export_name=None, export_name2=None):
>>          result = self.server.qmp('nbd-server-start', addr=address)
>> +        if 'error' in result and \
>> +           'Address already in use' in result['error']['desc']:
>> +            return False
>>          self.assert_qmp(result, 'return', {})
>>  
>>          if export_name is None:
>> @@ -146,20 +167,28 @@ class BuiltinNBD(NBDBlockdevAddBase):
>>                                       name=export_name2)
>>              self.assert_qmp(result, 'return', {})
>>  
>> +        return True
>> +
>> +    def _server_up(self, address, export_name=None, export_name2=None):
>> +        self.assertTrue(self._try_server_up(address, export_name, export_name2))
>>  
>>      def _server_down(self):
>>          result = self.server.qmp('nbd-server-stop')
>>          self.assert_qmp(result, 'return', {})
>>  
>>      def do_test_inet(self, export_name=None):
>> -        address = { 'type': 'inet',
>> -                    'data': {
>> -                        'host': 'localhost',
>> -                        'port': str(NBD_PORT)
>> -                    } }
>> -        self._server_up(address, export_name)
>> +        while True:
>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
>> +            address = { 'type': 'inet',
>> +                        'data': {
>> +                            'host': 'localhost',
>> +                            'port': str(nbd_port)
>> +                        } }
>> +            if self._try_server_up(address, export_name):
>> +                break
>> +
>>          export_name = export_name or 'nbd-export'
>> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, export_name),
>> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, export_name),
>>                           flatten_sock_addr(address), export_name)
>>          self._server_down()
>>  
>> @@ -173,15 +202,19 @@ class BuiltinNBD(NBDBlockdevAddBase):
>>          self.do_test_inet('shadow')
>>  
>>      def test_inet_two_exports(self):
>> -        address = { 'type': 'inet',
>> -                    'data': {
>> -                        'host': 'localhost',
>> -                        'port': str(NBD_PORT)
>> -                    } }
>> -        self._server_up(address, 'exp1', 'exp2')
>> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp1'),
>> +        while True:
>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
>> +            address = { 'type': 'inet',
>> +                        'data': {
>> +                            'host': 'localhost',
>> +                            'port': str(nbd_port)
>> +                        } }
>> +            if self._try_server_up(address, 'exp1', 'exp2'):
>> +                break
>> +
>> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, 'exp1'),
>>                           flatten_sock_addr(address), 'exp1', 'node1', False)
>> -        self.client_test('nbd://localhost:%i/%s' % (NBD_PORT, 'exp2'),
>> +        self.client_test('nbd://localhost:%i/%s' % (nbd_port, 'exp2'),
>>                           flatten_sock_addr(address), 'exp2', 'node2', False)
>>          result = self.vm.qmp('blockdev-del', node_name='node1')
>>          self.assert_qmp(result, 'return', {})
>> @@ -197,20 +230,25 @@ class BuiltinNBD(NBDBlockdevAddBase):
>>          except socket.gaierror:
>>              # IPv6 not available, skip
>>              return
>> -        address = { 'type': 'inet',
>> -                    'data': {
>> -                        'host': '::1',
>> -                        'port': str(NBD_PORT),
>> -                        'ipv4': False,
>> -                        'ipv6': True
>> -                    } }
>> +
>> +        while True:
>> +            nbd_port = random.randrange(NBD_IPV6_PORT_START, NBD_IPV6_PORT_END)
>> +            address = { 'type': 'inet',
>> +                        'data': {
>> +                            'host': '::1',
>> +                            'port': str(nbd_port),
>> +                            'ipv4': False,
>> +                            'ipv6': True
>> +                        } }
>> +            if self._try_server_up(address):
>> +                break
>> +
>>          filename = { 'driver': 'raw',
>>                       'file': {
>>                           'driver': 'nbd',
>>                           'export': 'nbd-export',
>>                           'server': flatten_sock_addr(address)
>>                       } }
>> -        self._server_up(address)
>>          self.client_test(filename, flatten_sock_addr(address), 'nbd-export')
>>          self._server_down()
>>  
>>
> 
> I guess my only other observation is that we have a lot of "while True"
> loops now, is it worth creating some kind of helper that does the dirty
> work of finding a serviceable port or nah?

Seems reasonable, I'll see how it looks.

> If none of my observations spark joy:
> 
> 1-3: Reviewed-By: John Snow <jsnow@redhat.com>

Thanks!

Max

Max Reitz Jan. 23, 2019, 1:12 p.m. UTC | #4

On 21.01.19 22:02, Eric Blake wrote:
> On 12/21/18 5:47 PM, Max Reitz wrote:
>> To do this, we need to allow creating the NBD server on various ports
>> instead of a single one (which may not even work if you run just one
>> instance, because something entirely else might be using that port).
> 
> Can you instead reuse the ideas from nbd_server_set_tcp_port() from
> qemu-iotests/common.nbd?
> 
>>
>> So we just pick a random port in [32768, 32768 + 1024) and try to create
>> a server there.  If that fails, we just retry until something sticks.
> 
> That has the advantage of checking whether a port is actually in use
> (using 'ss' - although it does limit the test to Linux-only; perhaps
> using socat instead of ss could make the test portable to non-Linux?)

But doesn't that give you race conditions?  That's the point of this
series, so you can run multiple instances of 147 concurrently.

>> For the IPv6 test, we need a different range, though (just above that
>> one).  This is because "localhost" resolves to both 127.0.0.1 and ::1.
>> This means that if you bind to it, it will bind to both, if possible, or
>> just one if the other is already in use.  Therefore, if the IPv6 test
>> has already taken [::1]:some_port and we then try to take
>> localhost:some_port, that will work -- only the second server will be
>> bound to 127.0.0.1:some_port alone and not [::1]:some_port in addition.
>> So we have two different servers on the same port, one for IPv4 and one
>> for IPv6.
>>
>> But when we then try to connect to the server through
>> localhost:some_port, we will always end up at the IPv6 one (as long as
>> it is up), and this may not be the one we want.
>>
>> Thus, we must make sure not to create an IPv6-only NBD server on the
>> same port as a normal "dual-stack" NBD server -- which is done by using
>> distinct port ranges, as explained above.
>>
>> Signed-off-by: Max Reitz <mreitz@redhat.com>
>> ---
>>  tests/qemu-iotests/147 | 98 +++++++++++++++++++++++++++++-------------
>>  1 file changed, 68 insertions(+), 30 deletions(-)
>>
> 
>> @@ -88,17 +92,29 @@ class QemuNBD(NBDBlockdevAddBase):
>>          except OSError:
>>              pass
>>  
>> +    def _try_server_up(self, *args):
>> +        status, msg = qemu_nbd_pipe('-f', imgfmt, test_img, *args)
>> +        if status == 0:
>> +            return True
>> +        if 'Address already in use' in msg:
>> +            return False
>> +        self.fail(msg)
> 
> Do you actually need to attempt a qemu-nbd process, if you take my
> suggestion of using ss to probe for an unused port?  And if not, do we
> still need qemu_nbd_pipe() added earlier in the series?
> 
> 
>> -        address = { 'type': 'inet',
>> -                    'data': {
>> -                        'host': 'localhost',
>> -                        'port': str(NBD_PORT)
>> -                    } }
>> -        self._server_up(address, export_name)
>> +        while True:
>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
> 
> common.nbd just iterates, instead of trying random ports.

I'm not sure which is better.  Iterating gives guaranteed termination,
trying random ports means the first one you try will usually work.

Max

Eric Blake Jan. 23, 2019, 2:33 p.m. UTC | #5

On 1/23/19 7:12 AM, Max Reitz wrote:
> On 21.01.19 22:02, Eric Blake wrote:
>> On 12/21/18 5:47 PM, Max Reitz wrote:
>>> To do this, we need to allow creating the NBD server on various ports
>>> instead of a single one (which may not even work if you run just one
>>> instance, because something entirely else might be using that port).
>>
>> Can you instead reuse the ideas from nbd_server_set_tcp_port() from
>> qemu-iotests/common.nbd?
>>
>>>
>>> So we just pick a random port in [32768, 32768 + 1024) and try to create
>>> a server there.  If that fails, we just retry until something sticks.
>>
>> That has the advantage of checking whether a port is actually in use
>> (using 'ss' - although it does limit the test to Linux-only; perhaps
>> using socat instead of ss could make the test portable to non-Linux?)
> 
> But doesn't that give you race conditions?  That's the point of this
> series, so you can run multiple instances of 147 concurrently.

Hmm - that does imply that common.nbd's use of ss IS racy because it
checks in linear fashion and has a TOCTTOU window (affects at least
iotest 233). Your observation that random probes within a range are less
susceptible (although not immune) to the race is correct.

>> Do you actually need to attempt a qemu-nbd process, if you take my
>> suggestion of using ss to probe for an unused port?  And if not, do we
>> still need qemu_nbd_pipe() added earlier in the series?
>>
>>
>>> -        address = { 'type': 'inet',
>>> -                    'data': {
>>> -                        'host': 'localhost',
>>> -                        'port': str(NBD_PORT)
>>> -                    } }
>>> -        self._server_up(address, export_name)
>>> +        while True:
>>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
>>
>> common.nbd just iterates, instead of trying random ports.
> 
> I'm not sure which is better.  Iterating gives guaranteed termination,
> trying random ports means the first one you try will usually work.

Is there any other way we can make the test more robust, perhaps by
using socket activation (that is, pre-open the port prior to starting
qemu_nbd, so that our code for finding a free socket is more easily
reusable), or by using Unix sockets for test 147 (that test seems to be
using TCP sockets only as a means to get to the real feature under test,
and not as the actual thing being tested)?

Hmm, and you made me realize that socket activation is NOT documented in
'man qemu-nbd'; I ought to fix that.

Max Reitz Jan. 23, 2019, 2:37 p.m. UTC | #6

On 23.01.19 15:33, Eric Blake wrote:
> On 1/23/19 7:12 AM, Max Reitz wrote:
>> On 21.01.19 22:02, Eric Blake wrote:
>>> On 12/21/18 5:47 PM, Max Reitz wrote:
>>>> To do this, we need to allow creating the NBD server on various ports
>>>> instead of a single one (which may not even work if you run just one
>>>> instance, because something entirely else might be using that port).
>>>
>>> Can you instead reuse the ideas from nbd_server_set_tcp_port() from
>>> qemu-iotests/common.nbd?
>>>
>>>>
>>>> So we just pick a random port in [32768, 32768 + 1024) and try to create
>>>> a server there.  If that fails, we just retry until something sticks.
>>>
>>> That has the advantage of checking whether a port is actually in use
>>> (using 'ss' - although it does limit the test to Linux-only; perhaps
>>> using socat instead of ss could make the test portable to non-Linux?)
>>
>> But doesn't that give you race conditions?  That's the point of this
>> series, so you can run multiple instances of 147 concurrently.
> 
> Hmm - that does imply that common.nbd's use of ss IS racy because it
> checks in linear fashion and has a TOCTTOU window (affects at least
> iotest 233). Your observation that random probes within a range are less
> susceptible (although not immune) to the race is correct.
> 
>>> Do you actually need to attempt a qemu-nbd process, if you take my
>>> suggestion of using ss to probe for an unused port?  And if not, do we
>>> still need qemu_nbd_pipe() added earlier in the series?
>>>
>>>
>>>> -        address = { 'type': 'inet',
>>>> -                    'data': {
>>>> -                        'host': 'localhost',
>>>> -                        'port': str(NBD_PORT)
>>>> -                    } }
>>>> -        self._server_up(address, export_name)
>>>> +        while True:
>>>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
>>>
>>> common.nbd just iterates, instead of trying random ports.
>>
>> I'm not sure which is better.  Iterating gives guaranteed termination,
>> trying random ports means the first one you try will usually work.
> 
> Is there any other way we can make the test more robust, perhaps by
> using socket activation (that is, pre-open the port prior to starting
> qemu_nbd, so that our code for finding a free socket is more easily
> reusable), or by using Unix sockets for test 147 (that test seems to be
> using TCP sockets only as a means to get to the real feature under test,
> and not as the actual thing being tested)?

147 needs TCP sockets because that interface is tested.

Making the code reusable is not too high of a priority to me, as
normally NBD tests should just use Unix sockets.  This test is just a
special case.  But for this test, I can try to put the while loop into
an own function (that gets fed an address object without data.port), as
John has proposed.

Max

> Hmm, and you made me realize that socket activation is NOT documented in
> 'man qemu-nbd'; I ought to fix that.
>

Daniel P. Berrangé Jan. 23, 2019, 2:43 p.m. UTC | #7

On Wed, Jan 23, 2019 at 08:33:41AM -0600, Eric Blake wrote:
> On 1/23/19 7:12 AM, Max Reitz wrote:
> > On 21.01.19 22:02, Eric Blake wrote:
> >> On 12/21/18 5:47 PM, Max Reitz wrote:
> >>> To do this, we need to allow creating the NBD server on various ports
> >>> instead of a single one (which may not even work if you run just one
> >>> instance, because something entirely else might be using that port).
> >>
> >> Can you instead reuse the ideas from nbd_server_set_tcp_port() from
> >> qemu-iotests/common.nbd?
> >>
> >>>
> >>> So we just pick a random port in [32768, 32768 + 1024) and try to create
> >>> a server there.  If that fails, we just retry until something sticks.
> >>
> >> That has the advantage of checking whether a port is actually in use
> >> (using 'ss' - although it does limit the test to Linux-only; perhaps
> >> using socat instead of ss could make the test portable to non-Linux?)
> > 
> > But doesn't that give you race conditions?  That's the point of this
> > series, so you can run multiple instances of 147 concurrently.
> 
> Hmm - that does imply that common.nbd's use of ss IS racy because it
> checks in linear fashion and has a TOCTTOU window (affects at least
> iotest 233). Your observation that random probes within a range are less
> susceptible (although not immune) to the race is correct.
> 
> >> Do you actually need to attempt a qemu-nbd process, if you take my
> >> suggestion of using ss to probe for an unused port?  And if not, do we
> >> still need qemu_nbd_pipe() added earlier in the series?
> >>
> >>
> >>> -        address = { 'type': 'inet',
> >>> -                    'data': {
> >>> -                        'host': 'localhost',
> >>> -                        'port': str(NBD_PORT)
> >>> -                    } }
> >>> -        self._server_up(address, export_name)
> >>> +        while True:
> >>> +            nbd_port = random.randrange(NBD_PORT_START, NBD_PORT_END)
> >>
> >> common.nbd just iterates, instead of trying random ports.
> > 
> > I'm not sure which is better.  Iterating gives guaranteed termination,
> > trying random ports means the first one you try will usually work.
> 
> Is there any other way we can make the test more robust, perhaps by
> using socket activation (that is, pre-open the port prior to starting
> qemu_nbd, so that our code for finding a free socket is more easily
> reusable), or by using Unix sockets for test 147 (that test seems to be
> using TCP sockets only as a means to get to the real feature under test,
> and not as the actual thing being tested)?

The problem with using socket activation is that you then are not getting
test coverage of the non-activation code paths which are quite significant
things we really want to be testing.

I do wonder if there's a case to be made for having iotests run inside a
container with private network namespace such that they then have a
predictable environment.  You could then simply declare that if a test
needs a TCP port, it should use  "port 9000 + $TEST_NUM". So every
test can safely run in parallel.

If the entire test harness needs to be run multiple in parallel each
run woudl be a separate container, and so again avoid clashing.


Regards,
Daniel

Max Reitz Jan. 23, 2019, 5:34 p.m. UTC | #8

On 23.01.19 14:05, Max Reitz wrote:
> On 21.01.19 21:50, John Snow wrote:

[...]

>> I guess my only other observation is that we have a lot of "while True"
>> loops now, is it worth creating some kind of helper that does the dirty
>> work of finding a serviceable port or nah?
> 
> Seems reasonable, I'll see how it looks.

If I do that, the diff stat looks to be 40+, 33-.  Is that worth it?

Max

John Snow Jan. 23, 2019, 5:47 p.m. UTC | #9

On 1/23/19 12:34 PM, Max Reitz wrote:
> On 23.01.19 14:05, Max Reitz wrote:
>> On 21.01.19 21:50, John Snow wrote:
> 
> [...]
> 
>>> I guess my only other observation is that we have a lot of "while True"
>>> loops now, is it worth creating some kind of helper that does the dirty
>>> work of finding a serviceable port or nah?
>>
>> Seems reasonable, I'll see how it looks.
> 
> If I do that, the diff stat looks to be 40+, 33-.  Is that worth it?
> 
> Max
> 

Instead of:
 1 file changed, 68 insertions(+), 30 deletions(-)

Or in addition to?

Use your own judg[e?]ment on it, I guess it's local to this one iotest
for now anyway, so if it's not an obvious win just skip it.

--js

Max Reitz Jan. 25, 2019, 3:01 p.m. UTC | #10

On 23.01.19 18:47, John Snow wrote:
> 
> 
> On 1/23/19 12:34 PM, Max Reitz wrote:
>> On 23.01.19 14:05, Max Reitz wrote:
>>> On 21.01.19 21:50, John Snow wrote:
>>
>> [...]
>>
>>>> I guess my only other observation is that we have a lot of "while True"
>>>> loops now, is it worth creating some kind of helper that does the dirty
>>>> work of finding a serviceable port or nah?
>>>
>>> Seems reasonable, I'll see how it looks.
>>
>> If I do that, the diff stat looks to be 40+, 33-.  Is that worth it?
>>
>> Max
>>
> 
> Instead of:
>  1 file changed, 68 insertions(+), 30 deletions(-)
> 
> Or in addition to?

In addition to.

> Use your own judg[e?]ment on it, I guess it's local to this one iotest
> for now anyway, so if it's not an obvious win just skip it.

OK.  I guess I'll skip it then, but let it be noted that your criticism
was listened to. :-)

Max

[3/3] iotests: Allow 147 to be run concurrently

Commit Message

Comments

Patch