diff mbox series

libxl: force netback to wait for hotplug execution before connecting

Message ID 20220124160248.37861-1-roger.pau@citrix.com (mailing list archive)
State New, archived
Headers show
Series libxl: force netback to wait for hotplug execution before connecting | expand

Commit Message

Roger Pau Monné Jan. 24, 2022, 4:02 p.m. UTC
By writing an empty "hotplug-status" xenstore node in the backend path
libxl can force Linux netback to wait for hotplug script execution
before proceeding to the 'connected' state.

This is required so that netback doesn't skip state 2 (InitWait) and
thus blocks libxl waiting for such state in order to launch the
hotplug script (see libxl__wait_device_connection).

Reported-by: James Dingwall <james-xen@dingwall.me.uk>
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Tested-by: James Dingwall <james-xen@dingwall.me.uk>
---
Cc: Wei Liu <wei.liu@kernel.org>
Cc: Paul Durrant <paul@xen.org>
---
 tools/libs/light/libxl_nic.c | 7 +++++++
 1 file changed, 7 insertions(+)

Comments

Paul Durrant Jan. 25, 2022, 2:02 p.m. UTC | #1
On 24/01/2022 16:02, Roger Pau Monne wrote:
> By writing an empty "hotplug-status" xenstore node in the backend path
> libxl can force Linux netback to wait for hotplug script execution
> before proceeding to the 'connected' state.
> 
> This is required so that netback doesn't skip state 2 (InitWait) and
> thus blocks libxl waiting for such state in order to launch the
> hotplug script (see libxl__wait_device_connection).
> 
> Reported-by: James Dingwall <james-xen@dingwall.me.uk>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Tested-by: James Dingwall <james-xen@dingwall.me.uk>
> ---
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Paul Durrant <paul@xen.org>

Reviewed-by: Paul Durrant <paul@xen.org>

> ---
>   tools/libs/light/libxl_nic.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/tools/libs/light/libxl_nic.c b/tools/libs/light/libxl_nic.c
> index 0b45469dca..0b9e70c9d1 100644
> --- a/tools/libs/light/libxl_nic.c
> +++ b/tools/libs/light/libxl_nic.c
> @@ -248,6 +248,13 @@ static int libxl__set_xenstore_nic(libxl__gc *gc, uint32_t domid,
>       flexarray_append(ro_front, "mtu");
>       flexarray_append(ro_front, GCSPRINTF("%u", nic->mtu));
>   
> +    /*
> +     * Force backend to wait for hotplug script execution before switching to
> +     * connected state.
> +     */
> +    flexarray_append(back, "hotplug-status");
> +    flexarray_append(back, "");
> +
>       return 0;
>   }
>
Julien Grall Jan. 25, 2022, 3:32 p.m. UTC | #2
Hi,

On 24/01/2022 16:02, Roger Pau Monne wrote:
> By writing an empty "hotplug-status" xenstore node in the backend path
> libxl can force Linux netback to wait for hotplug script execution
> before proceeding to the 'connected' state.

I was actually chasing the same issue today :).

> 
> This is required so that netback doesn't skip state 2 (InitWait) and

Technically netback never skip state 2 (otherwise it would always be 
reproducible). Instead, libxl may not be able to observe state 2 because 
receive a watch is asynchronous and doesn't contain the value of the 
node. So the backend may have moved to Connected before the state is read.

> thus blocks libxl waiting for such state in order to launch the
> hotplug script (see libxl__wait_device_connection).
> 
> Reported-by: James Dingwall <james-xen@dingwall.me.uk>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Tested-by: James Dingwall <james-xen@dingwall.me.uk>

I could easily reproduced by adding a sleep(1) before reading the key 
and use 'xl network-attach ...'.

Tested-by: Julien Grall <jgrall@amazon.com>

> ---
> Cc: Wei Liu <wei.liu@kernel.org>
> Cc: Paul Durrant <paul@xen.org>
> ---
>   tools/libs/light/libxl_nic.c | 7 +++++++
>   1 file changed, 7 insertions(+)
> 
> diff --git a/tools/libs/light/libxl_nic.c b/tools/libs/light/libxl_nic.c
> index 0b45469dca..0b9e70c9d1 100644
> --- a/tools/libs/light/libxl_nic.c
> +++ b/tools/libs/light/libxl_nic.c
> @@ -248,6 +248,13 @@ static int libxl__set_xenstore_nic(libxl__gc *gc, uint32_t domid,
>       flexarray_append(ro_front, "mtu");
>       flexarray_append(ro_front, GCSPRINTF("%u", nic->mtu));
>   
> +    /*
> +     * Force backend to wait for hotplug script execution before switching to
> +     * connected state.
> +     */
> +    flexarray_append(back, "hotplug-status");
> +    flexarray_append(back, "");
> +
>       return 0;
>   }
>   

Cheers,
Roger Pau Monné Jan. 25, 2022, 4:09 p.m. UTC | #3
On Tue, Jan 25, 2022 at 03:32:16PM +0000, Julien Grall wrote:
> Hi,
> 
> On 24/01/2022 16:02, Roger Pau Monne wrote:
> > By writing an empty "hotplug-status" xenstore node in the backend path
> > libxl can force Linux netback to wait for hotplug script execution
> > before proceeding to the 'connected' state.
> 
> I was actually chasing the same issue today :).
> 
> > 
> > This is required so that netback doesn't skip state 2 (InitWait) and
> 
> Technically netback never skip state 2 (otherwise it would always be
> reproducible). Instead, libxl may not be able to observe state 2 because
> receive a watch is asynchronous and doesn't contain the value of the node.
> So the backend may have moved to Connected before the state is read.

Right, might be more accurate to say it skips waiting for hotplug
script execution, and thus jumps from state 2 into 4. Note I think
it's also possible that by the time we setup the watch in libxl the
state has already been set to 4.

Thanks, Roger.
Julien Grall Jan. 25, 2022, 6:10 p.m. UTC | #4
Hi Roger,

On 25/01/2022 16:09, Roger Pau Monné wrote:
> On Tue, Jan 25, 2022 at 03:32:16PM +0000, Julien Grall wrote:
>> Hi,
>>
>> On 24/01/2022 16:02, Roger Pau Monne wrote:
>>> By writing an empty "hotplug-status" xenstore node in the backend path
>>> libxl can force Linux netback to wait for hotplug script execution
>>> before proceeding to the 'connected' state.
>>
>> I was actually chasing the same issue today :).
>>
>>>
>>> This is required so that netback doesn't skip state 2 (InitWait) and
>>
>> Technically netback never skip state 2 (otherwise it would always be
>> reproducible). Instead, libxl may not be able to observe state 2 because
>> receive a watch is asynchronous and doesn't contain the value of the node.
>> So the backend may have moved to Connected before the state is read.
> 
> Right, might be more accurate to say it skips waiting for hotplug
> script execution, and thus jumps from state 2 into 4.

I would add the jump happens when the frontend decides to connect.

> Note I think
> it's also possible that by the time we setup the watch in libxl the
> state has already been set to 4.

Correct.

Cheers,
Wei Liu Jan. 25, 2022, 7:59 p.m. UTC | #5
On Mon, Jan 24, 2022 at 05:02:48PM +0100, Roger Pau Monne wrote:
> By writing an empty "hotplug-status" xenstore node in the backend path
> libxl can force Linux netback to wait for hotplug script execution
> before proceeding to the 'connected' state.
> 
> This is required so that netback doesn't skip state 2 (InitWait) and
> thus blocks libxl waiting for such state in order to launch the
> hotplug script (see libxl__wait_device_connection).
> 
> Reported-by: James Dingwall <james-xen@dingwall.me.uk>
> Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
> Tested-by: James Dingwall <james-xen@dingwall.me.uk>

Reviewed-by: Wei Liu <wei.liu@kernel.org>
diff mbox series

Patch

diff --git a/tools/libs/light/libxl_nic.c b/tools/libs/light/libxl_nic.c
index 0b45469dca..0b9e70c9d1 100644
--- a/tools/libs/light/libxl_nic.c
+++ b/tools/libs/light/libxl_nic.c
@@ -248,6 +248,13 @@  static int libxl__set_xenstore_nic(libxl__gc *gc, uint32_t domid,
     flexarray_append(ro_front, "mtu");
     flexarray_append(ro_front, GCSPRINTF("%u", nic->mtu));
 
+    /*
+     * Force backend to wait for hotplug script execution before switching to
+     * connected state.
+     */
+    flexarray_append(back, "hotplug-status");
+    flexarray_append(back, "");
+
     return 0;
 }