diff mbox series

[V2] python/machine: QEMUMachine reopen_qmp_connection

Message ID 1675796605-235646-1-git-send-email-steven.sistare@oracle.com (mailing list archive)
State New, archived
Headers show
Series [V2] python/machine: QEMUMachine reopen_qmp_connection | expand

Commit Message

Steven Sistare Feb. 7, 2023, 7:03 p.m. UTC
Provide reopen_qmp_connection() to reopen a closed monitor connection.
This will be needed by cpr, because qemu exec closes the monitor socket.

Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
Reviewed-by: John Snow <jsnow@redhat.com>
---
 python/qemu/machine/machine.py | 10 ++++++++++
 1 file changed, 10 insertions(+)

Comments

John Snow Feb. 7, 2023, 8:28 p.m. UTC | #1
On Tue, Feb 7, 2023 at 2:03 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>
> Provide reopen_qmp_connection() to reopen a closed monitor connection.
> This will be needed by cpr, because qemu exec closes the monitor socket.
>
> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> Reviewed-by: John Snow <jsnow@redhat.com>
> ---
>  python/qemu/machine/machine.py | 10 ++++++++++
>  1 file changed, 10 insertions(+)
>
> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> index ef94dcf..557209a 100644
> --- a/python/qemu/machine/machine.py
> +++ b/python/qemu/machine/machine.py
> @@ -501,6 +501,16 @@ def _close_qmp_connection(self) -> None:
>          finally:
>              self._qmp_connection = None
>
> +    def reopen_qmp_connection(self) -> None:
> +        """Close and re-open the QMP connection."""
> +        self._close_qmp_connection()
> +        self._qmp_connection = QEMUMonitorProtocol(
> +            self._monitor_address,
> +            server=True,
> +            nickname=self._name
> +        )
> +        self._qmp.accept(self._qmp_timer)
> +
>      def _early_cleanup(self) -> None:
>          """
>          Perform any cleanup that needs to happen before the VM exits.
> --
> 1.8.3.1
>

This code is still mechanically fine as far as I can tell, but I lost
the plot on why it's needed - Can you please elaborate for me on what
you mean by "qemu exec will close the socket"?

(R-B still stands, but since I am rewriting machine.py to be natively
async, I should be aware of your new use case.)

--js
Steven Sistare Feb. 7, 2023, 9:04 p.m. UTC | #2
On 2/7/2023 3:28 PM, John Snow wrote:
> On Tue, Feb 7, 2023 at 2:03 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>>
>> Provide reopen_qmp_connection() to reopen a closed monitor connection.
>> This will be needed by cpr, because qemu exec closes the monitor socket.
>>
>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>> Reviewed-by: John Snow <jsnow@redhat.com>
>> ---
>>  python/qemu/machine/machine.py | 10 ++++++++++
>>  1 file changed, 10 insertions(+)
>>
>> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
>> index ef94dcf..557209a 100644
>> --- a/python/qemu/machine/machine.py
>> +++ b/python/qemu/machine/machine.py
>> @@ -501,6 +501,16 @@ def _close_qmp_connection(self) -> None:
>>          finally:
>>              self._qmp_connection = None
>>
>> +    def reopen_qmp_connection(self) -> None:
>> +        """Close and re-open the QMP connection."""
>> +        self._close_qmp_connection()
>> +        self._qmp_connection = QEMUMonitorProtocol(
>> +            self._monitor_address,
>> +            server=True,
>> +            nickname=self._name
>> +        )
>> +        self._qmp.accept(self._qmp_timer)
>> +
>>      def _early_cleanup(self) -> None:
>>          """
>>          Perform any cleanup that needs to happen before the VM exits.
>> --
>> 1.8.3.1
>>
> 
> This code is still mechanically fine as far as I can tell, but I lost
> the plot on why it's needed - Can you please elaborate for me on what
> you mean by "qemu exec will close the socket"?
> 
> (R-B still stands, but since I am rewriting machine.py to be natively
> async, I should be aware of your new use case.)

Sure, thanks for asking.

During cpr (aka live update), the old qemu process directly execv's the new
qemu binary.  The monitor socket fd is marked cloexec, so the kernel closes it
during execv.

- Steve
John Snow Feb. 7, 2023, 9:23 p.m. UTC | #3
On Tue, Feb 7, 2023 at 4:04 PM Steven Sistare <steven.sistare@oracle.com> wrote:
>
> On 2/7/2023 3:28 PM, John Snow wrote:
> > On Tue, Feb 7, 2023 at 2:03 PM Steve Sistare <steven.sistare@oracle.com> wrote:
> >>
> >> Provide reopen_qmp_connection() to reopen a closed monitor connection.
> >> This will be needed by cpr, because qemu exec closes the monitor socket.
> >>
> >> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
> >> Reviewed-by: John Snow <jsnow@redhat.com>
> >> ---
> >>  python/qemu/machine/machine.py | 10 ++++++++++
> >>  1 file changed, 10 insertions(+)
> >>
> >> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
> >> index ef94dcf..557209a 100644
> >> --- a/python/qemu/machine/machine.py
> >> +++ b/python/qemu/machine/machine.py
> >> @@ -501,6 +501,16 @@ def _close_qmp_connection(self) -> None:
> >>          finally:
> >>              self._qmp_connection = None
> >>
> >> +    def reopen_qmp_connection(self) -> None:
> >> +        """Close and re-open the QMP connection."""
> >> +        self._close_qmp_connection()
> >> +        self._qmp_connection = QEMUMonitorProtocol(
> >> +            self._monitor_address,
> >> +            server=True,
> >> +            nickname=self._name
> >> +        )
> >> +        self._qmp.accept(self._qmp_timer)
> >> +
> >>      def _early_cleanup(self) -> None:
> >>          """
> >>          Perform any cleanup that needs to happen before the VM exits.
> >> --
> >> 1.8.3.1
> >>
> >
> > This code is still mechanically fine as far as I can tell, but I lost
> > the plot on why it's needed - Can you please elaborate for me on what
> > you mean by "qemu exec will close the socket"?
> >
> > (R-B still stands, but since I am rewriting machine.py to be natively
> > async, I should be aware of your new use case.)
>
> Sure, thanks for asking.
>
> During cpr (aka live update), the old qemu process directly execv's the new
> qemu binary.  The monitor socket fd is marked cloexec, so the kernel closes it
> during execv.
>
> - Steve

Oho, I see.

I wonder if you are helped at all by some of Marc-Andre's recent
changes to use socketpair() for QMP connections with machine.py.
Depending on how those sockets are managed, it might be possible to
hold onto one of them without having it closed alongside the old QEMU
process. Or, if that doesn't help, double-check that it doesn't *hurt*
you either. It should be the default as of about a week ago now --
instead of using a local unix socket, we use socketpair and pass the
FD to help alleviate some race conditions.

(Again, I'm fine with this patch either way, just a heads up.)

--js
Steven Sistare Feb. 7, 2023, 9:32 p.m. UTC | #4
On 2/7/2023 4:23 PM, John Snow wrote:
> On Tue, Feb 7, 2023 at 4:04 PM Steven Sistare <steven.sistare@oracle.com> wrote:
>>
>> On 2/7/2023 3:28 PM, John Snow wrote:
>>> On Tue, Feb 7, 2023 at 2:03 PM Steve Sistare <steven.sistare@oracle.com> wrote:
>>>>
>>>> Provide reopen_qmp_connection() to reopen a closed monitor connection.
>>>> This will be needed by cpr, because qemu exec closes the monitor socket.
>>>>
>>>> Signed-off-by: Steve Sistare <steven.sistare@oracle.com>
>>>> Reviewed-by: John Snow <jsnow@redhat.com>
>>>> ---
>>>>  python/qemu/machine/machine.py | 10 ++++++++++
>>>>  1 file changed, 10 insertions(+)
>>>>
>>>> diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
>>>> index ef94dcf..557209a 100644
>>>> --- a/python/qemu/machine/machine.py
>>>> +++ b/python/qemu/machine/machine.py
>>>> @@ -501,6 +501,16 @@ def _close_qmp_connection(self) -> None:
>>>>          finally:
>>>>              self._qmp_connection = None
>>>>
>>>> +    def reopen_qmp_connection(self) -> None:
>>>> +        """Close and re-open the QMP connection."""
>>>> +        self._close_qmp_connection()
>>>> +        self._qmp_connection = QEMUMonitorProtocol(
>>>> +            self._monitor_address,
>>>> +            server=True,
>>>> +            nickname=self._name
>>>> +        )
>>>> +        self._qmp.accept(self._qmp_timer)
>>>> +
>>>>      def _early_cleanup(self) -> None:
>>>>          """
>>>>          Perform any cleanup that needs to happen before the VM exits.
>>>> --
>>>> 1.8.3.1
>>>>
>>>
>>> This code is still mechanically fine as far as I can tell, but I lost
>>> the plot on why it's needed - Can you please elaborate for me on what
>>> you mean by "qemu exec will close the socket"?
>>>
>>> (R-B still stands, but since I am rewriting machine.py to be natively
>>> async, I should be aware of your new use case.)
>>
>> Sure, thanks for asking.
>>
>> During cpr (aka live update), the old qemu process directly execv's the new
>> qemu binary.  The monitor socket fd is marked cloexec, so the kernel closes it
>> during execv.
>>
>> - Steve
> 
> Oho, I see.
> 
> I wonder if you are helped at all by some of Marc-Andre's recent
> changes to use socketpair() for QMP connections with machine.py.
> Depending on how those sockets are managed, it might be possible to
> hold onto one of them without having it closed alongside the old QEMU
> process. Or, if that doesn't help, double-check that it doesn't *hurt*
> you either. It should be the default as of about a week ago now --
> instead of using a local unix socket, we use socketpair and pass the
> FD to help alleviate some race conditions.
> 
> (Again, I'm fine with this patch either way, just a heads up.)

Thanks.  Preserving the monitor connection and state across exec requires more 
work though, because of buffering, partially processed messages, and other 
state.  I tried and abandoned that a while ago.  Re-opening has been robust.

- Steve
diff mbox series

Patch

diff --git a/python/qemu/machine/machine.py b/python/qemu/machine/machine.py
index ef94dcf..557209a 100644
--- a/python/qemu/machine/machine.py
+++ b/python/qemu/machine/machine.py
@@ -501,6 +501,16 @@  def _close_qmp_connection(self) -> None:
         finally:
             self._qmp_connection = None
 
+    def reopen_qmp_connection(self) -> None:
+        """Close and re-open the QMP connection."""
+        self._close_qmp_connection()
+        self._qmp_connection = QEMUMonitorProtocol(
+            self._monitor_address,
+            server=True,
+            nickname=self._name
+        )
+        self._qmp.accept(self._qmp_timer)
+
     def _early_cleanup(self) -> None:
         """
         Perform any cleanup that needs to happen before the VM exits.