Message ID | 20210414170352.29927-6-eesposit@redhat.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | qemu_iotests: improve debugging options | expand |
On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote: > Attaching a gdbserver implies that the qmp socket > should wait indefinitely for an answer from QEMU. > > Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> > --- > python/qemu/machine.py | 3 +++ > tests/qemu-iotests/iotests.py | 10 +++++++++- > 2 files changed, 12 insertions(+), 1 deletion(-) > > diff --git a/python/qemu/machine.py b/python/qemu/machine.py > index 12752142c9..d6142271c2 100644 > --- a/python/qemu/machine.py > +++ b/python/qemu/machine.py > @@ -409,6 +409,9 @@ def _launch(self) -> None: > stderr=subprocess.STDOUT, > shell=False, > close_fds=False) > + > + if 'gdbserver' in self._wrapper: > + self._qmp_timer = None Why doesn’t __init__() evaluate this? This here doesn’t feel like the right place for it. If we want to evaluate it here, self._qmp_timer shouldn’t exist, and instead the timeout should be a _post_launch() parameter. (Which I would have nothing against, by the way.) Also, mypy complains that this variable is a float, so iotest 297 (which runs mypy) fails. > self._post_launch() > > def _early_cleanup(self) -> None: > diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py > index 05d0dc0751..380527245e 100644 > --- a/tests/qemu-iotests/iotests.py > +++ b/tests/qemu-iotests/iotests.py > @@ -478,7 +478,10 @@ def log(msg: Msg, > > class Timeout: > def __init__(self, seconds, errmsg="Timeout"): > - self.seconds = seconds > + if qemu_gdb: > + self.seconds = 3000 > + else: > + self.seconds = seconds We might as well completely disable the timeout then, that would be more honest, I think. (I.e. to make __enter__ and __exit__ no-ops.) > self.errmsg = errmsg > def __enter__(self): > signal.signal(signal.SIGALRM, self.timeout) > @@ -684,6 +687,11 @@ def qmp_to_opts(self, obj): > output_list += [key + '=' + obj[key]] > return ','.join(output_list) > > + def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]: > + if qemu_gdb: > + wait = 0.0 First, this is a bool. I can see that qmp.py expects a Union[bool, float], but machine.py expects just a bool. (Also, mypy complains that this specific `wait` here is a `bool`. You can see that by running iotest 297.) Second, I don’t understand this. If the caller wants to block waiting on an event, then that should have nothing to do with whether we have gdb running or not. As far as I understand, setting wait to 0.0 is the same as wait = False, i.e. we don’t block and just return None immediately if there is no pending event. Max > + return super().get_qmp_events(wait=wait) > + > def get_qmp_events_filtered(self, wait=60.0): > result = [] > for ev in self.get_qmp_events(wait=wait): >
On 30/04/2021 13:59, Max Reitz wrote: > On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote: >> Attaching a gdbserver implies that the qmp socket >> should wait indefinitely for an answer from QEMU. >> >> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> >> --- >> python/qemu/machine.py | 3 +++ >> tests/qemu-iotests/iotests.py | 10 +++++++++- >> 2 files changed, 12 insertions(+), 1 deletion(-) >> >> diff --git a/python/qemu/machine.py b/python/qemu/machine.py >> index 12752142c9..d6142271c2 100644 >> --- a/python/qemu/machine.py >> +++ b/python/qemu/machine.py >> @@ -409,6 +409,9 @@ def _launch(self) -> None: >> stderr=subprocess.STDOUT, >> shell=False, >> close_fds=False) >> + >> + if 'gdbserver' in self._wrapper: >> + self._qmp_timer = None > > Why doesn’t __init__() evaluate this? This here doesn’t feel like the > right place for it. If we want to evaluate it here, self._qmp_timer > shouldn’t exist, and instead the timeout should be a _post_launch() > parameter. (Which I would have nothing against, by the way.) Uhm.. I got another comment in a previous version where for the "event" callbacks it was better a property than passing around a parameter. Which I honestly agree. What should __init__() do? The check here is to see if the invocation has gdb (and a couple of patches ahead also valgrind), to remove the timer. If I understand what you mean, you want something like def __init__(self, timer): But from my understanding, the class hierarchy is: QEMUMachine: in machine.py QEMUQtestMachine(QEMUMachine): in qtest.py VM(qtest.QEMUQtestMachine): in iotests.py VM() is then invoked in each test. So this is not easily reachable by check.py, to pass the parameter into __init__ > > Also, mypy complains that this variable is a float, so iotest 297 (which > runs mypy) fails. This and all mistakes of test 297 (mypy) is my fault: I did not have pylint-3 installed thus when testing it skipped the 297 test. > >> self._post_launch() >> def _early_cleanup(self) -> None: >> diff --git a/tests/qemu-iotests/iotests.py >> b/tests/qemu-iotests/iotests.py >> index 05d0dc0751..380527245e 100644 >> --- a/tests/qemu-iotests/iotests.py >> +++ b/tests/qemu-iotests/iotests.py >> @@ -478,7 +478,10 @@ def log(msg: Msg, >> class Timeout: >> def __init__(self, seconds, errmsg="Timeout"): >> - self.seconds = seconds >> + if qemu_gdb: >> + self.seconds = 3000 >> + else: >> + self.seconds = seconds > > We might as well completely disable the timeout then, that would be more > honest, I think. (I.e. to make __enter__ and __exit__ no-ops.) > >> self.errmsg = errmsg >> def __enter__(self): >> signal.signal(signal.SIGALRM, self.timeout) >> @@ -684,6 +687,11 @@ def qmp_to_opts(self, obj): >> output_list += [key + '=' + obj[key]] >> return ','.join(output_list) >> + def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]: >> + if qemu_gdb: >> + wait = 0.0 > > First, this is a bool. I can see that qmp.py expects a > Union[bool, float], but machine.py expects just a bool. (Also, mypy > complains that this specific `wait` here is a `bool`. You can see that > by running iotest 297.) I think here machine.py should reflect the qmp.py behavior and have an Union[bool, float] too. > > Second, I don’t understand this. If the caller wants to block waiting > on an event, then that should have nothing to do with whether we have > gdb running or not. As far as I understand, setting wait to 0.0 is the > same as wait = False, i.e. we don’t block and just return None > immediately if there is no pending event. You're right, this might not be needed here. The problem I had was that calling gdb and pausing at a breakpoint or something for a while would make the QMP socket timeout, thus aborting the whole test. In order to avoid that, I need to stop or delay timers. I can't remember why I added this check here. At some point I am sure the test was failing because of socket timeout expiration, but I cannot reproduce the problem when commenting out this check above in get_qmp_events. The other check in patch 3 should be enough. Emanuele > > Max > >> + return super().get_qmp_events(wait=wait) >> + >> def get_qmp_events_filtered(self, wait=60.0): >> result = [] >> for ev in self.get_qmp_events(wait=wait): >> >
On 30.04.21 23:03, Emanuele Giuseppe Esposito wrote: > > > On 30/04/2021 13:59, Max Reitz wrote: >> On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote: >>> Attaching a gdbserver implies that the qmp socket >>> should wait indefinitely for an answer from QEMU. >>> >>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> >>> --- >>> python/qemu/machine.py | 3 +++ >>> tests/qemu-iotests/iotests.py | 10 +++++++++- >>> 2 files changed, 12 insertions(+), 1 deletion(-) >>> >>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py >>> index 12752142c9..d6142271c2 100644 >>> --- a/python/qemu/machine.py >>> +++ b/python/qemu/machine.py >>> @@ -409,6 +409,9 @@ def _launch(self) -> None: >>> stderr=subprocess.STDOUT, >>> shell=False, >>> close_fds=False) >>> + >>> + if 'gdbserver' in self._wrapper: >>> + self._qmp_timer = None >> >> Why doesn’t __init__() evaluate this? This here doesn’t feel like the >> right place for it. If we want to evaluate it here, self._qmp_timer >> shouldn’t exist, and instead the timeout should be a _post_launch() >> parameter. (Which I would have nothing against, by the way.) > > Uhm.. I got another comment in a previous version where for the "event" > callbacks it was better a property than passing around a parameter. > Which I honestly agree. I think that comment was in the sense of providing a default value, which can be expressed by having a property that is set in __init__. I don’t have anything against making this a property, but I also don’t have anything against making it a _post_launch() parameter. I could even live with both, i.e. set _qmp_timer to 15 in __init__, then have a _post_launch parameter, and pass either self._qmp_timer or None if self._wrapper includes 'gdbserver'. What I do mind is that I don’t understand why the property is modified here. The value of self._qmp_timer is supposed to be 15 by default and None if self._wrapper includes 'gdbserver'. It should thus be changed to None the moment self._wrapper is made to include 'gdbserver'. Because self._wrapper is set only in __init__, this should happen in __init__. > What should __init__() do? The check here is to see if the invocation > has gdb (and a couple of patches ahead also valgrind), to remove the timer. > If I understand what you mean, you want something like > def __init__(self, timer): Oh, no. We can optionally do that perhaps later, but what I meant is just to put this in __init__() (without adding any parameters to it): self._qmp_timer = 15.0 if 'gdbserver' not in self._wrapper else None I think self._qmp_timer should always reflect what timeout we are going to use when a VM is launched. So if the conditions influencing the timeout change, it should be updated immediately to reflect this. The only condition we have right now is the content of self._wrapper, which is only set in __init__, so self._qmp_timer should be set once in __init__ and not changed afterwards. That sounds academic, but imagine what would happen if we had a set_qmp_timer() method: The timout could be adjusted, but launch() would just ignore it and update the property, even though the conditions influencing the timout didn’t change between set_qmp_timer() and launch(). Or if we had a get_qmp_timer(); a caller would read a timeout of 15.0 before launch(), even though the timeout is going to be None. Therefore, I think a property should not be updated just before it is read, but instead when any condition that’s supposed to influence its value changes. I suggested making it a parameter because updating a property when reading it sounds like it should be a parameter instead. I.e., one would say def __init__(): self._qmp_timeout_default = 15.0 def post_launch(qmp_timeout): self._qmp.accept(qmp_timeout) def launch(self): ... qmp_timeout = None if 'gdbserver' in self._wrapper \ else self._qmp_timout_default self.post_launch(qmp_timeout) Which is basically the structure your patch has, which gave me the idea. [...] >>> self._post_launch() >>> def _early_cleanup(self) -> None: >>> diff --git a/tests/qemu-iotests/iotests.py >>> b/tests/qemu-iotests/iotests.py >>> index 05d0dc0751..380527245e 100644 >>> --- a/tests/qemu-iotests/iotests.py >>> +++ b/tests/qemu-iotests/iotests.py [...] >>> @@ -684,6 +687,11 @@ def qmp_to_opts(self, obj): >>> output_list += [key + '=' + obj[key]] >>> return ','.join(output_list) >>> + def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]: >>> + if qemu_gdb: >>> + wait = 0.0 [...] >> >> Second, I don’t understand this. If the caller wants to block waiting >> on an event, then that should have nothing to do with whether we have >> gdb running or not. As far as I understand, setting wait to 0.0 is >> the same as wait = False, i.e. we don’t block and just return None >> immediately if there is no pending event. > > You're right, this might not be needed here. The problem I had was that > calling gdb and pausing at a breakpoint or something for a while would > make the QMP socket timeout, thus aborting the whole test. In order to > avoid that, I need to stop or delay timers. > > I can't remember why I added this check here. At some point I am sure > the test was failing because of socket timeout expiration, but I cannot > reproduce the problem when commenting out this check above in > get_qmp_events. The other check in patch 3 should be enough. Hm, ok. I’d guessed that you intended the wait=0.0 or wait=False to mean that we get an infinite timeout (i.e., no timeout), but that’s exactly why I didn’t get it. wait=0.0 doesn’t give an infinite timeout, but instead basically times out immediately. Max
On 5/3/21 11:02 AM, Max Reitz wrote: > On 30.04.21 23:03, Emanuele Giuseppe Esposito wrote: >> >> >> On 30/04/2021 13:59, Max Reitz wrote: >>> On 14.04.21 19:03, Emanuele Giuseppe Esposito wrote: >>>> Attaching a gdbserver implies that the qmp socket >>>> should wait indefinitely for an answer from QEMU. >>>> >>>> Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> >>>> --- >>>> python/qemu/machine.py | 3 +++ >>>> tests/qemu-iotests/iotests.py | 10 +++++++++- >>>> 2 files changed, 12 insertions(+), 1 deletion(-) >>>> >>>> diff --git a/python/qemu/machine.py b/python/qemu/machine.py >>>> index 12752142c9..d6142271c2 100644 >>>> --- a/python/qemu/machine.py >>>> +++ b/python/qemu/machine.py >>>> @@ -409,6 +409,9 @@ def _launch(self) -> None: >>>> stderr=subprocess.STDOUT, >>>> shell=False, >>>> close_fds=False) >>>> + >>>> + if 'gdbserver' in self._wrapper: >>>> + self._qmp_timer = None >>> >>> Why doesn’t __init__() evaluate this? This here doesn’t feel like >>> the right place for it. If we want to evaluate it here, >>> self._qmp_timer shouldn’t exist, and instead the timeout should be a >>> _post_launch() parameter. (Which I would have nothing against, by >>> the way.) >> >> Uhm.. I got another comment in a previous version where for the >> "event" callbacks it was better a property than passing around a >> parameter. Which I honestly agree. > > I think that comment was in the sense of providing a default value, > which can be expressed by having a property that is set in __init__. > My comment was along the lines that "_post_launch()" is behaving as an event loop hook and not the sort of thing I want to pass parameters to. It's a private method, so the only possibility for someone passing a parameter to is another class method anyway. We have a hierarchy of things that depend on the Machine class and I didn't want to start cascading optional parameters into the subclasses. It was my intent that the information needed to run _post_launch() correctly should be known by the state of the object -- which I think should be true anyway. > I don’t have anything against making this a property, but I also don’t > have anything against making it a _post_launch() parameter. I could > even live with both, i.e. set _qmp_timer to 15 in __init__, then have a > _post_launch parameter, and pass either self._qmp_timer or None if > self._wrapper includes 'gdbserver'. > > What I do mind is that I don’t understand why the property is modified > here. The value of self._qmp_timer is supposed to be 15 by default and > None if self._wrapper includes 'gdbserver'. It should thus be changed > to None the moment self._wrapper is made to include 'gdbserver'. Because > self._wrapper is set only in __init__, this should happen in __init__. > >> What should __init__() do? The check here is to see if the invocation >> has gdb (and a couple of patches ahead also valgrind), to remove the >> timer. >> If I understand what you mean, you want something like >> def __init__(self, timer): > > Oh, no. We can optionally do that perhaps later, but what I meant is > just to put this in __init__() (without adding any parameters to it): > > self._qmp_timer = 15.0 if 'gdbserver' not in self._wrapper else None > > I think self._qmp_timer should always reflect what timeout we are going > to use when a VM is launched. So if the conditions influencing the > timeout change, it should be updated immediately to reflect this. The > only condition we have right now is the content of self._wrapper, which > is only set in __init__, so self._qmp_timer should be set once in > __init__ and not changed afterwards. > > That sounds academic, but imagine what would happen if we had a > set_qmp_timer() method: The timout could be adjusted, but launch() would > just ignore it and update the property, even though the conditions > influencing the timout didn’t change between set_qmp_timer() and launch(). > > Or if we had a get_qmp_timer(); a caller would read a timeout of 15.0 > before launch(), even though the timeout is going to be None. > > Therefore, I think a property should not be updated just before it is > read, but instead when any condition that’s supposed to influence its > value changes. > I agree with Max's reasoning here. I am also not a fan of squishing magic into this class; changing class behavior based on introspection of wrapper arguments feels like a layering violation. Maybe what you want is a subclass or a wrapper class that knows how to run QEMU using gdbserver, and changes some behaviors accordingly? The factoring of Machine is quite bad already, admittedly, and is in need of a good spit-shine. Too many init parameters, too many state variables, too many methods that got patched in to support one specific use-case at one point or another. At a certain point, I begin to worry about how it's possible to audit how all of these one-off features behave and interact. It's getting complex. Is it time to dream up a refactoring for how the Machine class behaves? > > I suggested making it a parameter because updating a property when > reading it sounds like it should be a parameter instead. I.e., one > would say > > def __init__(): > self._qmp_timeout_default = 15.0 > > def post_launch(qmp_timeout): > self._qmp.accept(qmp_timeout) > > def launch(self): > ... > qmp_timeout = None if 'gdbserver' in self._wrapper \ > else self._qmp_timout_default > self.post_launch(qmp_timeout) > > > Which is basically the structure your patch has, which gave me the idea. > > [...] > >>>> self._post_launch() >>>> def _early_cleanup(self) -> None: >>>> diff --git a/tests/qemu-iotests/iotests.py >>>> b/tests/qemu-iotests/iotests.py >>>> index 05d0dc0751..380527245e 100644 >>>> --- a/tests/qemu-iotests/iotests.py >>>> +++ b/tests/qemu-iotests/iotests.py > > [...] > >>>> @@ -684,6 +687,11 @@ def qmp_to_opts(self, obj): >>>> output_list += [key + '=' + obj[key]] >>>> return ','.join(output_list) >>>> + def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]: >>>> + if qemu_gdb: >>>> + wait = 0.0 > > [...] > >>> >>> Second, I don’t understand this. If the caller wants to block >>> waiting on an event, then that should have nothing to do with whether >>> we have gdb running or not. As far as I understand, setting wait to >>> 0.0 is the same as wait = False, i.e. we don’t block and just return >>> None immediately if there is no pending event. >> >> You're right, this might not be needed here. The problem I had was >> that calling gdb and pausing at a breakpoint or something for a while >> would make the QMP socket timeout, thus aborting the whole test. In >> order to avoid that, I need to stop or delay timers. >> >> I can't remember why I added this check here. At some point I am sure >> the test was failing because of socket timeout expiration, but I >> cannot reproduce the problem when commenting out this check above in >> get_qmp_events. The other check in patch 3 should be enough. > > Hm, ok. I’d guessed that you intended the wait=0.0 or wait=False to > mean that we get an infinite timeout (i.e., no timeout), but that’s > exactly why I didn’t get it. wait=0.0 doesn’t give an infinite timeout, > but instead basically times out immediately. > > Max Well, I suppose if we don't need it, then that makes things easier too :)
diff --git a/python/qemu/machine.py b/python/qemu/machine.py index 12752142c9..d6142271c2 100644 --- a/python/qemu/machine.py +++ b/python/qemu/machine.py @@ -409,6 +409,9 @@ def _launch(self) -> None: stderr=subprocess.STDOUT, shell=False, close_fds=False) + + if 'gdbserver' in self._wrapper: + self._qmp_timer = None self._post_launch() def _early_cleanup(self) -> None: diff --git a/tests/qemu-iotests/iotests.py b/tests/qemu-iotests/iotests.py index 05d0dc0751..380527245e 100644 --- a/tests/qemu-iotests/iotests.py +++ b/tests/qemu-iotests/iotests.py @@ -478,7 +478,10 @@ def log(msg: Msg, class Timeout: def __init__(self, seconds, errmsg="Timeout"): - self.seconds = seconds + if qemu_gdb: + self.seconds = 3000 + else: + self.seconds = seconds self.errmsg = errmsg def __enter__(self): signal.signal(signal.SIGALRM, self.timeout) @@ -684,6 +687,11 @@ def qmp_to_opts(self, obj): output_list += [key + '=' + obj[key]] return ','.join(output_list) + def get_qmp_events(self, wait: bool = False) -> List[QMPMessage]: + if qemu_gdb: + wait = 0.0 + return super().get_qmp_events(wait=wait) + def get_qmp_events_filtered(self, wait=60.0): result = [] for ev in self.get_qmp_events(wait=wait):
Attaching a gdbserver implies that the qmp socket should wait indefinitely for an answer from QEMU. Signed-off-by: Emanuele Giuseppe Esposito <eesposit@redhat.com> --- python/qemu/machine.py | 3 +++ tests/qemu-iotests/iotests.py | 10 +++++++++- 2 files changed, 12 insertions(+), 1 deletion(-)