Message ID | 20210201100903.17309-1-cfontana@suse.de (mailing list archive) |
---|---|
Headers | show |
Series | i386 cleanup PART 2 | expand |
Claudio Fontana <cfontana@suse.de> writes: > v14 -> v15: > > * change the TcgCpuOperations so that all fields of the struct are > defined unconditionally, as per original patch by Eduardo, > before moving them to a separate struct referenced by a pointer > (Richard, Eduardo). > <snip> I'm not sure if their is some instability because I keep seeing failures in the review trees, e.g: https://gitlab.com/stsquad/qemu/-/pipelines/250749404 If you look at my pipeline history you'll see it is *mostly* green: https://gitlab.com/stsquad/qemu/-/pipelines especially for the testing/next branches that I apply the reviews to. So I worry there might be something lurking in there. How have your CI runs looked?
On 2/3/21 3:22 PM, Alex Bennée wrote: > > Claudio Fontana <cfontana@suse.de> writes: > >> v14 -> v15: >> >> * change the TcgCpuOperations so that all fields of the struct are >> defined unconditionally, as per original patch by Eduardo, >> before moving them to a separate struct referenced by a pointer >> (Richard, Eduardo). >> > <snip> > > I'm not sure if their is some instability because I keep seeing failures > in the review trees, e.g: > > https://gitlab.com/stsquad/qemu/-/pipelines/250749404 > > If you look at my pipeline history you'll see it is *mostly* green: > > https://gitlab.com/stsquad/qemu/-/pipelines > > especially for the testing/next branches that I apply the reviews to. So > I worry there might be something lurking in there. How have your CI runs > looked? > My CI has looked fine, I build-tested (with make check) every single commit. Rebased last time the day before yesterday. https://gitlab.com/hw-claudio/qemu/-/pipelines
On Wed, Feb 03, 2021 at 02:22:38PM +0000, Alex Bennée wrote: > > Claudio Fontana <cfontana@suse.de> writes: > > > v14 -> v15: > > > > * change the TcgCpuOperations so that all fields of the struct are > > defined unconditionally, as per original patch by Eduardo, > > before moving them to a separate struct referenced by a pointer > > (Richard, Eduardo). > > > <snip> > > I'm not sure if their is some instability because I keep seeing failures > in the review trees, e.g: > > https://gitlab.com/stsquad/qemu/-/pipelines/250749404 Seems unrelated to this series, Eric Blake reported similar iotest failures at https://lore.kernel.org/qemu-devel/9e71568c-ce4a-f844-fbd3-a4a59f850d74@redhat.com
Claudio Fontana <cfontana@suse.de> writes: > On 2/3/21 3:22 PM, Alex Bennée wrote: >> >> Claudio Fontana <cfontana@suse.de> writes: >> >>> v14 -> v15: >>> >>> * change the TcgCpuOperations so that all fields of the struct are >>> defined unconditionally, as per original patch by Eduardo, >>> before moving them to a separate struct referenced by a pointer >>> (Richard, Eduardo). >>> >> <snip> >> >> I'm not sure if their is some instability because I keep seeing failures >> in the review trees, e.g: >> >> https://gitlab.com/stsquad/qemu/-/pipelines/250749404 >> >> If you look at my pipeline history you'll see it is *mostly* green: >> >> https://gitlab.com/stsquad/qemu/-/pipelines >> >> especially for the testing/next branches that I apply the reviews to. So >> I worry there might be something lurking in there. How have your CI runs >> looked? >> > > My CI has looked fine, I build-tested (with make check) every single commit. > > Rebased last time the day before yesterday. > > https://gitlab.com/hw-claudio/qemu/-/pipelines Yeah it did seem weird given it was a block failure. Chalk it up to paranoia ;-)
Claudio Fontana <cfontana@suse.de> writes: <snip> Final comments. I think overall this series is looking pretty good although I got a bit lost at the end when we started expanding on the AccelClass. The main yuck was the start-up ordering constraint which would be nice to remove or failing that catch with some asserts so weird bugs don't get introduced. Paolo, is it worth picking up some of the early patches to reduce the patch delta going forward?
Hi Alex, thanks for your review, On 2/3/21 5:57 PM, Alex Bennée wrote: > > Claudio Fontana <cfontana@suse.de> writes: > > <snip> > > Final comments. I think overall this series is looking pretty good > although I got a bit lost at the end when we started expanding on the > AccelClass. > The main yuck was the start-up ordering constraint which To be sure, are you referring to tcg_accel_ops_init(), ie your comments towards the end of PATCH 17? Ciao, Claudio > would be nice to remove or failing that catch with some asserts so weird > bugs don't get introduced. > > Paolo, is it worth picking up some of the early patches to reduce the > patch delta going forward? >
For patch 17 on onwards it was just seeing what the actual benefit of the derived class was - I think I get it later on but you should mention it up front. I do think we need to address the ordering constraint in 21 - are we introducing one or just formalising what has been created? If we are introducing one then can we a) do it a better way with the structuring of QOM or b) enforce it so new models don't run into unexpected bugs. On Wed, 3 Feb 2021 at 17:10, Claudio Fontana <cfontana@suse.de> wrote: > > Hi Alex, > > thanks for your review, > > On 2/3/21 5:57 PM, Alex Bennée wrote: > > > > Claudio Fontana <cfontana@suse.de> writes: > > > > <snip> > > > > Final comments. I think overall this series is looking pretty good > > although I got a bit lost at the end when we started expanding on the > > AccelClass. > > The main yuck was the start-up ordering constraint which > > To be sure, are you referring to tcg_accel_ops_init(), ie your comments towards the end of PATCH 17? > > Ciao, > > Claudio > > > would be nice to remove or failing that catch with some asserts so weird > > bugs don't get introduced. > > > > Paolo, is it worth picking up some of the early patches to reduce the > > patch delta going forward? > > >
On 2/3/21 11:07 PM, Alex Bennée wrote: > For patch 17 on onwards it was just seeing what the actual benefit of > the derived class was - I think I get it later on but you should > mention it up front. > > I do think we need to address the ordering constraint in 21 - are we > introducing one or just formalising what has been created? If we are > introducing one then can we a) do it a better way with the structuring > of QOM or b) enforce it so new models don't run into unexpected bugs. What patch 21 tried to do is to improve on the existing call method of "realizefn" for cpus. To be honest it ended up not really achieving the goal, only removing one open call to qemu_init_vcpu in the target code. The actual problem of the completely freak call order of realizefn, where the object model and device model interactions just really get in the way and create more problems than they solve, remains largely untouched. The problem is everything that has been plugged on top of realizing cpus now, which depends on the existing call order, which makes it almost impossible in my view to untangle properly. As an example, the addition of a new cpu (cpu_list_add) should theoretically be done in the common cpu code, but it can't, due to the web of dependencies of the cpu_index being already updated before the common code is reached (tcg plugins are also a blocker there IIRC, but it is by no means the only one). cpu_exec_realizefn then remains the place where this is done, which is called directly inside the target/xxx/cpu.c code. Add to it the fact that we cannot do all framework operations in hw/core/cpu.c, because of the common_ss / specific_ss code split necessity, and you get a web of constraints that is likely impossible to navigate. To answer your questions: a) we are introducing a more strict order in this patch, in the sense that implementations in target/xxx/cpu.c are not free to call qemu_init_vcpu where they please, instead the call is included in common code, triggered by the parent_realize() call. b) this is basically automatically enforced by the fact that the call is not in target/ anymore -- As can be seen by the patch, for some targets, in particular the ones requiring a cpu_reset() after qemu_init_vcpu, this slightly changes the initialization, as between qemu_init_vcpu and cpu_reset() you now have the common code: /* qdev_get_machine() can return something that's not TYPE_MACHINE * if this is one of the user-only emulators; in that case there's * no need to check the ignore_memory_transaction_failures board flag. */ if (object_dynamic_cast(machine, TYPE_MACHINE)) { ObjectClass *oc = object_get_class(machine); MachineClass *mc = MACHINE_CLASS(oc); if (mc) { cpu->ignore_memory_transaction_failures = mc->ignore_memory_transaction_failures; } } if (dev->hotplugged) { cpu_synchronize_post_init(cpu); cpu_resume(cpu); } which was executed later before. -- Only as a result of your comment I now noticed the last part about hotplug, which looks a bit scary tbh. I wonder if there is some automated test that covers cpu device hotplug? And regardless of the fact that I could not see any issue, I am tempted to drop patch 21 entirely now. Let me know what you think, Thanks, Claudio > > On Wed, 3 Feb 2021 at 17:10, Claudio Fontana <cfontana@suse.de> wrote: >> >> Hi Alex, >> >> thanks for your review, >> >> On 2/3/21 5:57 PM, Alex Bennée wrote: >>> >>> Claudio Fontana <cfontana@suse.de> writes: >>> >>> <snip> >>> >>> Final comments. I think overall this series is looking pretty good >>> although I got a bit lost at the end when we started expanding on the >>> AccelClass. >>> The main yuck was the start-up ordering constraint which >> >> To be sure, are you referring to tcg_accel_ops_init(), ie your comments towards the end of PATCH 17? >> >> Ciao, >> >> Claudio >> >>> would be nice to remove or failing that catch with some asserts so weird >>> bugs don't get introduced. >>> >>> Paolo, is it worth picking up some of the early patches to reduce the >>> patch delta going forward? >>> >> > >