Message ID | 20230825033616.3402812-1-leo.yan@linaro.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | [v1] tools/hotplug: systemd: Make dependency on Xen device nodes | expand |
On Fri Aug 25, 2023 at 5:36 AM CEST, Leo Yan wrote: > When system booting up, the kernel module xen_gntdev.ko is loaded and > the device node '/dev/xen/gntdev' is created; later the xenstored > service in systemd launches daemon to open this device node. > > This flow has a race condition between creating the device node in the > kernel module and using the device node in the systemd service. It's > possible that the xenstored service fails to open the device node due > to the delay of creating the device node. In the end, xenbus cannot be > used between the Dom0 kernel and the Xen hypervisor. > > To resolve this issue, we need to synchronize between udev and systemd > for the device node. There have an extra change in the udev rules for > tagging 'systemd' for Xen device nodes, which notifies device node > creating to systemd; besides udev change, this patch adds dependency in > systemd service for waiting the device node. > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > --- > > The udev rules change is on github: > https://github.com/systemd/systemd/pull/28962/commits/520340dfea3b6cf9fe7a24c9238313b1a5fe8539 Let's see what they think, but I fear systemd may not be the correct upstream to carry this... Skimming through the rules that they have there, it mostly looks like they only carry rules that are relevant for systemd-related services directly. > tools/hotplug/Linux/systemd/xenstored.service.in | 4 ++-- > 1 file changed, 2 insertions(+), 2 deletions(-) > > diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in > index 261077dc92..6e48cdb0e7 100644 > --- a/tools/hotplug/Linux/systemd/xenstored.service.in > +++ b/tools/hotplug/Linux/systemd/xenstored.service.in > @@ -1,7 +1,7 @@ > [Unit] > Description=The Xen xenstore > -Requires=proc-xen.mount > -After=proc-xen.mount > +Requires=proc-xen.mount dev-xen-gntdev.device > +After=proc-xen.mount dev-xen-gntdev.device I must admit that I am a bit confused why this is enough... During our discussion on Slack, when you quoted from your rule it included `ENV{SYSTEMD_WANTS}="xenstored.service"`. Did you drop that during later development? Was there additional reasearch involved in dropping it? My understanding was that if devices do not exist when systemd builds its transaction model, dependencies on them will just get tossed out[1]. So I would have expected that there should still be a race if the .device does not pull in the service. Did you hit the race frequently enough to be sure that this fixes it in entirety? Is there a place somewhere in the Xen or kernel code where one could add an excessive sleep in order to trigger the race more reliably? [1] "citation-needed": But I vaguely remember dealing with a similar issue and getting told this on #systemd IRC - Erik
On Fri, Aug 25, 2023 at 07:02:33AM +0200, Erik Schilling wrote: > On Fri Aug 25, 2023 at 5:36 AM CEST, Leo Yan wrote: > > When system booting up, the kernel module xen_gntdev.ko is loaded and > > the device node '/dev/xen/gntdev' is created; later the xenstored > > service in systemd launches daemon to open this device node. > > > > This flow has a race condition between creating the device node in the > > kernel module and using the device node in the systemd service. It's > > possible that the xenstored service fails to open the device node due > > to the delay of creating the device node. In the end, xenbus cannot be > > used between the Dom0 kernel and the Xen hypervisor. > > > > To resolve this issue, we need to synchronize between udev and systemd > > for the device node. There have an extra change in the udev rules for > > tagging 'systemd' for Xen device nodes, which notifies device node > > creating to systemd; besides udev change, this patch adds dependency in > > systemd service for waiting the device node. > > > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > > --- > > > > The udev rules change is on github: > > https://github.com/systemd/systemd/pull/28962/commits/520340dfea3b6cf9fe7a24c9238313b1a5fe8539 > > Let's see what they think, but I fear systemd may not be the correct > upstream to carry this... Skimming through the rules that they have > there, it mostly looks like they only carry rules that are relevant for > systemd-related services directly. Yeah, I share the same concern. But seems to me, upstreaming change to the systemd is the most neat fixing. > > tools/hotplug/Linux/systemd/xenstored.service.in | 4 ++-- > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in > > index 261077dc92..6e48cdb0e7 100644 > > --- a/tools/hotplug/Linux/systemd/xenstored.service.in > > +++ b/tools/hotplug/Linux/systemd/xenstored.service.in > > @@ -1,7 +1,7 @@ > > [Unit] > > Description=The Xen xenstore > > -Requires=proc-xen.mount > > -After=proc-xen.mount > > +Requires=proc-xen.mount dev-xen-gntdev.device > > +After=proc-xen.mount dev-xen-gntdev.device > > I must admit that I am a bit confused why this is enough... During our > discussion on Slack, when you quoted from your rule it included > `ENV{SYSTEMD_WANTS}="xenstored.service"`. Did you drop that during later > development? Was there additional reasearch involved in dropping it? Yes, I dropped ENV{SYSTEMD_WANTS}="xenstored.service" and it works well at my side. My purpose is to upstream the udev rules in systemd as general as possible. As you mentioned, the "xenstored.service" is maintained in Xen but not in systemd, for this reason, I would like avoid to add "xenstored.service" into udev rules in systemd. > My understanding was that if devices do not exist when systemd builds > its transaction model, dependencies on them will just get tossed out[1]. > So I would have expected that there should still be a race if > the .device does not pull in the service. Thanks for sharing. To be honest, I don't know this part. Doesn't systemd parse the service script and let service to wait for specific .device until systemd receives notification for the .device? > Did you hit the race frequently enough to be sure that this fixes it in > entirety? I have two boards. One on board (Telechips), it's consistently reproduce this issue, and after applying the udev rules change and this patch, I confirmed the issue is dismissed entirely. > Is there a place somewhere in the Xen or kernel code where one > could add an excessive sleep in order to trigger the race more reliably? On my AVA board with running Xen, I tried to add a long delay (10 seconds) in the kernel driver 'drivers/xen/gntdev.c', but I can see systemd will wait for the kernel modules loading, and then it launches Xen services. Thus I cannot reproduce this issue on my AVA board. So in below flow: - Step 1: drivers/xen/gntdev.c registers misc device; - Step 2: udev creates device node '/dev/xen/gntdev'; - Step 3: systemd launches xenstored.service. Seems to me, the race condition happens between step 2 and step 3. If there have any delay in udev for creating device node, then the step3 for xenstored.service will fail. Thanks for review. Leo > [1] "citation-needed": But I vaguely remember dealing with a similar > issue and getting told this on #systemd IRC > > - Erik
On Fri Aug 25, 2023 at 8:36 AM CEST, Leo Yan wrote: > On Fri, Aug 25, 2023 at 07:02:33AM +0200, Erik Schilling wrote: > > On Fri Aug 25, 2023 at 5:36 AM CEST, Leo Yan wrote: > > > When system booting up, the kernel module xen_gntdev.ko is loaded and > > > the device node '/dev/xen/gntdev' is created; later the xenstored > > > service in systemd launches daemon to open this device node. > > > > > > This flow has a race condition between creating the device node in the > > > kernel module and using the device node in the systemd service. It's > > > possible that the xenstored service fails to open the device node due > > > to the delay of creating the device node. In the end, xenbus cannot be > > > used between the Dom0 kernel and the Xen hypervisor. > > > > > > To resolve this issue, we need to synchronize between udev and systemd > > > for the device node. There have an extra change in the udev rules for > > > tagging 'systemd' for Xen device nodes, which notifies device node > > > creating to systemd; besides udev change, this patch adds dependency in > > > systemd service for waiting the device node. > > > > > > Signed-off-by: Leo Yan <leo.yan@linaro.org> > > > --- > > > > > > The udev rules change is on github: > > > https://github.com/systemd/systemd/pull/28962/commits/520340dfea3b6cf9fe7a24c9238313b1a5fe8539 > > > > Let's see what they think, but I fear systemd may not be the correct > > upstream to carry this... Skimming through the rules that they have > > there, it mostly looks like they only carry rules that are relevant for > > systemd-related services directly. > > Yeah, I share the same concern. But seems to me, upstreaming change > to the systemd is the most neat fixing. > > > > tools/hotplug/Linux/systemd/xenstored.service.in | 4 ++-- > > > 1 file changed, 2 insertions(+), 2 deletions(-) > > > > > > diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in > > > index 261077dc92..6e48cdb0e7 100644 > > > --- a/tools/hotplug/Linux/systemd/xenstored.service.in > > > +++ b/tools/hotplug/Linux/systemd/xenstored.service.in > > > @@ -1,7 +1,7 @@ > > > [Unit] > > > Description=The Xen xenstore > > > -Requires=proc-xen.mount > > > -After=proc-xen.mount > > > +Requires=proc-xen.mount dev-xen-gntdev.device > > > +After=proc-xen.mount dev-xen-gntdev.device > > > > I must admit that I am a bit confused why this is enough... During our > > discussion on Slack, when you quoted from your rule it included > > `ENV{SYSTEMD_WANTS}="xenstored.service"`. Did you drop that during later > > development? Was there additional reasearch involved in dropping it? > > Yes, I dropped ENV{SYSTEMD_WANTS}="xenstored.service" and it works well > at my side. Hm. Interesting. Could you plot the activations after boot? systemd-analyze plot > /tmp/plot.svg Seeing failure vs success but also the success cases on AVA vs Telechips may be interesting. I just did some tests on my workstation where I added dependencies on a random USB device. If that one was not plugged at all, the service still happily started. Yet, it obviously seems to do something in your case... So I fear we may not fully understand the real problem yet. I must admit that I find this case a bit under-documented. While "Wants" explicitly says that failling transactions are ignored, there is no clear statement about what happens in that case with "Requires". While skimming the docs I also realized that maybe BindsTo= would maybe be more suitable compared to Requires= here. But that is unrelated to the confusion that I have about the original problem. > My purpose is to upstream the udev rules in systemd as general as > possible. As you mentioned, the "xenstored.service" is maintained in Xen > but not in systemd, for this reason, I would like avoid to add > "xenstored.service" into udev rules in systemd. > > > My understanding was that if devices do not exist when systemd builds > > its transaction model, dependencies on them will just get tossed out[1]. > > So I would have expected that there should still be a race if > > the .device does not pull in the service. > > Thanks for sharing. To be honest, I don't know this part. Doesn't > systemd parse the service script and let service to wait for specific > .device until systemd receives notification for the .device? > > > Did you hit the race frequently enough to be sure that this fixes it in > > entirety? > > I have two boards. One on board (Telechips), it's consistently > reproduce this issue, and after applying the udev rules change and > this patch, I confirmed the issue is dismissed entirely. > > > Is there a place somewhere in the Xen or kernel code where one > > could add an excessive sleep in order to trigger the race more reliably? > > On my AVA board with running Xen, I tried to add a long delay (10 > seconds) in the kernel driver 'drivers/xen/gntdev.c', but I can see > systemd will wait for the kernel modules loading, and then it launches > Xen services. Thus I cannot reproduce this issue on my AVA board. Seeing the systemd-analyze plots would maybe help to understand the differences here... Does one or the other maybe involve some initrd stage that already creates devices? Also: What was the original error that you saw? Was /dev/xen/gntdev missing entirely? Or did interaction with it fail with some error code? Can you reproduce the race while booting with systemd.log_level=debug and export the full journal of the current boot? journalctl -b0 -o export > journal.txt > So in below flow: > > - Step 1: drivers/xen/gntdev.c registers misc device; > - Step 2: udev creates device node '/dev/xen/gntdev'; > - Step 3: systemd launches xenstored.service. > > Seems to me, the race condition happens between step 2 and step 3. If > there have any delay in udev for creating device node, then the step3 > for xenstored.service will fail. Hm... If you see systemd waiting for the module, then Step 2 + 3 should not be able to race. The service does not set DefaultDependencies=no, so xenstored.service should only run after sysinit.target is reached. Since that target depends on the module loading, everything should - in theory - work out. - Erik > > Thanks for review. > > Leo > > > [1] "citation-needed": But I vaguely remember dealing with a similar > > issue and getting told this on #systemd IRC > > > > - Erik
On Fri, Aug 25, 2023 at 09:37:58AM +0200, Erik Schilling wrote: [...] > > > > diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in > > > > index 261077dc92..6e48cdb0e7 100644 > > > > --- a/tools/hotplug/Linux/systemd/xenstored.service.in > > > > +++ b/tools/hotplug/Linux/systemd/xenstored.service.in > > > > @@ -1,7 +1,7 @@ > > > > [Unit] > > > > Description=The Xen xenstore > > > > -Requires=proc-xen.mount > > > > -After=proc-xen.mount > > > > +Requires=proc-xen.mount dev-xen-gntdev.device > > > > +After=proc-xen.mount dev-xen-gntdev.device > > > > > > I must admit that I am a bit confused why this is enough... During our > > > discussion on Slack, when you quoted from your rule it included > > > `ENV{SYSTEMD_WANTS}="xenstored.service"`. Did you drop that during later > > > development? Was there additional reasearch involved in dropping it? > > > > Yes, I dropped ENV{SYSTEMD_WANTS}="xenstored.service" and it works well > > at my side. > > Hm. Interesting. Could you plot the activations after boot? > > systemd-analyze plot > /tmp/plot.svg > > Seeing failure vs success but also the success cases on AVA vs Telechips > may be interesting. > > I just did some tests on my workstation where I added dependencies on > a random USB device. If that one was not plugged at all, the service > still happily started. Yet, it obviously seems to do something in your > case... So I fear we may not fully understand the real problem yet. > > I must admit that I find this case a bit under-documented. While "Wants" > explicitly says that failling transactions are ignored, there is no > clear statement about what happens in that case with "Requires". > > While skimming the docs I also realized that maybe BindsTo= would maybe > be more suitable compared to Requires= here. But that is unrelated to > the confusion that I have about the original problem. Please ignore this patch and sorry for noise. Erik and me confirmed the systemd has established the dependencies properly. The dependency is (the above one depends on the blow one): xenstored.service `> sysinit.target `> systemd-modules-load.service (Load xen modules) My issue is caused by a local customzied systemd-modules-load.service which breaks the dependency between sysinit.target and systemd-modules-load.service. Very appreciate Erik's help for root cause the issue. Thanks, Leo
diff --git a/tools/hotplug/Linux/systemd/xenstored.service.in b/tools/hotplug/Linux/systemd/xenstored.service.in index 261077dc92..6e48cdb0e7 100644 --- a/tools/hotplug/Linux/systemd/xenstored.service.in +++ b/tools/hotplug/Linux/systemd/xenstored.service.in @@ -1,7 +1,7 @@ [Unit] Description=The Xen xenstore -Requires=proc-xen.mount -After=proc-xen.mount +Requires=proc-xen.mount dev-xen-gntdev.device +After=proc-xen.mount dev-xen-gntdev.device Before=libvirtd.service libvirt-guests.service RefuseManualStop=true ConditionPathExists=/proc/xen/capabilities
When system booting up, the kernel module xen_gntdev.ko is loaded and the device node '/dev/xen/gntdev' is created; later the xenstored service in systemd launches daemon to open this device node. This flow has a race condition between creating the device node in the kernel module and using the device node in the systemd service. It's possible that the xenstored service fails to open the device node due to the delay of creating the device node. In the end, xenbus cannot be used between the Dom0 kernel and the Xen hypervisor. To resolve this issue, we need to synchronize between udev and systemd for the device node. There have an extra change in the udev rules for tagging 'systemd' for Xen device nodes, which notifies device node creating to systemd; besides udev change, this patch adds dependency in systemd service for waiting the device node. Signed-off-by: Leo Yan <leo.yan@linaro.org> --- The udev rules change is on github: https://github.com/systemd/systemd/pull/28962/commits/520340dfea3b6cf9fe7a24c9238313b1a5fe8539 tools/hotplug/Linux/systemd/xenstored.service.in | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)