Message ID | 1368076726-11492-2-git-send-email-skannan@codeaurora.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> wrote: > > The most obvious fallback of using late_initcall_sync() also doesn't work > since the deferred probing work initated during late_initcall() is done in > a workqueue. So, frameworks that want to wait for all devices to finish > probing during init will now have to wait for the deferred workqueue to > finish it's work. This patch adds a wait_for_init_deferred_probe_done() API flush_workqueue() has been added in deferred_probe_initcall(), so looks it should be OK for your problem, doesn't it? Thanks,
On Thu, May 9, 2013 at 11:07 AM, Ming Lei <tom.leiming@gmail.com> wrote: > On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> wrote: >> >> The most obvious fallback of using late_initcall_sync() also doesn't work >> since the deferred probing work initated during late_initcall() is done in >> a workqueue. So, frameworks that want to wait for all devices to finish >> probing during init will now have to wait for the deferred workqueue to >> finish it's work. This patch adds a wait_for_init_deferred_probe_done() API > > flush_workqueue() has been added in deferred_probe_initcall(), so looks it > should be OK for your problem, doesn't it? It looks like Saravana is using a kernel that already does that based on object bb5645e from the diff. So if he is still having problem, then there is probably another deferred probe that is triggered after the deferred probe lateinitcall is executed. It would be good to know what driver is getting deferred past clearing the queue. Or has this been rebased from an earlier kernel? It may no longer be necessary. However, if a device that shuts down resources after init has completed and then cannot turn those resources back on when another driver requests them then it sounds like there is a bigger design problem. We're in a hotplug world and most of the time a driver cannot assume that a resource will never get requested after initcalls have completed. It sounds like a design bug in the driver if it cannot handle that use case. g.
On Thu, May 09, 2013 at 12:50:46PM +0100, Grant Likely wrote: > However, if a device that shuts down resources after init has > completed and then cannot turn those resources back on when another > driver requests them then it sounds like there is a bigger design > problem. We're in a hotplug world and most of the time a driver cannot > assume that a resource will never get requested after initcalls have > completed. It sounds like a design bug in the driver if it cannot > handle that use case. Even if the driver copes fine it can still be desirable to avoid the power down/up cycle if it involves some user visible effect - things like blinking the display off then on for example. That said I am a little suspicious about this approach, it doesn't feel as robust as it should to go round individual callers.
On Thu, May 09, 2013 at 02:50:17PM +0100, Mark Brown wrote: > On Thu, May 09, 2013 at 12:50:46PM +0100, Grant Likely wrote: > > > However, if a device that shuts down resources after init has > > completed and then cannot turn those resources back on when another > > driver requests them then it sounds like there is a bigger design > > problem. We're in a hotplug world and most of the time a driver cannot > > assume that a resource will never get requested after initcalls have > > completed. It sounds like a design bug in the driver if it cannot > > handle that use case. > > Even if the driver copes fine it can still be desirable to avoid the > power down/up cycle if it involves some user visible effect - things > like blinking the display off then on for example. That said I am a > little suspicious about this approach, it doesn't feel as robust as it > should to go round individual callers. What if the driver for something like your display is a module which needs to be loaded from userland? Where the design bug lies is in the "lets probe all the drivers and then shut down resources which drivers haven't claimed". That contains an implied assumption: that all drivers have been loaded and probed at the point where you shut down those resources. That simply may not be true in todays kernel - it's not true for a start if you have modular drivers, and you have built most of the drivers as modules (as a distro would want to with single zImage). It's coincidentally not true if you have deferred probing and some drivers defer. The real problem is this: at what point has the system actually finished "booting" in the sense that all drivers for a platform have been initialised? With user loadable modules and deferred probing, that's actually a very fuzzy concept. As you can't really tell when that point has been reached, how can you decide to shut down resources which "aren't being used" in a sane way without avoiding the down/up cycle? Basically, you can't. So, trying to solve the problem may be totally fruitless because you can't actually solve it - you can only put a sticky plaster over it and hope that it catches most of the problem. But reality is that you can't have both a shutdown of unused resources _and_ avoid the down/up cycle.
On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> wrote: > Kernel framework (Eg: regulator, clock, etc) might want to do some clean up > work (Eg: turn off unclaimed resources) after all devices are done probing > during kernel init. Before deferred probing was introduced, this was Generally, system resources should be configured as disabled at default, and for drivers, the correct usage should be only requesting and enabling the resources just in need, then the sort of 'cleanup' things can be avoided. Thanks,
On Thu, May 09, 2013 at 03:14:45PM +0100, Russell King - ARM Linux wrote: > On Thu, May 09, 2013 at 02:50:17PM +0100, Mark Brown wrote: > > Even if the driver copes fine it can still be desirable to avoid the > > power down/up cycle if it involves some user visible effect - things > > like blinking the display off then on for example. That said I am a > > little suspicious about this approach, it doesn't feel as robust as it > > should to go round individual callers. > What if the driver for something like your display is a module which > needs to be loaded from userland? That's clearly a "don't do that then" sort of thing; while we don't want to be unhelpful there's no guarantees with this approach. > Where the design bug lies is in the "lets probe all the drivers and then > shut down resources which drivers haven't claimed". That contains an > implied assumption: that all drivers have been loaded and probed at the > point where you shut down those resources. Well, that's not really the intention - it's not a strong guarantee, it never has been given things like the module case you mention. > So, trying to solve the problem may be totally fruitless because you > can't actually solve it - you can only put a sticky plaster over it and > hope that it catches most of the problem. But reality is that you can't > have both a shutdown of unused resources _and_ avoid the down/up cycle. Yes, exactly - all we're trying to do here is cover the 90% case, not solve all the possible problems since as you say that's not achievable. There's a clear and reasonable desire to turn off resources we know aren't in use at the current time, but equally well doing so as soon as we start controlling the resources is pretty much guranteed to introduce user visible issues on some systems so it's a question of picking some reasonable point after that. Another option here which is more in tune with deferred probing and hotplugging would be to switch the delay over to be time based rather than initcall based; do the shutdown at some point based on the time the last resource was registered. That still won't cover everything though we could make the delay tunable.
On Thu, May 09, 2013 at 03:37:02PM +0100, Mark Brown wrote: > On Thu, May 09, 2013 at 03:14:45PM +0100, Russell King - ARM Linux wrote: > > On Thu, May 09, 2013 at 02:50:17PM +0100, Mark Brown wrote: > > > > Even if the driver copes fine it can still be desirable to avoid the > > > power down/up cycle if it involves some user visible effect - things > > > like blinking the display off then on for example. That said I am a > > > little suspicious about this approach, it doesn't feel as robust as it > > > should to go round individual callers. > > > What if the driver for something like your display is a module which > > needs to be loaded from userland? > > That's clearly a "don't do that then" sort of thing; while we don't want > to be unhelpful there's no guarantees with this approach. That's not a "don't do that then" thing, because in this case it's unreasonable to say that. The display subsystems like fbdev and DRM represent quite a sizable chunk: - Base DRM is around 200k. - DRM drivers typically around 100k each. - Base FBdev is around 100k. It won't take long before you're into the territory of having a significant portion of your kernel being display drivers of one type or other, much of which won't be usable on any one specific platform. So to say "don't build your display drivers as modules" is an unreasonable position to take. > Yes, exactly - all we're trying to do here is cover the 90% case, not > solve all the possible problems since as you say that's not achievable. > There's a clear and reasonable desire to turn off resources we know > aren't in use at the current time, but equally well doing so as soon as > we start controlling the resources is pretty much guranteed to introduce > user visible issues on some systems so it's a question of picking some > reasonable point after that. I beg to differ on whether it's possible to solve it completely. > Another option here which is more in tune with deferred probing and > hotplugging would be to switch the delay over to be time based rather > than initcall based; do the shutdown at some point based on the time the > last resource was registered. That still won't cover everything > though we could make the delay tunable. Yuck. That's crap design. Really, time based stuff is crap. I've seen this too many times with the gnome crap in Ubuntu 12.04 - where if you boot this off SD card it will complain that some applets fail to start (and sure enough, half your panel is missing.) Boot it off eSATA and it works 100% reliably. Time based stuff to guess when stuff has finished is never a good thing and can never be reliable. A better solution may be to avoid the problem in kernel space altogether. That's already done in the past with the scsi_wait_scan module. Make the the shutdown of stuff a separate loadable module which userspace can load at the appropriate time to trigger the shutdown of unused resources. Or provide a different method for userspace to trigger that action. With that kind of solution, it is possible to know that the system has finished booting (many userspace implementations already do this with stuff like not permitting login via network until after the system has finished booting despite sshd et.al. already being started.)
On Thu, May 09, 2013 at 04:07:50PM +0100, Russell King - ARM Linux wrote: > On Thu, May 09, 2013 at 03:37:02PM +0100, Mark Brown wrote: > > That's clearly a "don't do that then" sort of thing; while we don't want > > to be unhelpful there's no guarantees with this approach. > That's not a "don't do that then" thing, because in this case it's > unreasonable to say that. The display subsystems like fbdev and > DRM represent quite a sizable chunk: > - Base DRM is around 200k. > - DRM drivers typically around 100k each. > - Base FBdev is around 100k. > It won't take long before you're into the territory of having a > significant portion of your kernel being display drivers of one > type or other, much of which won't be usable on any one specific > platform. So to say "don't build your display drivers as modules" > is an unreasonable position to take. Sure, it's unhelpful for distro style kernels. Like I say, 90%. > > Another option here which is more in tune with deferred probing and > > hotplugging would be to switch the delay over to be time based rather > > than initcall based; do the shutdown at some point based on the time the > > last resource was registered. That still won't cover everything > > though we could make the delay tunable. > Yuck. That's crap design. Really, time based stuff is crap. I've seen > this too many times with the gnome crap in Ubuntu 12.04 - where if you > boot this off SD card it will complain that some applets fail to start > (and sure enough, half your panel is missing.) Boot it off eSATA and > it works 100% reliably. No argument here, there's a reason I immediately went to the "make it tunable" fudge factor. > A better solution may be to avoid the problem in kernel space altogether. > That's already done in the past with the scsi_wait_scan module. Make the > the shutdown of stuff a separate loadable module which userspace can load > at the appropriate time to trigger the shutdown of unused resources. Or > provide a different method for userspace to trigger that action. > With that kind of solution, it is possible to know that the system has > finished booting (many userspace implementations already do this with > stuff like not permitting login via network until after the system has > finished booting despite sshd et.al. already being started.) This works fine for boot but if we're going to solve this properly and asking userspace to make changes we probably ought to be trying to handle the actual hotplug case (which the current bodge doesn't cope with at all) as well. A similar tactic with asking for handshaking from userspace for hotplug notifications should work though it'd be a bit more hassle to deploy. Or perhaps given that it should be simple for userspace and there's probably other uses for the information we just put the hook in there anyway even if this particular problem gets a better solution later on.
On Thu, May 9, 2013 at 3:14 PM, Russell King - ARM Linux <linux@arm.linux.org.uk> wrote: > On Thu, May 09, 2013 at 02:50:17PM +0100, Mark Brown wrote: >> On Thu, May 09, 2013 at 12:50:46PM +0100, Grant Likely wrote: >> >> > However, if a device that shuts down resources after init has >> > completed and then cannot turn those resources back on when another >> > driver requests them then it sounds like there is a bigger design >> > problem. We're in a hotplug world and most of the time a driver cannot >> > assume that a resource will never get requested after initcalls have >> > completed. It sounds like a design bug in the driver if it cannot >> > handle that use case. >> >> Even if the driver copes fine it can still be desirable to avoid the >> power down/up cycle if it involves some user visible effect - things >> like blinking the display off then on for example. That said I am a >> little suspicious about this approach, it doesn't feel as robust as it >> should to go round individual callers. > > What if the driver for something like your display is a module which > needs to be loaded from userland? > > Where the design bug lies is in the "lets probe all the drivers and then > shut down resources which drivers haven't claimed". That contains an > implied assumption: that all drivers have been loaded and probed at the > point where you shut down those resources. > > That simply may not be true in todays kernel - it's not true for a start > if you have modular drivers, and you have built most of the drivers as > modules (as a distro would want to with single zImage). It's > coincidentally not true if you have deferred probing and some drivers > defer. > > The real problem is this: at what point has the system actually finished > "booting" in the sense that all drivers for a platform have been > initialised? With user loadable modules and deferred probing, that's > actually a very fuzzy concept. > > As you can't really tell when that point has been reached, how can you > decide to shut down resources which "aren't being used" in a sane way > without avoiding the down/up cycle? Basically, you can't. > > So, trying to solve the problem may be totally fruitless because you > can't actually solve it - you can only put a sticky plaster over it and > hope that it catches most of the problem. But reality is that you can't > have both a shutdown of unused resources _and_ avoid the down/up cycle. +1. You can /minimize/ up-down cycles with the kind of optimization that is being attempted here, but the driver still *must* deal with bringing resources back up if they get requested "later than you would otherwise like". g. -- Grant Likely, B.Sc., P.Eng. Secret Lab Technologies Ltd.
On 05/09/2013 04:50 AM, Grant Likely wrote: > On Thu, May 9, 2013 at 11:07 AM, Ming Lei <tom.leiming@gmail.com> wrote: >> On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> wrote: >>> >>> The most obvious fallback of using late_initcall_sync() also doesn't work >>> since the deferred probing work initated during late_initcall() is done in >>> a workqueue. So, frameworks that want to wait for all devices to finish >>> probing during init will now have to wait for the deferred workqueue to >>> finish it's work. This patch adds a wait_for_init_deferred_probe_done() API >> >> flush_workqueue() has been added in deferred_probe_initcall(), so looks it >> should be OK for your problem, doesn't it? > > It looks like Saravana is using a kernel that already does that based > on object bb5645e from the diff. So if he is still having problem, > then there is probably another deferred probe that is triggered after > the deferred probe lateinitcall is executed. It would be good to know > what driver is getting deferred past clearing the queue. Or has this > been rebased from an earlier kernel? It may no longer be necessary. Sorry, it was mindless rebase late at night. I missed the addition of the flush_workqueue(). That takes care of my immediate needs. Sorry for wasting your time. But the other patches to move clock and regulator calls to late_init_sync should still be necessary. Right? Or are we going to depend on the Makefile ordering to determine the order of the lateinit calls? (I would rather not). But the rest of the discussion is still interesting and relevant and something I would like to see resolved too. I'll reply to that part of the discussion in one of the later emails. Thanks, Saravana
On Thu, May 9, 2013 at 5:52 PM, Saravana Kannan <skannan@codeaurora.org> wrote: > On 05/09/2013 04:50 AM, Grant Likely wrote: >> >> On Thu, May 9, 2013 at 11:07 AM, Ming Lei <tom.leiming@gmail.com> wrote: >>> >>> On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> >>> wrote: >>>> >>>> >>>> The most obvious fallback of using late_initcall_sync() also doesn't >>>> work >>>> since the deferred probing work initated during late_initcall() is done >>>> in >>>> a workqueue. So, frameworks that want to wait for all devices to finish >>>> probing during init will now have to wait for the deferred workqueue to >>>> finish it's work. This patch adds a wait_for_init_deferred_probe_done() >>>> API >>> >>> >>> flush_workqueue() has been added in deferred_probe_initcall(), so looks >>> it >>> should be OK for your problem, doesn't it? >> >> >> It looks like Saravana is using a kernel that already does that based >> on object bb5645e from the diff. So if he is still having problem, >> then there is probably another deferred probe that is triggered after >> the deferred probe lateinitcall is executed. It would be good to know >> what driver is getting deferred past clearing the queue. Or has this >> been rebased from an earlier kernel? It may no longer be necessary. > > > Sorry, it was mindless rebase late at night. I missed the addition of the > flush_workqueue(). That takes care of my immediate needs. Sorry for wasting > your time. > > But the other patches to move clock and regulator calls to late_init_sync > should still be necessary. Right? Or are we going to depend on the Makefile > ordering to determine the order of the lateinit calls? (I would rather not). You'll need to make sure the regulator and clocks defer to after all the late initcalls are complete. That will ensure that the deferred queue has been processed. g.
On 05/09/2013 11:09 AM, Grant Likely wrote: > On Thu, May 9, 2013 at 5:52 PM, Saravana Kannan <skannan@codeaurora.org> wrote: >> On 05/09/2013 04:50 AM, Grant Likely wrote: >>> >>> On Thu, May 9, 2013 at 11:07 AM, Ming Lei <tom.leiming@gmail.com> wrote: >>>> >>>> On Thu, May 9, 2013 at 1:18 PM, Saravana Kannan <skannan@codeaurora.org> >>>> wrote: >>>>> >>>>> >>>>> The most obvious fallback of using late_initcall_sync() also doesn't >>>>> work >>>>> since the deferred probing work initated during late_initcall() is done >>>>> in >>>>> a workqueue. So, frameworks that want to wait for all devices to finish >>>>> probing during init will now have to wait for the deferred workqueue to >>>>> finish it's work. This patch adds a wait_for_init_deferred_probe_done() >>>>> API >>>> >>>> >>>> flush_workqueue() has been added in deferred_probe_initcall(), so looks >>>> it >>>> should be OK for your problem, doesn't it? >>> >>> >>> It looks like Saravana is using a kernel that already does that based >>> on object bb5645e from the diff. So if he is still having problem, >>> then there is probably another deferred probe that is triggered after >>> the deferred probe lateinitcall is executed. It would be good to know >>> what driver is getting deferred past clearing the queue. Or has this >>> been rebased from an earlier kernel? It may no longer be necessary. >> >> >> Sorry, it was mindless rebase late at night. I missed the addition of the >> flush_workqueue(). That takes care of my immediate needs. Sorry for wasting >> your time. >> >> But the other patches to move clock and regulator calls to late_init_sync >> should still be necessary. Right? Or are we going to depend on the Makefile >> ordering to determine the order of the lateinit calls? (I would rather not). > > You'll need to make sure the regulator and clocks defer to after all > the late initcalls are complete. That will ensure that the deferred > queue has been processed. > Ok, then I guess I'll still need to send out the other 2 patches. -Saravana
On 16:07 Thu 09 May , Russell King - ARM Linux wrote: > On Thu, May 09, 2013 at 03:37:02PM +0100, Mark Brown wrote: > > On Thu, May 09, 2013 at 03:14:45PM +0100, Russell King - ARM Linux wrote: > > > On Thu, May 09, 2013 at 02:50:17PM +0100, Mark Brown wrote: > > > > > > Even if the driver copes fine it can still be desirable to avoid the > > > > power down/up cycle if it involves some user visible effect - things > > > > like blinking the display off then on for example. That said I am a > > > > little suspicious about this approach, it doesn't feel as robust as it > > > > should to go round individual callers. > > > > > What if the driver for something like your display is a module which > > > needs to be loaded from userland? > > > > That's clearly a "don't do that then" sort of thing; while we don't want > > to be unhelpful there's no guarantees with this approach. > > That's not a "don't do that then" thing, because in this case it's > unreasonable to say that. The display subsystems like fbdev and > DRM represent quite a sizable chunk: > > - Base DRM is around 200k. > - DRM drivers typically around 100k each. > - Base FBdev is around 100k. > > It won't take long before you're into the territory of having a > significant portion of your kernel being display drivers of one > type or other, much of which won't be usable on any one specific > platform. So to say "don't build your display drivers as modules" > is an unreasonable position to take. > > > Yes, exactly - all we're trying to do here is cover the 90% case, not > > solve all the possible problems since as you say that's not achievable. > > There's a clear and reasonable desire to turn off resources we know > > aren't in use at the current time, but equally well doing so as soon as > > we start controlling the resources is pretty much guranteed to introduce > > user visible issues on some systems so it's a question of picking some > > reasonable point after that. > > I beg to differ on whether it's possible to solve it completely. > > > Another option here which is more in tune with deferred probing and > > hotplugging would be to switch the delay over to be time based rather > > than initcall based; do the shutdown at some point based on the time the > > last resource was registered. That still won't cover everything > > though we could make the delay tunable. > > Yuck. That's crap design. Really, time based stuff is crap. I've seen > this too many times with the gnome crap in Ubuntu 12.04 - where if you > boot this off SD card it will complain that some applets fail to start > (and sure enough, half your panel is missing.) Boot it off eSATA and > it works 100% reliably. > > Time based stuff to guess when stuff has finished is never a good thing > and can never be reliable. > > A better solution may be to avoid the problem in kernel space altogether. > That's already done in the past with the scsi_wait_scan module. Make the > the shutdown of stuff a separate loadable module which userspace can load > at the appropriate time to trigger the shutdown of unused resources. Or > provide a different method for userspace to trigger that action. > > With that kind of solution, it is possible to know that the system has > finished booting (many userspace implementations already do this with > stuff like not permitting login via network until after the system has > finished booting despite sshd et.al. already being started.) I agree with Russell, your bootlaoder setup the splash screen and you did not load the framebuffer or DRM driver. if you shutdown the clock you loose the splashscreen It's a crap way to do Best Regards, J. > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Thu, May 09, 2013 at 05:39:03PM +0100, Grant Likely wrote: > +1. You can /minimize/ up-down cycles with the kind of optimization > that is being attempted here, but the driver still *must* deal with > bringing resources back up if they get requested "later than you would > otherwise like". Now that I think about it we should probably have a debug option like the shared IRQ testing one which does actually disable all resources as soon as the kernel gains control of them in order to help with test coverage for this.
diff --git a/drivers/base/dd.c b/drivers/base/dd.c index bb5645e..bb2b9c6 100644 --- a/drivers/base/dd.c +++ b/drivers/base/dd.c @@ -52,6 +52,7 @@ static DEFINE_MUTEX(deferred_probe_mutex); static LIST_HEAD(deferred_probe_pending_list); static LIST_HEAD(deferred_probe_active_list); static struct workqueue_struct *deferred_wq; +DECLARE_COMPLETION(init_def_probe_done); /** * deferred_probe_work_func() - Retry probing devices in the active list. @@ -105,6 +106,7 @@ static void deferred_probe_work_func(struct work_struct *work) put_device(dev); } mutex_unlock(&deferred_probe_mutex); + complete_all(&init_def_probe_done); } static DECLARE_WORK(deferred_probe_work, deferred_probe_work_func); @@ -179,6 +181,12 @@ static int deferred_probe_initcall(void) } late_initcall(deferred_probe_initcall); +void wait_for_init_deferred_probe_done(void) +{ + wait_for_completion(&init_def_probe_done); +} +EXPORT_SYMBOL_GPL(wait_for_init_deferred_probe_done); + static void driver_bound(struct device *dev) { if (klist_node_attached(&dev->p->knode_driver)) { diff --git a/include/linux/device.h b/include/linux/device.h index 9d6464e..5c557f7 100644 --- a/include/linux/device.h +++ b/include/linux/device.h @@ -247,6 +247,7 @@ extern struct device_driver *driver_find(const char *name, struct bus_type *bus); extern int driver_probe_done(void); extern void wait_for_device_probe(void); +extern void wait_for_init_deferred_probe_done(void); /* sysfs interface for exporting driver attributes */
Kernel framework (Eg: regulator, clock, etc) might want to do some clean up work (Eg: turn off unclaimed resources) after all devices are done probing during kernel init. Before deferred probing was introduced, this was typically done using a late_initcall(). That approach still makes the assumption that all drivers that are compiled in, register in one of the earlier initcall levels. With the introduction of deferred probing, even if the assumption that all compiled in drivers register in one of the earlier initcalls is ture, there is no longer a guarantee that all their matching devices would have completed probing by late_initcall(). This is because deferred probing loginc starts attempting the deferred probes only in a late_initcall() function. The most obvious fallback of using late_initcall_sync() also doesn't work since the deferred probing work initated during late_initcall() is done in a workqueue. So, frameworks that want to wait for all devices to finish probing during init will now have to wait for the deferred workqueue to finish it's work. This patch adds a wait_for_init_deferred_probe_done() API that can by called from late_initcall_sync() or a workqueue started from late_initcall_sync() Signed-off-by: Saravana Kannan <skannan@codeaurora.org> --- drivers/base/dd.c | 8 ++++++++ include/linux/device.h | 1 + 2 files changed, 9 insertions(+), 0 deletions(-)