Message ID | 2715729.9U1nlcpFb3@vostro.rjw.lan (mailing list archive) |
---|---|
State | Superseded, archived |
Delegated to: | Rafael Wysocki |
Headers | show |
Hi Rafael, sorry for not responding to v5 of your series earlier, just sending this out now in the hope that it reaches you before your travels. On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > - Modify device_links_check_suppliers(), device_links_driver_bound(), > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), > and device_links_unbind_consumers() to walk link lists under device_links_lock > (to make the new "driver presence tracking" mechanism work reliably). This change might increase boot time if drivers return -EPROBE_DEFER. It can easily happen that such drivers are probed dozens of times, and each time the lock will be acquired, blocking other drivers from probing. Granted, it's hard to measure if and in how far boot time really increases, but somehow I liked the previous approach better. The combination of a mutex with an RCU plus a spinlock seemed complicated when I reviewed the patch, but I never said anything because I came to the conclusion that the effort seemed worthwhile to keep up the performance. (BTW, kbuild test robot has complained that the usage of RCUs isn't enclosed in #ifdef CONFIG_SRCU. Just in case you haven't seen that.) > - Redefine device_link_add() to take three arguments, the supplier and consumer > pointers and flags, and to figure out the initial link state automatically. That's a good idea, however it seems to me that there's some redundancy between the dl_dev_state and device_link_state: AFAICS, at any given moment the device_link_state can be computed from the supplier's and consumer's dl_dev_state. One could argue that caching the device_link_state is cheaper than recomputing it every time it's needed, but often the dl_dev_state of one of the two devices is known, so the link can only be in a subset of all possible states, and instead of computing the device_link_state it's sufficient to just look at the other device's dl_dev_state. E.g. in device_links_check_suppliers() we have this: + if (link->status != DL_STATE_AVAILABLE) { + device_links_missing_supplier(dev); + ret = -EPROBE_DEFER; + break; + } When this function is called we know that the consumer's dl_dev_state is DL_DEV_PROBING. Of interest is only the status of the supplier. The above if-condition would seem to be equivalent to: if (link->supplier->links.status == DL_DEV_DRIVER_BOUND) { So the device_link_state seems redundant, but if it's removed from struct device_link, changes to dl_dev_state need to be synchronized between supplier and consumers. Perhaps this could be accomplished by acquiring the device_lock for the other device. E.g. when a consumer wants to probe, before changing its own dl_dev_state it would have to acquire the device_lock for all suppliers to prevent races if one of them simultaneously decides to unbind (and changes its dl_dev_state to DL_DEV_UNBINDING). Hm, could this deadlock? Also, I notice that the dl_dev_state is part of a device's "links" structure, but being able to look up a device's driver state might be useful in other cases, not just for device links. Maybe it makes sense to move the dl_dev_state one level up to struct device and name the attribute "driver_status" or somesuch (like the existing "driver" and "driver_data" attributes). > + /* Deterine the initial link state. */ ^ m Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: > Hi Rafael, > > sorry for not responding to v5 of your series earlier, just sending > this out now in the hope that it reaches you before your travels. > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > - Modify device_links_check_suppliers(), device_links_driver_bound(), > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), > > and device_links_unbind_consumers() to walk link lists under device_links_lock > > (to make the new "driver presence tracking" mechanism work reliably). > > This change might increase boot time if drivers return -EPROBE_DEFER. "might"? Please verify this before guessing.... And don't make this more complex than needed before actually determining a real issue. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > +/* > + * Device link flags. > + * > + * STATELESS: The core won't track the presence of supplier/consumer drivers. > + * AUTOREMOVE: Remove this link automatically on consumer driver unbind. > + */ > +#define DL_FLAG_STATELESS (1 << 0) > +#define DL_FLAG_AUTOREMOVE (1 << 1) Minor nit, BIT()? I'll leave this for someone to send a checkpatch cleanup patch later on :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: > On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > > - Modify device_links_check_suppliers(), device_links_driver_bound(), > > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), > > > and device_links_unbind_consumers() to walk link lists under device_links_lock > > > (to make the new "driver presence tracking" mechanism work reliably). > > > > This change might increase boot time if drivers return -EPROBE_DEFER. > > "might"? Please verify this before guessing.... I can't, my machine only uses device links for Thunderbolt hotplug ports, and their driver (portdrv) doesn't use deferred probing. I'm only aware of a single device in my system whose driver causes others to defer probing (apple-gmux), but they don't use device links, so the mutex is only acquired very briefly because the supplier/consumer lists are empty, hence the performance impact can't be measured on my system. But the situation may be different for Marek's or Hanjun's use cases. I'm not familiar with their systems, hence the "might". > And don't make this more complex than needed before actually determining > a real issue. As pointed out it currently *is* more complex than needed because the device_link_state is dispensable. The whole issue is moot with the changes I suggested because the mutex would only have to be taken for addition/deletion of device links, not when probing. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: > On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: > > Hi Rafael, > > > > sorry for not responding to v5 of your series earlier, just sending > > this out now in the hope that it reaches you before your travels. > > > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > > - Modify device_links_check_suppliers(), device_links_driver_bound(), > > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), > > > and device_links_unbind_consumers() to walk link lists under device_links_lock > > > (to make the new "driver presence tracking" mechanism work reliably). > > > > This change might increase boot time if drivers return -EPROBE_DEFER. > > "might"? Please verify this before guessing.... > > And don't make this more complex than needed before actually determining > a real issue. As clarified by Rafael at Plumbers, this functional dependencies framework assumes your driver / subsystem supports deferred probe, if it does not support its not clear what will happen.... We have no explicit semantics to check if a driver / subsystem supports deferred probe. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > Currently, there is a problem with taking functional dependencies > between devices into account. > > What I mean by a "functional dependency" is when the driver of device > B needs device A to be functional and (generally) its driver to be > present in order to work properly. This has certain consequences > for power management (suspend/resume and runtime PM ordering) and > shutdown ordering of these devices. In general, it also implies that > the driver of A needs to be working for B to be probed successfully > and it cannot be unbound from the device before the B's driver. > > Support for representing those functional dependencies between > devices is added here to allow the driver core to track them and act > on them in certain cases where applicable. > > The argument for doing that in the driver core is that there are > quite a few distinct use cases involving device dependencies, they > are relatively hard to get right in a driver (if one wants to > address all of them properly) and it only gets worse if multiplied > by the number of drivers potentially needing to do it. How many drivers actually *need this* today for suspend / resume ? How many of these are because of ACPI firmware bugs rather than some other reason ? Can ACPI somehow be used instead by these devices to address quirks? > Morever, at > least one case (asynchronous system suspend/resume) cannot be handled > in a single driver at all, because it requires the driver of A to > wait for B to suspend (during system suspend) and the driver of B to > wait for A to resume (during system resume). Why is this all of a sudden a new issue? It seems there is quite a bit of frameworks already out there that somehow deal with some sort of device ordering / dependencies, and I am curious if they've been addressing some of these problems for a while now on their own somehow with their own solutions, is that the case? For instance can DAPM the / DRM / Audio component framework v4l async solution be used or does it already address some of these concerns ? > For this reason, represent dependencies between devices as "links", > with the help of struct device_link objects each containing pointers > to the "linked" devices, a list node for each of them, status > information, flags, and an RCU head for synchronization. > > Also add two new list heads, representing the lists of links to the > devices that depend on the given one (consumers) and to the devices > depended on by it (suppliers), and a "driver presence status" field > (needed for figuring out initial states of device links) to struct > device. > > The entire data structure consisting of all of the lists of link > objects for all devices is protected by a mutex (for link object > addition/removal and for list walks during device driver probing > and removal) and by SRCU (for list walking in other case that will > be introduced by subsequent change sets). In addition, each link > object has an internal status field whose value reflects whether or > not drivers are bound to the devices pointed to by the link or > probing/removal of their drivers is in progress etc. That field > is only modified under the device links mutex, but it may be read > outside of it in some cases (introduced by subsequent change sets), > so modifications of it are annotated with WRITE_ONCE(). > > New links are added by calling device_link_add() which takes three > arguments: pointers to the devices in question and flags. In > particular, if DL_FLAG_STATELESS is set in the flags, the link status > is not to be taken into account for this link and the driver core > will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the > flags, the driver core will remove the link automatically when the > consumer device driver unbinds from it. > > One of the actions carried out by device_link_add() is to reorder > the lists used for device shutdown and system suspend/resume to > put the consumer device along with all of its children and all of > its consumers (and so on, recursively) to the ends of those lists > in order to ensure the right ordering between all of the supplier > and consumer devices. There's no explanation as to why this order is ensured to be correct, I think its important to document this. From our discussions at Plumbers it seems the order is ensured due to the fact that order was already implicitly provided through platform firmware (ACPI enumeration is one), adjusting order on the dpm list is just shuffling order between consumer / provider, but nothing else. In theory it works because in the simple case this should suffice however I remain unconvinced that if we have more users of this framework this simple algorithm will suffice. Having a test driver or series of test drivers that shows this would be good. As it stands there is simply an assumption that this is correct, how *strongly* do you feel that the order will *always* be correct if this is done and no cycles can be added, even if tons of drivers start using this ? > For this reason, it is not possible to create a link between two > devices if the would-be supplier device already depends on the > would-be consumer device as either a direct descendant of it or a > consumer of one of its direct descendants or one of its consumers > and so on. > > There are two types of link objects, persistent and non-persistent. > The persistent ones stay around until one of the target devices is > deleted, while the non-persistent ones are removed automatically when > the consumer driver unbinds from its device (ie. they are assumed to > be valid only as long as the consumer device has a driver bound to > it). Persistent links are created by default and non-persistent > links are created when the DL_FLAG_AUTOREMOVE flag is passed > to device_link_add(). > > Both persistent and non-persistent device links can be deleted > with an explicit call to device_link_del(). > > Links created without the DL_FLAG_STATELESS flag set are managed > by the driver core using a simple state machine. There are 5 states > each link can be in: DORMANT (unused), AVAILABLE (the supplier driver > is present and functional), CONSUMER_PROBE (the consumer driver is > probing), ACTIVE (both supplier and consumer drivers are present and > functional), and SUPPLIER_UNBIND (the supplier driver is unbinding). > The driver core updates the link state automatically depending on > what happens to the linked devices and for each link state specific > actions are taken in addition to that. > > For example, if the supplier driver unbinds from its device, the > driver core will also unbind the drivers of all of its consumers > automatically under the assumption that they cannot function > properly without the supplier. Analogously, the driver core will > only allow the consumer driver to bind to its device if the > supplier driver is present and functional (ie. the link is in > the AVAILABLE state). If that's not the case, it will rely on > the existing deferred probing mechanism to wait for the supplier > driver to become available. This assumes drivers and subsystems support deferred probe, is there a way to ensure that drivers that use this framework support it instead of possibly breaking them if they use this framework ? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote: > We have no explicit semantics to check if a driver / subsystem > supports deferred probe. Why would we need such a thing? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote: > On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote: > > We have no explicit semantics to check if a driver / subsystem > > supports deferred probe. > > Why would we need such a thing? It depends on the impact of a driver/subsystem not properly supporting deffered probe, if this is no-op then such a need is not critical but would be good to proactively inform developers / users so they avoid its use, if this will cause issues its perhaps best to make this a no-op through a check. AFAICT reviewing implications of not supporting deferred probe on drivers/subsytsems for this framework is not clearly spelled out, if we start considering re-using this framework for probe ordering I'd hate to see issues come up without this corner case being concretely considered. Furthermore -- how does this framework compare to Andrzej's resource tracking solution? I confess I have not had a chance yet to review yet but in light of this question it would be good to know if Andrzej's framework also requires deferred probe as similar concerns would exist there as well. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 08, 2016 at 08:21:04PM +0100, Luis R. Rodriguez wrote: > On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote: > > On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote: > > > We have no explicit semantics to check if a driver / subsystem > > > supports deferred probe. > > > > Why would we need such a thing? > > It depends on the impact of a driver/subsystem not properly supporting > deffered probe, if this is no-op then such a need is not critical but > would be good to proactively inform developers / users so they avoid > its use, if this will cause issues its perhaps best to make this a > no-op through a check. AFAICT reviewing implications of not supporting > deferred probe on drivers/subsytsems for this framework is not clearly > spelled out, if we start considering re-using this framework for probe > ordering I'd hate to see issues come up without this corner case being > concretely considered. It should not matter to the driver core if a subsystem, or a driver, supports or does not support deferred probe. It's a quick and simple solution to a complex problem that works well. Yes, you can iterate a lot of times, but that's fine, we have time at boot to do that (and really, it is fast.) > Furthermore -- how does this framework compare to Andrzej's resource tracking > solution? I confess I have not had a chance yet to review yet but in light of > this question it would be good to know if Andrzej's framework also requires > deferred probe as similar concerns would exist there as well. I have no idea what "framework" you are talking about here, do you have a pointer to patches? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 08, 2016 at 08:43:35PM +0100, Greg Kroah-Hartman wrote: > On Tue, Nov 08, 2016 at 08:21:04PM +0100, Luis R. Rodriguez wrote: > > On Tue, Nov 08, 2016 at 07:45:41AM +0100, Greg Kroah-Hartman wrote: > > > On Mon, Nov 07, 2016 at 10:22:50PM +0100, Luis R. Rodriguez wrote: > > > > We have no explicit semantics to check if a driver / subsystem > > > > supports deferred probe. > > > > > > Why would we need such a thing? > > > > It depends on the impact of a driver/subsystem not properly supporting > > deffered probe, if this is no-op then such a need is not critical but > > would be good to proactively inform developers / users so they avoid > > its use, if this will cause issues its perhaps best to make this a > > no-op through a check. AFAICT reviewing implications of not supporting > > deferred probe on drivers/subsytsems for this framework is not clearly > > spelled out, if we start considering re-using this framework for probe > > ordering I'd hate to see issues come up without this corner case being > > concretely considered. > > It should not matter to the driver core if a subsystem, or a driver, > supports or does not support deferred probe. That was my impression as well -- however Rafael noted this as an area worth highlighting. So perhaps he can elaborate. But at least as per my notes I do have here Geert Uytterhoeven reminding us that: "Some drivers / subsystems don’t support deferred probe yet, such failures usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet phy could not get its interrupt as the primary IRQ chip had not been probed yet, it reverted to polling though. Sub-optimal." [0] [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html This sounds more like the existing deffered probe solution has detrimental unexpected effects on some drivers / subsystems -- its not clear *why*, but its worth reviewing if there are other drivers and seeing if we need this annotation. > It's a quick and simple solution to a complex problem that works well. We all agree. The recent discussions over probe ordering are simply optimization considerations -- should the time incurred to use deferred probe be X and the time incurred when using an alternative is Y and we determine the time Y is < X we have a winning alternative. Part of my own big issue with deferred probe is a) its extremely non-deterministic, b) it seems some folks have thought about similar problems and we might really be able to do better. I'm not convinced that the functional dependencies patches are the panacea for probe ordering, and while it was not intended for that, some folks are assuming it could be. To really vet this prospect we must really consider what other subsystem have done and review other alternatives efforts. > Yes, you can iterate a > lot of times, but that's fine, we have time at boot to do that (and > really, it is fast.) Deferred probe is left for late_initcall() [1] -- this *assumes* that the driver/subsystem can be loaded so late, and as per Andrzej this solution isunacceptable/undesirable. So it would be unfair and incorrect to categorize all drivers in the same boat, in fact given this lone fact it may be worth revisiting the idea I mentioned about a capability aspect to support a late deferred probe later. We tend to assume its fine, however it does not seem to be the case. [1] drivers/base/dd.c:late_initcall(deferred_probe_initcall); > > Furthermore -- how does this framework compare to Andrzej's resource tracking > > solution? I confess I have not had a chance yet to review yet but in light of > > this question it would be good to know if Andrzej's framework also requires > > deferred probe as similar concerns would exist there as well. > > I have no idea what "framework" you are talking about here, do you have > a pointer to patches? I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed in on Rafael's patches to indicate that we likely can integrate PM concerns into his own "framework" [3]. There was no resolution to this discussion, however its not IMHO sufficient to brush off Andrzej's points in particular because Andrzej *is* indicating that his framework: - Eliminates deferred probe and resulting late_initcall(), consumer registers callbacks informing when given resources (clock, regulator, etc) becomes available - Properly handle resource disappearance (driver unbind, hotplug) - Track resources which are not vital to the device, but can influence behavior - Offers simplified resource allocation - Can be easily expanded to help with power management Granted I have not reviewed this yet but it at least was on my radar, and I do believe its worth reviewing this further given the generally expressed interest to see if we can have a common framework to address both ordering problems, suspend and probe. At a quick glance the "ghost provider" idea seems like a rather crazy idea but hey, there may be some goods in there. It was sad both Andrzej and yourself could not attend the complex dependencies tracks -- I think it would have been useful. I don't expect us to address a resolution to probe ordering immediately -- but I am in the hopes we at least can keep an open mind about the similarity of the problems and see if we can aim for a clean elegant solution that might help both. [2] https://lwn.net/Articles/625454/ [3] http://thread.gmane.org/gmane.linux.kernel/2087152 Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote: > > > Furthermore -- how does this framework compare to Andrzej's resource tracking > > > solution? I confess I have not had a chance yet to review yet but in light of > > > this question it would be good to know if Andrzej's framework also requires > > > deferred probe as similar concerns would exist there as well. > > > > I have no idea what "framework" you are talking about here, do you have > > a pointer to patches? > > I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed > in on Rafael's patches to indicate that we likely can integrate PM concerns > into his own "framework" [3]. There was no resolution to this discussion, however > its not IMHO sufficient to brush off Andrzej's points in particular because > Andrzej *is* indicating that his framework: Dude, those patches were from 2014! I can't remember patches people sent to me a month ago... > - Eliminates deferred probe and resulting late_initcall(), consumer registers > callbacks informing when given resources (clock, regulator, etc) becomes > available > - Properly handle resource disappearance (driver unbind, hotplug) > - Track resources which are not vital to the device, but can influence behavior > - Offers simplified resource allocation > - Can be easily expanded to help with power management > > Granted I have not reviewed this yet but it at least was on my radar, and > I do believe its worth reviewing this further given the generally expressed > interest to see if we can have a common framework to address both ordering > problems, suspend and probe. At a quick glance the "ghost provider" idea > seems like a rather crazy idea but hey, there may be some goods in there. From what I remember, and I could be totally wrong, these patches were way too complex and required that every subsystem change their interfaces. That's not going to work out well, but read the email threads for the details... > It was sad both Andrzej and yourself could not attend the complex dependencies > tracks -- I think it would have been useful. Sometimes real-life gets in the way of work, sorry :( > I don't expect us to address a > resolution to probe ordering immediately -- but I am in the hopes we at least > can keep an open mind about the similarity of the problems and see if we can > aim for a clean elegant solution that might help both. I'll always review patches of what people come up with. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 09.11.2016 07:45, Greg Kroah-Hartman wrote: > On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote: >>>> Furthermore -- how does this framework compare to Andrzej's resource tracking >>>> solution? I confess I have not had a chance yet to review yet but in light of >>>> this question it would be good to know if Andrzej's framework also requires >>>> deferred probe as similar concerns would exist there as well. >>> I have no idea what "framework" you are talking about here, do you have >>> a pointer to patches? >> I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed >> in on Rafael's patches to indicate that we likely can integrate PM concerns >> into his own "framework" [3]. There was no resolution to this discussion, however >> its not IMHO sufficient to brush off Andrzej's points in particular because >> Andrzej *is* indicating that his framework: > Dude, those patches were from 2014! I can't remember patches people > sent to me a month ago... > >> - Eliminates deferred probe and resulting late_initcall(), consumer registers >> callbacks informing when given resources (clock, regulator, etc) becomes >> available >> - Properly handle resource disappearance (driver unbind, hotplug) >> - Track resources which are not vital to the device, but can influence behavior >> - Offers simplified resource allocation >> - Can be easily expanded to help with power management >> >> Granted I have not reviewed this yet but it at least was on my radar, and >> I do believe its worth reviewing this further given the generally expressed >> interest to see if we can have a common framework to address both ordering >> problems, suspend and probe. At a quick glance the "ghost provider" idea >> seems like a rather crazy idea but hey, there may be some goods in there. > >From what I remember, and I could be totally wrong, these patches were > way too complex and required that every subsystem change their > interfaces. That's not going to work out well, but read the email > threads for the details... I haven't seen your comment on my patches, except few general questions regarding one of earlier version of the framework. So maybe you are talking about different framework. Regarding complexity, if the subsystem have simple way of '(un)publishing' resources it just adds single calls to restrack core: restrack_up, restrack_down in proper places. Additionally it adds quite simple stuff to encapsulate resource description and allocation routines into generic *_restrack_desc structure, see for example patch adding restrack to phy framework[1]. [1]: https://lists.freedesktop.org/archives/dri-devel/2014-December/073759.html Regards Andrzej > >> It was sad both Andrzej and yourself could not attend the complex dependencies >> tracks -- I think it would have been useful. > Sometimes real-life gets in the way of work, sorry :( > >> I don't expect us to address a >> resolution to probe ordering immediately -- but I am in the hopes we at least >> can keep an open mind about the similarity of the problems and see if we can >> aim for a clean elegant solution that might help both. > I'll always review patches of what people come up with. > > thanks, > > greg k-h > > > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 09, 2016 at 10:36:54AM +0100, Andrzej Hajda wrote: > On 09.11.2016 07:45, Greg Kroah-Hartman wrote: > > On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote: > >>>> Furthermore -- how does this framework compare to Andrzej's resource tracking > >>>> solution? I confess I have not had a chance yet to review yet but in light of > >>>> this question it would be good to know if Andrzej's framework also requires > >>>> deferred probe as similar concerns would exist there as well. > >>> I have no idea what "framework" you are talking about here, do you have > >>> a pointer to patches? > >> I'm surprised given Andrzej did both Cc you on his patches [2] *and* chimed > >> in on Rafael's patches to indicate that we likely can integrate PM concerns > >> into his own "framework" [3]. There was no resolution to this discussion, however > >> its not IMHO sufficient to brush off Andrzej's points in particular because > >> Andrzej *is* indicating that his framework: > > Dude, those patches were from 2014! I can't remember patches people > > sent to me a month ago... > > > >> - Eliminates deferred probe and resulting late_initcall(), consumer registers > >> callbacks informing when given resources (clock, regulator, etc) becomes > >> available > >> - Properly handle resource disappearance (driver unbind, hotplug) > >> - Track resources which are not vital to the device, but can influence behavior > >> - Offers simplified resource allocation > >> - Can be easily expanded to help with power management > >> > >> Granted I have not reviewed this yet but it at least was on my radar, and > >> I do believe its worth reviewing this further given the generally expressed > >> interest to see if we can have a common framework to address both ordering > >> problems, suspend and probe. At a quick glance the "ghost provider" idea > >> seems like a rather crazy idea but hey, there may be some goods in there. > > >From what I remember, and I could be totally wrong, these patches were > > way too complex and required that every subsystem change their > > interfaces. That's not going to work out well, but read the email > > threads for the details... > > I haven't seen your comment on my patches, except few general questions > regarding one of earlier version of the framework. > So maybe you are talking about different framework. > > Regarding complexity, if the subsystem have simple way of > '(un)publishing' resources it just adds single calls to restrack core: > restrack_up, restrack_down in proper places. > Additionally it adds quite simple stuff to encapsulate resource > description and allocation routines into generic *_restrack_desc > structure, see for example patch adding restrack to phy framework[1]. Ok, again, I have no idea what my response was to a 2 year-old patchset, again, I can't remember my response to a patchset that was sent just a month ago... update it, and repost and we can all go from there if you think it is a viable solution. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: >> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: >> > Hi Rafael, >> > >> > sorry for not responding to v5 of your series earlier, just sending >> > this out now in the hope that it reaches you before your travels. >> > >> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: >> > > - Modify device_links_check_suppliers(), device_links_driver_bound(), >> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), >> > > and device_links_unbind_consumers() to walk link lists under device_links_lock >> > > (to make the new "driver presence tracking" mechanism work reliably). >> > >> > This change might increase boot time if drivers return -EPROBE_DEFER. >> >> "might"? Please verify this before guessing.... >> >> And don't make this more complex than needed before actually determining >> a real issue. > > As clarified by Rafael at Plumbers, this functional dependencies > framework assumes your driver / subsystem supports deferred probe, It isn't particularly clear what you mean by "support" here. I guess that you mean that it will allow the ->probe callback to be invoked for multiple times for the same device/driver combination without issues. If that's the case, the way the new code uses -EPROBE_DEFER doesn't interfere with this, because it will not invoke the ->probe callbacks for consumers at all until their (required) suppliers are ready. > if it does not support its not clear what will happen.... I don't see any problems here, but if you see any, please just say what they are. > We have no explicit semantics to check if a driver / subsystem > supports deferred probe. That's correct, but then do we need it? Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: > On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: >> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: >>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: >>> > Hi Rafael, >>> > >>> > sorry for not responding to v5 of your series earlier, just sending >>> > this out now in the hope that it reaches you before your travels. >>> > >>> > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: >>> > > - Modify device_links_check_suppliers(), device_links_driver_bound(), >>> > > device_links_no_driver(), device_links_driver_cleanup(), device_links_busy(), >>> > > and device_links_unbind_consumers() to walk link lists under device_links_lock >>> > > (to make the new "driver presence tracking" mechanism work reliably). >>> > >>> > This change might increase boot time if drivers return -EPROBE_DEFER. >>> >>> "might"? Please verify this before guessing.... >>> >>> And don't make this more complex than needed before actually determining >>> a real issue. >> >> As clarified by Rafael at Plumbers, this functional dependencies >> framework assumes your driver / subsystem supports deferred probe, > > It isn't particularly clear what you mean by "support" here. I noted some folks had reported issues, and you acknowledged that if deferred probe was used in some drivers and if this created an issue the same issue would be seen with this framework. AFAICT there are two possible issues to consider: 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0]. "Some drivers / subsystems don’t support deferred probe yet, such failures usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet phy could not get its interrupt as the primary IRQ chip had not been probed yet, it reverted to polling though. Sub-optimal." [0] [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html Geert can you provide more details? 2) Since deferred probe relies on late_initcall() if your driver must load earlier than this deferred probe can create an issue. Andrzej had you identified a driver that ran into this and had issues ? If not this seems like a semantics thing we should consider in extending the documentation for drivers so that driver writers are aware of this limitation. I would suppose candidates for this would be anything not using module_init() or late_initcall() on their inits and have a probe. >> if it does not support its not clear what will happen.... > > I don't see any problems here, but if you see any, please just say > what they are. > >> We have no explicit semantics to check if a driver / subsystem >> supports deferred probe. > > That's correct, but then do we need it? We can determine this by reviewing the two items above. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 7, 2016 at 10:39 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> >> >> Currently, there is a problem with taking functional dependencies >> between devices into account. >> >> What I mean by a "functional dependency" is when the driver of device >> B needs device A to be functional and (generally) its driver to be >> present in order to work properly. This has certain consequences >> for power management (suspend/resume and runtime PM ordering) and >> shutdown ordering of these devices. In general, it also implies that >> the driver of A needs to be working for B to be probed successfully >> and it cannot be unbound from the device before the B's driver. >> >> Support for representing those functional dependencies between >> devices is added here to allow the driver core to track them and act >> on them in certain cases where applicable. >> >> The argument for doing that in the driver core is that there are >> quite a few distinct use cases involving device dependencies, they >> are relatively hard to get right in a driver (if one wants to >> address all of them properly) and it only gets worse if multiplied >> by the number of drivers potentially needing to do it. > > How many drivers actually *need this* today for suspend / resume ? Admittedly, I don't have a list, but from discussions I had in the past it looked like at least a dozen. Plus, there is this pesky ACPI _DEP thing that in some cases is actually correct and (before this patchest) there is no way to follow it (a classical example here is when ACPI power management methods on one device are implemented in terms of I2C accesses and require a specific I2C controller to be present and functional for the PM of the device in question to work, but there are other cases like that). > How many of these are because of ACPI firmware bugs rather than > some other reason ? Two things here (as I said elsewhere). There is the order in which operations (eg. async suspend resume of devices during system-wide state transitions) are started and there is the order in which they are run/completed. Even if the former is right, the latter still may not be correct and there needs to be a generic way to make that happen. The "firmware bugs" above affect the first item and even if it is entirely correct (ie. all PM operations for all devices are always started in the right order, consumers before suppliers or the other way around depending on whether this is suspend or resume), there still is the second item to address. Apart from this, even if the information provided by the firmware does not lead to a device registration ordering that will produce the correct dpm_list ordering (for example), it may still not be clear whether or not that's a bug in the firmware. The device registration ordering itself is just not sufficient in general. > Can ACPI somehow be used instead by these devices to address quirks? > >> Morever, at >> least one case (asynchronous system suspend/resume) cannot be handled >> in a single driver at all, because it requires the driver of A to >> wait for B to suspend (during system suspend) and the driver of B to >> wait for A to resume (during system resume). > > Why is this all of a sudden a new issue? It seems there is quite a bit of > frameworks already out there that somehow deal with some sort of device > ordering / dependencies, and I am curious if they've been addressing some of > these problems for a while now on their own somehow with their own solutions, > is that the case? For instance can DAPM the / DRM / Audio component framework > v4l async solution be used or does it already address some of these concerns ? None of them don't address all of the problems involved in a sufficiently generic way to my knowledge. >> For this reason, represent dependencies between devices as "links", >> with the help of struct device_link objects each containing pointers >> to the "linked" devices, a list node for each of them, status >> information, flags, and an RCU head for synchronization. >> >> Also add two new list heads, representing the lists of links to the >> devices that depend on the given one (consumers) and to the devices >> depended on by it (suppliers), and a "driver presence status" field >> (needed for figuring out initial states of device links) to struct >> device. >> >> The entire data structure consisting of all of the lists of link >> objects for all devices is protected by a mutex (for link object >> addition/removal and for list walks during device driver probing >> and removal) and by SRCU (for list walking in other case that will >> be introduced by subsequent change sets). In addition, each link >> object has an internal status field whose value reflects whether or >> not drivers are bound to the devices pointed to by the link or >> probing/removal of their drivers is in progress etc. That field >> is only modified under the device links mutex, but it may be read >> outside of it in some cases (introduced by subsequent change sets), >> so modifications of it are annotated with WRITE_ONCE(). >> >> New links are added by calling device_link_add() which takes three >> arguments: pointers to the devices in question and flags. In >> particular, if DL_FLAG_STATELESS is set in the flags, the link status >> is not to be taken into account for this link and the driver core >> will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the >> flags, the driver core will remove the link automatically when the >> consumer device driver unbinds from it. >> >> One of the actions carried out by device_link_add() is to reorder >> the lists used for device shutdown and system suspend/resume to >> put the consumer device along with all of its children and all of >> its consumers (and so on, recursively) to the ends of those lists >> in order to ensure the right ordering between all of the supplier >> and consumer devices. > > There's no explanation as to why this order is ensured to be > correct, I think its important to document this. Agreed. There is a documentation work in progress (started by Lukas). We will get to that point. > From our discussions > at Plumbers it seems the order is ensured due to the fact that order > was already implicitly provided through platform firmware (ACPI > enumeration is one), adjusting order on the dpm list is just shuffling > order between consumer / provider, but nothing else. That basically is the case. > In theory it > works because in the simple case this should suffice however I > remain unconvinced that if we have more users of this framework this > simple algorithm will suffice. Having a test driver or series of > test drivers that shows this would be good. As it stands there is simply > an assumption that this is correct, how *strongly* do you feel that > the order will *always* be correct if this is done and no cycles > can be added, even if tons of drivers start using this ? Cycles cannot be added because of the way the code works. It will just return NULL from device_link_add() if it detects a cycle (that it cannot deal with). >> For this reason, it is not possible to create a link between two >> devices if the would-be supplier device already depends on the >> would-be consumer device as either a direct descendant of it or a >> consumer of one of its direct descendants or one of its consumers >> and so on. >> >> There are two types of link objects, persistent and non-persistent. >> The persistent ones stay around until one of the target devices is >> deleted, while the non-persistent ones are removed automatically when >> the consumer driver unbinds from its device (ie. they are assumed to >> be valid only as long as the consumer device has a driver bound to >> it). Persistent links are created by default and non-persistent >> links are created when the DL_FLAG_AUTOREMOVE flag is passed >> to device_link_add(). >> >> Both persistent and non-persistent device links can be deleted >> with an explicit call to device_link_del(). >> >> Links created without the DL_FLAG_STATELESS flag set are managed >> by the driver core using a simple state machine. There are 5 states >> each link can be in: DORMANT (unused), AVAILABLE (the supplier driver >> is present and functional), CONSUMER_PROBE (the consumer driver is >> probing), ACTIVE (both supplier and consumer drivers are present and >> functional), and SUPPLIER_UNBIND (the supplier driver is unbinding). >> The driver core updates the link state automatically depending on >> what happens to the linked devices and for each link state specific >> actions are taken in addition to that. >> >> For example, if the supplier driver unbinds from its device, the >> driver core will also unbind the drivers of all of its consumers >> automatically under the assumption that they cannot function >> properly without the supplier. Analogously, the driver core will >> only allow the consumer driver to bind to its device if the >> supplier driver is present and functional (ie. the link is in >> the AVAILABLE state). If that's not the case, it will rely on >> the existing deferred probing mechanism to wait for the supplier >> driver to become available. > > This assumes drivers and subsystems support deferred probe, No, I don't think so. Where does that assumption matter, exactly? > is there a way to ensure that drivers that use this framework support > it instead of possibly breaking them if they use this framework ? The whole driver/device binding tracking thing is not mandatory in the first place. You can just tell the framework to ignore that part and just use the links for PM and shutdown. Thanks, Rafael -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Luis, On Monday 07 Nov 2016 22:39:54 Luis R. Rodriguez wrote: > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > Currently, there is a problem with taking functional dependencies > > between devices into account. > > > > What I mean by a "functional dependency" is when the driver of device > > B needs device A to be functional and (generally) its driver to be > > present in order to work properly. This has certain consequences > > for power management (suspend/resume and runtime PM ordering) and > > shutdown ordering of these devices. In general, it also implies that > > the driver of A needs to be working for B to be probed successfully > > and it cannot be unbound from the device before the B's driver. > > > > Support for representing those functional dependencies between > > devices is added here to allow the driver core to track them and act > > on them in certain cases where applicable. > > > > The argument for doing that in the driver core is that there are > > quite a few distinct use cases involving device dependencies, they > > are relatively hard to get right in a driver (if one wants to > > address all of them properly) and it only gets worse if multiplied > > by the number of drivers potentially needing to do it. > > How many drivers actually *need this* today for suspend / resume ? I don't think there's a list, but just speaking for Renesas R-Car platforms such a dependency exists from all DMA bus masters to the system IOMMUs (we're talking about dozens of devices here), as well as from the display, video codec and video processing IP cores to transparent memory access IP cores that handle burst access and compression/decompression. > How many of these are because of ACPI firmware bugs rather than > some other reason ? None from the list above, even with s/ACPI/DT/. > Can ACPI somehow be used instead by these devices to address quirks? Certainly not on ARM embedded platforms :-) > > Morever, at least one case (asynchronous system suspend/resume) cannot be > > handled in a single driver at all, because it requires the driver of A to > > wait for B to suspend (during system suspend) and the driver of B to > > wait for A to resume (during system resume). > > Why is this all of a sudden a new issue? It's not a new issue, we've just never addressed it properly so far. Various workarounds existed, with various level of success (usually more for runtime PM than system suspend/resume). > It seems there is quite a bit of frameworks already out there that somehow > deal with some sort of device ordering / dependencies, and I am curious if > they've been addressing some of these problems for a while now on their own > somehow with their own solutions, is that the case? For instance can DAPM > the / DRM / Audio component framework v4l async solution be used or does it > already address some of these concerns ? > > > For this reason, represent dependencies between devices as "links", > > with the help of struct device_link objects each containing pointers > > to the "linked" devices, a list node for each of them, status > > information, flags, and an RCU head for synchronization. > > > > Also add two new list heads, representing the lists of links to the > > devices that depend on the given one (consumers) and to the devices > > depended on by it (suppliers), and a "driver presence status" field > > (needed for figuring out initial states of device links) to struct > > device. > > > > The entire data structure consisting of all of the lists of link > > objects for all devices is protected by a mutex (for link object > > addition/removal and for list walks during device driver probing > > and removal) and by SRCU (for list walking in other case that will > > be introduced by subsequent change sets). In addition, each link > > object has an internal status field whose value reflects whether or > > not drivers are bound to the devices pointed to by the link or > > probing/removal of their drivers is in progress etc. That field > > is only modified under the device links mutex, but it may be read > > outside of it in some cases (introduced by subsequent change sets), > > so modifications of it are annotated with WRITE_ONCE(). > > > > New links are added by calling device_link_add() which takes three > > arguments: pointers to the devices in question and flags. In > > particular, if DL_FLAG_STATELESS is set in the flags, the link status > > is not to be taken into account for this link and the driver core > > will not manage it. In turn, if DL_FLAG_AUTOREMOVE is set in the > > flags, the driver core will remove the link automatically when the > > consumer device driver unbinds from it. > > > > One of the actions carried out by device_link_add() is to reorder > > the lists used for device shutdown and system suspend/resume to > > put the consumer device along with all of its children and all of > > its consumers (and so on, recursively) to the ends of those lists > > in order to ensure the right ordering between all of the supplier > > and consumer devices. > > There's no explanation as to why this order is ensured to be > correct, I think its important to document this. From our discussions > at Plumbers it seems the order is ensured due to the fact that order > was already implicitly provided through platform firmware (ACPI > enumeration is one), adjusting order on the dpm list is just shuffling > order between consumer / provider, but nothing else. In theory it > works because in the simple case this should suffice however I > remain unconvinced that if we have more users of this framework this > simple algorithm will suffice. Having a test driver or series of > test drivers that shows this would be good. As it stands there is simply > an assumption that this is correct, how *strongly* do you feel that > the order will *always* be correct if this is done and no cycles > can be added, even if tons of drivers start using this ? > > > For this reason, it is not possible to create a link between two > > devices if the would-be supplier device already depends on the > > would-be consumer device as either a direct descendant of it or a > > consumer of one of its direct descendants or one of its consumers > > and so on. > > > > There are two types of link objects, persistent and non-persistent. > > The persistent ones stay around until one of the target devices is > > deleted, while the non-persistent ones are removed automatically when > > the consumer driver unbinds from its device (ie. they are assumed to > > be valid only as long as the consumer device has a driver bound to > > it). Persistent links are created by default and non-persistent > > links are created when the DL_FLAG_AUTOREMOVE flag is passed > > to device_link_add(). > > > > Both persistent and non-persistent device links can be deleted > > with an explicit call to device_link_del(). > > > > Links created without the DL_FLAG_STATELESS flag set are managed > > by the driver core using a simple state machine. There are 5 states > > each link can be in: DORMANT (unused), AVAILABLE (the supplier driver > > is present and functional), CONSUMER_PROBE (the consumer driver is > > probing), ACTIVE (both supplier and consumer drivers are present and > > functional), and SUPPLIER_UNBIND (the supplier driver is unbinding). > > The driver core updates the link state automatically depending on > > what happens to the linked devices and for each link state specific > > actions are taken in addition to that. > > > > For example, if the supplier driver unbinds from its device, the > > driver core will also unbind the drivers of all of its consumers > > automatically under the assumption that they cannot function > > properly without the supplier. Analogously, the driver core will > > only allow the consumer driver to bind to its device if the > > supplier driver is present and functional (ie. the link is in > > the AVAILABLE state). If that's not the case, it will rely on > > the existing deferred probing mechanism to wait for the supplier > > driver to become available. > > This assumes drivers and subsystems support deferred probe, is > there a way to ensure that drivers that use this framework support > it instead of possibly breaking them if they use this framework ?
Hi Luis, On Wednesday 09 Nov 2016 16:59:30 Luis R. Rodriguez wrote: > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki wrote: > > On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez wrote: > >> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: > >>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: > >>>> Hi Rafael, > >>>> > >>>> sorry for not responding to v5 of your series earlier, just sending > >>>> this out now in the hope that it reaches you before your travels. > >>>> > >>>> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > >>>>> - Modify device_links_check_suppliers(), > >>>>> device_links_driver_bound(), > >>>>> > >>>>> device_links_no_driver(), device_links_driver_cleanup(), > >>>>> device_links_busy(), and device_links_unbind_consumers() to walk > >>>>> link lists under device_links_lock (to make the new "driver > >>>>> presence tracking" mechanism work reliably). > >>>> > >>>> This change might increase boot time if drivers return -EPROBE_DEFER. > >>> > >>> "might"? Please verify this before guessing.... > >>> > >>> And don't make this more complex than needed before actually determining > >>> a real issue. > >> > >> As clarified by Rafael at Plumbers, this functional dependencies > >> framework assumes your driver / subsystem supports deferred probe, > > > > It isn't particularly clear what you mean by "support" here. > > I noted some folks had reported issues, and you acknowledged that if > deferred probe was used in some drivers and if this created an issue > the same issue would be seen with this framework. AFAICT there are two > possible issues to consider: > > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned > [0]. > > "Some drivers / subsystems don’t support deferred probe yet, such failures > usually don’t blow up, but cause subtle malfunctioning. Example, an > Ethernet phy could not get its interrupt as the primary IRQ chip had not > been probed yet, it reverted to polling though. Sub-optimal." [0] > > [0] > https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003 > 425.html > > Geert can you provide more details? This is a more global issue. In many cases drivers depend on optional resources. They are able to operate in a degraded mode (reduced feature set, reduced performances, ...) when those resources are not present. They can easily determine at probe time whether those resources are present, but have no way to know, in case they're absent, whether they will be present at some point in the near future (due to another driver probing the device providing the resource for instance) or if they will never be present (for instance because the required driver is missing). In the first case it would make sense to defer probe, in the latter case deferring probe forever for missing optional resources would prevent the device from being probed successfully at all. The functional dependencies tracking patch series isn't meant to address this issue. I can imagine a framework that would notify drivers of optional resource availability after probe time, but it would come at a high cost for drivers as switching between modes of operation at runtime based on the availability of such resources would be way more complex than a mechanism based on probe deferral. > 2) Since deferred probe relies on late_initcall() if your driver must > load earlier than this deferred probe can create an issue. Andrzej had > you identified a driver that ran into this and had issues ? If not > this seems like a semantics thing we should consider in extending the > documentation for drivers so that driver writers are aware of this > limitation. I would suppose candidates for this would be anything not > using module_init() or late_initcall() on their inits and have a > probe. > > >> if it does not support its not clear what will happen.... > > > > I don't see any problems here, but if you see any, please just say > > what they are. > > > >> We have no explicit semantics to check if a driver / subsystem > >> supports deferred probe. > > > > That's correct, but then do we need it? > > We can determine this by reviewing the two items above.
On Thu, Nov 10, 2016 at 1:59 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: >> On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: >>> As clarified by Rafael at Plumbers, this functional dependencies >>> framework assumes your driver / subsystem supports deferred probe, >> >> It isn't particularly clear what you mean by "support" here. > > I noted some folks had reported issues, and you acknowledged that if > deferred probe was used in some drivers and if this created an issue > the same issue would be seen with this framework. AFAICT there are two > possible issues to consider: > > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0]. > > "Some drivers / subsystems don’t support deferred probe yet, such failures > usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet > phy could not get its interrupt as the primary IRQ chip had not been probed > yet, it reverted to polling though. Sub-optimal." [0] > > [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html > > Geert can you provide more details? Issue reported in "of_mdiobus_register_phy() and deferred probe" (http://lkml.iu.edu/hypermail/linux/kernel/1510.2/05770.html) Key point is: "However, of_mdiobus_register_phy() uses irq_of_parse_and_map(), which plainly ignores EPROBE_DEFER, and it just continues." At that time, the PHY driver fell back to polling, but as of commit d5c3d8465 ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") that's no longer the case, and now the PHY fails to work completely. Workaround is "[PATCH v2] irqchip/renesas-irqc: Postpone driver initialization" (https://www.spinics.net/lists/netdev/msg403325.html), which seems to have sparked some interest in fixing the issue for good ;-) Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 10, 2016 at 09:14:32AM +0200, Laurent Pinchart wrote: > Hi Luis, > > On Wednesday 09 Nov 2016 16:59:30 Luis R. Rodriguez wrote: > > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki wrote: > > > On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez wrote: > > >> On Thu, Oct 27, 2016 at 05:25:51PM +0200, Greg Kroah-Hartman wrote: > > >>> On Wed, Oct 26, 2016 at 01:19:02PM +0200, Lukas Wunner wrote: > > >>>> Hi Rafael, > > >>>> > > >>>> sorry for not responding to v5 of your series earlier, just sending > > >>>> this out now in the hope that it reaches you before your travels. > > >>>> > > >>>> On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > >>>>> - Modify device_links_check_suppliers(), > > >>>>> device_links_driver_bound(), > > >>>>> > > >>>>> device_links_no_driver(), device_links_driver_cleanup(), > > >>>>> device_links_busy(), and device_links_unbind_consumers() to walk > > >>>>> link lists under device_links_lock (to make the new "driver > > >>>>> presence tracking" mechanism work reliably). > > >>>> > > >>>> This change might increase boot time if drivers return -EPROBE_DEFER. > > >>> > > >>> "might"? Please verify this before guessing.... > > >>> > > >>> And don't make this more complex than needed before actually determining > > >>> a real issue. > > >> > > >> As clarified by Rafael at Plumbers, this functional dependencies > > >> framework assumes your driver / subsystem supports deferred probe, > > > > > > It isn't particularly clear what you mean by "support" here. > > > > I noted some folks had reported issues, and you acknowledged that if > > deferred probe was used in some drivers and if this created an issue > > the same issue would be seen with this framework. AFAICT there are two > > possible issues to consider: > > > > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned > > [0]. > > > > "Some drivers / subsystems don’t support deferred probe yet, such failures > > usually don’t blow up, but cause subtle malfunctioning. Example, an > > Ethernet phy could not get its interrupt as the primary IRQ chip had not > > been probed yet, it reverted to polling though. Sub-optimal." [0] > > > > [0] > > https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003 > > 425.html > > > > Geert can you provide more details? > > This is a more global issue. In many cases drivers depend on optional > resources. They are able to operate in a degraded mode (reduced feature set, > reduced performances, ...) when those resources are not present. They can > easily determine at probe time whether those resources are present, but have > no way to know, in case they're absent, whether they will be present at some > point in the near future (due to another driver probing the device providing > the resource for instance) or if they will never be present (for instance > because the required driver is missing). I see thanks, so -EPROBE_DEFER assumes a late_initcall() would suffice to load all necessary requirements. > In the first case it would make sense to defer probe, So if the assumption is correct then it -EPROBE_DEFER should work. > in the latter case deferring probe forever for missing > optional resources would prevent the device from being probed successfully at > all. Right I see. And the driver core has no way to know what things *may* be needed. > The functional dependencies tracking patch series isn't meant to address this > issue. Right, however it does track functional dependencies for suspend/run time PM and we were certainly in hope this could help with probe ordering *later* in the future. This is a separate topic. But more on point, the issue here is that this framework relies on -EPROBE_DEFER -- so the issues discussed with it still exist and should be properly documented. > I can imagine a framework that would notify drivers of optional > resource availability after probe time, but it would come at a high cost for > drivers as switching between modes of operation at runtime based on the > availability of such resources would be way more complex than a mechanism > based on probe deferral. Right, I see. This is more forward looking, but -- if we had an annotation in Kconfig/turned to a mod info section, or to start off with just a driver MODULE_SUGGESTS() macro to start off with it might suffice for the driver core to request_module() annotated dependencies, such requests could be explicitly suggested as synchronous so init + probe do run together (as-is today), after which it could know that all possible drivers that needed to be loaded should now be loaded. If this sounds plausible to help, do we have drivers where we can test this on? For instance, since the functional dependency framework annotates functional dependencies for consumers/providers for suspend/resume and un time PM could such MODULE_SUGGESTS() annotations be considered on the consumers to suggest the provider drivers so their own probe yields to their providers to try first ? Luis > > 2) Since deferred probe relies on late_initcall() if your driver must > > load earlier than this deferred probe can create an issue. Andrzej had > > you identified a driver that ran into this and had issues ? If not > > this seems like a semantics thing we should consider in extending the > > documentation for drivers so that driver writers are aware of this > > limitation. I would suppose candidates for this would be anything not > > using module_init() or late_initcall() on their inits and have a > > probe. > > > > >> if it does not support its not clear what will happen.... > > > > > > I don't see any problems here, but if you see any, please just say > > > what they are. > > > > > >> We have no explicit semantics to check if a driver / subsystem > > >> supports deferred probe. > > > > > > That's correct, but then do we need it? > > > > We can determine this by reviewing the two items above. > > -- > Regards, > > Laurent Pinchart > >
On Thu, Nov 10, 2016 at 09:46:42AM +0100, Geert Uytterhoeven wrote: > On Thu, Nov 10, 2016 at 1:59 AM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > > On Wed, Nov 9, 2016 at 4:43 PM, Rafael J. Wysocki <rafael@kernel.org> wrote: > >> On Mon, Nov 7, 2016 at 10:22 PM, Luis R. Rodriguez <mcgrof@kernel.org> wrote: > >>> As clarified by Rafael at Plumbers, this functional dependencies > >>> framework assumes your driver / subsystem supports deferred probe, > >> > >> It isn't particularly clear what you mean by "support" here. > > > > I noted some folks had reported issues, and you acknowledged that if > > deferred probe was used in some drivers and if this created an issue > > the same issue would be seen with this framework. AFAICT there are two > > possible issues to consider: > > > > 1) the one Geert Uytterhoeven noted. Again I'll note what he had mentioned [0]. > > > > "Some drivers / subsystems don’t support deferred probe yet, such failures > > usually don’t blow up, but cause subtle malfunctioning. Example, an Ethernet > > phy could not get its interrupt as the primary IRQ chip had not been probed > > yet, it reverted to polling though. Sub-optimal." [0] > > > > [0] https://lists.linuxfoundation.org/pipermail/ksummit-discuss/2016-August/003425.html > > > > Geert can you provide more details? > > Issue reported in "of_mdiobus_register_phy() and deferred probe" > (http://lkml.iu.edu/hypermail/linux/kernel/1510.2/05770.html) > > Key point is: > "However, of_mdiobus_register_phy() uses irq_of_parse_and_map(), which plainly > ignores EPROBE_DEFER, and it just continues." > > At that time, the PHY driver fell back to polling, but as of commit d5c3d8465 > ("net: phy: Avoid polling PHY with PHY_IGNORE_INTERRUPTS") that's no longer the > case, and now the PHY fails to work completely. > > Workaround is "[PATCH v2] irqchip/renesas-irqc: Postpone driver initialization" > (https://www.spinics.net/lists/netdev/msg403325.html), which seems to have > sparked some interest in fixing the issue for good ;-) Ah playing with init levels. You are lucky here that this suffices, the IOMMU folks already ran out with enough levels to play with so they cannot resolve their issue this way, for instance. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote: > This is more forward looking, but -- if we had an annotation in Kconfig/turned > to a mod info section, or to start off with just a driver MODULE_SUGGESTS() macro > to start off with it might suffice for the driver core to request_module() > annotated dependencies, such requests could be explicitly suggested as > synchronous so init + probe do run together (as-is today), after which it > could know that all possible drivers that needed to be loaded should now be > loaded. If this sounds plausible to help, do we have drivers where we can > test this on? For instance, since the functional dependency framework > annotates functional dependencies for consumers/providers for suspend/resume > and un time PM could such MODULE_SUGGESTS() annotations be considered on the > consumers to suggest the provider drivers so their own probe yields to their > providers to try first ? No. Stop. First off, the "driver core" NEVER can "know" if "all possible drivers that should be loaded, are loaded. That way lies madness and impossibility. Secondly, yet-another-section isn't going to help anything here, we alredy "suggest" to userspace a bunch of stuff, so we get the needed modules loaded, at sometime in the future, if they are around, and userspace feels like it. That's the best we can ever do. Don't try to make this more difficult than it is please. DEFER works today really really well, and it's really really simple. Inter-dependancy of modules and devices connected to each other are two different things, be careful about this. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Thu, Nov 10, 2016 at 09:05:30AM +0200, Laurent Pinchart wrote: > Hi Luis, > > On Monday 07 Nov 2016 22:39:54 Luis R. Rodriguez wrote: > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com> > > > > > > Currently, there is a problem with taking functional dependencies > > > between devices into account. > > > > > > What I mean by a "functional dependency" is when the driver of device > > > B needs device A to be functional and (generally) its driver to be > > > present in order to work properly. This has certain consequences > > > for power management (suspend/resume and runtime PM ordering) and > > > shutdown ordering of these devices. In general, it also implies that > > > the driver of A needs to be working for B to be probed successfully > > > and it cannot be unbound from the device before the B's driver. > > > > > > Support for representing those functional dependencies between > > > devices is added here to allow the driver core to track them and act > > > on them in certain cases where applicable. > > > > > > The argument for doing that in the driver core is that there are > > > quite a few distinct use cases involving device dependencies, they > > > are relatively hard to get right in a driver (if one wants to > > > address all of them properly) and it only gets worse if multiplied > > > by the number of drivers potentially needing to do it. > > > > How many drivers actually *need this* today for suspend / resume ? > > I don't think there's a list, but just speaking for Renesas R-Car platforms > such a dependency exists from all DMA bus masters to the system IOMMUs (we're > talking about dozens of devices here), as well as from the display, video > codec and video processing IP cores to transparent memory access IP cores that > handle burst access and compression/decompression. This is very significant, thanks. > > How many of these are because of ACPI firmware bugs rather than > > some other reason ? > > None from the list above, even with s/ACPI/DT/. > > > Can ACPI somehow be used instead by these devices to address quirks? > > Certainly not on ARM embedded platforms :-) So this framework adds a generic way. > > > Morever, at least one case (asynchronous system suspend/resume) cannot be > > > handled in a single driver at all, because it requires the driver of A to > > > wait for B to suspend (during system suspend) and the driver of B to > > > wait for A to resume (during system resume). > > > > Why is this all of a sudden a new issue? > > It's not a new issue, we've just never addressed it properly so far. Various > workarounds existed, with various level of success (usually more for runtime > PM than system suspend/resume). This helps a lot. I think there has at this point been enough effort to try to inform as many folks who might have these workaround or their own solution to voice their concerns, I tried to review some of the existing solutions myself but its hard to get a quick grasp of them -- provided no one else yells out I welcome then this change as a proper evolution. I suppose the only last remaining concerns I had were aggregating on top of deferred probe and the lack of a test driver to test complex topologies using this framework to test order is never broken. For the first concern it would seem that folks using this framework would actually test and ensure device links works for their drivers, even if deferred probe is used. For the later concern, a separate test driver could be added later. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Greg, On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote: > On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote: > > This is more forward looking, but -- if we had an annotation in > > Kconfig/turned to a mod info section, or to start off with just a driver > > MODULE_SUGGESTS() macro to start off with it might suffice for the driver > > core to request_module() annotated dependencies, such requests could be > > explicitly suggested as synchronous so init + probe do run together > > (as-is today), after which it could know that all possible drivers that > > needed to be loaded should now be loaded. If this sounds plausible to > > help, do we have drivers where we can test this on? For instance, since > > the functional dependency framework annotates functional dependencies for > > consumers/providers for suspend/resume and un time PM could such > > MODULE_SUGGESTS() annotations be considered on the consumers to suggest > > the provider drivers so their own probe yields to their providers to try > > first ? > > No. > > Stop. > > First off, the "driver core" NEVER can "know" if "all possible drivers > that should be loaded, are loaded. That way lies madness and > impossibility. > > Secondly, yet-another-section isn't going to help anything here, we > alredy "suggest" to userspace a bunch of stuff, so we get the needed > modules loaded, at sometime in the future, if they are around, and > userspace feels like it. That's the best we can ever do. > > Don't try to make this more difficult than it is please. DEFER works > today really really well, and it's really really simple. > Inter-dependancy of modules and devices connected to each other are two > different things, be careful about this. One issue we don't address today is handling of optional dependencies. A simple example is an SPI controller that can use a DMA engine or work in PIO mode. At probe time the driver will request a DMA channel if the platform (ACPI, DT, platform data) specifies that DMA is available. This can fail for various reasons, one of them being that the DMA engine driver hasn't probed the DMA device yet. In that case the SPI controller driver will continue in PIO mode, ignoring the DMA engine that will later be probed. We can't defer probing of the SPI controller as the DMA engine driver might never get loaded, which would result in the SPI controller probe being deferred forever. One solution for this type of dependency issue would be to notify the SPI controller driver after probe that the DMA channel is now available. I'd like to avoid that though, as it would drastically increase the complexity of lots of drivers and create lots of race conditions. There are certain configurations that we could possibly consider as invalid. For instance if the SPI controller driver is built-in and the DMA engine driver built as a module, the user clearly shot themselves in the foot and the kernel can't be blamed. For resources that can't be built as a module (IOMMUs for instance) we thus only have to consider the case where both drivers are built-in, as the resource built-in and consumer as a module should work properly from an ordering point of view (at least as long as we don't allow asynchronous probing of built-in drivers to be delayed enough for modules to be loaded...). In this case, if the resource driver isn't available when the consumer is probed, if will never be available at the consumer can safely proceed in a degraded mode. We would thus only need to solve the probe ordering issue. I'm not sure how far these simple(r) solutions that consider certain cases as invalid would scale though, and whether we won't need a more generic solution at some point anyway.
On Fri, Nov 11, 2016 at 02:08:35AM +0200, Laurent Pinchart wrote: > Hi Greg, > > On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote: > > On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote: > > > This is more forward looking, but -- if we had an annotation in > > > Kconfig/turned to a mod info section, or to start off with just a driver > > > MODULE_SUGGESTS() macro to start off with it might suffice for the driver > > > core to request_module() annotated dependencies, such requests could be > > > explicitly suggested as synchronous so init + probe do run together > > > (as-is today), after which it could know that all possible drivers that > > > needed to be loaded should now be loaded. If this sounds plausible to > > > help, do we have drivers where we can test this on? For instance, since > > > the functional dependency framework annotates functional dependencies for > > > consumers/providers for suspend/resume and un time PM could such > > > MODULE_SUGGESTS() annotations be considered on the consumers to suggest > > > the provider drivers so their own probe yields to their providers to try > > > first ? > > > > No. > > > > Stop. > > > > First off, the "driver core" NEVER can "know" if "all possible drivers > > that should be loaded, are loaded. That way lies madness and > > impossibility. > > > > Secondly, yet-another-section isn't going to help anything here, we > > alredy "suggest" to userspace a bunch of stuff, so we get the needed > > modules loaded, at sometime in the future, if they are around, and > > userspace feels like it. That's the best we can ever do. > > > > Don't try to make this more difficult than it is please. DEFER works > > today really really well, and it's really really simple. > > Inter-dependancy of modules and devices connected to each other are two > > different things, be careful about this. > > One issue we don't address today is handling of optional dependencies. A > simple example is an SPI controller that can use a DMA engine or work in PIO > mode. At probe time the driver will request a DMA channel if the platform > (ACPI, DT, platform data) specifies that DMA is available. This can fail for > various reasons, one of them being that the DMA engine driver hasn't probed > the DMA device yet. In that case the SPI controller driver will continue in > PIO mode, ignoring the DMA engine that will later be probed. We can't defer > probing of the SPI controller as the DMA engine driver might never get loaded, > which would result in the SPI controller probe being deferred forever. > > One solution for this type of dependency issue would be to notify the SPI > controller driver after probe that the DMA channel is now available. I'd like > to avoid that though, as it would drastically increase the complexity of lots > of drivers and create lots of race conditions. > > There are certain configurations that we could possibly consider as invalid. > For instance if the SPI controller driver is built-in and the DMA engine > driver built as a module, the user clearly shot themselves in the foot and the > kernel can't be blamed. > > For resources that can't be built as a module (IOMMUs for instance) we thus > only have to consider the case where both drivers are built-in, as the > resource built-in and consumer as a module should work properly from an > ordering point of view (at least as long as we don't allow asynchronous > probing of built-in drivers to be delayed enough for modules to be loaded...). > In this case, if the resource driver isn't available when the consumer is > probed, if will never be available at the consumer can safely proceed in a > degraded mode. We would thus only need to solve the probe ordering issue. > > I'm not sure how far these simple(r) solutions that consider certain cases as > invalid would scale though, and whether we won't need a more generic solution > at some point anyway. I would love to see a generic solution that works for all of these complex cases, as I agree with you, it's complex :) But I have yet to see any such patches that implement this. As always, I am very glad to review anything that people create, but I don't have the time to work on such a solution myself at the moment. thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Nov 08, 2016 at 09:58:24PM +0100, Luis R. Rodriguez wrote: > On Tue, Nov 08, 2016 at 08:43:35PM +0100, Greg Kroah-Hartman wrote: > > Yes, you can iterate a > > lot of times, but that's fine, we have time at boot to do that (and > > really, it is fast.) > > Deferred probe is left for late_initcall() [1] -- this *assumes* that the > driver/subsystem can be loaded so late, and as per Andrzej this solution > isunacceptable/undesirable. So it would be unfair and incorrect to categorize > all drivers in the same boat, in fact given this lone fact it may be > worth revisiting the idea I mentioned about a capability aspect to support > a late deferred probe later. We tend to assume its fine, however it does > not seem to be the case. > > [1] drivers/base/dd.c:late_initcall(deferred_probe_initcall); Note that driver_deferred_probe_trigger() is called not only from this late_initcall but also from driver_bound(). In other words, whenever a driver binds successfully, deferred devices will reprobe, and devices which had to defer will not necessarily have to wait until late_initcall until they're reprobed. The late_initcall is just a way to tell devices "this is last call". Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote: > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > One of the actions carried out by device_link_add() is to reorder > > the lists used for device shutdown and system suspend/resume to > > put the consumer device along with all of its children and all of > > its consumers (and so on, recursively) to the ends of those lists > > in order to ensure the right ordering between all of the supplier > > and consumer devices. > > There's no explanation as to why this order is ensured to be > correct, I think its important to document this. From our discussions > at Plumbers it seems the order is ensured due to the fact that order > was already implicitly provided through platform firmware (ACPI > enumeration is one), adjusting order on the dpm list is just shuffling > order between consumer / provider, but nothing else. ACPI specifies a hierarchy and the order on the dpm_list and devices_kset is such that children are behind their parent. A device link specifies a dependency that exists in addition to the hierarchy, hence consumers need to be moved behind their supplier. And not only the consumers themselves but also recursively their children and consumers. Essentially the entire subtree is moved to the back. That happens in device_reorder_to_tail() in patch 2. If another device is enumerated which acts as a supplier to an existing other supplier, that other supplier and all its dependents are moved behind the newly enumerated device, and so on. That is provably correct so long as no loops are introduced in the dependency graph. That is checked by device_is_dependent(), which is called from device_link_add(), and the addition of the link is aborted if a loop is detected. Best regards, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Hi Laurent, On Fri, Nov 11, 2016 at 1:08 AM, Laurent Pinchart <laurent.pinchart@ideasonboard.com> wrote: > On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote: >> On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote: >> Don't try to make this more difficult than it is please. DEFER works >> today really really well, and it's really really simple. >> Inter-dependancy of modules and devices connected to each other are two >> different things, be careful about this. > > One issue we don't address today is handling of optional dependencies. A > simple example is an SPI controller that can use a DMA engine or work in PIO > mode. At probe time the driver will request a DMA channel if the platform > (ACPI, DT, platform data) specifies that DMA is available. This can fail for > various reasons, one of them being that the DMA engine driver hasn't probed > the DMA device yet. In that case the SPI controller driver will continue in > PIO mode, ignoring the DMA engine that will later be probed. We can't defer > probing of the SPI controller as the DMA engine driver might never get loaded, > which would result in the SPI controller probe being deferred forever. > > One solution for this type of dependency issue would be to notify the SPI > controller driver after probe that the DMA channel is now available. I'd like > to avoid that though, as it would drastically increase the complexity of lots > of drivers and create lots of race conditions. Alternatively, the driver can request optional DMA resources at open time instead of at probe time. E.g. sh_sci does that in its uart_ops.startup() callback. Regardless of that, DMA can fail temporarily for other reasons (e.g. running out of channels). Gr{oetje,eeting}s, Geert -- Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org In personal conversations with technical people, I call myself a hacker. But when I'm talking to journalists I just say "programmer" or something like that. -- Linus Torvalds -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote: > On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote: > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > > One of the actions carried out by device_link_add() is to reorder > > > the lists used for device shutdown and system suspend/resume to > > > put the consumer device along with all of its children and all of > > > its consumers (and so on, recursively) to the ends of those lists > > > in order to ensure the right ordering between all of the supplier > > > and consumer devices. > > > > There's no explanation as to why this order is ensured to be > > correct, I think its important to document this. From our discussions > > at Plumbers it seems the order is ensured due to the fact that order > > was already implicitly provided through platform firmware (ACPI > > enumeration is one), adjusting order on the dpm list is just shuffling > > order between consumer / provider, but nothing else. > > ACPI specifies a hierarchy and the order on the dpm_list and > devices_kset is such that children are behind their parent. > > A device link specifies a dependency that exists in addition > to the hierarchy, hence consumers need to be moved behind > their supplier. And not only the consumers themselves but > also recursively their children and consumers. Essentially > the entire subtree is moved to the back. That happens in > device_reorder_to_tail() in patch 2. Ah neat, I failed to notice this full subtree tree move, its rather important. > If another device is enumerated which acts as a supplier to > an existing other supplier, that other supplier and all its > dependents are moved behind the newly enumerated device, > and so on. > > That is probably correct so long as no loops are introduced > in the dependency graph. "Probably" is what concerns me, there is no formality about the correctness of this. > That is checked by device_is_dependent(), > which is called from device_link_add(), and the addition of the > link is aborted if a loop is detected. And that is sufficient ? Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Nov 13, 2016 at 11:59:42AM +0100, Greg Kroah-Hartman wrote: > On Fri, Nov 11, 2016 at 02:08:35AM +0200, Laurent Pinchart wrote: > > Hi Greg, > > > > On Thursday 10 Nov 2016 23:40:54 Greg Kroah-Hartman wrote: > > > On Thu, Nov 10, 2016 at 11:04:07PM +0100, Luis R. Rodriguez wrote: > > > > This is more forward looking, but -- if we had an annotation in > > > > Kconfig/turned to a mod info section, or to start off with just a driver > > > > MODULE_SUGGESTS() macro to start off with it might suffice for the driver > > > > core to request_module() annotated dependencies, such requests could be > > > > explicitly suggested as synchronous so init + probe do run together > > > > (as-is today), after which it could know that all possible drivers that > > > > needed to be loaded should now be loaded. If this sounds plausible to > > > > help, do we have drivers where we can test this on? For instance, since > > > > the functional dependency framework annotates functional dependencies for > > > > consumers/providers for suspend/resume and un time PM could such > > > > MODULE_SUGGESTS() annotations be considered on the consumers to suggest > > > > the provider drivers so their own probe yields to their providers to try > > > > first ? > > > > > > No. > > > > > > Stop. > > > > > > First off, the "driver core" NEVER can "know" if "all possible drivers > > > that should be loaded, are loaded. That way lies madness and > > > impossibility. At first I had discarded the generic driver problem as a slightly unrelated topic however its clear now its not. In terms of functional dependencies I agree that not providing strict order is *sometimes* desirable to help with simplicity. The generic driver problem can be described graph-wise: on a DAG, we're considering a topology with where nodes have optional superior alternatives, and what you seem to be advocating is a transition to an alternative is better and much simpler than forcing order from the start. That can only work so long as some intermediary dependencies are (for lack of a better term) soft-nodes -- where transitions to better alternatives are possible and can such transitions can be handled in software. You may have (again for lack of a better term) hard-nodes though where an entry in the DAG is required as a hard requirement immediately prior to letting another entry proceed. An example here is the x86 IOMMU drivers and dependent GPU DRM drivers, currently only link order asserts proper order given we have run out of semantics in between to ensure proper order is correct. > > > Secondly, yet-another-section isn't going to help anything here, we > > > alredy "suggest" to userspace a bunch of stuff, so we get the needed > > > modules loaded, at sometime in the future, if they are around, and > > > userspace feels like it. That's the best we can ever do. For some cases this is sufficient, for hard-nodes (term used above) though if you get the incorrect order you may in the worst case oops. > > > Don't try to make this more difficult than it is please. DEFER works > > > today really really well, and it's really really simple. It seems many disagree. What is clear is its simplicity outweighs the complexity by alternatives which have been historically considered. This is reasonable. Part of the reason probe ordering, as an optimization consideration, came up while function dependencies for runtime PM and suspend are being discussed is as we've determined this is a related problem and at least for hard-nodes this is critical to resolve. For now we have enough tools to work around problems for hard-nodes, but it would be silly for us not to start thinking about ways to improve upon this for the future. > > > Inter-dependancy of modules and devices connected to each other are two > > > different things, be careful about this. This is a *very* fair warning :) > > One issue we don't address today is handling of optional dependencies. A > > simple example is an SPI controller that can use a DMA engine or work in PIO > > mode. At probe time the driver will request a DMA channel if the platform > > (ACPI, DT, platform data) specifies that DMA is available. This can fail for > > various reasons, one of them being that the DMA engine driver hasn't probed > > the DMA device yet. In that case the SPI controller driver will continue in > > PIO mode, ignoring the DMA engine that will later be probed. We can't defer > > probing of the SPI controller as the DMA engine driver might never get loaded, > > which would result in the SPI controller probe being deferred forever. > > > > One solution for this type of dependency issue would be to notify the SPI > > controller driver after probe that the DMA channel is now available. I'd like > > to avoid that though, as it would drastically increase the complexity of lots > > of drivers and create lots of race conditions. > > > > There are certain configurations that we could possibly consider as invalid. > > For instance if the SPI controller driver is built-in and the DMA engine > > driver built as a module, the user clearly shot themselves in the foot and the > > kernel can't be blamed. > > > > For resources that can't be built as a module (IOMMUs for instance) we thus > > only have to consider the case where both drivers are built-in, as the > > resource built-in and consumer as a module should work properly from an > > ordering point of view (at least as long as we don't allow asynchronous > > probing of built-in drivers to be delayed enough for modules to be loaded...). > > In this case, if the resource driver isn't available when the consumer is > > probed, if will never be available at the consumer can safely proceed in a > > degraded mode. We would thus only need to solve the probe ordering issue. > > > > I'm not sure how far these simple(r) solutions that consider certain cases as > > invalid would scale though, and whether we won't need a more generic solution > > at some point anyway. > > I would love to see a generic solution that works for all of these > complex cases, as I agree with you, it's complex :) > > But I have yet to see any such patches that implement this. The generic driver topic is related but it certainly only part of the picture. It seems there were enough folks interested in that topic though so perhaps patches will be eventually produced for it. > As always, > I am very glad to review anything that people create, but I don't have > the time to work on such a solution myself at the moment. Part of what we tried to discuss during the complex dependencies topics at Plumbers was evaluating if some of the existing solutions for run time PM and suspend could help with probe ordering, it seems we had agreement on it, what we found though was that for many cases the use of struct device for link association is too late. Alternatives mechanisms will be considered in the future, and it seems that one path forward will be to consider expanding upon this simple functional device dependency framework. So let's see more patches! Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 14, 2016 at 02:48:32PM +0100, Luis R. Rodriguez wrote: > On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote: > > On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote: > > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: > > > > One of the actions carried out by device_link_add() is to reorder > > > > the lists used for device shutdown and system suspend/resume to > > > > put the consumer device along with all of its children and all of > > > > its consumers (and so on, recursively) to the ends of those lists > > > > in order to ensure the right ordering between all of the supplier > > > > and consumer devices. > > > > > > There's no explanation as to why this order is ensured to be > > > correct, I think its important to document this. From our discussions > > > at Plumbers it seems the order is ensured due to the fact that order > > > was already implicitly provided through platform firmware (ACPI > > > enumeration is one), adjusting order on the dpm list is just shuffling > > > order between consumer / provider, but nothing else. > > > > ACPI specifies a hierarchy and the order on the dpm_list and > > devices_kset is such that children are behind their parent. > > > > A device link specifies a dependency that exists in addition > > to the hierarchy, hence consumers need to be moved behind > > their supplier. And not only the consumers themselves but > > also recursively their children and consumers. Essentially > > the entire subtree is moved to the back. That happens in > > device_reorder_to_tail() in patch 2. > > Ah neat, I failed to notice this full subtree tree move, its > rather important. > > > If another device is enumerated which acts as a supplier to > > an existing other supplier, that other supplier and all its > > dependents are moved behind the newly enumerated device, > > and so on. > > > > That is probably correct so long as no loops are introduced > > in the dependency graph. > > "Probably" is what concerns me, there is no formality about > the correctness of this. It's a typo, I meant to say "provably correct". Sorry. Quite a difference in meaning. :-) > > That is checked by device_is_dependent(), > > which is called from device_link_add(), and the addition of the > > link is aborted if a loop is detected. > > And that is sufficient ? The device links turn the device tree into a directed acyclic graph. For the dpm_list and devices_kset, that graph is flattened into a one-dimensional form such that all ancestors and suppliers of a device appear in front of that device in the lists. I'm not a graph theorist and can't provide a formal proof. I think Rafael is a Dr., maybe he can do it. :-) I merely looked at this from a practical point of view, i.e. I tried to come up with corner cases where dependencies are added that would result in incorrect ordering, and concluded that I couldn't find any. Thanks, Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Mon, Nov 14, 2016 at 7:48 AM, Lukas Wunner <lukas@wunner.de> wrote: > On Mon, Nov 14, 2016 at 02:48:32PM +0100, Luis R. Rodriguez wrote: >> On Sun, Nov 13, 2016 at 06:34:13PM +0100, Lukas Wunner wrote: >> > On Mon, Nov 07, 2016 at 10:39:54PM +0100, Luis R. Rodriguez wrote: >> > > On Mon, Oct 10, 2016 at 02:51:04PM +0200, Rafael J. Wysocki wrote: >> > > > One of the actions carried out by device_link_add() is to reorder >> > > > the lists used for device shutdown and system suspend/resume to >> > > > put the consumer device along with all of its children and all of >> > > > its consumers (and so on, recursively) to the ends of those lists >> > > > in order to ensure the right ordering between all of the supplier >> > > > and consumer devices. >> > > >> > > There's no explanation as to why this order is ensured to be >> > > correct, I think its important to document this. From our discussions >> > > at Plumbers it seems the order is ensured due to the fact that order >> > > was already implicitly provided through platform firmware (ACPI >> > > enumeration is one), adjusting order on the dpm list is just shuffling >> > > order between consumer / provider, but nothing else. >> > >> > ACPI specifies a hierarchy and the order on the dpm_list and >> > devices_kset is such that children are behind their parent. >> > >> > A device link specifies a dependency that exists in addition >> > to the hierarchy, hence consumers need to be moved behind >> > their supplier. And not only the consumers themselves but >> > also recursively their children and consumers. Essentially >> > the entire subtree is moved to the back. That happens in >> > device_reorder_to_tail() in patch 2. >> >> Ah neat, I failed to notice this full subtree tree move, its >> rather important. >> >> > If another device is enumerated which acts as a supplier to >> > an existing other supplier, that other supplier and all its >> > dependents are moved behind the newly enumerated device, >> > and so on. >> > >> > That is probably correct so long as no loops are introduced >> > in the dependency graph. >> >> "Probably" is what concerns me, there is no formality about >> the correctness of this. > > It's a typo, I meant to say "provably correct". Sorry. > > Quite a difference in meaning. :-) No sorry my own mistake -- you had written provably but I thought you had a typo, clearly you meant as you typed :) If the trees are independent then yes, I can see this working. >> > That is checked by device_is_dependent(), >> > which is called from device_link_add(), and the addition of the >> > link is aborted if a loop is detected. >> >> And that is sufficient ? > > The device links turn the device tree into a directed acyclic graph. > For the dpm_list and devices_kset, that graph is flattened into a > one-dimensional form such that all ancestors and suppliers of a > device appear in front of that device in the lists. I'm not a > graph theorist and can't provide a formal proof. I think Rafael > is a Dr., maybe he can do it. :-) If you traverse the DAG you can get a linear one-dimensional representation of the dependencies -- I'm no expert on this either, however I'm buying what you described above then, provided we do indeed avoid cycles. > I merely looked at this from a > practical point of view, i.e. I tried to come up with corner cases > where dependencies are added that would result in incorrect ordering, > and concluded that I couldn't find any. Fair enough, likewise. I think the takeaway from this discussion is a test driver trying to break this might be good to have, but also documenting how this works in practice and how we avoid the cycles is important. This was not very clear from the patches. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Index: linux-pm/drivers/base/base.h =================================================================== --- linux-pm.orig/drivers/base/base.h +++ linux-pm/drivers/base/base.h @@ -107,6 +107,9 @@ extern void bus_remove_device(struct dev extern int bus_add_driver(struct device_driver *drv); extern void bus_remove_driver(struct device_driver *drv); +extern void device_release_driver_internal(struct device *dev, + struct device_driver *drv, + struct device *parent); extern void driver_detach(struct device_driver *drv); extern int driver_probe_device(struct device_driver *drv, struct device *dev); @@ -152,3 +155,13 @@ extern int devtmpfs_init(void); #else static inline int devtmpfs_init(void) { return 0; } #endif + +/* Device links support */ +extern int device_links_read_lock(void); +extern void device_links_read_unlock(int idx); +extern int device_links_check_suppliers(struct device *dev); +extern void device_links_driver_bound(struct device *dev); +extern void device_links_driver_cleanup(struct device *dev); +extern void device_links_no_driver(struct device *dev); +extern bool device_links_busy(struct device *dev); +extern void device_links_unbind_consumers(struct device *dev); Index: linux-pm/drivers/base/core.c =================================================================== --- linux-pm.orig/drivers/base/core.c +++ linux-pm/drivers/base/core.c @@ -44,6 +44,492 @@ static int __init sysfs_deprecated_setup early_param("sysfs.deprecated", sysfs_deprecated_setup); #endif +/* Device links support. */ + +static DEFINE_MUTEX(device_links_lock); +DEFINE_STATIC_SRCU(device_links_srcu); + +int device_links_read_lock(void) +{ + return srcu_read_lock(&device_links_srcu); +} + +void device_links_read_unlock(int idx) +{ + return srcu_read_unlock(&device_links_srcu, idx); +} + +/** + * device_is_dependent - Check if one device depends on another one + * @dev: Device to check dependencies for. + * @target: Device to check against. + * + * Check if @target depends on @dev or any device dependent on it (its child or + * its consumer etc). Return 1 if that is the case or 0 otherwise. + */ +static int device_is_dependent(struct device *dev, void *target) +{ + struct device_link *link; + int ret; + + if (WARN_ON(dev == target)) + return 1; + + ret = device_for_each_child(dev, target, device_is_dependent); + if (ret) + return ret; + + list_for_each_entry(link, &dev->links.consumers, s_node) { + if (WARN_ON(link->consumer == target)) + return 1; + + ret = device_is_dependent(link->consumer, target); + if (ret) + break; + } + return ret; +} + +static int device_reorder_to_tail(struct device *dev, void *not_used) +{ + struct device_link *link; + + /* + * Devices that have not been registered yet will be put to the ends + * of the lists during the registration, so skip them here. + */ + if (device_is_registered(dev)) + devices_kset_move_last(dev); + + if (device_pm_initialized(dev)) + device_pm_move_last(dev); + + device_for_each_child(dev, NULL, device_reorder_to_tail); + list_for_each_entry(link, &dev->links.consumers, s_node) + device_reorder_to_tail(link->consumer, NULL); + + return 0; +} + +/** + * device_link_add - Create a link between two devices. + * @consumer: Consumer end of the link. + * @supplier: Supplier end of the link. + * @flags: Link flags. + * + * If the DL_FLAG_AUTOREMOVE is set, the link will be removed automatically + * when the consumer device driver unbinds from it. The combination of both + * DL_FLAG_AUTOREMOVE and DL_FLAG_STATELESS set is invalid and will cause NULL + * to be returned. + * + * A side effect of the link creation is re-ordering of dpm_list and the + * devices_kset list by moving the consumer device and all devices depending + * on it to the ends of these lists (that does not happen to devices that have + * not been registered when this function is called). + * + * The supplier device is required to be registered when this function is called + * and NULL will be returned if that is not the case. The consumer device need + * not be registerd, however. + */ +struct device_link *device_link_add(struct device *consumer, + struct device *supplier, u32 flags) +{ + struct device_link *link; + + if (!consumer || !supplier || + ((flags & DL_FLAG_STATELESS) && (flags & DL_FLAG_AUTOREMOVE))) + return NULL; + + mutex_lock(&device_links_lock); + device_pm_lock(); + + /* + * If the supplier has not been fully registered yet or there is a + * reverse dependency between the consumer and the supplier already in + * the graph, return NULL. + */ + if (!device_pm_initialized(supplier) + || device_is_dependent(consumer, supplier)) { + link = NULL; + goto out; + } + + list_for_each_entry(link, &supplier->links.consumers, s_node) + if (link->consumer == consumer) + goto out; + + link = kmalloc(sizeof(*link), GFP_KERNEL); + if (!link) + goto out; + + get_device(supplier); + link->supplier = supplier; + INIT_LIST_HEAD(&link->s_node); + get_device(consumer); + link->consumer = consumer; + INIT_LIST_HEAD(&link->c_node); + link->flags = flags; + + /* Deterine the initial link state. */ + if (flags & DL_FLAG_STATELESS) { + link->status = DL_STATE_NONE; + } else { + switch (supplier->links.status) { + case DL_DEV_DRIVER_BOUND: + switch (consumer->links.status) { + case DL_DEV_PROBING: + link->status = DL_STATE_CONSUMER_PROBE; + break; + case DL_DEV_DRIVER_BOUND: + link->status = DL_STATE_ACTIVE; + break; + default: + link->status = DL_STATE_AVAILABLE; + break; + } + break; + case DL_DEV_UNBINDING: + link->status = DL_STATE_SUPPLIER_UNBIND; + break; + default: + link->status = DL_STATE_DORMANT; + break; + } + } + + /* + * Move the consumer and all of the devices depending on it to the end + * of dpm_list and the devices_kset list. + * + * It is necessary to hold dpm_list locked throughout all that or else + * we may end up suspending with a wrong ordering of it. + */ + device_reorder_to_tail(consumer, NULL); + + list_add_tail_rcu(&link->s_node, &supplier->links.consumers); + list_add_tail_rcu(&link->c_node, &consumer->links.suppliers); + + dev_info(consumer, "Linked as a consumer to %s\n", dev_name(supplier)); + + out: + device_pm_unlock(); + mutex_unlock(&device_links_lock); + return link; +} +EXPORT_SYMBOL_GPL(device_link_add); + +static void __device_link_free_srcu(struct rcu_head *rhead) +{ + struct device_link *link; + + link = container_of(rhead, struct device_link, rcu_head); + put_device(link->consumer); + put_device(link->supplier); + kfree(link); +} + +static void __device_link_del(struct device_link *link) +{ + dev_info(link->consumer, "Dropping the link to %s\n", + dev_name(link->supplier)); + + list_del_rcu(&link->s_node); + list_del_rcu(&link->c_node); + call_srcu(&device_links_srcu, &link->rcu_head, __device_link_free_srcu); +} + +/** + * device_link_del - Delete a link between two devices. + * @link: Device link to delete. + * + * The caller must ensure proper synchronization of this function with runtime + * PM. + */ +void device_link_del(struct device_link *link) +{ + mutex_lock(&device_links_lock); + device_pm_lock(); + __device_link_del(link); + device_pm_unlock(); + mutex_unlock(&device_links_lock); +} +EXPORT_SYMBOL_GPL(device_link_del); + +static void device_links_missing_supplier(struct device *dev) +{ + struct device_link *link; + + list_for_each_entry(link, &dev->links.suppliers, c_node) + if (link->status == DL_STATE_CONSUMER_PROBE) + WRITE_ONCE(link->status, DL_STATE_AVAILABLE); +} + +/** + * device_links_check_suppliers - Check presence of supplier drivers. + * @dev: Consumer device. + * + * Check links from this device to any suppliers. Walk the list of the device's + * links to suppliers and see if all of them are available. If not, simply + * return -EPROBE_DEFER. + * + * We need to guarantee that the supplier will not go away after the check has + * been positive here. It only can go away in __device_release_driver() and + * that function checks the device's links to consumers. This means we need to + * mark the link as "consumer probe in progress" to make the supplier removal + * wait for us to complete (or bad things may happen). + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +int device_links_check_suppliers(struct device *dev) +{ + struct device_link *link; + int ret = 0; + + mutex_lock(&device_links_lock); + + list_for_each_entry(link, &dev->links.suppliers, c_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + if (link->status != DL_STATE_AVAILABLE) { + device_links_missing_supplier(dev); + ret = -EPROBE_DEFER; + break; + } + WRITE_ONCE(link->status, DL_STATE_CONSUMER_PROBE); + } + dev->links.status = DL_DEV_PROBING; + + mutex_unlock(&device_links_lock); + return ret; +} + +/** + * device_links_driver_bound - Update device links after probing its driver. + * @dev: Device to update the links for. + * + * The probe has been successful, so update links from this device to any + * consumers by changing their status to "available". + * + * Also change the status of @dev's links to suppliers to "active". + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +void device_links_driver_bound(struct device *dev) +{ + struct device_link *link; + + mutex_lock(&device_links_lock); + + list_for_each_entry(link, &dev->links.consumers, s_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + WARN_ON(link->status != DL_STATE_DORMANT); + WRITE_ONCE(link->status, DL_STATE_AVAILABLE); + } + + list_for_each_entry(link, &dev->links.suppliers, c_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + WARN_ON(link->status != DL_STATE_CONSUMER_PROBE); + WRITE_ONCE(link->status, DL_STATE_ACTIVE); + } + + dev->links.status = DL_DEV_DRIVER_BOUND; + + mutex_unlock(&device_links_lock); +} + +/** + * __device_links_no_driver - Update links of a device without a driver. + * @dev: Device without a drvier. + * + * Delete all non-persistent links from this device to any suppliers. + * + * Persistent links stay around, but their status is changed to "available", + * unless they already are in the "supplier unbind in progress" state in which + * case they need not be updated. + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +static void __device_links_no_driver(struct device *dev) +{ + struct device_link *link, *ln; + + list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + if (link->flags & DL_FLAG_AUTOREMOVE) + __device_link_del(link); + else if (link->status != DL_STATE_SUPPLIER_UNBIND) + WRITE_ONCE(link->status, DL_STATE_AVAILABLE); + } + + dev->links.status = DL_DEV_NO_DRIVER; +} + +void device_links_no_driver(struct device *dev) +{ + mutex_lock(&device_links_lock); + __device_links_no_driver(dev); + mutex_unlock(&device_links_lock); +} + +/** + * device_links_driver_cleanup - Update links after driver removal. + * @dev: Device whose driver has just gone away. + * + * Update links to consumers for @dev by changing their status to "dormant" and + * invoke %__device_links_no_driver() to update links to suppliers for it as + * appropriate. + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +void device_links_driver_cleanup(struct device *dev) +{ + struct device_link *link; + + mutex_lock(&device_links_lock); + + list_for_each_entry(link, &dev->links.consumers, s_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + WARN_ON(link->flags & DL_FLAG_AUTOREMOVE); + WARN_ON(link->status != DL_STATE_SUPPLIER_UNBIND); + WRITE_ONCE(link->status, DL_STATE_DORMANT); + } + + __device_links_no_driver(dev); + + mutex_unlock(&device_links_lock); +} + +/** + * device_links_busy - Check if there are any busy links to consumers. + * @dev: Device to check. + * + * Check each consumer of the device and return 'true' if its link's status + * is one of "consumer probe" or "active" (meaning that the given consumer is + * probing right now or its driver is present). Otherwise, change the link + * state to "supplier unbind" to prevent the consumer from being probed + * successfully going forward. + * + * Return 'false' if there are no probing or active consumers. + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +bool device_links_busy(struct device *dev) +{ + struct device_link *link; + bool ret = false; + + mutex_lock(&device_links_lock); + + list_for_each_entry(link, &dev->links.consumers, s_node) { + if (link->flags & DL_FLAG_STATELESS) + continue; + + if (link->status == DL_STATE_CONSUMER_PROBE + || link->status == DL_STATE_ACTIVE) { + ret = true; + break; + } + WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND); + } + + dev->links.status = DL_DEV_UNBINDING; + + mutex_unlock(&device_links_lock); + return ret; +} + +/** + * device_links_unbind_consumers - Force unbind consumers of the given device. + * @dev: Device to unbind the consumers of. + * + * Walk the list of links to consumers for @dev and if any of them is in the + * "consumer probe" state, wait for all device probes in progress to complete + * and start over. + * + * If that's not the case, change the status of the link to "supplier unbind" + * and check if the link was in the "active" state. If so, force the consumer + * driver to unbind and start over (the consumer will not re-probe as we have + * changed the state of the link already). + * + * Links with the DL_FLAG_STATELESS flag set are ignored. + */ +void device_links_unbind_consumers(struct device *dev) +{ + struct device_link *link; + + start: + mutex_lock(&device_links_lock); + + list_for_each_entry(link, &dev->links.consumers, s_node) { + enum device_link_state status; + + if (link->flags & DL_FLAG_STATELESS) + continue; + + status = link->status; + if (status == DL_STATE_CONSUMER_PROBE) { + mutex_unlock(&device_links_lock); + + wait_for_device_probe(); + goto start; + } + WRITE_ONCE(link->status, DL_STATE_SUPPLIER_UNBIND); + if (status == DL_STATE_ACTIVE) { + struct device *consumer = link->consumer; + + get_device(consumer); + + mutex_unlock(&device_links_lock); + + device_release_driver_internal(consumer, NULL, + consumer->parent); + put_device(consumer); + goto start; + } + } + + mutex_unlock(&device_links_lock); +} + +/** + * device_links_purge - Delete existing links to other devices. + * @dev: Target device. + */ +static void device_links_purge(struct device *dev) +{ + struct device_link *link, *ln; + + /* + * Delete all of the remaining links from this device to any other + * devices (either consumers or suppliers). + */ + mutex_lock(&device_links_lock); + + list_for_each_entry_safe_reverse(link, ln, &dev->links.suppliers, c_node) { + WARN_ON(link->status == DL_STATE_ACTIVE); + __device_link_del(link); + } + + list_for_each_entry_safe_reverse(link, ln, &dev->links.consumers, s_node) { + WARN_ON(link->status != DL_STATE_DORMANT && + link->status != DL_STATE_NONE); + __device_link_del(link); + } + + mutex_unlock(&device_links_lock); +} + +/* Device links support end. */ + int (*platform_notify)(struct device *dev) = NULL; int (*platform_notify_remove)(struct device *dev) = NULL; static struct kobject *dev_kobj; @@ -711,6 +1197,9 @@ void device_initialize(struct device *de #ifdef CONFIG_GENERIC_MSI_IRQ INIT_LIST_HEAD(&dev->msi_list); #endif + INIT_LIST_HEAD(&dev->links.consumers); + INIT_LIST_HEAD(&dev->links.suppliers); + dev->links.status = DL_DEV_NO_DRIVER; } EXPORT_SYMBOL_GPL(device_initialize); @@ -1240,6 +1729,8 @@ void device_del(struct device *dev) if (dev->bus) blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DEL_DEVICE, dev); + + device_links_purge(dev); dpm_sysfs_remove(dev); if (parent) klist_del(&dev->p->knode_parent); Index: linux-pm/drivers/base/dd.c =================================================================== --- linux-pm.orig/drivers/base/dd.c +++ linux-pm/drivers/base/dd.c @@ -249,6 +249,7 @@ static void driver_bound(struct device * __func__, dev_name(dev)); klist_add_tail(&dev->p->knode_driver, &dev->driver->p->klist_devices); + device_links_driver_bound(dev); device_pm_check_callbacks(dev); @@ -341,6 +342,10 @@ static int really_probe(struct device *d return ret; } + ret = device_links_check_suppliers(dev); + if (ret) + return ret; + atomic_inc(&probe_count); pr_debug("bus: '%s': %s: probing driver %s with device %s\n", drv->bus->name, __func__, drv->name, dev_name(dev)); @@ -399,6 +404,7 @@ probe_failed: blocking_notifier_call_chain(&dev->bus->p->bus_notifier, BUS_NOTIFY_DRIVER_NOT_BOUND, dev); pinctrl_bind_failed: + device_links_no_driver(dev); devres_release_all(dev); driver_sysfs_remove(dev); dev->driver = NULL; @@ -756,7 +762,7 @@ EXPORT_SYMBOL_GPL(driver_attach); * __device_release_driver() must be called with @dev lock held. * When called for a USB interface, @dev->parent lock must be held as well. */ -static void __device_release_driver(struct device *dev) +static void __device_release_driver(struct device *dev, struct device *parent) { struct device_driver *drv; @@ -765,6 +771,25 @@ static void __device_release_driver(stru if (driver_allows_async_probing(drv)) async_synchronize_full(); + while (device_links_busy(dev)) { + device_unlock(dev); + if (parent) + device_unlock(parent); + + device_links_unbind_consumers(dev); + if (parent) + device_lock(parent); + + device_lock(dev); + /* + * A concurrent invocation of the same function might + * have released the driver successfully while this one + * was waiting, so check for that. + */ + if (dev->driver != drv) + return; + } + pm_runtime_get_sync(dev); driver_sysfs_remove(dev); @@ -780,6 +805,8 @@ static void __device_release_driver(stru dev->bus->remove(dev); else if (drv->remove) drv->remove(dev); + + device_links_driver_cleanup(dev); devres_release_all(dev); dev->driver = NULL; dev_set_drvdata(dev, NULL); @@ -796,16 +823,16 @@ static void __device_release_driver(stru } } -static void device_release_driver_internal(struct device *dev, - struct device_driver *drv, - struct device *parent) +void device_release_driver_internal(struct device *dev, + struct device_driver *drv, + struct device *parent) { if (parent) device_lock(parent); device_lock(dev); if (!drv || drv == dev->driver) - __device_release_driver(dev); + __device_release_driver(dev, parent); device_unlock(dev); if (parent) @@ -818,6 +845,10 @@ static void device_release_driver_intern * * Manually detach device from driver. * When called for a USB interface, @dev->parent lock must be held. + * + * If this function is to be called with @dev->parent lock held, ensure that + * the device's consumers are unbound in advance or that their locks can be + * acquired under the @dev->parent lock. */ void device_release_driver(struct device *dev) { Index: linux-pm/include/linux/device.h =================================================================== --- linux-pm.orig/include/linux/device.h +++ linux-pm/include/linux/device.h @@ -707,6 +707,79 @@ struct device_dma_parameters { }; /** + * enum device_link_state - Device link states. + * @NONE: The presence of the drivers is not being tracked. + * @DORMANT: None of the supplier/consumer drivers is present. + * @AVAILABLE: The supplier driver is present, but the consumer is not. + * @CONSUMER_PROBE: The supplier driver is present and the consumer is probing. + * @ACTIVE: Active link; both the supplier and consumer drivers are present. + * @SUPPLIER_UNBIND: The supplier driver is unbinding. + */ +enum device_link_state { + DL_STATE_NONE = -1, + DL_STATE_DORMANT = 0, + DL_STATE_AVAILABLE, + DL_STATE_CONSUMER_PROBE, + DL_STATE_ACTIVE, + DL_STATE_SUPPLIER_UNBIND, +}; + +/* + * Device link flags. + * + * STATELESS: The core won't track the presence of supplier/consumer drivers. + * AUTOREMOVE: Remove this link automatically on consumer driver unbind. + */ +#define DL_FLAG_STATELESS (1 << 0) +#define DL_FLAG_AUTOREMOVE (1 << 1) + +/** + * struct device_link - Device link representation. + * @supplier: The device on the supplier end of the link. + * @s_node: Hook to the supplier device's list of links to consumers. + * @consumer: The device on the consumer end of the link. + * @c_node: Hook to the consumer device's list of links to suppliers. + * @status: The state of the link (with respect to the presence of drivers). + * @flags: Link flags. + * @rcu_head: An RCU head to use for deferred execution of SRCU callbacks. + */ +struct device_link { + struct device *supplier; + struct list_head s_node; + struct device *consumer; + struct list_head c_node; + enum device_link_state status; + u32 flags; + struct rcu_head rcu_head; +}; + +/** + * enum dl_dev_state - Device driver presence tracking information. + * @NO_DRIVER: There is no driver attached to the device. + * @PROBING: A driver is probing. + * @DRIVER_BOUND: The driver has been bound to the device. + * @UNBINDING: The driver is unbinding from the device. + */ +enum dl_dev_state { + DL_DEV_NO_DRIVER = 0, + DL_DEV_PROBING, + DL_DEV_DRIVER_BOUND, + DL_DEV_UNBINDING, +}; + +/** + * struct dev_links_info - Device data related to device links. + * @suppliers: List of links to supplier devices. + * @consumers: List of links to consumer devices. + * @status: Driver status information. + */ +struct dev_links_info { + struct list_head suppliers; + struct list_head consumers; + enum dl_dev_state status; +}; + +/** * struct device - The basic device structure * @parent: The device's "parent" device, the device to which it is attached. * In most cases, a parent device is some sort of bus or host @@ -797,6 +870,7 @@ struct device { core doesn't touch it */ void *driver_data; /* Driver data, set and get with dev_set/get_drvdata */ + struct dev_links_info links; struct dev_pm_info power; struct dev_pm_domain *pm_domain; @@ -1113,6 +1187,10 @@ extern void device_shutdown(void); /* debugging and troubleshooting/diagnostic helpers. */ extern const char *dev_driver_string(const struct device *dev); +/* Device links interface. */ +struct device_link *device_link_add(struct device *consumer, + struct device *supplier, u32 flags); +void device_link_del(struct device_link *link); #ifdef CONFIG_PRINTK Index: linux-pm/drivers/base/power/main.c =================================================================== --- linux-pm.orig/drivers/base/power/main.c +++ linux-pm/drivers/base/power/main.c @@ -131,6 +131,7 @@ void device_pm_add(struct device *dev) dev_warn(dev, "parent %s should not be sleeping\n", dev_name(dev->parent)); list_add_tail(&dev->power.entry, &dpm_list); + dev->power.in_dpm_list = true; mutex_unlock(&dpm_list_mtx); } @@ -145,6 +146,7 @@ void device_pm_remove(struct device *dev complete_all(&dev->power.completion); mutex_lock(&dpm_list_mtx); list_del_init(&dev->power.entry); + dev->power.in_dpm_list = false; mutex_unlock(&dpm_list_mtx); device_wakeup_disable(dev); pm_runtime_remove(dev); Index: linux-pm/drivers/base/power/power.h =================================================================== --- linux-pm.orig/drivers/base/power/power.h +++ linux-pm/drivers/base/power/power.h @@ -127,6 +127,11 @@ extern void device_pm_move_after(struct extern void device_pm_move_last(struct device *); extern void device_pm_check_callbacks(struct device *dev); +static inline bool device_pm_initialized(struct device *dev) +{ + return dev->power.in_dpm_list; +} + #else /* !CONFIG_PM_SLEEP */ static inline void device_pm_sleep_init(struct device *dev) {} @@ -146,6 +151,11 @@ static inline void device_pm_move_last(s static inline void device_pm_check_callbacks(struct device *dev) {} +static inline bool device_pm_initialized(struct device *dev) +{ + return device_is_registered(dev); +} + #endif /* !CONFIG_PM_SLEEP */ static inline void device_pm_init(struct device *dev) Index: linux-pm/include/linux/pm.h =================================================================== --- linux-pm.orig/include/linux/pm.h +++ linux-pm/include/linux/pm.h @@ -559,6 +559,7 @@ struct dev_pm_info { pm_message_t power_state; unsigned int can_wakeup:1; unsigned int async_suspend:1; + bool in_dpm_list:1; /* Owned by the PM core */ bool is_prepared:1; /* Owned by the PM core */ bool is_suspended:1; /* Ditto */ bool is_noirq_suspended:1;