Message ID | 20190828163636.12967-5-thierry.reding@gmail.com (mailing list archive) |
---|---|
State | Mainlined, archived |
Commit | c1e33666c94fda365a321c471f2a82996fdf3d83 |
Headers | show |
Series | [1/5] PCI: exynos: Properly handle optional PHYs | expand |
On Wed, Aug 28, 2019 at 06:36:36PM +0200, Thierry Reding wrote: > From: Thierry Reding <treding@nvidia.com> > > devm_phy_get() can fail for a number of resides besides probe deferral. > It can for example return -ENOMEM if it runs out of memory as it tries > to allocate devres structures. Propagating only -EPROBE_DEFER is > problematic because it results in these legitimately fatal errors being > treated as "PHY not specified in DT". > > What we really want is to ignore the optional PHYs only if they have not > been specified in DT. devm_phy_optional_get() is a function that exactly > does what's required here, so use that instead. > > Cc: Ray Jui <rjui@broadcom.com> > Cc: Scott Branden <sbranden@broadcom.com> > Cc: bcm-kernel-feedback-list@broadcom.com > Signed-off-by: Thierry Reding <treding@nvidia.com> > --- > drivers/pci/controller/pcie-iproc-platform.c | 9 +++------ > 1 file changed, 3 insertions(+), 6 deletions(-) > > diff --git a/drivers/pci/controller/pcie-iproc-platform.c b/drivers/pci/controller/pcie-iproc-platform.c > index 5a3550b6bb29..9ee6200a66f4 100644 > --- a/drivers/pci/controller/pcie-iproc-platform.c > +++ b/drivers/pci/controller/pcie-iproc-platform.c > @@ -93,12 +93,9 @@ static int iproc_pcie_pltfm_probe(struct platform_device *pdev) > pcie->need_ib_cfg = of_property_read_bool(np, "dma-ranges"); > > /* PHY use is optional */ > - pcie->phy = devm_phy_get(dev, "pcie-phy"); > - if (IS_ERR(pcie->phy)) { > - if (PTR_ERR(pcie->phy) == -EPROBE_DEFER) > - return -EPROBE_DEFER; > - pcie->phy = NULL; > - } > + pcie->phy = devm_phy_optional_get(dev, "pcie-phy"); > + if (IS_ERR(pcie->phy)) > + return PTR_ERR(pcie->phy); Once you've applied Bjorn's feedback you can add: Reviewed-by: Andrew Murray <andrew.murray@arm.com> I initially thought that you forgot to check for -ENODEV - though I can see that the implementation of devm_phy_optional_get very helpfully does this for us and returns NULL instead of an error. What is also confusing is that devm_regulator_get_optional, despite its _optional suffix doesn't do this and returns an error. I wonder if devm_phy_optional_get should be changed to return NULL instead of an error instead of -ENODEV. I've copied Liam/Mark for feedback. Thanks, Andrew Murray > > ret = devm_of_pci_get_host_bridge_resources(dev, 0, 0xff, &resources, > &iobase); > -- > 2.22.0 >
On Wed, Aug 28, 2019 at 10:26:55PM +0100, Andrew Murray wrote: > I initially thought that you forgot to check for -ENODEV - though I can see > that the implementation of devm_phy_optional_get very helpfully does this for > us and returns NULL instead of an error. > What is also confusing is that devm_regulator_get_optional, despite its > _optional suffix doesn't do this and returns an error. I wonder if > devm_phy_optional_get should be changed to return NULL instead of an error > instead of -ENODEV. I've copied Liam/Mark for feedback. The regulator API has an assumption that people will write bad DTs and not describe all the regulators in the system, this is especially likely in cases where consumer drivers initially don't have regulator support and then get it added since people often only describe supplies actively used by drivers. In order to handle this gracefully the API will substitute in a dummy regulator if it sees that the regulator just isn't described in the system but a consumer requests it, this will ensure that for most simple uses the consumer will function fine even if the DT is not fully described. Since most devices won't physically work if some of their supplies are missing this is a good default assumption. If a consumer could genuinely have some missing supplies (some devices do support this for various reasons) then this support would mean that the consumer would have to have some extra property to say that the regulator is intentionally missing which would be bad. Instead what we do is let the consumer say that real systems could actually be missing the regulator and that the dummy shouldn't be used so that the consumer can handle this.
On Wed, Aug 28, 2019 at 10:49:01PM +0100, Mark Brown wrote: > On Wed, Aug 28, 2019 at 10:26:55PM +0100, Andrew Murray wrote: > > > I initially thought that you forgot to check for -ENODEV - though I can see > > that the implementation of devm_phy_optional_get very helpfully does this for > > us and returns NULL instead of an error. > > > What is also confusing is that devm_regulator_get_optional, despite its > > _optional suffix doesn't do this and returns an error. I wonder if > > devm_phy_optional_get should be changed to return NULL instead of an error > > instead of -ENODEV. I've copied Liam/Mark for feedback. > > The regulator API has an assumption that people will write bad DTs and > not describe all the regulators in the system, this is especially likely > in cases where consumer drivers initially don't have regulator support > and then get it added since people often only describe supplies actively > used by drivers. In order to handle this gracefully the API will > substitute in a dummy regulator if it sees that the regulator just isn't > drescribed in the system but a consumer requests it, this will ensure > that for most simple uses the consumer will function fine even if the DT > is not fully described. Since most devices won't physically work if > some of their supplies are missing this is a good default assumption. Right, if I understand correctly this is the behaviour when regulator_get is called (e.g. NORMAL_GET) - you get a dummy instead of an error. > > If a consumer could genuinely have some missing supplies (some devices > do support this for various reasons) then this support would mean that > the consumer would have to have some extra property to say that the > regulator is intentionally missing which would be bad. Instead what we > do is let the consumer say that real systems could actually be missing > the regulator and that the dummy shouldn't be used so that the consumer > can handle this. And if I understand correctly this is the behaviour when regulator_get_optional is called (e.g. OPTIONAL_GET) - you get -ENODEV instead of a dummy. But why do we return -ENODEV and not NULL for OPTIONAL_GET? Looking at some of the consumer drivers I can see that lots of them don't correctly handle the return value of regulator_get_optional: - some fail their probes and return upon IS_ERR(ret) - for example even if -ENODEV is returned. - some don't fail their probes and assume the regulator isn't present upon IS_ERR(ret) - yet this may not be correct as the regulator may be present but -ENOMEM was returned. Given that a common pattern is to set a consumer regulator pointer to NULL upon -ENODEV - if regulator_get_optional did this for them, then it would be more difficult for consumer drivers to get the error handling wrong and would remove some consumer boiler plate code. (Of course some consumers won't set a regulator pointer to NULL and instead test it against IS_ERR instead of NULL everywhere (IS_ERR(NULL) is false) - but such a change may be a reason to not use IS_ERR everywhere). As I understand, if a consumer wants to fail upon an absent regulator it seems the only way they can do this is call regulator_get_optional (which seems odd) and test for -ENODEV. I'm not sure if there is actually a use-case for this. I guess I'm looking here for something that can simplify consumer error handling - it's easy to get wrong and it seems that many drivers may be wrong. Thanks, Andrew Murray
On Thu, Aug 29, 2019 at 11:09:34AM +0100, Andrew Murray wrote: > On Wed, Aug 28, 2019 at 10:49:01PM +0100, Mark Brown wrote: > > On Wed, Aug 28, 2019 at 10:26:55PM +0100, Andrew Murray wrote: > > > > > I initially thought that you forgot to check for -ENODEV - though I can see > > > that the implementation of devm_phy_optional_get very helpfully does this for > > > us and returns NULL instead of an error. > > > > > What is also confusing is that devm_regulator_get_optional, despite its > > > _optional suffix doesn't do this and returns an error. I wonder if > > > devm_phy_optional_get should be changed to return NULL instead of an error > > > instead of -ENODEV. I've copied Liam/Mark for feedback. > > > > The regulator API has an assumption that people will write bad DTs and > > not describe all the regulators in the system, this is especially likely > > in cases where consumer drivers initially don't have regulator support > > and then get it added since people often only describe supplies actively > > used by drivers. In order to handle this gracefully the API will > > substitute in a dummy regulator if it sees that the regulator just isn't > > drescribed in the system but a consumer requests it, this will ensure > > that for most simple uses the consumer will function fine even if the DT > > is not fully described. Since most devices won't physically work if > > some of their supplies are missing this is a good default assumption. > > Right, if I understand correctly this is the behaviour when regulator_get > is called (e.g. NORMAL_GET) - you get a dummy instead of an error. > > > > > If a consumer could genuinely have some missing supplies (some devices > > do support this for various reasons) then this support would mean that > > the consumer would have to have some extra property to say that the > > regulator is intentionally missing which would be bad. Instead what we > > do is let the consumer say that real systems could actually be missing > > the regulator and that the dummy shouldn't be used so that the consumer > > can handle this. > > And if I understand correctly this is the behaviour when > regulator_get_optional is called (e.g. OPTIONAL_GET) - you get -ENODEV > instead of a dummy. > > But why do we return -ENODEV and not NULL for OPTIONAL_GET? > > Looking at some of the consumer drivers I can see that lots of them don't > correctly handle the return value of regulator_get_optional: > > - some fail their probes and return upon IS_ERR(ret) - for example even > if -ENODEV is returned. > > - some don't fail their probes and assume the regulator isn't present upon > IS_ERR(ret) - yet this may not be correct as the regulator may be present > but -ENOMEM was returned. > > Given that a common pattern is to set a consumer regulator pointer to NULL > upon -ENODEV - if regulator_get_optional did this for them, then it would > be more difficult for consumer drivers to get the error handling wrong and > would remove some consumer boiler plate code. > > (Of course some consumers won't set a regulator pointer to NULL and instead > test it against IS_ERR instead of NULL everywhere (IS_ERR(NULL) is false) - > but such a change may be a reason to not use IS_ERR everywhere). > > As I understand, if a consumer wants to fail upon an absent regulator > it seems the only way they can do this is call regulator_get_optional (which > seems odd) and test for -ENODEV. I'm not sure if there is actually a use-case > for this. > > I guess I'm looking here for something that can simplify consumer error > handling - it's easy to get wrong and it seems that many drivers may be wrong. Agreed. However, this requires a thorough audit of all callers of regulator_get_optional() to make sure they behave in a sane way. To further complicate things, unless we want to convert all ~100 callers in a single patch we need to convert all of them to set the regulator pointer to NULL on -ENODEV. After that we can make the change to regulator_get_optional() and only then can we remove the now obsolete boilerplate from those ~100 callers. Not impossible, but pretty time- consuming. While at it, we could also add optional variants to some of the *phy*_get() functions to convert those as well. Currently there's only optional variants for phy_get() and devm_phy_get(), but a bunch of drivers use of_phy_get() or of_phy_get_by_index(). Though especially the latter isn't very common with optional PHYs, I think. I also noticed a slightly similar pattern for GPIOs. Perhaps this would be a good task for someone with good semantic patch skills. Or perhaps something to add to the janitors' TODO list? Not sure if that's still a thing, though. Thierry
On Thu, Aug 29, 2019 at 11:09:34AM +0100, Andrew Murray wrote: > But why do we return -ENODEV and not NULL for OPTIONAL_GET? Why would we return NULL? The caller is going to have to check the error code anyway since they need to handle -EPROBE_DEFER and NULL is never a valid thing to use with the regulator API. > Looking at some of the consumer drivers I can see that lots of them don't > correctly handle the return value of regulator_get_optional: This API is *really* commonly abused, especially in the graphics subsystem which for some reason has lots of users even though I don't think I've ever seen a sensible use of the API there. As far as I can tell the graphics subsystem mostly suffers from people cut'n'pasting from the Qualcomm BSP which is full of really bad practice. It's really frustrating since I will intermittently go through and clean things up but the uses just keep coming back into the subsystem. > Given that a common pattern is to set a consumer regulator pointer to NULL > upon -ENODEV - if regulator_get_optional did this for them, then it would > be more difficult for consumer drivers to get the error handling wrong and > would remove some consumer boiler plate code. No, they'd all still be broken for probe deferral (which is commonly caused by off-chip devices) and they'd crash as soon as they try to call any further regulator API functions. > I guess I'm looking here for something that can simplify consumer error > handling - it's easy to get wrong and it seems that many drivers may be wrong. The overwhelming majority of the users just shouldn't be using this API. If there weren't devices that absolutely *need* this API it wouldn't be there in the first place since it seems like a magent for bad code, it's so disappointing how bad the majority of the consumer code is.
On Thu, Aug 29, 2019 at 12:17:29PM +0100, Mark Brown wrote: > On Thu, Aug 29, 2019 at 11:09:34AM +0100, Andrew Murray wrote: > > > But why do we return -ENODEV and not NULL for OPTIONAL_GET? > > Why would we return NULL? The caller is going to have to check the > error code anyway since they need to handle -EPROBE_DEFER and NULL is > never a valid thing to use with the regulator API. I think the idea is that consumers would do something like this: supply = regulator_get_optional(dev, "foo"); if (IS_ERR(supply)) { /* -EPROBE_DEFER handled as part of this */ return PTR_ERR(supply); } /* * Optional supply is NULL here, making it "easier" to check * whether or not to use it. */ I suppose checking using IS_ERR() is equally simple, so this mostly boils down to taste... The PHY subsystem, for instance, uses a similar approach but it goes one step further. All APIs can take NULL as their struct phy * argument and return success in that case, which makes it really trivial to deal with optional PHYs. You really don't have to care about them at all after you get NULL from phy_get_optional(). > > Looking at some of the consumer drivers I can see that lots of them don't > > correctly handle the return value of regulator_get_optional: > > This API is *really* commonly abused, especially in the graphics > subsystem which for some reason has lots of users even though I don't > think I've ever seen a sensible use of the API there. As far as I can > tell the graphics subsystem mostly suffers from people cut'n'pasting > from the Qualcomm BSP which is full of really bad practice. It's really > frustrating since I will intermittently go through and clean things up > but the uses just keep coming back into the subsystem. The intention here is to make it more difficult to abuse. Obviously if people keep copying abuses from one driver to another that's a different story. But if there was no way to abuse the API and if it automatically did the right thing, even copy/paste abuse would be difficult to pull off. > > Given that a common pattern is to set a consumer regulator pointer to NULL > > upon -ENODEV - if regulator_get_optional did this for them, then it would > > be more difficult for consumer drivers to get the error handling wrong and > > would remove some consumer boiler plate code. > > No, they'd all still be broken for probe deferral (which is commonly > caused by off-chip devices) and they'd crash as soon as they try to call > any further regulator API functions. I think what Andrew was suggesting here was to make it easier for people to determine whether or not an optional regulator was in fact requested successfully or not. Many consumers already set the optional supply to NULL and then check for that before calling any regulator API. If regulator_get_optional() returned NULL for absent optional supplies, this could be unified across all drivers. And it would allow treating NULL regulators special, if that's something you'd be willing to do. In either case, the number of abuses shows that people clearly don't understand how to use this. So there are two options: a) fix abuse every time we come across it or b) try to change the API to make it more difficult to abuse. > > I guess I'm looking here for something that can simplify consumer error > > handling - it's easy to get wrong and it seems that many drivers may be wrong. > > The overwhelming majority of the users just shouldn't be using this API. > If there weren't devices that absolutely *need* this API it wouldn't be > there in the first place since it seems like a magent for bad code, it's > so disappointing how bad the majority of the consumer code is. Yeah, I guess in many cases this API is used as a cheap way to model always-on, fixed-voltage regulators. It's pretty hard to change those after the fact, though, because they're usually happening as part of device tree bindings implementation, so by the time we notice them, they've often become ABI... Thierry
On Thu, Aug 29, 2019 at 01:46:03PM +0200, Thierry Reding wrote: > On Thu, Aug 29, 2019 at 12:17:29PM +0100, Mark Brown wrote: > > On Thu, Aug 29, 2019 at 11:09:34AM +0100, Andrew Murray wrote: > > > > > But why do we return -ENODEV and not NULL for OPTIONAL_GET? > > > > Why would we return NULL? The caller is going to have to check the > > error code anyway since they need to handle -EPROBE_DEFER and NULL is > > never a valid thing to use with the regulator API. > > I think the idea is that consumers would do something like this: > > supply = regulator_get_optional(dev, "foo"); > if (IS_ERR(supply)) { > /* -EPROBE_DEFER handled as part of this */ > return PTR_ERR(supply); > } > > /* > * Optional supply is NULL here, making it "easier" to check > * whether or not to use it. > */ Indeed. This way the error path is only for errors (if you consider that requesting an optional regulator that doesn't exist isn't an error - and if you also consider that -EPROBE_DEFER is an error, or at least a reason to bail). > > I suppose checking using IS_ERR() is equally simple, so this mostly > boils down to taste... > > The PHY subsystem, for instance, uses a similar approach but it goes one > step further. All APIs can take NULL as their struct phy * argument and > return success in that case, which makes it really trivial to deal with > optional PHYs. You really don't have to care about them at all after you > get NULL from phy_get_optional(). I can see how this makes everything very convenient for the driver, though this doesn't smell right. > > > > Looking at some of the consumer drivers I can see that lots of them don't > > > correctly handle the return value of regulator_get_optional: > > > > This API is *really* commonly abused, especially in the graphics > > subsystem which for some reason has lots of users even though I don't > > think I've ever seen a sensible use of the API there. As far as I can > > tell the graphics subsystem mostly suffers from people cut'n'pasting > > from the Qualcomm BSP which is full of really bad practice. It's really > > frustrating since I will intermittently go through and clean things up > > but the uses just keep coming back into the subsystem. > > The intention here is to make it more difficult to abuse. Obviously if > people keep copying abuses from one driver to another that's a different > story. But if there was no way to abuse the API and if it automatically > did the right thing, even copy/paste abuse would be difficult to pull > off. That's the motativation here. > > > > Given that a common pattern is to set a consumer regulator pointer to NULL > > > upon -ENODEV - if regulator_get_optional did this for them, then it would > > > be more difficult for consumer drivers to get the error handling wrong and > > > would remove some consumer boiler plate code. > > > > No, they'd all still be broken for probe deferral (which is commonly > > caused by off-chip devices) and they'd crash as soon as they try to call > > any further regulator API functions. regulator_get_optional would still return -EPROBE_DEFER and this would be caught in the above example set out by Thierry. > > I think what Andrew was suggesting here was to make it easier for people > to determine whether or not an optional regulator was in fact requested > successfully or not. Many consumers already set the optional supply to > NULL and then check for that before calling any regulator API. > > If regulator_get_optional() returned NULL for absent optional supplies, > this could be unified across all drivers. And it would allow treating > NULL regulators special, if that's something you'd be willing to do. > > In either case, the number of abuses shows that people clearly don't > understand how to use this. So there are two options: a) fix abuse every > time we come across it or b) try to change the API to make it more > difficult to abuse. Sure. I think we end up with something like: diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c index e0c0cf462004..67e2a6d7abf6 100644 --- a/drivers/regulator/core.c +++ b/drivers/regulator/core.c @@ -1868,6 +1868,9 @@ struct regulator *_regulator_get(struct device *dev, const char *id, } switch (get_type) { + case OPTIONAL_GET: + return NULL; + case NORMAL_GET: /* * Assume that a regulator is physically present and Of course there may be other approaches. > > > > I guess I'm looking here for something that can simplify consumer error > > > handling - it's easy to get wrong and it seems that many drivers may be wrong. > > > > The overwhelming majority of the users just shouldn't be using this API. > > If there weren't devices that absolutely *need* this API it wouldn't be > > there in the first place since it seems like a magent for bad code, it's > > so disappointing how bad the majority of the consumer code is. > > Yeah, I guess in many cases this API is used as a cheap way to model > always-on, fixed-voltage regulators. It's pretty hard to change those > after the fact, though, because they're usually happening as part of > device tree bindings implementation, so by the time we notice them, > they've often become ABI... Thanks, Andrew Murray > > Thierry
On Thu, Aug 29, 2019 at 12:48:06PM +0200, Thierry Reding wrote: > On Thu, Aug 29, 2019 at 11:09:34AM +0100, Andrew Murray wrote: > > On Wed, Aug 28, 2019 at 10:49:01PM +0100, Mark Brown wrote: > > > On Wed, Aug 28, 2019 at 10:26:55PM +0100, Andrew Murray wrote: > > > > > > > I initially thought that you forgot to check for -ENODEV - though I can see > > > > that the implementation of devm_phy_optional_get very helpfully does this for > > > > us and returns NULL instead of an error. > > > > > > > What is also confusing is that devm_regulator_get_optional, despite its > > > > _optional suffix doesn't do this and returns an error. I wonder if > > > > devm_phy_optional_get should be changed to return NULL instead of an error > > > > instead of -ENODEV. I've copied Liam/Mark for feedback. > > > > > > The regulator API has an assumption that people will write bad DTs and > > > not describe all the regulators in the system, this is especially likely > > > in cases where consumer drivers initially don't have regulator support > > > and then get it added since people often only describe supplies actively > > > used by drivers. In order to handle this gracefully the API will > > > substitute in a dummy regulator if it sees that the regulator just isn't > > > drescribed in the system but a consumer requests it, this will ensure > > > that for most simple uses the consumer will function fine even if the DT > > > is not fully described. Since most devices won't physically work if > > > some of their supplies are missing this is a good default assumption. > > > > Right, if I understand correctly this is the behaviour when regulator_get > > is called (e.g. NORMAL_GET) - you get a dummy instead of an error. > > > > > > > > If a consumer could genuinely have some missing supplies (some devices > > > do support this for various reasons) then this support would mean that > > > the consumer would have to have some extra property to say that the > > > regulator is intentionally missing which would be bad. Instead what we > > > do is let the consumer say that real systems could actually be missing > > > the regulator and that the dummy shouldn't be used so that the consumer > > > can handle this. > > > > And if I understand correctly this is the behaviour when > > regulator_get_optional is called (e.g. OPTIONAL_GET) - you get -ENODEV > > instead of a dummy. > > > > But why do we return -ENODEV and not NULL for OPTIONAL_GET? > > > > Looking at some of the consumer drivers I can see that lots of them don't > > correctly handle the return value of regulator_get_optional: > > > > - some fail their probes and return upon IS_ERR(ret) - for example even > > if -ENODEV is returned. > > > > - some don't fail their probes and assume the regulator isn't present upon > > IS_ERR(ret) - yet this may not be correct as the regulator may be present > > but -ENOMEM was returned. > > > > Given that a common pattern is to set a consumer regulator pointer to NULL > > upon -ENODEV - if regulator_get_optional did this for them, then it would > > be more difficult for consumer drivers to get the error handling wrong and > > would remove some consumer boiler plate code. > > > > (Of course some consumers won't set a regulator pointer to NULL and instead > > test it against IS_ERR instead of NULL everywhere (IS_ERR(NULL) is false) - > > but such a change may be a reason to not use IS_ERR everywhere). > > > > As I understand, if a consumer wants to fail upon an absent regulator > > it seems the only way they can do this is call regulator_get_optional (which > > seems odd) and test for -ENODEV. I'm not sure if there is actually a use-case > > for this. > > > > I guess I'm looking here for something that can simplify consumer error > > handling - it's easy to get wrong and it seems that many drivers may be wrong. > > Agreed. However, this requires a thorough audit of all callers of > regulator_get_optional() to make sure they behave in a sane way. To > further complicate things, unless we want to convert all ~100 callers > in a single patch we need to convert all of them to set the regulator > pointer to NULL on -ENODEV. After that we can make the change to > regulator_get_optional() and only then can we remove the now obsolete > boilerplate from those ~100 callers. Not impossible, but pretty time- > consuming. This makes sense. > > While at it, we could also add optional variants to some of the > *phy*_get() functions to convert those as well. Currently there's only > optional variants for phy_get() and devm_phy_get(), but a bunch of > drivers use of_phy_get() or of_phy_get_by_index(). Though especially the > latter isn't very common with optional PHYs, I think. > > I also noticed a slightly similar pattern for GPIOs. Perhaps this would > be a good task for someone with good semantic patch skills. Or perhaps > something to add to the janitors' TODO list? Not sure if that's still a > thing, though. Yeah I'm pretty sure this applies to several APIs. Janitors is still around - http://vger.kernel.org/vger-lists.html#kernel-janitors I think there are others too, such as the Linux Kernel Mentorship Program Thanks, Andrew Murray > > Thierry
On Thu, Aug 29, 2019 at 01:46:03PM +0200, Thierry Reding wrote: > On Thu, Aug 29, 2019 at 12:17:29PM +0100, Mark Brown wrote: > > Why would we return NULL? The caller is going to have to check the > > error code anyway since they need to handle -EPROBE_DEFER and NULL is > > never a valid thing to use with the regulator API. > I think the idea is that consumers would do something like this: > supply = regulator_get_optional(dev, "foo"); > if (IS_ERR(supply)) { > /* -EPROBE_DEFER handled as part of this */ > return PTR_ERR(supply); > } > /* > * Optional supply is NULL here, making it "easier" to check > * whether or not to use it. > */ > I suppose checking using IS_ERR() is equally simple, so this mostly > boils down to taste... Or setting a flag saying which mode to drive the chip in for example. > The PHY subsystem, for instance, uses a similar approach but it goes one > step further. All APIs can take NULL as their struct phy * argument and > return success in that case, which makes it really trivial to deal with > optional PHYs. You really don't have to care about them at all after you > get NULL from phy_get_optional(). That case really doesn't exist for the regulator API, that's the case which is handled by the dummy regulator in the regulator API - it really is rare to see devices that operate without power. I suspect the PHY case is similar in that there always is a PHY of some kind. If a consumer can be written like that then it just shouldn't be using regulator_get_optional() in the first place. > > > Looking at some of the consumer drivers I can see that lots of them don't > > > correctly handle the return value of regulator_get_optional: > > This API is *really* commonly abused, especially in the graphics > > subsystem which for some reason has lots of users even though I don't > > think I've ever seen a sensible use of the API there. As far as I can > > tell the graphics subsystem mostly suffers from people cut'n'pasting > > from the Qualcomm BSP which is full of really bad practice. It's really > > frustrating since I will intermittently go through and clean things up > > but the uses just keep coming back into the subsystem. > The intention here is to make it more difficult to abuse. Obviously if > people keep copying abuses from one driver to another that's a different > story. But if there was no way to abuse the API and if it automatically > did the right thing, even copy/paste abuse would be difficult to pull > off. I fail to see how returning NULL would change anything with regard to the API being abused, if anything I think it'd make it more likely to have people write things that break for probe deferral since it's not reminding people about error codes (that's very marginal though). > > No, they'd all still be broken for probe deferral (which is commonly > > caused by off-chip devices) and they'd crash as soon as they try to call > > any further regulator API functions. > I think what Andrew was suggesting here was to make it easier for people > to determine whether or not an optional regulator was in fact requested > successfully or not. Many consumers already set the optional supply to > NULL and then check for that before calling any regulator API. I think that really is very marginal. > If regulator_get_optional() returned NULL for absent optional supplies, > this could be unified across all drivers. And it would allow treating > NULL regulators special, if that's something you'd be willing to do. Not really, no. What we're doing at the minute at least mitigates against the problems we used to have with people just not handling errors at all and having the dummy regulator gives us some opportunity to do checks on API usage in the consumers that would otherwise not get run which helps robustness for the systems where there are real, controllable regulators providing those supplies. > In either case, the number of abuses shows that people clearly don't > understand how to use this. So there are two options: a) fix abuse every > time we come across it or b) try to change the API to make it more > difficult to abuse. I don't think there's much that can be done to avoid abuse of the API, I've thought of things like having a #define around the prototype for "yes I really meant to use this" which consumers would have to define in an effort to try to flag up to developers and reviewers that it's not normal. > > > I guess I'm looking here for something that can simplify consumer error > > > handling - it's easy to get wrong and it seems that many drivers may be wrong. > > The overwhelming majority of the users just shouldn't be using this API. > > If there weren't devices that absolutely *need* this API it wouldn't be > > there in the first place since it seems like a magent for bad code, it's > > so disappointing how bad the majority of the consumer code is. > Yeah, I guess in many cases this API is used as a cheap way to model > always-on, fixed-voltage regulators. It's pretty hard to change those > after the fact, though, because they're usually happening as part of > device tree bindings implementation, so by the time we notice them, > they've often become ABI... I don't follow here? Such users are going to be the common case for the regulator API and shouldn't have any problems using normal regulator_get(). When I say users shouldn't be using this API I mean _get_optional() specifically.
On Thu, Aug 29, 2019 at 01:08:35PM +0100, Andrew Murray wrote: > On Thu, Aug 29, 2019 at 01:46:03PM +0200, Thierry Reding wrote: > > If regulator_get_optional() returned NULL for absent optional supplies, > > this could be unified across all drivers. And it would allow treating > > NULL regulators special, if that's something you'd be willing to do. > > In either case, the number of abuses shows that people clearly don't > > understand how to use this. So there are two options: a) fix abuse every > > time we come across it or b) try to change the API to make it more > > difficult to abuse. > Sure. I think we end up with something like: > diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c > index e0c0cf462004..67e2a6d7abf6 100644 > --- a/drivers/regulator/core.c > +++ b/drivers/regulator/core.c > @@ -1868,6 +1868,9 @@ struct regulator *_regulator_get(struct device *dev, const char *id, > } > > switch (get_type) { > + case OPTIONAL_GET: > + return NULL; > + Implementing returning NULL is not hard. How returning NULL discourages people from using regulator_get_optional() when they shouldn't be using it in the first place is not clear to me.
On Thu, Aug 29, 2019 at 02:16:03PM +0100, Mark Brown wrote: > On Thu, Aug 29, 2019 at 01:08:35PM +0100, Andrew Murray wrote: > > On Thu, Aug 29, 2019 at 01:46:03PM +0200, Thierry Reding wrote: > > > > If regulator_get_optional() returned NULL for absent optional supplies, > > > this could be unified across all drivers. And it would allow treating > > > NULL regulators special, if that's something you'd be willing to do. > > > > In either case, the number of abuses shows that people clearly don't > > > understand how to use this. So there are two options: a) fix abuse every > > > time we come across it or b) try to change the API to make it more > > > difficult to abuse. > > > Sure. I think we end up with something like: > > > diff --git a/drivers/regulator/core.c b/drivers/regulator/core.c > > index e0c0cf462004..67e2a6d7abf6 100644 > > --- a/drivers/regulator/core.c > > +++ b/drivers/regulator/core.c > > @@ -1868,6 +1868,9 @@ struct regulator *_regulator_get(struct device *dev, const char *id, > > } > > > > switch (get_type) { > > + case OPTIONAL_GET: > > + return NULL; > > + > > Implementing returning NULL is not hard. How returning NULL discourages > people from using regulator_get_optional() when they shouldn't be using > it in the first place is not clear to me. I think this is the part I haven't understood until now. There are many consumer drivers that will not have a regulator specified in the DT - this may be because they are optional (possibly a rare thing) or because they don't need to be specified (because they are always on and require no software interaction)... Where they are not specified, because there is really no reason for them to be described in the DT - then these drivers should use regulator_get and be happy with a dummy regulator. This gives a benefit as if another hardware version uses the same driver but does have a regulator that needs software control then we can be confident it will work. Where regulators are really optional, then regulator_get_optional is used and -ENODEV can be used by the caller to perform a different course of action if required. (Does this use-case actually exist?) I guess I interpreted _optional as 'it's OK if you can't provide a regulator', whereas the meaning is really 'get me a regulator that may not exist'. Is my understanding correct? If so I guess another course of action would be to clean-up users of _optional and convert them to regulator_get where appropriate? Thanks, Andrew Murray
On Thu, Aug 29, 2019 at 02:03:23PM +0100, Mark Brown wrote: > On Thu, Aug 29, 2019 at 01:46:03PM +0200, Thierry Reding wrote: > > On Thu, Aug 29, 2019 at 12:17:29PM +0100, Mark Brown wrote: > > > > Why would we return NULL? The caller is going to have to check the > > > error code anyway since they need to handle -EPROBE_DEFER and NULL is > > > never a valid thing to use with the regulator API. > > > I think the idea is that consumers would do something like this: > > > supply = regulator_get_optional(dev, "foo"); > > if (IS_ERR(supply)) { > > /* -EPROBE_DEFER handled as part of this */ > > return PTR_ERR(supply); > > } > > > /* > > * Optional supply is NULL here, making it "easier" to check > > * whether or not to use it. > > */ > > > I suppose checking using IS_ERR() is equally simple, so this mostly > > boils down to taste... > > Or setting a flag saying which mode to drive the chip in for example. > > > The PHY subsystem, for instance, uses a similar approach but it goes one > > step further. All APIs can take NULL as their struct phy * argument and > > return success in that case, which makes it really trivial to deal with > > optional PHYs. You really don't have to care about them at all after you > > get NULL from phy_get_optional(). > > That case really doesn't exist for the regulator API, that's the case > which is handled by the dummy regulator in the regulator API - it really > is rare to see devices that operate without power. I suspect the PHY > case is similar in that there always is a PHY of some kind. If a > consumer can be written like that then it just shouldn't be using > regulator_get_optional() in the first place. One recent example, which is actually what spawned this series of cleanups, that comes to mind here is the case where we need regulators to supply power to a PCI device. The PCI device that consumes power is itself not listed in the device tree, because it is usually enumerated via the PCI bus. However, we still have to provide power to a PCI slot to make sure the device powers up and can be enumerated in the first place. For this particular case it's pretty obvious that the supplies are in fact required. However, the same PCI controller can also be used with onboard devices where no standard 3.3 V and 12 V need to be supplied (as in the PCIe slot case above). Instead there are usually different power supplies to power the device. Also, since these supplies are usually required to bring the device up and make it detectable on the PCI bus, these supplies are typically always on. Also, since the PCI controller is not the consumer of the power supplies, it's impossible to specify which supplies exactly are needed by the device (they'd have to be described by a binding for these devices, but drivers couldn't be requesting them because without the supplies being enabled the device would never show up on the PCI bus and the driver would never bind). Do you think those 3.3 V and 12 V regulators would qualify for use of the regulator_get_optional() API? Because in the case where the PCIe controller doesn't drive a PCIe slot, the 3.3 V and 12 V supplies really are not there. > > > > Looking at some of the consumer drivers I can see that lots of them don't > > > > correctly handle the return value of regulator_get_optional: > > > > This API is *really* commonly abused, especially in the graphics > > > subsystem which for some reason has lots of users even though I don't > > > think I've ever seen a sensible use of the API there. As far as I can > > > tell the graphics subsystem mostly suffers from people cut'n'pasting > > > from the Qualcomm BSP which is full of really bad practice. It's really > > > frustrating since I will intermittently go through and clean things up > > > but the uses just keep coming back into the subsystem. > > > The intention here is to make it more difficult to abuse. Obviously if > > people keep copying abuses from one driver to another that's a different > > story. But if there was no way to abuse the API and if it automatically > > did the right thing, even copy/paste abuse would be difficult to pull > > off. > > I fail to see how returning NULL would change anything with regard to > the API being abused, if anything I think it'd make it more likely to > have people write things that break for probe deferral since it's not > reminding people about error codes (that's very marginal though). People would of course still need to check the return value for errors. Returning NULL would only make sure that if there's really no regulator there's a standard way to signal that. And drivers could always use similar code patterns to deal with optional regulators. Right now there are two patterns: set regulator to NULL if the regulator is not available (and use !supply checks before calling regulator APIs) or leave the regulator set to whatever error pointer was returned (and use !IS_ERR() checks before calling regulator APIs). In many cases above, the drivers will continue without a regulator irrespective of the error code returned. If we return NULL in exactly only the case where the regulator is not there and an ERR_PTR() otherwise, it becomes much clearer that all errors should just be propagated to the caller (including -EPROBE_DEFER) and otherwise we can continue with appropriate NULL checks. Again, yeah, this might be marginal. > > > No, they'd all still be broken for probe deferral (which is commonly > > > caused by off-chip devices) and they'd crash as soon as they try to call > > > any further regulator API functions. > > > I think what Andrew was suggesting here was to make it easier for people > > to determine whether or not an optional regulator was in fact requested > > successfully or not. Many consumers already set the optional supply to > > NULL and then check for that before calling any regulator API. > > I think that really is very marginal. > > > If regulator_get_optional() returned NULL for absent optional supplies, > > this could be unified across all drivers. And it would allow treating > > NULL regulators special, if that's something you'd be willing to do. > > Not really, no. What we're doing at the minute at least mitigates > against the problems we used to have with people just not handling > errors at all and having the dummy regulator gives us some opportunity > to do checks on API usage in the consumers that would otherwise not get > run which helps robustness for the systems where there are real, > controllable regulators providing those supplies. > > > In either case, the number of abuses shows that people clearly don't > > understand how to use this. So there are two options: a) fix abuse every > > time we come across it or b) try to change the API to make it more > > difficult to abuse. > > I don't think there's much that can be done to avoid abuse of the API, > I've thought of things like having a #define around the prototype for > "yes I really meant to use this" which consumers would have to define in > an effort to try to flag up to developers and reviewers that it's not > normal. > > > > > I guess I'm looking here for something that can simplify consumer error > > > > handling - it's easy to get wrong and it seems that many drivers may be wrong. > > > > The overwhelming majority of the users just shouldn't be using this API. > > > If there weren't devices that absolutely *need* this API it wouldn't be > > > there in the first place since it seems like a magent for bad code, it's > > > so disappointing how bad the majority of the consumer code is. > > > Yeah, I guess in many cases this API is used as a cheap way to model > > always-on, fixed-voltage regulators. It's pretty hard to change those > > after the fact, though, because they're usually happening as part of > > device tree bindings implementation, so by the time we notice them, > > they've often become ABI... > > I don't follow here? Such users are going to be the common case for the > regulator API and shouldn't have any problems using normal regulator_get(). > When I say users shouldn't be using this API I mean _get_optional() > specifically. True, even if the regulator is specified as optional in the bindings, using regulator_get() would effectively make it optional for the driver given that it returns the dummy. Thierry
On Thu, Aug 29, 2019 at 02:43:46PM +0100, Andrew Murray wrote: > Where they are not specified, because there is really no reason for them to > be described in the DT - then these drivers should use regulator_get and > be happy with a dummy regulator. This gives a benefit as if another hardware > version uses the same driver but does have a regulator that needs software > control then we can be confident it will work. The common use case for this is that some boards have flexible power control which allows them save energy by powering off chips that are not in use, the driver doesn't super care if it actually gets powered off or not but it's nice to be able to do so. > Where regulators are really optional, then regulator_get_optional is used > and -ENODEV can be used by the caller to perform a different course of action > if required. (Does this use-case actually exist?) Yes. There are two main use cases. One is for things like MMC where there's optional supplies that can be used for some more advanced use cases but their use needs to be negotiated between the host and device, these typically enable lower power or higher performance modes at the cost of complexity. The other is for devices which have the option of using an internal regulator but can use an external one. This is typically used either for performance reasons (the external regulator might perform better but will increase board cost) or if some systems need multiple devices to be operating with the same reference voltage. > I guess I interpreted _optional as 'it's OK if you can't provide a regulator', > whereas the meaning is really 'get me a regulator that may not exist'. > Is my understanding correct? If so I guess another course of action would Yes. > be to clean-up users of _optional and convert them to regulator_get where > appropriate? Yes, I keep doing that intermittently (hence my frustration with more usages continually popping up in graphics drivers).
On Thu, Aug 29, 2019 at 04:58:20PM +0200, Thierry Reding wrote: > However, the same PCI controller can also be used with onboard devices > where no standard 3.3 V and 12 V need to be supplied (as in the PCIe > slot case above). Instead there are usually different power supplies > to power the device. Also, since these supplies are usually required to > bring the device up and make it detectable on the PCI bus, these > supplies are typically always on. Also, since the PCI controller is not > the consumer of the power supplies, it's impossible to specify which > supplies exactly are needed by the device (they'd have to be described > by a binding for these devices, but drivers couldn't be requesting them > because without the supplies being enabled the device would never show > up on the PCI bus and the driver would never bind). These on board devices that might have non-standard supplies are more of a problem as you say. This is a general problem with enumerable buses that get deployed in embedded contexts, things like clocks and GPIOs can also be required for enumeration. Ideally we'd have some pre-probe callback that could sort these things out but that's core device model work I've never found the time/enthusiasm to work on. IIRC there is already a PCI binding for DT so I guess we could do something like say it's up to the driver for the PCI device and just call probe() even without enumeration and require the device to power things up but that feels very messy. > Do you think those 3.3 V and 12 V regulators would qualify for use of > the regulator_get_optional() API? Because in the case where the PCIe > controller doesn't drive a PCIe slot, the 3.3 V and 12 V supplies really > are not there. It doesn't fill me with joy. I think the main thing is working out where to attach the supply. The cleanest and most idiomatic thing from a regulator perspective for the physical slots would be for the supplies to be attached to the PCI slot rather than the controller in the DT, even though they will get driven by the controller driver in practice. This would mean that controllers would have optional slot objects with mandatory regulators, the controller would then have to cope with powering the slot up before enumerating it but would be able to power it off if nothing's plugged in and save a bit of power, it also copes with the case where individual slots have separately controllable power. I'm not sure how this plays more generally but since the slots are a physical thing you can point to on the board it doesn't seem unreasonable. Every time I see these supplies attached to the controller for a bus it always feels a bit wrong, especially in interactions with soldered down devices. That feels cleaner than saying that the controller has a couple of optional supplies, it plays a bit better with soldered down devices and with slots having distinct supplies and it generally feels a bit more consistent. I'm not super sure I'm *happy* with this approach though, it feels a bit awkward with the device model. > > When I say users shouldn't be using this API I mean _get_optional() > > specifically. > True, even if the regulator is specified as optional in the bindings, > using regulator_get() would effectively make it optional for the driver > given that it returns the dummy. Unless the hardware can genuinely cope with there being literally no power supply there (as opposed to some fixed voltage thing) it probably shouldn't be specifying the supply as optional in the bindings either :/
diff --git a/drivers/pci/controller/pcie-iproc-platform.c b/drivers/pci/controller/pcie-iproc-platform.c index 5a3550b6bb29..9ee6200a66f4 100644 --- a/drivers/pci/controller/pcie-iproc-platform.c +++ b/drivers/pci/controller/pcie-iproc-platform.c @@ -93,12 +93,9 @@ static int iproc_pcie_pltfm_probe(struct platform_device *pdev) pcie->need_ib_cfg = of_property_read_bool(np, "dma-ranges"); /* PHY use is optional */ - pcie->phy = devm_phy_get(dev, "pcie-phy"); - if (IS_ERR(pcie->phy)) { - if (PTR_ERR(pcie->phy) == -EPROBE_DEFER) - return -EPROBE_DEFER; - pcie->phy = NULL; - } + pcie->phy = devm_phy_optional_get(dev, "pcie-phy"); + if (IS_ERR(pcie->phy)) + return PTR_ERR(pcie->phy); ret = devm_of_pci_get_host_bridge_resources(dev, 0, 0xff, &resources, &iobase);