Message ID | 20180503164009.14395-7-boris.brezillon@bootlin.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon <boris.brezillon@bootlin.com> wrote: > The device might be described in the device tree but not connected to > the I2C bus. Update the status property so that the DRM panel logic > returns -ENODEV when someone tries to get the panel attached to this > DT node. > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > --- > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > 1 file changed, 35 insertions(+) > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > index 2c9c9722734f..b8fcb1acef75 100644 > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > .get_modes = rpi_touchscreen_get_modes, > }; > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > +{ > + struct property *newprop; > + > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > + if (!newprop) > + return; > + > + newprop->name = kstrdup("status", GFP_KERNEL); > + if (!newprop->name) > + goto err; > + > + newprop->value = kstrdup("fail", GFP_KERNEL); > + if (!newprop->value) > + goto err; > + > + newprop->length = sizeof("fail"); > + > + if (of_update_property(i2c->dev.of_node, newprop)) > + goto err; > + As I mentioned on irc, can you make this a common DT function. I'm not sure if it matters that we set status to fail vs. disabled. I somewhat prefer the latter as we already have other cases and I'd rather the api not pass a string in. I can't think of any reason to distinguish the difference between fail and disabled. > + /* We intentionally leak the memory we allocate here, because the new > + * OF property might live longer than the underlying dev, so no way > + * we can use devm_kzalloc() here. > + */ > + return; > + > +err: > + kfree(newprop->value); > + kfree(newprop->name); > + kfree(newprop); > +} > + > static int rpi_touchscreen_probe(struct i2c_client *i2c, > const struct i2c_device_id *id) > { > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > if (ver < 0) { > + rpi_touchscreen_set_status_fail(i2c); I've thought some more about this and I still think this should be handled in the driver core or i2c core. The reason is simple. I think the state of the system should be the same after this as if you booted with 'status = "disabled"' for this node. And that means the device should be removed completely because we don't create struct device's for disabled nodes. Rob
Hi Rob, On Thu, 3 May 2018 12:12:39 -0500 Rob Herring <robh+dt@kernel.org> wrote: > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > <boris.brezillon@bootlin.com> wrote: > > The device might be described in the device tree but not connected to > > the I2C bus. Update the status property so that the DRM panel logic > > returns -ENODEV when someone tries to get the panel attached to this > > DT node. > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > --- > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > 1 file changed, 35 insertions(+) > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > index 2c9c9722734f..b8fcb1acef75 100644 > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > .get_modes = rpi_touchscreen_get_modes, > > }; > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > +{ > > + struct property *newprop; > > + > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > + if (!newprop) > > + return; > > + > > + newprop->name = kstrdup("status", GFP_KERNEL); > > + if (!newprop->name) > > + goto err; > > + > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > + if (!newprop->value) > > + goto err; > > + > > + newprop->length = sizeof("fail"); > > + > > + if (of_update_property(i2c->dev.of_node, newprop)) > > + goto err; > > + > > As I mentioned on irc, can you make this a common DT function. Yep, will move that to drivers/of/base.c and make it generic. > > I'm not sure if it matters that we set status to fail vs. disabled. I > somewhat prefer the latter as we already have other cases and I'd > rather the api not pass a string in. I can't think of any reason to > distinguish the difference between fail and disabled. Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), and "fail" seemed like a good match for what we are trying to express here: "we failed to communicate with the device in the probe function and want to mark it unusable", which is a bit different from "the device was explicitly disabled by the user". Anyway, if you think "disabled" is more appropriate, I'll use that. > > > + /* We intentionally leak the memory we allocate here, because the new > > + * OF property might live longer than the underlying dev, so no way > > + * we can use devm_kzalloc() here. > > + */ > > + return; > > + > > +err: > > + kfree(newprop->value); > > + kfree(newprop->name); > > + kfree(newprop); > > +} > > + > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > const struct i2c_device_id *id) > > { > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > if (ver < 0) { > > + rpi_touchscreen_set_status_fail(i2c); > > I've thought some more about this and I still think this should be > handled in the driver core or i2c core. > > The reason is simple. I think the state of the system should be the > same after this as if you booted with 'status = "disabled"' for this > node. And that means the device should be removed completely because > we don't create struct device's for disabled nodes. That was my feeling to when first discussing the issue with Daniel and Thierry on IRC, but after digging a bit in the code I'm no longer sure this is a good idea. At least, I don't think basing the decision to disable the device (or mark it unusable) based on the return value of the probe function is a good idea. What I can do is: 1/ provide a function to change the status prop in of.h 2/ let each driver call this function if they want to 3/ let the I2C core test the status prop again after the probe function has returned an error to determine whether the device (I mean struct i2c_client/device object) should be removed Would that work for you? Regards, Boris [1]https://elinux.org/images/c/cf/Power_ePAPR_APPROVED_v1.1.pdf
On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > Hi Rob, > > On Thu, 3 May 2018 12:12:39 -0500 > Rob Herring <robh+dt@kernel.org> wrote: > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > > <boris.brezillon@bootlin.com> wrote: > > > The device might be described in the device tree but not connected to > > > the I2C bus. Update the status property so that the DRM panel logic > > > returns -ENODEV when someone tries to get the panel attached to this > > > DT node. > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > > --- > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > > 1 file changed, 35 insertions(+) > > > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > index 2c9c9722734f..b8fcb1acef75 100644 > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > > .get_modes = rpi_touchscreen_get_modes, > > > }; > > > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > > +{ > > > + struct property *newprop; > > > + > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > > + if (!newprop) > > > + return; > > > + > > > + newprop->name = kstrdup("status", GFP_KERNEL); > > > + if (!newprop->name) > > > + goto err; > > > + > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > > + if (!newprop->value) > > > + goto err; > > > + > > > + newprop->length = sizeof("fail"); > > > + > > > + if (of_update_property(i2c->dev.of_node, newprop)) > > > + goto err; > > > + > > > > As I mentioned on irc, can you make this a common DT function. > > Yep, will move that to drivers/of/base.c and make it generic. > > > > > I'm not sure if it matters that we set status to fail vs. disabled. I > > somewhat prefer the latter as we already have other cases and I'd > > rather the api not pass a string in. I can't think of any reason to > > distinguish the difference between fail and disabled. > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > and "fail" seemed like a good match for what we are trying to express > here: "we failed to communicate with the device in the probe function > and want to mark it unusable", which is a bit different from "the > device was explicitly disabled by the user". > > Anyway, if you think "disabled" is more appropriate, I'll use that. > > > > > > + /* We intentionally leak the memory we allocate here, because the new > > > + * OF property might live longer than the underlying dev, so no way > > > + * we can use devm_kzalloc() here. > > > + */ > > > + return; > > > + > > > +err: > > > + kfree(newprop->value); > > > + kfree(newprop->name); > > > + kfree(newprop); > > > +} > > > + > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > const struct i2c_device_id *id) > > > { > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > > if (ver < 0) { > > > + rpi_touchscreen_set_status_fail(i2c); > > > > I've thought some more about this and I still think this should be > > handled in the driver core or i2c core. > > > > The reason is simple. I think the state of the system should be the > > same after this as if you booted with 'status = "disabled"' for this > > node. And that means the device should be removed completely because > > we don't create struct device's for disabled nodes. > > That was my feeling to when first discussing the issue with Daniel and > Thierry on IRC, but after digging a bit in the code I'm no longer sure > this is a good idea. At least, I don't think basing the decision to > disable the device (or mark it unusable) based on the return value of > the probe function is a good idea. I'm not so sure about that. -ENODEV seems like a very suitable error code to base that decision on. A random sampling of a handful of drivers confirms that this is primarily used to report situations where it is impossible for the device to ever be probed successfully, so might as well just remove it. At the very least I think it is worth proposing the patch and let Greg and others weigh in. Thierry
On Fri, 4 May 2018 11:47:48 +0200 Thierry Reding <thierry.reding@gmail.com> wrote: > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > > Hi Rob, > > > > On Thu, 3 May 2018 12:12:39 -0500 > > Rob Herring <robh+dt@kernel.org> wrote: > > > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > > > <boris.brezillon@bootlin.com> wrote: > > > > The device might be described in the device tree but not connected to > > > > the I2C bus. Update the status property so that the DRM panel logic > > > > returns -ENODEV when someone tries to get the panel attached to this > > > > DT node. > > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > > > --- > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > > > 1 file changed, 35 insertions(+) > > > > > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > index 2c9c9722734f..b8fcb1acef75 100644 > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > > > .get_modes = rpi_touchscreen_get_modes, > > > > }; > > > > > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > > > +{ > > > > + struct property *newprop; > > > > + > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > > > + if (!newprop) > > > > + return; > > > > + > > > > + newprop->name = kstrdup("status", GFP_KERNEL); > > > > + if (!newprop->name) > > > > + goto err; > > > > + > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > > > + if (!newprop->value) > > > > + goto err; > > > > + > > > > + newprop->length = sizeof("fail"); > > > > + > > > > + if (of_update_property(i2c->dev.of_node, newprop)) > > > > + goto err; > > > > + > > > > > > As I mentioned on irc, can you make this a common DT function. > > > > Yep, will move that to drivers/of/base.c and make it generic. > > > > > > > > I'm not sure if it matters that we set status to fail vs. disabled. I > > > somewhat prefer the latter as we already have other cases and I'd > > > rather the api not pass a string in. I can't think of any reason to > > > distinguish the difference between fail and disabled. > > > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > > and "fail" seemed like a good match for what we are trying to express > > here: "we failed to communicate with the device in the probe function > > and want to mark it unusable", which is a bit different from "the > > device was explicitly disabled by the user". > > > > Anyway, if you think "disabled" is more appropriate, I'll use that. > > > > > > > > > + /* We intentionally leak the memory we allocate here, because the new > > > > + * OF property might live longer than the underlying dev, so no way > > > > + * we can use devm_kzalloc() here. > > > > + */ > > > > + return; > > > > + > > > > +err: > > > > + kfree(newprop->value); > > > > + kfree(newprop->name); > > > > + kfree(newprop); > > > > +} > > > > + > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > const struct i2c_device_id *id) > > > > { > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > > > if (ver < 0) { > > > > + rpi_touchscreen_set_status_fail(i2c); > > > > > > I've thought some more about this and I still think this should be > > > handled in the driver core or i2c core. > > > > > > The reason is simple. I think the state of the system should be the > > > same after this as if you booted with 'status = "disabled"' for this > > > node. And that means the device should be removed completely because > > > we don't create struct device's for disabled nodes. > > > > That was my feeling to when first discussing the issue with Daniel and > > Thierry on IRC, but after digging a bit in the code I'm no longer sure > > this is a good idea. At least, I don't think basing the decision to > > disable the device (or mark it unusable) based on the return value of > > the probe function is a good idea. > > I'm not so sure about that. -ENODEV seems like a very suitable error > code to base that decision on. A random sampling of a handful of drivers > confirms that this is primarily used to report situations where it is > impossible for the device to ever be probed successfully, so might as > well just remove it. It's not that easy. It has to be done from the I2C core since it's the only one who can call device_unregister() and cleanup the other bits associated with an I2C device (see i2c_unregister_device()). Now, the i2c_driver->probe() function is called from a context where I'm almost sure device_unregister() can't be called since we might still be in the device_register() path. The solution would be to queue the unregistration work to a workqueue, but I'm not even sure this is safe to do that. What if the I2C adapter exposing the device is removed in the meantime? Of course, all of this can be addressed, I'm just wondering if it's really worth the trouble (we're likely to introduce new races or other kind of bugs while doing that), especially since placing the device in a "fail" state and still keeping it around would solve the problem without requiring all the extra cleanup we're talking about here. > > At the very least I think it is worth proposing the patch and let Greg > and others weigh in. Well, if it was an easy thing to do, I guess I would have gone for this approach from the beginning, but I fear doing that will be much more complicated than we think it is (maybe I'm wrong).
On Fri, May 04, 2018 at 02:17:49PM +0200, Boris Brezillon wrote: > On Fri, 4 May 2018 11:47:48 +0200 > Thierry Reding <thierry.reding@gmail.com> wrote: > > > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > > > Hi Rob, > > > > > > On Thu, 3 May 2018 12:12:39 -0500 > > > Rob Herring <robh+dt@kernel.org> wrote: > > > > > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > > > > <boris.brezillon@bootlin.com> wrote: > > > > > The device might be described in the device tree but not connected to > > > > > the I2C bus. Update the status property so that the DRM panel logic > > > > > returns -ENODEV when someone tries to get the panel attached to this > > > > > DT node. > > > > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > > > > --- > > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > > > > 1 file changed, 35 insertions(+) > > > > > > > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > index 2c9c9722734f..b8fcb1acef75 100644 > > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > > > > .get_modes = rpi_touchscreen_get_modes, > > > > > }; > > > > > > > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > > > > +{ > > > > > + struct property *newprop; > > > > > + > > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > > > > + if (!newprop) > > > > > + return; > > > > > + > > > > > + newprop->name = kstrdup("status", GFP_KERNEL); > > > > > + if (!newprop->name) > > > > > + goto err; > > > > > + > > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > > > > + if (!newprop->value) > > > > > + goto err; > > > > > + > > > > > + newprop->length = sizeof("fail"); > > > > > + > > > > > + if (of_update_property(i2c->dev.of_node, newprop)) > > > > > + goto err; > > > > > + > > > > > > > > As I mentioned on irc, can you make this a common DT function. > > > > > > Yep, will move that to drivers/of/base.c and make it generic. > > > > > > > > > > > I'm not sure if it matters that we set status to fail vs. disabled. I > > > > somewhat prefer the latter as we already have other cases and I'd > > > > rather the api not pass a string in. I can't think of any reason to > > > > distinguish the difference between fail and disabled. > > > > > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > > > and "fail" seemed like a good match for what we are trying to express > > > here: "we failed to communicate with the device in the probe function > > > and want to mark it unusable", which is a bit different from "the > > > device was explicitly disabled by the user". > > > > > > Anyway, if you think "disabled" is more appropriate, I'll use that. > > > > > > > > > > > > + /* We intentionally leak the memory we allocate here, because the new > > > > > + * OF property might live longer than the underlying dev, so no way > > > > > + * we can use devm_kzalloc() here. > > > > > + */ > > > > > + return; > > > > > + > > > > > +err: > > > > > + kfree(newprop->value); > > > > > + kfree(newprop->name); > > > > > + kfree(newprop); > > > > > +} > > > > > + > > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > const struct i2c_device_id *id) > > > > > { > > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > > > > if (ver < 0) { > > > > > + rpi_touchscreen_set_status_fail(i2c); > > > > > > > > I've thought some more about this and I still think this should be > > > > handled in the driver core or i2c core. > > > > > > > > The reason is simple. I think the state of the system should be the > > > > same after this as if you booted with 'status = "disabled"' for this > > > > node. And that means the device should be removed completely because > > > > we don't create struct device's for disabled nodes. > > > > > > That was my feeling to when first discussing the issue with Daniel and > > > Thierry on IRC, but after digging a bit in the code I'm no longer sure > > > this is a good idea. At least, I don't think basing the decision to > > > disable the device (or mark it unusable) based on the return value of > > > the probe function is a good idea. > > > > I'm not so sure about that. -ENODEV seems like a very suitable error > > code to base that decision on. A random sampling of a handful of drivers > > confirms that this is primarily used to report situations where it is > > impossible for the device to ever be probed successfully, so might as > > well just remove it. > > It's not that easy. It has to be done from the I2C core since it's the > only one who can call device_unregister() and cleanup the other bits > associated with an I2C device (see i2c_unregister_device()). Now, the > i2c_driver->probe() function is called from a context where I'm almost > sure device_unregister() can't be called since we might still be in the > device_register() path. The solution would be to queue the > unregistration work to a workqueue, but I'm not even sure this is safe > to do that. What if the I2C adapter exposing the device is removed in > the meantime? Of course, all of this can be addressed, I'm just > wondering if it's really worth the trouble (we're likely to introduce > new races or other kind of bugs while doing that), especially since > placing the device in a "fail" state and still keeping it around would > solve the problem without requiring all the extra cleanup we're talking > about here. I think you have to put the device status into "fail" immediately, otherwise there's a race with deferred probing. Scenario: 1. vc4 loads, panel isn't there yet -> EPROBE_DEFER. 2. rpi driver loads, notices panel isn't there, returns -ENODEV 3. i2c core fires off the worker and finishes it's ->probe callback. 4. device core starts a reprobe trigger 5. vc4 gets loaded, does the of_device_is_available check, but since that's not yet update it doesn't get the ENODEV, but still EPROBE_DEFER. 6. i2c worker disables the device and unregisters it. -> vc4 fails to load since nothing triggers another reprobe anymore. At least afaics device removal does not trigger a reprobe. -Daniel > > > > > At the very least I think it is worth proposing the patch and let Greg > > and others weigh in. > > Well, if it was an easy thing to do, I guess I would have gone for this > approach from the beginning, but I fear doing that will be much more > complicated than we think it is (maybe I'm wrong).
On Fri, 4 May 2018 16:20:17 +0200 Daniel Vetter <daniel@ffwll.ch> wrote: > On Fri, May 04, 2018 at 02:17:49PM +0200, Boris Brezillon wrote: > > On Fri, 4 May 2018 11:47:48 +0200 > > Thierry Reding <thierry.reding@gmail.com> wrote: > > > > > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > > > > Hi Rob, > > > > > > > > On Thu, 3 May 2018 12:12:39 -0500 > > > > Rob Herring <robh+dt@kernel.org> wrote: > > > > > > > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > > > > > <boris.brezillon@bootlin.com> wrote: > > > > > > The device might be described in the device tree but not connected to > > > > > > the I2C bus. Update the status property so that the DRM panel logic > > > > > > returns -ENODEV when someone tries to get the panel attached to this > > > > > > DT node. > > > > > > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > > > > > --- > > > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > > > > > 1 file changed, 35 insertions(+) > > > > > > > > > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > index 2c9c9722734f..b8fcb1acef75 100644 > > > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > > > > > .get_modes = rpi_touchscreen_get_modes, > > > > > > }; > > > > > > > > > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > > > > > +{ > > > > > > + struct property *newprop; > > > > > > + > > > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > > > > > + if (!newprop) > > > > > > + return; > > > > > > + > > > > > > + newprop->name = kstrdup("status", GFP_KERNEL); > > > > > > + if (!newprop->name) > > > > > > + goto err; > > > > > > + > > > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > > > > > + if (!newprop->value) > > > > > > + goto err; > > > > > > + > > > > > > + newprop->length = sizeof("fail"); > > > > > > + > > > > > > + if (of_update_property(i2c->dev.of_node, newprop)) > > > > > > + goto err; > > > > > > + > > > > > > > > > > As I mentioned on irc, can you make this a common DT function. > > > > > > > > Yep, will move that to drivers/of/base.c and make it generic. > > > > > > > > > > > > > > I'm not sure if it matters that we set status to fail vs. disabled. I > > > > > somewhat prefer the latter as we already have other cases and I'd > > > > > rather the api not pass a string in. I can't think of any reason to > > > > > distinguish the difference between fail and disabled. > > > > > > > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > > > > and "fail" seemed like a good match for what we are trying to express > > > > here: "we failed to communicate with the device in the probe function > > > > and want to mark it unusable", which is a bit different from "the > > > > device was explicitly disabled by the user". > > > > > > > > Anyway, if you think "disabled" is more appropriate, I'll use that. > > > > > > > > > > > > > > > + /* We intentionally leak the memory we allocate here, because the new > > > > > > + * OF property might live longer than the underlying dev, so no way > > > > > > + * we can use devm_kzalloc() here. > > > > > > + */ > > > > > > + return; > > > > > > + > > > > > > +err: > > > > > > + kfree(newprop->value); > > > > > > + kfree(newprop->name); > > > > > > + kfree(newprop); > > > > > > +} > > > > > > + > > > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > const struct i2c_device_id *id) > > > > > > { > > > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > > > > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > > > > > if (ver < 0) { > > > > > > + rpi_touchscreen_set_status_fail(i2c); > > > > > > > > > > I've thought some more about this and I still think this should be > > > > > handled in the driver core or i2c core. > > > > > > > > > > The reason is simple. I think the state of the system should be the > > > > > same after this as if you booted with 'status = "disabled"' for this > > > > > node. And that means the device should be removed completely because > > > > > we don't create struct device's for disabled nodes. > > > > > > > > That was my feeling to when first discussing the issue with Daniel and > > > > Thierry on IRC, but after digging a bit in the code I'm no longer sure > > > > this is a good idea. At least, I don't think basing the decision to > > > > disable the device (or mark it unusable) based on the return value of > > > > the probe function is a good idea. > > > > > > I'm not so sure about that. -ENODEV seems like a very suitable error > > > code to base that decision on. A random sampling of a handful of drivers > > > confirms that this is primarily used to report situations where it is > > > impossible for the device to ever be probed successfully, so might as > > > well just remove it. > > > > It's not that easy. It has to be done from the I2C core since it's the > > only one who can call device_unregister() and cleanup the other bits > > associated with an I2C device (see i2c_unregister_device()). Now, the > > i2c_driver->probe() function is called from a context where I'm almost > > sure device_unregister() can't be called since we might still be in the > > device_register() path. The solution would be to queue the > > unregistration work to a workqueue, but I'm not even sure this is safe > > to do that. What if the I2C adapter exposing the device is removed in > > the meantime? Of course, all of this can be addressed, I'm just > > wondering if it's really worth the trouble (we're likely to introduce > > new races or other kind of bugs while doing that), especially since > > placing the device in a "fail" state and still keeping it around would > > solve the problem without requiring all the extra cleanup we're talking > > about here. > > I think you have to put the device status into "fail" immediately, > otherwise there's a race with deferred probing. Scenario: > > 1. vc4 loads, panel isn't there yet -> EPROBE_DEFER. > 2. rpi driver loads, notices panel isn't there, returns -ENODEV > 3. i2c core fires off the worker and finishes it's ->probe callback. > 4. device core starts a reprobe trigger > 5. vc4 gets loaded, does the of_device_is_available check, but since > that's not yet update it doesn't get the ENODEV, but still EPROBE_DEFER. > 6. i2c worker disables the device and unregisters it. > > -> vc4 fails to load since nothing triggers another reprobe anymore. > > At least afaics device removal does not trigger a reprobe. Yep, you're correct. See, one more reason to keep the logic simple and let each driver change the status prop in their ->probe() function.
On Fri, 4 May 2018 16:24:04 +0200 Boris Brezillon <boris.brezillon@bootlin.com> wrote: > On Fri, 4 May 2018 16:20:17 +0200 > Daniel Vetter <daniel@ffwll.ch> wrote: > > > On Fri, May 04, 2018 at 02:17:49PM +0200, Boris Brezillon wrote: > > > On Fri, 4 May 2018 11:47:48 +0200 > > > Thierry Reding <thierry.reding@gmail.com> wrote: > > > > > > > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > > > > > Hi Rob, > > > > > > > > > > On Thu, 3 May 2018 12:12:39 -0500 > > > > > Rob Herring <robh+dt@kernel.org> wrote: > > > > > > > > > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > > > > > > <boris.brezillon@bootlin.com> wrote: > > > > > > > The device might be described in the device tree but not connected to > > > > > > > the I2C bus. Update the status property so that the DRM panel logic > > > > > > > returns -ENODEV when someone tries to get the panel attached to this > > > > > > > DT node. > > > > > > > > > > > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > > > > > > > --- > > > > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > > > > > > > 1 file changed, 35 insertions(+) > > > > > > > > > > > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > > index 2c9c9722734f..b8fcb1acef75 100644 > > > > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > > > > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > > > > > > > .get_modes = rpi_touchscreen_get_modes, > > > > > > > }; > > > > > > > > > > > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > > > > > > > +{ > > > > > > > + struct property *newprop; > > > > > > > + > > > > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > > > > > > > + if (!newprop) > > > > > > > + return; > > > > > > > + > > > > > > > + newprop->name = kstrdup("status", GFP_KERNEL); > > > > > > > + if (!newprop->name) > > > > > > > + goto err; > > > > > > > + > > > > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > > > > > > > + if (!newprop->value) > > > > > > > + goto err; > > > > > > > + > > > > > > > + newprop->length = sizeof("fail"); > > > > > > > + > > > > > > > + if (of_update_property(i2c->dev.of_node, newprop)) > > > > > > > + goto err; > > > > > > > + > > > > > > > > > > > > As I mentioned on irc, can you make this a common DT function. > > > > > > > > > > Yep, will move that to drivers/of/base.c and make it generic. > > > > > > > > > > > > > > > > > I'm not sure if it matters that we set status to fail vs. disabled. I > > > > > > somewhat prefer the latter as we already have other cases and I'd > > > > > > rather the api not pass a string in. I can't think of any reason to > > > > > > distinguish the difference between fail and disabled. > > > > > > > > > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > > > > > and "fail" seemed like a good match for what we are trying to express > > > > > here: "we failed to communicate with the device in the probe function > > > > > and want to mark it unusable", which is a bit different from "the > > > > > device was explicitly disabled by the user". > > > > > > > > > > Anyway, if you think "disabled" is more appropriate, I'll use that. > > > > > > > > > > > > > > > > > > + /* We intentionally leak the memory we allocate here, because the new > > > > > > > + * OF property might live longer than the underlying dev, so no way > > > > > > > + * we can use devm_kzalloc() here. > > > > > > > + */ > > > > > > > + return; > > > > > > > + > > > > > > > +err: > > > > > > > + kfree(newprop->value); > > > > > > > + kfree(newprop->name); > > > > > > > + kfree(newprop); > > > > > > > +} > > > > > > > + > > > > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > > const struct i2c_device_id *id) > > > > > > > { > > > > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > > > > > > > > > > > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > > > > > > > if (ver < 0) { > > > > > > > + rpi_touchscreen_set_status_fail(i2c); > > > > > > > > > > > > I've thought some more about this and I still think this should be > > > > > > handled in the driver core or i2c core. > > > > > > > > > > > > The reason is simple. I think the state of the system should be the > > > > > > same after this as if you booted with 'status = "disabled"' for this > > > > > > node. And that means the device should be removed completely because > > > > > > we don't create struct device's for disabled nodes. > > > > > > > > > > That was my feeling to when first discussing the issue with Daniel and > > > > > Thierry on IRC, but after digging a bit in the code I'm no longer sure > > > > > this is a good idea. At least, I don't think basing the decision to > > > > > disable the device (or mark it unusable) based on the return value of > > > > > the probe function is a good idea. > > > > > > > > I'm not so sure about that. -ENODEV seems like a very suitable error > > > > code to base that decision on. A random sampling of a handful of drivers > > > > confirms that this is primarily used to report situations where it is > > > > impossible for the device to ever be probed successfully, so might as > > > > well just remove it. > > > > > > It's not that easy. It has to be done from the I2C core since it's the > > > only one who can call device_unregister() and cleanup the other bits > > > associated with an I2C device (see i2c_unregister_device()). Now, the > > > i2c_driver->probe() function is called from a context where I'm almost > > > sure device_unregister() can't be called since we might still be in the > > > device_register() path. The solution would be to queue the > > > unregistration work to a workqueue, but I'm not even sure this is safe > > > to do that. What if the I2C adapter exposing the device is removed in > > > the meantime? Of course, all of this can be addressed, I'm just > > > wondering if it's really worth the trouble (we're likely to introduce > > > new races or other kind of bugs while doing that), especially since > > > placing the device in a "fail" state and still keeping it around would > > > solve the problem without requiring all the extra cleanup we're talking > > > about here. > > > > I think you have to put the device status into "fail" immediately, > > otherwise there's a race with deferred probing. Scenario: > > > > 1. vc4 loads, panel isn't there yet -> EPROBE_DEFER. > > 2. rpi driver loads, notices panel isn't there, returns -ENODEV > > 3. i2c core fires off the worker and finishes it's ->probe callback. > > 4. device core starts a reprobe trigger > > 5. vc4 gets loaded, does the of_device_is_available check, but since > > that's not yet update it doesn't get the ENODEV, but still EPROBE_DEFER. > > 6. i2c worker disables the device and unregisters it. > > > > -> vc4 fails to load since nothing triggers another reprobe anymore. > > > > At least afaics device removal does not trigger a reprobe. > > Yep, you're correct. See, one more reason to keep the logic simple and > let each driver change the status prop in their ->probe() function. Hm, actually it does not work even with my solution because the only thing that forces a new attempt on all deferred-probe devices is when a new device is bound to a driver, which will not happen if the rpi-panel ->probe() function returns -ENODEV.
On Fri, May 4, 2018 at 9:20 AM, Daniel Vetter <daniel@ffwll.ch> wrote: > On Fri, May 04, 2018 at 02:17:49PM +0200, Boris Brezillon wrote: >> On Fri, 4 May 2018 11:47:48 +0200 >> Thierry Reding <thierry.reding@gmail.com> wrote: >> >> > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: >> > > Hi Rob, >> > > >> > > On Thu, 3 May 2018 12:12:39 -0500 >> > > Rob Herring <robh+dt@kernel.org> wrote: >> > > >> > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon >> > > > <boris.brezillon@bootlin.com> wrote: >> > > > > The device might be described in the device tree but not connected to >> > > > > the I2C bus. Update the status property so that the DRM panel logic >> > > > > returns -ENODEV when someone tries to get the panel attached to this >> > > > > DT node. >> > > > > >> > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> >> > > > > --- >> > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ >> > > > > 1 file changed, 35 insertions(+) >> > > > > >> > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c >> > > > > index 2c9c9722734f..b8fcb1acef75 100644 >> > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c >> > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c >> > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { >> > > > > .get_modes = rpi_touchscreen_get_modes, >> > > > > }; >> > > > > >> > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) >> > > > > +{ >> > > > > + struct property *newprop; >> > > > > + >> > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); >> > > > > + if (!newprop) >> > > > > + return; >> > > > > + >> > > > > + newprop->name = kstrdup("status", GFP_KERNEL); >> > > > > + if (!newprop->name) >> > > > > + goto err; >> > > > > + >> > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); >> > > > > + if (!newprop->value) >> > > > > + goto err; >> > > > > + >> > > > > + newprop->length = sizeof("fail"); >> > > > > + >> > > > > + if (of_update_property(i2c->dev.of_node, newprop)) >> > > > > + goto err; >> > > > > + >> > > > >> > > > As I mentioned on irc, can you make this a common DT function. >> > > >> > > Yep, will move that to drivers/of/base.c and make it generic. >> > > >> > > > >> > > > I'm not sure if it matters that we set status to fail vs. disabled. I >> > > > somewhat prefer the latter as we already have other cases and I'd >> > > > rather the api not pass a string in. I can't think of any reason to >> > > > distinguish the difference between fail and disabled. >> > > >> > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), >> > > and "fail" seemed like a good match for what we are trying to express >> > > here: "we failed to communicate with the device in the probe function >> > > and want to mark it unusable", which is a bit different from "the >> > > device was explicitly disabled by the user". >> > > >> > > Anyway, if you think "disabled" is more appropriate, I'll use that. >> > > >> > > > >> > > > > + /* We intentionally leak the memory we allocate here, because the new >> > > > > + * OF property might live longer than the underlying dev, so no way >> > > > > + * we can use devm_kzalloc() here. >> > > > > + */ >> > > > > + return; >> > > > > + >> > > > > +err: >> > > > > + kfree(newprop->value); >> > > > > + kfree(newprop->name); >> > > > > + kfree(newprop); >> > > > > +} >> > > > > + >> > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, >> > > > > const struct i2c_device_id *id) >> > > > > { >> > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, >> > > > > >> > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); >> > > > > if (ver < 0) { >> > > > > + rpi_touchscreen_set_status_fail(i2c); >> > > > >> > > > I've thought some more about this and I still think this should be >> > > > handled in the driver core or i2c core. >> > > > >> > > > The reason is simple. I think the state of the system should be the >> > > > same after this as if you booted with 'status = "disabled"' for this >> > > > node. And that means the device should be removed completely because >> > > > we don't create struct device's for disabled nodes. >> > > >> > > That was my feeling to when first discussing the issue with Daniel and >> > > Thierry on IRC, but after digging a bit in the code I'm no longer sure >> > > this is a good idea. At least, I don't think basing the decision to >> > > disable the device (or mark it unusable) based on the return value of >> > > the probe function is a good idea. >> > >> > I'm not so sure about that. -ENODEV seems like a very suitable error >> > code to base that decision on. A random sampling of a handful of drivers >> > confirms that this is primarily used to report situations where it is >> > impossible for the device to ever be probed successfully, so might as >> > well just remove it. >> >> It's not that easy. It has to be done from the I2C core since it's the >> only one who can call device_unregister() and cleanup the other bits >> associated with an I2C device (see i2c_unregister_device()). Now, the >> i2c_driver->probe() function is called from a context where I'm almost >> sure device_unregister() can't be called since we might still be in the >> device_register() path. The solution would be to queue the >> unregistration work to a workqueue, but I'm not even sure this is safe >> to do that. What if the I2C adapter exposing the device is removed in >> the meantime? Of course, all of this can be addressed, I'm just >> wondering if it's really worth the trouble (we're likely to introduce >> new races or other kind of bugs while doing that), especially since >> placing the device in a "fail" state and still keeping it around would >> solve the problem without requiring all the extra cleanup we're talking >> about here. > > I think you have to put the device status into "fail" immediately, > otherwise there's a race with deferred probing. Scenario: > > 1. vc4 loads, panel isn't there yet -> EPROBE_DEFER. > 2. rpi driver loads, notices panel isn't there, returns -ENODEV Step 3a: i2c core updates device DT status property Step 3b: i2c core fires off worker for device_unregister That solves the race, right? > 3. i2c core fires off the worker and finishes it's ->probe callback. > 4. device core starts a reprobe trigger > 5. vc4 gets loaded, does the of_device_is_available check, but since > that's not yet update it doesn't get the ENODEV, but still EPROBE_DEFER. > 6. i2c worker disables the device and unregisters it. > > -> vc4 fails to load since nothing triggers another reprobe anymore. > > At least afaics device removal does not trigger a reprobe. > -Daniel > >> >> > >> > At the very least I think it is worth proposing the patch and let Greg >> > and others weigh in. >> >> Well, if it was an easy thing to do, I guess I would have gone for this >> approach from the beginning, but I fear doing that will be much more >> complicated than we think it is (maybe I'm wrong). > > -- > Daniel Vetter > Software Engineer, Intel Corporation > http://blog.ffwll.ch
On Fri, 4 May 2018 14:29:27 -0500 Rob Herring <robh+dt@kernel.org> wrote: > On Fri, May 4, 2018 at 9:20 AM, Daniel Vetter <daniel@ffwll.ch> wrote: > > On Fri, May 04, 2018 at 02:17:49PM +0200, Boris Brezillon wrote: > >> On Fri, 4 May 2018 11:47:48 +0200 > >> Thierry Reding <thierry.reding@gmail.com> wrote: > >> > >> > On Fri, May 04, 2018 at 10:06:53AM +0200, Boris Brezillon wrote: > >> > > Hi Rob, > >> > > > >> > > On Thu, 3 May 2018 12:12:39 -0500 > >> > > Rob Herring <robh+dt@kernel.org> wrote: > >> > > > >> > > > On Thu, May 3, 2018 at 11:40 AM, Boris Brezillon > >> > > > <boris.brezillon@bootlin.com> wrote: > >> > > > > The device might be described in the device tree but not connected to > >> > > > > the I2C bus. Update the status property so that the DRM panel logic > >> > > > > returns -ENODEV when someone tries to get the panel attached to this > >> > > > > DT node. > >> > > > > > >> > > > > Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> > >> > > > > --- > >> > > > > .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ > >> > > > > 1 file changed, 35 insertions(+) > >> > > > > > >> > > > > diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > >> > > > > index 2c9c9722734f..b8fcb1acef75 100644 > >> > > > > --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > >> > > > > +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c > >> > > > > @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { > >> > > > > .get_modes = rpi_touchscreen_get_modes, > >> > > > > }; > >> > > > > > >> > > > > +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) > >> > > > > +{ > >> > > > > + struct property *newprop; > >> > > > > + > >> > > > > + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); > >> > > > > + if (!newprop) > >> > > > > + return; > >> > > > > + > >> > > > > + newprop->name = kstrdup("status", GFP_KERNEL); > >> > > > > + if (!newprop->name) > >> > > > > + goto err; > >> > > > > + > >> > > > > + newprop->value = kstrdup("fail", GFP_KERNEL); > >> > > > > + if (!newprop->value) > >> > > > > + goto err; > >> > > > > + > >> > > > > + newprop->length = sizeof("fail"); > >> > > > > + > >> > > > > + if (of_update_property(i2c->dev.of_node, newprop)) > >> > > > > + goto err; > >> > > > > + > >> > > > > >> > > > As I mentioned on irc, can you make this a common DT function. > >> > > > >> > > Yep, will move that to drivers/of/base.c and make it generic. > >> > > > >> > > > > >> > > > I'm not sure if it matters that we set status to fail vs. disabled. I > >> > > > somewhat prefer the latter as we already have other cases and I'd > >> > > > rather the api not pass a string in. I can't think of any reason to > >> > > > distinguish the difference between fail and disabled. > >> > > > >> > > Well, I just read the ePAPR doc pointed by Thierry [1] (section 2.3.4), > >> > > and "fail" seemed like a good match for what we are trying to express > >> > > here: "we failed to communicate with the device in the probe function > >> > > and want to mark it unusable", which is a bit different from "the > >> > > device was explicitly disabled by the user". > >> > > > >> > > Anyway, if you think "disabled" is more appropriate, I'll use that. > >> > > > >> > > > > >> > > > > + /* We intentionally leak the memory we allocate here, because the new > >> > > > > + * OF property might live longer than the underlying dev, so no way > >> > > > > + * we can use devm_kzalloc() here. > >> > > > > + */ > >> > > > > + return; > >> > > > > + > >> > > > > +err: > >> > > > > + kfree(newprop->value); > >> > > > > + kfree(newprop->name); > >> > > > > + kfree(newprop); > >> > > > > +} > >> > > > > + > >> > > > > static int rpi_touchscreen_probe(struct i2c_client *i2c, > >> > > > > const struct i2c_device_id *id) > >> > > > > { > >> > > > > @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, > >> > > > > > >> > > > > ver = rpi_touchscreen_i2c_read(ts, REG_ID); > >> > > > > if (ver < 0) { > >> > > > > + rpi_touchscreen_set_status_fail(i2c); > >> > > > > >> > > > I've thought some more about this and I still think this should be > >> > > > handled in the driver core or i2c core. > >> > > > > >> > > > The reason is simple. I think the state of the system should be the > >> > > > same after this as if you booted with 'status = "disabled"' for this > >> > > > node. And that means the device should be removed completely because > >> > > > we don't create struct device's for disabled nodes. > >> > > > >> > > That was my feeling to when first discussing the issue with Daniel and > >> > > Thierry on IRC, but after digging a bit in the code I'm no longer sure > >> > > this is a good idea. At least, I don't think basing the decision to > >> > > disable the device (or mark it unusable) based on the return value of > >> > > the probe function is a good idea. > >> > > >> > I'm not so sure about that. -ENODEV seems like a very suitable error > >> > code to base that decision on. A random sampling of a handful of drivers > >> > confirms that this is primarily used to report situations where it is > >> > impossible for the device to ever be probed successfully, so might as > >> > well just remove it. > >> > >> It's not that easy. It has to be done from the I2C core since it's the > >> only one who can call device_unregister() and cleanup the other bits > >> associated with an I2C device (see i2c_unregister_device()). Now, the > >> i2c_driver->probe() function is called from a context where I'm almost > >> sure device_unregister() can't be called since we might still be in the > >> device_register() path. The solution would be to queue the > >> unregistration work to a workqueue, but I'm not even sure this is safe > >> to do that. What if the I2C adapter exposing the device is removed in > >> the meantime? Of course, all of this can be addressed, I'm just > >> wondering if it's really worth the trouble (we're likely to introduce > >> new races or other kind of bugs while doing that), especially since > >> placing the device in a "fail" state and still keeping it around would > >> solve the problem without requiring all the extra cleanup we're talking > >> about here. > > > > I think you have to put the device status into "fail" immediately, > > otherwise there's a race with deferred probing. Scenario: > > > > 1. vc4 loads, panel isn't there yet -> EPROBE_DEFER. > > 2. rpi driver loads, notices panel isn't there, returns -ENODEV > > Step 3a: i2c core updates device DT status property > Step 3b: i2c core fires off worker for device_unregister > > That solves the race, right? Yep, that should solve this particular race, but we still have a problem with the deferred-probe trigger. The only thing that forces all devices that have seen -EPROBE_DEFER to be probed again is when a device is bound to a driver, which in our case won't happen because we return -ENODEV. If we're lucky, something else in the system will trigger that after the I2C core has changed the status prop, and if we're not, the DRM dev will never but probed again.
diff --git a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c index 2c9c9722734f..b8fcb1acef75 100644 --- a/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c +++ b/drivers/gpu/drm/panel/panel-raspberrypi-touchscreen.c @@ -358,6 +358,39 @@ static const struct drm_panel_funcs rpi_touchscreen_funcs = { .get_modes = rpi_touchscreen_get_modes, }; +static void rpi_touchscreen_set_status_fail(struct i2c_client *i2c) +{ + struct property *newprop; + + newprop = kzalloc(sizeof(*newprop), GFP_KERNEL); + if (!newprop) + return; + + newprop->name = kstrdup("status", GFP_KERNEL); + if (!newprop->name) + goto err; + + newprop->value = kstrdup("fail", GFP_KERNEL); + if (!newprop->value) + goto err; + + newprop->length = sizeof("fail"); + + if (of_update_property(i2c->dev.of_node, newprop)) + goto err; + + /* We intentionally leak the memory we allocate here, because the new + * OF property might live longer than the underlying dev, so no way + * we can use devm_kzalloc() here. + */ + return; + +err: + kfree(newprop->value); + kfree(newprop->name); + kfree(newprop); +} + static int rpi_touchscreen_probe(struct i2c_client *i2c, const struct i2c_device_id *id) { @@ -382,6 +415,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, ver = rpi_touchscreen_i2c_read(ts, REG_ID); if (ver < 0) { + rpi_touchscreen_set_status_fail(i2c); dev_err(dev, "Atmel I2C read failed: %d\n", ver); return -ENODEV; } @@ -391,6 +425,7 @@ static int rpi_touchscreen_probe(struct i2c_client *i2c, case 0xc3: /* ver 2 */ break; default: + rpi_touchscreen_set_status_fail(i2c); dev_err(dev, "Unknown Atmel firmware revision: 0x%02x\n", ver); return -ENODEV; }
The device might be described in the device tree but not connected to the I2C bus. Update the status property so that the DRM panel logic returns -ENODEV when someone tries to get the panel attached to this DT node. Signed-off-by: Boris Brezillon <boris.brezillon@bootlin.com> --- .../gpu/drm/panel/panel-raspberrypi-touchscreen.c | 35 ++++++++++++++++++++++ 1 file changed, 35 insertions(+)