Message ID | 20241231-hotplug-drm-bridge-v5-4-173065a1ece1@bootlin.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | Add support for hot-pluggable DRM bridges | expand |
Hi-- On 12/31/24 2:39 AM, Luca Ceresoli wrote: > Document in detail the new refcounted bridges as well as the "legacy" way. > > Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> > > --- > > This patch was added in v5. > --- > Documentation/gpu/drm-kms-helpers.rst | 6 ++ > drivers/gpu/drm/drm_bridge.c | 122 ++++++++++++++++++++++++++++++++++ > 2 files changed, 128 insertions(+) > > diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst > index 8cf2f041af4704875910ce8228ae04615d0f21bd..ca2cfef2101988933e1464fe146997c1a661a117 100644 > --- a/Documentation/gpu/drm-kms-helpers.rst > +++ b/Documentation/gpu/drm-kms-helpers.rst > @@ -151,6 +151,12 @@ Overview > .. kernel-doc:: drivers/gpu/drm/drm_bridge.c > :doc: overview > > +Bridge lifecycle > +---------------- > + > +.. kernel-doc:: drivers/gpu/drm/drm_bridge.c > + :doc: bridge lifecycle > + > Display Driver Integration > -------------------------- > > diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c > index 6255ef59f73d8041a8cb7f2c6e23e5a67d1ae926..e9f138aa5b3270b4e3a1a56dc8d4b7e5f993c929 100644 > --- a/drivers/gpu/drm/drm_bridge.c > +++ b/drivers/gpu/drm/drm_bridge.c > @@ -60,6 +60,128 @@ > * encoder chain. > */ > > +/** > + * DOC: bridge lifecycle > + * > + * Allocation, initializion and teardown of a bridge can be implemented in initialization > + * one of two ways: *refcounted* mode and *legacy* mode. > + * > + * In **refcounted** mode: > + * > + * - each &struct drm_bridge is reference counted since its instantiation > + * - any code taking a pointer to a bridge has get and put APIs to refcount > + * it and so ensure the bridge won't be deallocated while using it > + * - deallocation is done when the last put happens and the refcount drops > + * to zero > + * - the driver instantiating the bridge also holds a reference, but the > + * allocated struct can survive it > + * > + * A bridge using refcounted mode is called a *refcounted bridge*. > + * > + * In **legacy** mode the &struct drm_bridge lifetime is tied to the device > + * instantiating it: it is allocated on probe and freed on removal. Any > + * other kernel entities holding a pointer to the bridge could incur in > + * use-after-free in case the bridge is deallocated at runtime. > + * > + * Legacy mode used to be the only one until refcounted bridges were > + * introduced, hance the name. It is still fine in case the bridges are a hence > + * fixed part of the pipeline, i.e. if the bridges are removed only when > + * tearing down the entire card. Refcounted bridges support both that case > + * and the case of more dynamic hardware with bridges that can be removed > + * at runtime without tearing down the entire card. > + * > + * Usage of refcounted bridges happens in two sides: the driver > + * implementing the bridge and the code using the bridge. > + * > + * For *drivers implemeting the bridge*, in both refcounted and legacy implementing > + * modes the common and expected pattern is that the driver declares a > + * driver-specific struct embedding a &struct drm_bridge. E.g.::
Hello Randy, On Tue, 31 Dec 2024 09:54:41 -0800 Randy Dunlap <rdunlap@infradead.org> wrote: > Hi-- > > On 12/31/24 2:39 AM, Luca Ceresoli wrote: > > Document in detail the new refcounted bridges as well as the "legacy" way. > > > > Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> > > > > --- ... > > + * Allocation, initializion and teardown of a bridge can be implemented in > > initialization Thanks for reviewing! I fixed locally the typos you spotted. Luca
Hi, Most of these comments affect your earlier patches, but let's work on the API-level view. On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > + * When using refcounted mode, the driver should allocate ``struct > + * my_bridge`` using regular allocation (as opposed to ``devm_`` or > + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to > + * transfer lifecycle management to the DRM bridge core, and implement a > + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this > + * example:: > + * > + * static void my_bridge_destroy(struct drm_bridge *bridge) > + * { > + * kfree(container_of(bridge, struct my_bridge, bridge)); > + * } > + * > + * static const struct drm_bridge_funcs my_bridge_funcs = { > + * .destroy = my_bridge_destroy, > + * ... > + * }; > + * > + * static int my_bridge_probe(...) > + * { > + * struct my_bridge *mybr; > + * int err; > + * > + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); > + * if (!mybr) > + * return -ENOMEM; > + * > + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); > + * if (err) > + * return err; > + * > + * ... > + * drm_bridge_add(); > + * ... > + * } > + * > + * static void my_bridge_remove() > + * { > + * struct my_bridge *mybr = ...; > + * drm_bridge_remove(&mybr->bridge); > + * // ... NO kfree here! > + * } I'm a bit worried there, since that API is pretty difficult to get right, and we don't have anything to catch bad patterns. Let's take a step back. What we're trying to solve here is: 1) We want to avoid any dangling pointers to a bridge if the bridge device is removed. 2) To do so, we need to switch to reference counted allocations and pointers. 3) Most bridges structures are allocated through devm_kzalloc, and they one that aren't are freed at remove time anyway, so the allocated structure will be gone when the device is removed. 4) To properly track users, each user that will use a drm_bridge needs to take a reference. AFAIU, the destroy introduction and the on-purpose omission of kfree in remove is to solve 3. Introducing a function that allocates the drm_bridge container struct (like drmm_encoder_alloc for example), take a reference, register a devm kfree action, and return the pointer to the driver structure would solve that too pretty nicely. So, something like: struct driver_priv { struct drm_bridge bridge; ... } static int driver_probe(...) { struct driver_priv *priv; struct drm_bridge *bridge; .... priv = devm_drm_bridge_alloc(dev, struct driver_priv, bridge); if (IS_ERR(priv)) return ERR_PTR(priv); bridge = &priv->bridge; ... drm_bridge_add(bridge); } Would work just as well. I also don't think we need explicit (at least for the common case) drm_bridge_get and drm_bridge_put calls for bridge users. drm_bridge_attach and drm_bridge_detach can get/put the reference directly. And we'll also need some flag in drm_bridge to indicate that the device is gone, similar to what drm_dev_enter()/drm_dev_exit() provides, because now your bridge driver sticks around for much longer than your device so the expectation that your device managed resources (clocks, registers, etc.) are always going to be around. Maxime
On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > Hi, > > Most of these comments affect your earlier patches, but let's work on > the API-level view. > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > + * When using refcounted mode, the driver should allocate ``struct > > + * my_bridge`` using regular allocation (as opposed to ``devm_`` or > > + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to > > + * transfer lifecycle management to the DRM bridge core, and implement a > > + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this > > + * example:: > > + * > > + * static void my_bridge_destroy(struct drm_bridge *bridge) > > + * { > > + * kfree(container_of(bridge, struct my_bridge, bridge)); > > + * } > > + * > > + * static const struct drm_bridge_funcs my_bridge_funcs = { > > + * .destroy = my_bridge_destroy, > > + * ... > > + * }; > > + * > > + * static int my_bridge_probe(...) > > + * { > > + * struct my_bridge *mybr; > > + * int err; > > + * > > + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); > > + * if (!mybr) > > + * return -ENOMEM; > > + * > > + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); > > + * if (err) > > + * return err; > > + * > > + * ... > > + * drm_bridge_add(); > > + * ... > > + * } > > + * > > + * static void my_bridge_remove() > > + * { > > + * struct my_bridge *mybr = ...; > > + * drm_bridge_remove(&mybr->bridge); > > + * // ... NO kfree here! > > + * } > > I'm a bit worried there, since that API is pretty difficult to get > right, and we don't have anything to catch bad patterns. > > Let's take a step back. What we're trying to solve here is: > > 1) We want to avoid any dangling pointers to a bridge if the bridge > device is removed. > > 2) To do so, we need to switch to reference counted allocations and > pointers. > > 3) Most bridges structures are allocated through devm_kzalloc, and they > one that aren't are freed at remove time anyway, so the allocated > structure will be gone when the device is removed. > > 4) To properly track users, each user that will use a drm_bridge needs > to take a reference. 5) Handle the disappearing next_bridge problem: probe() function gets a pointer to the next bridge, but then for some reasons (e.g. because of the other device being removed or because of some probe deferral) the next_bridge driver gets unbdound and the next_bridge becomes unusable before a call to drm_bridge_attach(). > > AFAIU, the destroy introduction and the on-purpose omission of kfree in > remove is to solve 3. > > Introducing a function that allocates the drm_bridge container struct > (like drmm_encoder_alloc for example), take a reference, register a devm > kfree action, and return the pointer to the driver structure would solve > that too pretty nicely. > > So, something like: > > > struct driver_priv { > struct drm_bridge bridge; > > ... > } > > static int driver_probe(...) > { > struct driver_priv *priv; > struct drm_bridge *bridge; > > .... > > priv = devm_drm_bridge_alloc(dev, struct driver_priv, bridge); Ah... And devm-cleanup will just drop a reference to that data, freeing it when all refs are cleaned? Nice idea. > if (IS_ERR(priv)) > return ERR_PTR(priv); > bridge = &priv->bridge; > > ... > > drm_bridge_add(bridge); > } > > Would work just as well. > > I also don't think we need explicit (at least for the common case) > drm_bridge_get and drm_bridge_put calls for bridge users. > drm_bridge_attach and drm_bridge_detach can get/put the reference > directly. As I wrote previously, I think drm_bridge_attach() might be too late for that. It sounds like drm_of_get_panel_or_bridge() and of_drm_find_bridge should increment the refcount, possibly adding a devres action to put the reference. > And we'll also need some flag in drm_bridge to indicate that the device > is gone, similar to what drm_dev_enter()/drm_dev_exit() provides, > because now your bridge driver sticks around for much longer than your > device so the expectation that your device managed resources (clocks, > registers, etc.) are always going to be around.
On Mon, Jan 06, 2025 at 02:24:00PM +0200, Dmitry Baryshkov wrote: > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > Hi, > > > > Most of these comments affect your earlier patches, but let's work on > > the API-level view. > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > + * When using refcounted mode, the driver should allocate ``struct > > > + * my_bridge`` using regular allocation (as opposed to ``devm_`` or > > > + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to > > > + * transfer lifecycle management to the DRM bridge core, and implement a > > > + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this > > > + * example:: > > > + * > > > + * static void my_bridge_destroy(struct drm_bridge *bridge) > > > + * { > > > + * kfree(container_of(bridge, struct my_bridge, bridge)); > > > + * } > > > + * > > > + * static const struct drm_bridge_funcs my_bridge_funcs = { > > > + * .destroy = my_bridge_destroy, > > > + * ... > > > + * }; > > > + * > > > + * static int my_bridge_probe(...) > > > + * { > > > + * struct my_bridge *mybr; > > > + * int err; > > > + * > > > + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); > > > + * if (!mybr) > > > + * return -ENOMEM; > > > + * > > > + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); > > > + * if (err) > > > + * return err; > > > + * > > > + * ... > > > + * drm_bridge_add(); > > > + * ... > > > + * } > > > + * > > > + * static void my_bridge_remove() > > > + * { > > > + * struct my_bridge *mybr = ...; > > > + * drm_bridge_remove(&mybr->bridge); > > > + * // ... NO kfree here! > > > + * } > > > > I'm a bit worried there, since that API is pretty difficult to get > > right, and we don't have anything to catch bad patterns. > > > > Let's take a step back. What we're trying to solve here is: > > > > 1) We want to avoid any dangling pointers to a bridge if the bridge > > device is removed. > > > > 2) To do so, we need to switch to reference counted allocations and > > pointers. > > > > 3) Most bridges structures are allocated through devm_kzalloc, and they > > one that aren't are freed at remove time anyway, so the allocated > > structure will be gone when the device is removed. > > > > 4) To properly track users, each user that will use a drm_bridge needs > > to take a reference. > > 5) Handle the disappearing next_bridge problem: probe() function gets > a pointer to the next bridge, but then for some reasons (e.g. because > of the other device being removed or because of some probe deferral) > the next_bridge driver gets unbdound and the next_bridge becomes > unusable before a call to drm_bridge_attach(). Oh, right. We need to plumb it in drm_of_find_bridge somehow too. > > AFAIU, the destroy introduction and the on-purpose omission of kfree in > > remove is to solve 3. > > > > Introducing a function that allocates the drm_bridge container struct > > (like drmm_encoder_alloc for example), take a reference, register a devm > > kfree action, and return the pointer to the driver structure would solve > > that too pretty nicely. > > > > So, something like: > > > > > > struct driver_priv { > > struct drm_bridge bridge; > > > > ... > > } > > > > static int driver_probe(...) > > { > > struct driver_priv *priv; > > struct drm_bridge *bridge; > > > > .... > > > > priv = devm_drm_bridge_alloc(dev, struct driver_priv, bridge); > > Ah... And devm-cleanup will just drop a reference to that data, > freeing it when all refs are cleaned? Nice idea. Yup. > > if (IS_ERR(priv)) > > return ERR_PTR(priv); > > bridge = &priv->bridge; > > > > ... > > > > drm_bridge_add(bridge); > > } > > > > Would work just as well. > > > > I also don't think we need explicit (at least for the common case) > > drm_bridge_get and drm_bridge_put calls for bridge users. > > drm_bridge_attach and drm_bridge_detach can get/put the reference > > directly. > > As I wrote previously, I think drm_bridge_attach() might be too late for that. > It sounds like drm_of_get_panel_or_bridge() and of_drm_find_bridge > should increment the refcount, possibly adding a devres action to put > the reference. We probably need both. drm_bridge_attach adds the bridge pointer to the encoder bridge_chain list, so if we had something like bridge = drm_of_find_bridge(); drm_bridge_attach(encoder, bridge); drm_bridge_put(bridge); We could have a dangling pointer. Maxime
On Mon, Jan 06, 2025 at 03:49:48PM +0100, Maxime Ripard wrote: > On Mon, Jan 06, 2025 at 02:24:00PM +0200, Dmitry Baryshkov wrote: > > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > Hi, > > > > > > Most of these comments affect your earlier patches, but let's work on > > > the API-level view. > > > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > if (IS_ERR(priv)) > > > return ERR_PTR(priv); > > > bridge = &priv->bridge; > > > > > > ... > > > > > > drm_bridge_add(bridge); > > > } > > > > > > Would work just as well. > > > > > > I also don't think we need explicit (at least for the common case) > > > drm_bridge_get and drm_bridge_put calls for bridge users. > > > drm_bridge_attach and drm_bridge_detach can get/put the reference > > > directly. > > > > As I wrote previously, I think drm_bridge_attach() might be too late for that. > > It sounds like drm_of_get_panel_or_bridge() and of_drm_find_bridge > > should increment the refcount, possibly adding a devres action to put > > the reference. > > We probably need both. drm_bridge_attach adds the bridge pointer to the > encoder bridge_chain list, so if we had something like > > bridge = drm_of_find_bridge(); > drm_bridge_attach(encoder, bridge); > drm_bridge_put(bridge); > > We could have a dangling pointer. Yes... So, both drm_bridge_attach and drm_of_find_bridge() should take the refcount. Just as an idea, it might be nice to add refcounting to bridges_show(), so that we can easily verify that refcounting works correctly.
On Tue, Jan 07, 2025 at 12:35:15PM +0200, Dmitry Baryshkov wrote: > On Mon, Jan 06, 2025 at 03:49:48PM +0100, Maxime Ripard wrote: > > On Mon, Jan 06, 2025 at 02:24:00PM +0200, Dmitry Baryshkov wrote: > > > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > > > Hi, > > > > > > > > Most of these comments affect your earlier patches, but let's work on > > > > the API-level view. > > > > > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > > > if (IS_ERR(priv)) > > > > return ERR_PTR(priv); > > > > bridge = &priv->bridge; > > > > > > > > ... > > > > > > > > drm_bridge_add(bridge); > > > > } > > > > > > > > Would work just as well. > > > > > > > > I also don't think we need explicit (at least for the common case) > > > > drm_bridge_get and drm_bridge_put calls for bridge users. > > > > drm_bridge_attach and drm_bridge_detach can get/put the reference > > > > directly. > > > > > > As I wrote previously, I think drm_bridge_attach() might be too late for that. > > > It sounds like drm_of_get_panel_or_bridge() and of_drm_find_bridge > > > should increment the refcount, possibly adding a devres action to put > > > the reference. > > > > We probably need both. drm_bridge_attach adds the bridge pointer to the > > encoder bridge_chain list, so if we had something like > > > > bridge = drm_of_find_bridge(); > > drm_bridge_attach(encoder, bridge); > > drm_bridge_put(bridge); > > > > We could have a dangling pointer. > > Yes... So, both drm_bridge_attach and drm_of_find_bridge() should take > the refcount. > > Just as an idea, it might be nice to add refcounting to bridges_show(), > so that we can easily verify that refcounting works correctly. Yep, it looks like a good idea indeed. Maxime
Hi Maxime, Dmitry, thanks both for the useful review! On Mon, 6 Jan 2025 14:24:00 +0200 Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > Hi, > > > > Most of these comments affect your earlier patches, but let's work on > > the API-level view. > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > + * When using refcounted mode, the driver should allocate ``struct > > > + * my_bridge`` using regular allocation (as opposed to ``devm_`` or > > > + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to > > > + * transfer lifecycle management to the DRM bridge core, and implement a > > > + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this > > > + * example:: > > > + * > > > + * static void my_bridge_destroy(struct drm_bridge *bridge) > > > + * { > > > + * kfree(container_of(bridge, struct my_bridge, bridge)); > > > + * } > > > + * > > > + * static const struct drm_bridge_funcs my_bridge_funcs = { > > > + * .destroy = my_bridge_destroy, > > > + * ... > > > + * }; > > > + * > > > + * static int my_bridge_probe(...) > > > + * { > > > + * struct my_bridge *mybr; > > > + * int err; > > > + * > > > + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); > > > + * if (!mybr) > > > + * return -ENOMEM; > > > + * > > > + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); > > > + * if (err) > > > + * return err; > > > + * > > > + * ... > > > + * drm_bridge_add(); > > > + * ... > > > + * } > > > + * > > > + * static void my_bridge_remove() > > > + * { > > > + * struct my_bridge *mybr = ...; > > > + * drm_bridge_remove(&mybr->bridge); > > > + * // ... NO kfree here! > > > + * } > > > > I'm a bit worried there, since that API is pretty difficult to get > > right, and we don't have anything to catch bad patterns. > > > > Let's take a step back. What we're trying to solve here is: > > > > 1) We want to avoid any dangling pointers to a bridge if the bridge > > device is removed. > > > > 2) To do so, we need to switch to reference counted allocations and > > pointers. > > > > 3) Most bridges structures are allocated through devm_kzalloc, and they > > one that aren't are freed at remove time anyway, so the allocated > > structure will be gone when the device is removed. > > > > 4) To properly track users, each user that will use a drm_bridge needs > > to take a reference. > > 5) Handle the disappearing next_bridge problem: probe() function gets > a pointer to the next bridge, but then for some reasons (e.g. because > of the other device being removed or because of some probe deferral) > the next_bridge driver gets unbdound and the next_bridge becomes > unusable before a call to drm_bridge_attach(). > > > > > AFAIU, the destroy introduction and the on-purpose omission of kfree in > > remove is to solve 3. > > > > Introducing a function that allocates the drm_bridge container struct > > (like drmm_encoder_alloc for example), take a reference, register a devm > > kfree action, and return the pointer to the driver structure would solve > > that too pretty nicely. > > > > So, something like: > > > > > > struct driver_priv { > > struct drm_bridge bridge; > > > > ... > > } > > > > static int driver_probe(...) > > { > > struct driver_priv *priv; > > struct drm_bridge *bridge; > > > > .... > > > > priv = devm_drm_bridge_alloc(dev, struct driver_priv, bridge); > > Ah... And devm-cleanup will just drop a reference to that data, > freeing it when all refs are cleaned? Nice idea. I like the idea. It's basically a macro wrapping the calls to kzalloc() + drm_bridge_init() that I proposed in this series. I had thought about such an idea initially but I haven't seen such a macro in drm_connector.h I didn't follow the idea. I don't love the _alloc name though because it will be doing much more than allocating. What about devm_drm_bridge_new()? I understand _alloc is coherent with the drmm_encoder_alloc() and I could survive that... but what about renaming that one to drmm_encoder_new()? Or maybe _create instead of _new, because _new is used for atomic states, in opposition to _old. > > And we'll also need some flag in drm_bridge to indicate that the device > > is gone, similar to what drm_dev_enter()/drm_dev_exit() provides, > > because now your bridge driver sticks around for much longer than your > > device so the expectation that your device managed resources (clocks, > > registers, etc.) are always going to be around. Yes, makes sense too. That should be a drm_bridge_enter/exit(), and drm_bridge.c will need to be sprinkled with them I guess. Luca
On Tue, 7 Jan 2025 12:35:15 +0200 Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > On Mon, Jan 06, 2025 at 03:49:48PM +0100, Maxime Ripard wrote: > > On Mon, Jan 06, 2025 at 02:24:00PM +0200, Dmitry Baryshkov wrote: > > > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > > > Hi, > > > > > > > > Most of these comments affect your earlier patches, but let's work on > > > > the API-level view. > > > > > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > > > if (IS_ERR(priv)) > > > > return ERR_PTR(priv); > > > > bridge = &priv->bridge; > > > > > > > > ... > > > > > > > > drm_bridge_add(bridge); > > > > } > > > > > > > > Would work just as well. > > > > > > > > I also don't think we need explicit (at least for the common case) > > > > drm_bridge_get and drm_bridge_put calls for bridge users. > > > > drm_bridge_attach and drm_bridge_detach can get/put the reference > > > > directly. > > > > > > As I wrote previously, I think drm_bridge_attach() might be too late for that. > > > It sounds like drm_of_get_panel_or_bridge() and of_drm_find_bridge > > > should increment the refcount, possibly adding a devres action to put > > > the reference. > > > > We probably need both. drm_bridge_attach adds the bridge pointer to the > > encoder bridge_chain list, so if we had something like > > > > bridge = drm_of_find_bridge(); > > drm_bridge_attach(encoder, bridge); > > drm_bridge_put(bridge); > > > > We could have a dangling pointer. > > Yes... So, both drm_bridge_attach and drm_of_find_bridge() should take > the refcount. Exactly, just like drm_bridge_add/remove() do. > Just as an idea, it might be nice to add refcounting to bridges_show(), > so that we can easily verify that refcounting works correctly. Good point, cheap and useful, adding that too! Luca
On Wed, Jan 08, 2025 at 04:24:29PM +0100, Luca Ceresoli wrote: > Hi Maxime, Dmitry, > > thanks both for the useful review! > > On Mon, 6 Jan 2025 14:24:00 +0200 > Dmitry Baryshkov <dmitry.baryshkov@linaro.org> wrote: > > > On Mon, 6 Jan 2025 at 12:39, Maxime Ripard <mripard@kernel.org> wrote: > > > > > > Hi, > > > > > > Most of these comments affect your earlier patches, but let's work on > > > the API-level view. > > > > > > On Tue, Dec 31, 2024 at 11:39:58AM +0100, Luca Ceresoli wrote: > > > > + * When using refcounted mode, the driver should allocate ``struct > > > > + * my_bridge`` using regular allocation (as opposed to ``devm_`` or > > > > + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to > > > > + * transfer lifecycle management to the DRM bridge core, and implement a > > > > + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this > > > > + * example:: > > > > + * > > > > + * static void my_bridge_destroy(struct drm_bridge *bridge) > > > > + * { > > > > + * kfree(container_of(bridge, struct my_bridge, bridge)); > > > > + * } > > > > + * > > > > + * static const struct drm_bridge_funcs my_bridge_funcs = { > > > > + * .destroy = my_bridge_destroy, > > > > + * ... > > > > + * }; > > > > + * > > > > + * static int my_bridge_probe(...) > > > > + * { > > > > + * struct my_bridge *mybr; > > > > + * int err; > > > > + * > > > > + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); > > > > + * if (!mybr) > > > > + * return -ENOMEM; > > > > + * > > > > + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); > > > > + * if (err) > > > > + * return err; > > > > + * > > > > + * ... > > > > + * drm_bridge_add(); > > > > + * ... > > > > + * } > > > > + * > > > > + * static void my_bridge_remove() > > > > + * { > > > > + * struct my_bridge *mybr = ...; > > > > + * drm_bridge_remove(&mybr->bridge); > > > > + * // ... NO kfree here! > > > > + * } > > > > > > I'm a bit worried there, since that API is pretty difficult to get > > > right, and we don't have anything to catch bad patterns. > > > > > > Let's take a step back. What we're trying to solve here is: > > > > > > 1) We want to avoid any dangling pointers to a bridge if the bridge > > > device is removed. > > > > > > 2) To do so, we need to switch to reference counted allocations and > > > pointers. > > > > > > 3) Most bridges structures are allocated through devm_kzalloc, and they > > > one that aren't are freed at remove time anyway, so the allocated > > > structure will be gone when the device is removed. > > > > > > 4) To properly track users, each user that will use a drm_bridge needs > > > to take a reference. > > > > 5) Handle the disappearing next_bridge problem: probe() function gets > > a pointer to the next bridge, but then for some reasons (e.g. because > > of the other device being removed or because of some probe deferral) > > the next_bridge driver gets unbdound and the next_bridge becomes > > unusable before a call to drm_bridge_attach(). > > > > > > > > AFAIU, the destroy introduction and the on-purpose omission of kfree in > > > remove is to solve 3. > > > > > > Introducing a function that allocates the drm_bridge container struct > > > (like drmm_encoder_alloc for example), take a reference, register a devm > > > kfree action, and return the pointer to the driver structure would solve > > > that too pretty nicely. > > > > > > So, something like: > > > > > > > > > struct driver_priv { > > > struct drm_bridge bridge; > > > > > > ... > > > } > > > > > > static int driver_probe(...) > > > { > > > struct driver_priv *priv; > > > struct drm_bridge *bridge; > > > > > > .... > > > > > > priv = devm_drm_bridge_alloc(dev, struct driver_priv, bridge); > > > > Ah... And devm-cleanup will just drop a reference to that data, > > freeing it when all refs are cleaned? Nice idea. > > I like the idea. It's basically a macro wrapping the calls to kzalloc() > + drm_bridge_init() that I proposed in this series. I had thought about > such an idea initially but I haven't seen such a macro in > drm_connector.h I didn't follow the idea. > > I don't love the _alloc name though because it will be doing much more > than allocating. What about devm_drm_bridge_new()? > > I understand _alloc is coherent with the drmm_encoder_alloc() and I > could survive that... but what about renaming that one to > drmm_encoder_new()? alloc is used pretty much every where for allocation + init, see CRTC, planes, connectors, etc. It might be unfortunate, but I don't think we should change that convention. > Or maybe _create instead of _new, because _new is used for atomic > states, in opposition to _old. > > > > And we'll also need some flag in drm_bridge to indicate that the device > > > is gone, similar to what drm_dev_enter()/drm_dev_exit() provides, > > > because now your bridge driver sticks around for much longer than your > > > device so the expectation that your device managed resources (clocks, > > > registers, etc.) are always going to be around. > > Yes, makes sense too. That should be a drm_bridge_enter/exit(), and > drm_bridge.c will need to be sprinkled with them I guess. The users would be the drivers, most likely. There's not much we can do at the framework level, unfortunately. Maxime
diff --git a/Documentation/gpu/drm-kms-helpers.rst b/Documentation/gpu/drm-kms-helpers.rst index 8cf2f041af4704875910ce8228ae04615d0f21bd..ca2cfef2101988933e1464fe146997c1a661a117 100644 --- a/Documentation/gpu/drm-kms-helpers.rst +++ b/Documentation/gpu/drm-kms-helpers.rst @@ -151,6 +151,12 @@ Overview .. kernel-doc:: drivers/gpu/drm/drm_bridge.c :doc: overview +Bridge lifecycle +---------------- + +.. kernel-doc:: drivers/gpu/drm/drm_bridge.c + :doc: bridge lifecycle + Display Driver Integration -------------------------- diff --git a/drivers/gpu/drm/drm_bridge.c b/drivers/gpu/drm/drm_bridge.c index 6255ef59f73d8041a8cb7f2c6e23e5a67d1ae926..e9f138aa5b3270b4e3a1a56dc8d4b7e5f993c929 100644 --- a/drivers/gpu/drm/drm_bridge.c +++ b/drivers/gpu/drm/drm_bridge.c @@ -60,6 +60,128 @@ * encoder chain. */ +/** + * DOC: bridge lifecycle + * + * Allocation, initializion and teardown of a bridge can be implemented in + * one of two ways: *refcounted* mode and *legacy* mode. + * + * In **refcounted** mode: + * + * - each &struct drm_bridge is reference counted since its instantiation + * - any code taking a pointer to a bridge has get and put APIs to refcount + * it and so ensure the bridge won't be deallocated while using it + * - deallocation is done when the last put happens and the refcount drops + * to zero + * - the driver instantiating the bridge also holds a reference, but the + * allocated struct can survive it + * + * A bridge using refcounted mode is called a *refcounted bridge*. + * + * In **legacy** mode the &struct drm_bridge lifetime is tied to the device + * instantiating it: it is allocated on probe and freed on removal. Any + * other kernel entities holding a pointer to the bridge could incur in + * use-after-free in case the bridge is deallocated at runtime. + * + * Legacy mode used to be the only one until refcounted bridges were + * introduced, hance the name. It is still fine in case the bridges are a + * fixed part of the pipeline, i.e. if the bridges are removed only when + * tearing down the entire card. Refcounted bridges support both that case + * and the case of more dynamic hardware with bridges that can be removed + * at runtime without tearing down the entire card. + * + * Usage of refcounted bridges happens in two sides: the driver + * implementing the bridge and the code using the bridge. + * + * For *drivers implemeting the bridge*, in both refcounted and legacy + * modes the common and expected pattern is that the driver declares a + * driver-specific struct embedding a &struct drm_bridge. E.g.:: + * + * struct my_bridge { + * ... + * struct drm_bridge bridge; + * ... + * }; + * + * When using refcounted mode, the driver should allocate ``struct + * my_bridge`` using regular allocation (as opposed to ``devm_`` or + * ``drmm_`` allocation), call drm_bridge_init() immediately afterwards to + * transfer lifecycle management to the DRM bridge core, and implement a + * ``.destroy`` function to deallocate the ``struct my_bridge``, as in this + * example:: + * + * static void my_bridge_destroy(struct drm_bridge *bridge) + * { + * kfree(container_of(bridge, struct my_bridge, bridge)); + * } + * + * static const struct drm_bridge_funcs my_bridge_funcs = { + * .destroy = my_bridge_destroy, + * ... + * }; + * + * static int my_bridge_probe(...) + * { + * struct my_bridge *mybr; + * int err; + * + * mybr = kzalloc(sizeof(*mybr), GFP_KERNEL); + * if (!mybr) + * return -ENOMEM; + * + * err = drm_bridge_init(dev, &mybr->bridge, &my_bridge_funcs); + * if (err) + * return err; + * + * ... + * drm_bridge_add(); + * ... + * } + * + * static void my_bridge_remove() + * { + * struct my_bridge *mybr = ...; + * drm_bridge_remove(&mybr->bridge); + * // ... NO kfree here! + * } + * + * In legacy mode, the driver can either use ``devm_`` allocation or + * equivalently free ``struct my_bridge`` in their remove function:: + * + * static int my_bridge_probe(...) + * { + * struct my_bridge *mybr; + * + * mybr = devm_kzalloc(dev, sizeof(*mybr), GFP_KERNEL); + * if (!mybr) + * return -ENOMEM; + * + * ... + * drm_bridge_add(); + * ... + * } + * + * static void my_bridge_remove() + * { + * struct my_bridge *mybr = ...; + * drm_bridge_remove(&mybr->bridge); + * // kfree(mybr) if not using devm_*() for allocation + * } + * + * The *code using the bridge* is all the code taking a &struct drm_bridge + * pointer, including other bridges, encoders and the DRM core. As the + * bridge could be removed at any time, such code can incur in + * use-after-free. To void that, it has to call drm_bridge_get() when + * taking a pointer and drm_bridge_put() after it has done using it. This + * will extend the allocation lifetime of the bridge struct until the last + * reference has been put, potentially after the bridge device has been + * removed from the kernel. + * + * Calling drm_bridge_get() and drm_bridge_put() on a bridge that is not + * refcounted does nothing, so code using these two APIs will work both on + * refcounted bridges and non-refcounted ones. + */ + /** * DOC: display driver integration *
Document in detail the new refcounted bridges as well as the "legacy" way. Signed-off-by: Luca Ceresoli <luca.ceresoli@bootlin.com> --- This patch was added in v5. --- Documentation/gpu/drm-kms-helpers.rst | 6 ++ drivers/gpu/drm/drm_bridge.c | 122 ++++++++++++++++++++++++++++++++++ 2 files changed, 128 insertions(+)