Message ID | 20230106045537.1243887-1-wenst@chromium.org (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | platform/chromeos: cros_ec: Use per-device lockdep key | expand |
On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > the following lock sequences: > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > 2. lock(&ec_dev->lock); lock(prepare_lock); > > The actual dependency chains are much longer. The shortened version > looks somewhat like: > > 1. cros-ec-rpmsg on mtk-scp > ec_dev->lock -> prepare_lock > 2. In rt5682_i2c_probe() on native I2C bus: > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > 3. In rt5682_i2c_probe() on native I2C bus: > regmap->lock -> i2c_adapter->bus_lock > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > i2c_adapter->bus_lock -> ec_dev->lock > > While lockdep is correct that the shared lockdep classes have a circular > dependency, it is bogus because > > a) 2+3 happen on a native I2C bus > b) 4 happens on the actual EC on ChromeOS devices > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > happen to expose a cros-ec interface, but do not have a passthrough > I2C bus > > In short, the "dependencies" are actually on different devices. Path of 4 looks weird to me. Could you point out where sbs_probe() gets to acquire ec_dev->lock? I may misunderstand: I thought there is no such I2C bus for passthrough from kernel's point of view (as the bus and devices behind the EC). See also [2]. [2]: https://elixir.bootlin.com/linux/v6.2-rc2/source/drivers/platform/chrome/cros_ec.c#L241 On a related note, for the commit title: s/chromeos/chrome/ if it gets chance to have next version.
On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > the following lock sequences: > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > The actual dependency chains are much longer. The shortened version > > looks somewhat like: > > > > 1. cros-ec-rpmsg on mtk-scp > > ec_dev->lock -> prepare_lock > > 2. In rt5682_i2c_probe() on native I2C bus: > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > 3. In rt5682_i2c_probe() on native I2C bus: > > regmap->lock -> i2c_adapter->bus_lock > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > i2c_adapter->bus_lock -> ec_dev->lock > > > > While lockdep is correct that the shared lockdep classes have a circular > > dependency, it is bogus because > > > > a) 2+3 happen on a native I2C bus > > b) 4 happens on the actual EC on ChromeOS devices > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > happen to expose a cros-ec interface, but do not have a passthrough > > I2C bus > > > > In short, the "dependencies" are actually on different devices. > > Path of 4 looks weird to me. > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? sbs_probe() calls sbs_get_battery_presence_and_health(), which -> does an I2C transfer. This SBS instance is connected on the I2C bus on the EC, so the I2C transfer -> acquires i2c_adapter->bus_lock, and -> calls ec_i2c_xfer(), which -> calls cros_ec_cmd_xfer_status(), which -> calls cros_ec_cmd_xfer(), which -> acquires ec_dev->lock > I may misunderstand: I thought there is no such I2C bus for passthrough > from kernel's point of view (as the bus and devices behind the EC). > See also [2]. It is an I2C adapter on the EC, also known as i2c-cros-ec-tunnel. Passthrough probably isn't the right word. > [2]: https://elixir.bootlin.com/linux/v6.2-rc2/source/drivers/platform/chrome/cros_ec.c#L241 > > > On a related note, for the commit title: s/chromeos/chrome/ if it gets > chance to have next version. OK. Thanks ChenYu
On Sat, Jan 07, 2023 at 01:43:57PM +0800, Chen-Yu Tsai wrote: > On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > > the following lock sequences: > > > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > > > The actual dependency chains are much longer. The shortened version > > > looks somewhat like: > > > > > > 1. cros-ec-rpmsg on mtk-scp > > > ec_dev->lock -> prepare_lock > > > 2. In rt5682_i2c_probe() on native I2C bus: > > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > > 3. In rt5682_i2c_probe() on native I2C bus: > > > regmap->lock -> i2c_adapter->bus_lock > > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > > i2c_adapter->bus_lock -> ec_dev->lock > > > > > > While lockdep is correct that the shared lockdep classes have a circular > > > dependency, it is bogus because > > > > > > a) 2+3 happen on a native I2C bus > > > b) 4 happens on the actual EC on ChromeOS devices > > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > > happen to expose a cros-ec interface, but do not have a passthrough > > > I2C bus > > > > > > In short, the "dependencies" are actually on different devices. > > > > Path of 4 looks weird to me. > > > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? > > sbs_probe() calls sbs_get_battery_presence_and_health(), which > > -> does an I2C transfer. This SBS instance is connected on the I2C bus > on the EC, so the I2C transfer > > -> acquires i2c_adapter->bus_lock, and I see. Another question: the i2c_adapter here should be different from the native I2C bus in 2 and 3. Did they really form the circular dependencies?
On Mon, Jan 9, 2023 at 1:46 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > On Sat, Jan 07, 2023 at 01:43:57PM +0800, Chen-Yu Tsai wrote: > > On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > > > the following lock sequences: > > > > > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > > > > > The actual dependency chains are much longer. The shortened version > > > > looks somewhat like: > > > > > > > > 1. cros-ec-rpmsg on mtk-scp > > > > ec_dev->lock -> prepare_lock > > > > 2. In rt5682_i2c_probe() on native I2C bus: > > > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > > > 3. In rt5682_i2c_probe() on native I2C bus: > > > > regmap->lock -> i2c_adapter->bus_lock > > > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > > > i2c_adapter->bus_lock -> ec_dev->lock > > > > > > > > While lockdep is correct that the shared lockdep classes have a circular > > > > dependency, it is bogus because > > > > > > > > a) 2+3 happen on a native I2C bus > > > > b) 4 happens on the actual EC on ChromeOS devices > > > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > > > happen to expose a cros-ec interface, but do not have a passthrough > > > > I2C bus > > > > > > > > In short, the "dependencies" are actually on different devices. > > > > > > Path of 4 looks weird to me. > > > > > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? > > > > sbs_probe() calls sbs_get_battery_presence_and_health(), which > > > > -> does an I2C transfer. This SBS instance is connected on the I2C bus > > on the EC, so the I2C transfer > > > > -> acquires i2c_adapter->bus_lock, and > > I see. > > Another question: the i2c_adapter here should be different from the native > I2C bus in 2 and 3. Did they really form the circular dependencies? That's why it's a false positive. lockdep normally doesn't track individual instances, only classes of locks. The class is declared as part of the mutex_init() macro. ChenYu
On Mon, Jan 09, 2023 at 02:19:38PM +0800, Chen-Yu Tsai wrote: > On Mon, Jan 9, 2023 at 1:46 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > On Sat, Jan 07, 2023 at 01:43:57PM +0800, Chen-Yu Tsai wrote: > > > On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > > > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > > > > the following lock sequences: > > > > > > > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > > > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > > > > > > > The actual dependency chains are much longer. The shortened version > > > > > looks somewhat like: > > > > > > > > > > 1. cros-ec-rpmsg on mtk-scp > > > > > ec_dev->lock -> prepare_lock > > > > > 2. In rt5682_i2c_probe() on native I2C bus: > > > > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > > > > 3. In rt5682_i2c_probe() on native I2C bus: > > > > > regmap->lock -> i2c_adapter->bus_lock > > > > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > > > > i2c_adapter->bus_lock -> ec_dev->lock > > > > > > > > > > While lockdep is correct that the shared lockdep classes have a circular > > > > > dependency, it is bogus because > > > > > > > > > > a) 2+3 happen on a native I2C bus > > > > > b) 4 happens on the actual EC on ChromeOS devices > > > > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > > > > happen to expose a cros-ec interface, but do not have a passthrough > > > > > I2C bus > > > > > > > > > > In short, the "dependencies" are actually on different devices. > > > > > > > > Path of 4 looks weird to me. > > > > > > > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? > > > > > > sbs_probe() calls sbs_get_battery_presence_and_health(), which > > > > > > -> does an I2C transfer. This SBS instance is connected on the I2C bus > > > on the EC, so the I2C transfer > > > > > > -> acquires i2c_adapter->bus_lock, and > > > > I see. > > > > Another question: the i2c_adapter here should be different from the native > > I2C bus in 2 and 3. Did they really form the circular dependencies? > > That's why it's a false positive. lockdep normally doesn't track individual > instances, only classes of locks. The class is declared as part of the > mutex_init() macro. Is the following understanding correct: It has 2 ways to break the "fake" circular dependencies: separate lockdep key for i2c_adapter vs. ec_dev. The patch adopts the latter one because it has limited impact for other I2C-related drivers.
On Mon, Jan 9, 2023 at 3:30 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > On Mon, Jan 09, 2023 at 02:19:38PM +0800, Chen-Yu Tsai wrote: > > On Mon, Jan 9, 2023 at 1:46 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > On Sat, Jan 07, 2023 at 01:43:57PM +0800, Chen-Yu Tsai wrote: > > > > On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > > > > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > > > > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > > > > > the following lock sequences: > > > > > > > > > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > > > > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > > > > > > > > > The actual dependency chains are much longer. The shortened version > > > > > > looks somewhat like: > > > > > > > > > > > > 1. cros-ec-rpmsg on mtk-scp > > > > > > ec_dev->lock -> prepare_lock > > > > > > 2. In rt5682_i2c_probe() on native I2C bus: > > > > > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > > > > > 3. In rt5682_i2c_probe() on native I2C bus: > > > > > > regmap->lock -> i2c_adapter->bus_lock > > > > > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > > > > > i2c_adapter->bus_lock -> ec_dev->lock > > > > > > > > > > > > While lockdep is correct that the shared lockdep classes have a circular > > > > > > dependency, it is bogus because > > > > > > > > > > > > a) 2+3 happen on a native I2C bus > > > > > > b) 4 happens on the actual EC on ChromeOS devices > > > > > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > > > > > happen to expose a cros-ec interface, but do not have a passthrough > > > > > > I2C bus > > > > > > > > > > > > In short, the "dependencies" are actually on different devices. > > > > > > > > > > Path of 4 looks weird to me. > > > > > > > > > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? > > > > > > > > sbs_probe() calls sbs_get_battery_presence_and_health(), which > > > > > > > > -> does an I2C transfer. This SBS instance is connected on the I2C bus > > > > on the EC, so the I2C transfer > > > > > > > > -> acquires i2c_adapter->bus_lock, and > > > > > > I see. > > > > > > Another question: the i2c_adapter here should be different from the native > > > I2C bus in 2 and 3. Did they really form the circular dependencies? > > > > That's why it's a false positive. lockdep normally doesn't track individual > > instances, only classes of locks. The class is declared as part of the > > mutex_init() macro. > > Is the following understanding correct: > It has 2 ways to break the "fake" circular dependencies: separate lockdep key > for i2c_adapter vs. ec_dev. The patch adopts the latter one because it has > limited impact for other I2C-related drivers. That's correct.
On Mon, Jan 09, 2023 at 03:35:08PM +0800, Chen-Yu Tsai wrote: > On Mon, Jan 9, 2023 at 3:30 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > On Mon, Jan 09, 2023 at 02:19:38PM +0800, Chen-Yu Tsai wrote: > > > On Mon, Jan 9, 2023 at 1:46 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > > > On Sat, Jan 07, 2023 at 01:43:57PM +0800, Chen-Yu Tsai wrote: > > > > > On Fri, Jan 6, 2023 at 5:08 PM Tzung-Bi Shih <tzungbi@kernel.org> wrote: > > > > > > > > > > > > On Fri, Jan 06, 2023 at 12:55:37PM +0800, Chen-Yu Tsai wrote: > > > > > > > Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to > > > > > > > the following lock sequences: > > > > > > > > > > > > > > 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) > > > > > > > 2. lock(&ec_dev->lock); lock(prepare_lock); > > > > > > > > > > > > > > The actual dependency chains are much longer. The shortened version > > > > > > > looks somewhat like: > > > > > > > > > > > > > > 1. cros-ec-rpmsg on mtk-scp > > > > > > > ec_dev->lock -> prepare_lock > > > > > > > 2. In rt5682_i2c_probe() on native I2C bus: > > > > > > > prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock > > > > > > > 3. In rt5682_i2c_probe() on native I2C bus: > > > > > > > regmap->lock -> i2c_adapter->bus_lock > > > > > > > 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec > > > > > > > i2c_adapter->bus_lock -> ec_dev->lock > > > > > > > > > > > > > > While lockdep is correct that the shared lockdep classes have a circular > > > > > > > dependency, it is bogus because > > > > > > > > > > > > > > a) 2+3 happen on a native I2C bus > > > > > > > b) 4 happens on the actual EC on ChromeOS devices > > > > > > > c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just > > > > > > > happen to expose a cros-ec interface, but do not have a passthrough > > > > > > > I2C bus > > > > > > > > > > > > > > In short, the "dependencies" are actually on different devices. > > > > > > > > > > > > Path of 4 looks weird to me. > > > > > > > > > > > > Could you point out where sbs_probe() gets to acquire ec_dev->lock? > > > > > > > > > > sbs_probe() calls sbs_get_battery_presence_and_health(), which > > > > > > > > > > -> does an I2C transfer. This SBS instance is connected on the I2C bus > > > > > on the EC, so the I2C transfer > > > > > > > > > > -> acquires i2c_adapter->bus_lock, and > > > > > > > > I see. > > > > > > > > Another question: the i2c_adapter here should be different from the native > > > > I2C bus in 2 and 3. Did they really form the circular dependencies? > > > > > > That's why it's a false positive. lockdep normally doesn't track individual > > > instances, only classes of locks. The class is declared as part of the > > > mutex_init() macro. > > > > Is the following understanding correct: > > It has 2 ways to break the "fake" circular dependencies: separate lockdep key > > for i2c_adapter vs. ec_dev. The patch adopts the latter one because it has > > limited impact for other I2C-related drivers. > > That's correct. Thanks for the explanation. The patch looks good to me. Just realized a kernel-doc warning after applying the patch: $ ./scripts/kernel-doc -none include/linux/platform_data/cros_ec_proto.h include/linux/platform_data/cros_ec_proto.h:199: warning: Function parameter or member 'lockdep_key' not described in 'cros_ec_device' Please fix the warning and commit title.
diff --git a/drivers/platform/chrome/cros_ec.c b/drivers/platform/chrome/cros_ec.c index ec733f683f34..4ae57820afd5 100644 --- a/drivers/platform/chrome/cros_ec.c +++ b/drivers/platform/chrome/cros_ec.c @@ -198,12 +198,14 @@ int cros_ec_register(struct cros_ec_device *ec_dev) if (!ec_dev->dout) return -ENOMEM; + lockdep_register_key(&ec_dev->lockdep_key); mutex_init(&ec_dev->lock); + lockdep_set_class(&ec_dev->lock, &ec_dev->lockdep_key); err = cros_ec_query_all(ec_dev); if (err) { dev_err(dev, "Cannot identify the EC: error %d\n", err); - return err; + goto destroy_mutex; } if (ec_dev->irq > 0) { @@ -215,7 +217,7 @@ int cros_ec_register(struct cros_ec_device *ec_dev) if (err) { dev_err(dev, "Failed to request IRQ %d: %d\n", ec_dev->irq, err); - return err; + goto destroy_mutex; } } @@ -226,7 +228,8 @@ int cros_ec_register(struct cros_ec_device *ec_dev) if (IS_ERR(ec_dev->ec)) { dev_err(ec_dev->dev, "Failed to create CrOS EC platform device\n"); - return PTR_ERR(ec_dev->ec); + err = PTR_ERR(ec_dev->ec); + goto destroy_mutex; } if (ec_dev->max_passthru) { @@ -292,6 +295,9 @@ int cros_ec_register(struct cros_ec_device *ec_dev) exit: platform_device_unregister(ec_dev->ec); platform_device_unregister(ec_dev->pd); +destroy_mutex: + mutex_destroy(&ec_dev->lock); + lockdep_unregister_key(&ec_dev->lockdep_key); return err; } EXPORT_SYMBOL(cros_ec_register); @@ -309,6 +315,8 @@ void cros_ec_unregister(struct cros_ec_device *ec_dev) if (ec_dev->pd) platform_device_unregister(ec_dev->pd); platform_device_unregister(ec_dev->ec); + mutex_destroy(&ec_dev->lock); + lockdep_unregister_key(&ec_dev->lockdep_key); } EXPORT_SYMBOL(cros_ec_unregister); diff --git a/include/linux/platform_data/cros_ec_proto.h b/include/linux/platform_data/cros_ec_proto.h index e43107e0bee1..677d2eae1692 100644 --- a/include/linux/platform_data/cros_ec_proto.h +++ b/include/linux/platform_data/cros_ec_proto.h @@ -9,6 +9,7 @@ #define __LINUX_CROS_EC_PROTO_H #include <linux/device.h> +#include <linux/lockdep_types.h> #include <linux/mutex.h> #include <linux/notifier.h> @@ -160,6 +161,7 @@ struct cros_ec_device { struct cros_ec_command *msg); int (*pkt_xfer)(struct cros_ec_device *ec, struct cros_ec_command *msg); + struct lock_class_key lockdep_key; struct mutex lock; u8 mkbp_event_supported; bool host_sleep_v1;
Lockdep reports a bogus possible deadlock on MT8192 Chromebooks due to the following lock sequences: 1. lock(i2c_register_adapter) [1]; lock(&ec_dev->lock) 2. lock(&ec_dev->lock); lock(prepare_lock); The actual dependency chains are much longer. The shortened version looks somewhat like: 1. cros-ec-rpmsg on mtk-scp ec_dev->lock -> prepare_lock 2. In rt5682_i2c_probe() on native I2C bus: prepare_lock -> regmap->lock -> (possibly) i2c_adapter->bus_lock 3. In rt5682_i2c_probe() on native I2C bus: regmap->lock -> i2c_adapter->bus_lock 4. In sbs_probe() on cros-ec-i2c (passthrough) I2C bus on cros-ec i2c_adapter->bus_lock -> ec_dev->lock While lockdep is correct that the shared lockdep classes have a circular dependency, it is bogus because a) 2+3 happen on a native I2C bus b) 4 happens on the actual EC on ChromeOS devices c) 1 happens on the SCP coprocessor on MediaTek Chromebooks that just happen to expose a cros-ec interface, but do not have a passthrough I2C bus In short, the "dependencies" are actually on different devices. Setup a per-device lockdep key for cros_ec devices so lockdep can tell the two instances apart. This helps with getting rid of the bogus lockdep warning. For ChromeOS devices that only have one cros-ec instance this doesn't change anything. Also add a missing mutex_destroy, just to make the teardown complete. [1] This is likely the per I2C bus lock with shared lockdep class Signed-off-by: Chen-Yu Tsai <wenst@chromium.org> --- drivers/platform/chrome/cros_ec.c | 14 +++++++++++--- include/linux/platform_data/cros_ec_proto.h | 2 ++ 2 files changed, 13 insertions(+), 3 deletions(-)