diff mbox

[v1,3/7] platform/mellanox: mlxreg-hotplug: add extra cycle for hotplug work queue

Message ID 1522144927-56512-4-git-send-email-vadimp@mellanox.com (mailing list archive)
State Superseded, archived
Delegated to: Darren Hart
Headers show

Commit Message

Vadim Pasternak March 27, 2018, 10:02 a.m. UTC
It adds missed logic for signal acknowledge, by adding an extra run for
work queue in case a signal is received, but no specific signal assertion
is detected. Such case theoretically can happen for example in case
several units are removed or inserted at the same time. In this situation
acknowledge for some signal can be missed at signal top aggreagation
status level. The extra run will allow to handler to acknowledge the
missed signal.

The interrupt handling flow performs the next steps:
(1)
Enter mlxreg_hotplug_work_handler due to signal assertion.
Aggregation status register is changed for example from 0xff to 0xfd
(event signal group related to bit 1).
(2)
Mask aggregation interrupts, read aggregation status register and save it
(0xfd) in aggr_cache, then traverse down to handle signal from groups
related to the changed bit.
(3)
Read and mask group related signal.
Acknowledge and unmask group related signal (acknowledge should clear
aggregation status register from 0xfd back to 0xff).
(4)
Re-schedule work queue for the immediate execution.
(5)
Enter mlxreg_hotplug_work_handler due to re-scheduling.
Aggregation status is changed from previous 0xfd to 0xff.
Go over steps (2) - (5) and in case no new signal assertion
is detected - unmask aggregation interrupts.

The possible race could happen in case new signal from the same group is
asserted after step (3) and prior step (5). In such case aggregation
status will change back from 0xff to 0xfd and the value read from the
aggregation status register will be the same as a value saved in
aggr_cache. As a result the handler will not traverse down and signal
will stay unhandled.
The fix will enforce handler to traverse down in case the signal is
received, but signal assertion is not detected.

Fixes: 1f976f6978bf ("platform/x86: Move Mellanox platform hotplug driver to platform/mellanox")
Signed-off-by: Vadim Pasternak <vadimp@mellanox.com>
---
 drivers/platform/mellanox/mlxreg-hotplug.c | 17 +++++++++++++++++
 1 file changed, 17 insertions(+)

Comments

Darren Hart April 13, 2018, 4:47 p.m. UTC | #1
On Tue, Mar 27, 2018 at 10:02:03AM +0000, Vadim Pasternak wrote:
> It adds missed logic for signal acknowledge, by adding an extra run for
> work queue in case a signal is received, but no specific signal assertion
> is detected. Such case theoretically can happen for example in case
> several units are removed or inserted at the same time. In this situation
> acknowledge for some signal can be missed at signal top aggreagation
> status level.

Why can they be missed? What does "signal top aggregation status level"
mean? I'm asking to confirm that we are fixing this at the right place,
and not just applying a suboptimal bandaid by running the workqueue
more.

...

> 
> Fixes: 1f976f6978bf ("platform/x86: Move Mellanox platform hotplug driver to platform/mellanox")

You didn't mention above how this commit caused this - how did moving
the driver create this problem? Does this need to go to stable? I'm
assuming not as you've called it theoretical - not something you've
observed in practice?

...

>  static int mlxreg_hotplug_device_create(struct mlxreg_hotplug_priv_data *priv,
> @@ -410,6 +413,18 @@ static void mlxreg_hotplug_work_handler(struct work_struct *work)
>  	aggr_asserted = priv->aggr_cache ^ regval;
>  	priv->aggr_cache = regval;
>  
> +	/*
> +	 * Handler is invoked, but no assertion is detected at top aggregation
> +	 * status level. Set aggr_asserted to mask value to allow handler extra
> +	 * run over all relevant signals to recover any missed signal.
> +	 */
> +	if (priv->not_asserted == MLXREG_HOTPLUG_NOT_ASSERT) {
> +		priv->not_asserted = 0;
> +		aggr_asserted = pdata->mask;
> +	}
> +	if (!aggr_asserted)

We seem to check aggr_asserted in several places in this routine now.
Looks like something we could simplify. If you check it here, can you
drop the check lower in the routine? Can you remove it from the for loop
if conditional entirely? Please consider how to simplify.
Vadim Pasternak April 13, 2018, 7:39 p.m. UTC | #2
> -----Original Message-----
> From: Darren Hart [mailto:dvhart@infradead.org]
> Sent: Friday, April 13, 2018 7:47 PM
> To: Vadim Pasternak <vadimp@mellanox.com>
> Cc: andy.shevchenko@gmail.com; gregkh@linuxfoundation.org; linux-
> kernel@vger.kernel.org; platform-driver-x86@vger.kernel.org; jiri@resnulli.us;
> Michael Shych <michaelsh@mellanox.com>; ivecera@redhat.com
> Subject: Re: [PATCH v1 3/7] platform/mellanox: mlxreg-hotplug: add extra cycle
> for hotplug work queue
> 
> On Tue, Mar 27, 2018 at 10:02:03AM +0000, Vadim Pasternak wrote:
> > It adds missed logic for signal acknowledge, by adding an extra run
> > for work queue in case a signal is received, but no specific signal
> > assertion is detected. Such case theoretically can happen for example
> > in case several units are removed or inserted at the same time. In
> > this situation acknowledge for some signal can be missed at signal top
> > aggreagation status level.
> 
> Why can they be missed? What does "signal top aggregation status level"
> mean? I'm asking to confirm that we are fixing this at the right place, and not
> just applying a suboptimal bandaid by running the workqueue more.
> 

Hi Darren,

Thank for review.

It could happen within the next flow:

The signal routing flow is as following (f.e. for of FANi removing):
 - FAN status and event registers related bit is changed;
 -- intermediate aggregation status register is changed;
 --- top aggregation status register is changed;
 ---- interrupt routed to CPU and interrupt handler is invoked.

When interrupt handler is invoked it follows the next simple logic (f.e
FAN3 is removed):
 (a1) mask top aggregation interrupt mask register;
 (a2) read top aggregation interrupt status register and test to which
      underling group belongs a signal (FANs in this case and is changed
	  from 0xff to 0xfb and 0xfb is saved as a last status value);
   (b1) mask FANs interrupt mask register;
   (b2) read FANs status register and test which FAN has been changed (FAN3 in
        this example);
     (c1) perform relevant action;
              <--------------- (FAN2 is removed at this point)
   (b3) clear FANs interrupt event register to acknowledge FAN3 signal;
   (b4) unmask FANs interrupt mask register
 (a3) unmask top aggregation interrupt mask register;
 
 An interrupt handler is invoked, since FAN2 interrupt is not acknowledge.
 It should set top aggregation interrupt status register bit 6 (0xfb).
 In step (a2)
 (a2) read top aggregation interrupt and comparing it with saved value doesn't
      show change (same 0xfb) and after (a2) execution jumps to (a3) and
	  signal leaved unhandled.

> ...
> 
> >
> > Fixes: 1f976f6978bf ("platform/x86: Move Mellanox platform hotplug
> > driver to platform/mellanox")
> 
> You didn't mention above how this commit caused this - how did moving the
> driver create this problem? 

Actually I should reference to 
07b89c2b2a5e ("platform/x86: Introduce support for Mellanox hotplug driver")
which was initial driver commit, before it has been relocated. 

Does this need to go to stable? I'm assuming not as
> you've called it theoretical - not something you've observed in practice?
> 

It's not necessary to go to stable.

> ...
> 
> >  static int mlxreg_hotplug_device_create(struct
> > mlxreg_hotplug_priv_data *priv, @@ -410,6 +413,18 @@ static void
> mlxreg_hotplug_work_handler(struct work_struct *work)
> >  	aggr_asserted = priv->aggr_cache ^ regval;
> >  	priv->aggr_cache = regval;
> >
> > +	/*
> > +	 * Handler is invoked, but no assertion is detected at top aggregation
> > +	 * status level. Set aggr_asserted to mask value to allow handler extra
> > +	 * run over all relevant signals to recover any missed signal.
> > +	 */
> > +	if (priv->not_asserted == MLXREG_HOTPLUG_NOT_ASSERT) {
> > +		priv->not_asserted = 0;
> > +		aggr_asserted = pdata->mask;
> > +	}
> > +	if (!aggr_asserted)
> 
> We seem to check aggr_asserted in several places in this routine now.
> Looks like something we could simplify. If you check it here, can you drop the
> check lower in the routine? Can you remove it from the for loop if conditional
> entirely? Please consider how to simplify.

OK, will review this code.

> 
> --
> Darren Hart
> VMware Open Source Technology Center
diff mbox

Patch

diff --git a/drivers/platform/mellanox/mlxreg-hotplug.c b/drivers/platform/mellanox/mlxreg-hotplug.c
index b56953a..ced81b7 100644
--- a/drivers/platform/mellanox/mlxreg-hotplug.c
+++ b/drivers/platform/mellanox/mlxreg-hotplug.c
@@ -55,6 +55,7 @@ 
 #define MLXREG_HOTPLUG_RST_CNTR		3
 
 #define MLXREG_HOTPLUG_ATTRS_MAX	24
+#define MLXREG_HOTPLUG_NOT_ASSERT	3
 
 /**
  * struct mlxreg_hotplug_priv_data - platform private data:
@@ -74,6 +75,7 @@ 
  * @mask: top aggregation interrupt common mask;
  * @aggr_cache: last value of aggregation register status;
  * @after_probe: flag indication probing completion;
+ * @not_asserted: number of entries in workqueue with no signal assertion;
  */
 struct mlxreg_hotplug_priv_data {
 	int irq;
@@ -93,6 +95,7 @@  struct mlxreg_hotplug_priv_data {
 	u32 mask;
 	u32 aggr_cache;
 	bool after_probe;
+	u8 not_asserted;
 };
 
 static int mlxreg_hotplug_device_create(struct mlxreg_hotplug_priv_data *priv,
@@ -410,6 +413,18 @@  static void mlxreg_hotplug_work_handler(struct work_struct *work)
 	aggr_asserted = priv->aggr_cache ^ regval;
 	priv->aggr_cache = regval;
 
+	/*
+	 * Handler is invoked, but no assertion is detected at top aggregation
+	 * status level. Set aggr_asserted to mask value to allow handler extra
+	 * run over all relevant signals to recover any missed signal.
+	 */
+	if (priv->not_asserted == MLXREG_HOTPLUG_NOT_ASSERT) {
+		priv->not_asserted = 0;
+		aggr_asserted = pdata->mask;
+	}
+	if (!aggr_asserted)
+		goto unmask_event;
+
 	/* Handle topology and health configuration changes. */
 	for (i = 0; i < pdata->counter; i++, item++) {
 		if (aggr_asserted & item->aggr_mask) {
@@ -441,6 +456,8 @@  static void mlxreg_hotplug_work_handler(struct work_struct *work)
 		return;
 	}
 
+unmask_event:
+	priv->not_asserted++;
 	/* Unmask aggregation event (no need acknowledge). */
 	ret = regmap_write(priv->regmap, pdata->cell +
 			   MLXREG_HOTPLUG_AGGR_MASK_OFF, pdata->mask);