Message ID | 20181030093139.10226-2-kurt@linutronix.de (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | net: xlinx: mdio: recheck condition after timeout | expand |
From: Kurt Kanzenbach <kurt@linutronix.de> Date: Tue, 30 Oct 2018 10:31:38 +0100 > The function could report a false positive if it gets preempted between reading > the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case, > the condition has to be rechecked to avoid false positives. > > Therefore, check for expected condition even after the timeout occurred. > > Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> ... > if (time_before_eq(end, jiffies)) { > - WARN_ON(1); > - return -ETIMEDOUT; > + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET); > + break; > } > + > udelay(1); > } > - return 0; > + if (val & XAE_MDIO_MCR_READY_MASK) > + return 0; > + > + WARN_ON(1); > + return -ETIMEDOUT; You are not fundamentally changing the situation at all. The condtion could change right after your last read of XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your modifications to this code. It sounds more like the timeout is slightly too short, and that's the real problem that causes whatever behavior you think you are fixing here. I'm not applying this.
On Tue, Oct 30, 2018 at 11:25:11AM -0700, David Miller wrote: > From: Kurt Kanzenbach <kurt@linutronix.de> > Date: Tue, 30 Oct 2018 10:31:38 +0100 > > > The function could report a false positive if it gets preempted between reading > > the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case, > > the condition has to be rechecked to avoid false positives. > > > > Therefore, check for expected condition even after the timeout occurred. > > > > Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> > ... > > if (time_before_eq(end, jiffies)) { > > - WARN_ON(1); > > - return -ETIMEDOUT; > > + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET); > > + break; > > } > > + > > udelay(1); > > } > > - return 0; > > + if (val & XAE_MDIO_MCR_READY_MASK) > > + return 0; > > + > > + WARN_ON(1); > > + return -ETIMEDOUT; > > You are not fundamentally changing the situation at all. > > The condtion could change right after your last read of > XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your > modifications to this code. That's true. The problem is different: If the current task gets preempted by a higher priority task between checking the condition and the timeout code, then a timeout might be falsely detected. Consider the following events: loop: check mdio condition ------------------------ task with real time priority may run for a long time ------------------------ check for timeout wait That's why I've added the recheck of the condition in the timeout case. > > It sounds more like the timeout is slightly too short, and that's the > real problem that causes whatever behavior you think you are fixing > here. The timeout value is not the problem here. Thanks, Kurt
On 2018-10-30 11:25:11 [-0700], David Miller wrote: > From: Kurt Kanzenbach <kurt@linutronix.de> > Date: Tue, 30 Oct 2018 10:31:38 +0100 > > > The function could report a false positive if it gets preempted between reading > > the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case, > > the condition has to be rechecked to avoid false positives. > > > > Therefore, check for expected condition even after the timeout occurred. > > > > Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> > ... > > if (time_before_eq(end, jiffies)) { > > - WARN_ON(1); > > - return -ETIMEDOUT; > > + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET); > > + break; > > } > > + > > udelay(1); > > } > > - return 0; > > + if (val & XAE_MDIO_MCR_READY_MASK) > > + return 0; > > + > > + WARN_ON(1); > > + return -ETIMEDOUT; > > You are not fundamentally changing the situation at all. > The condtion could change right after your last read of > XAR_MDIO_MCR_OFFSET, which is the same thing that happens before your > modifications to this code. > > It sounds more like the timeout is slightly too short, and that's the > real problem that causes whatever behavior you think you are fixing > here. There is a timeout of two jiffies. If the condition is not true within those two jiffies it will attempt to check condition one last time after the timeout occured. If the task got preempted after the reading from the register but before the timeout it is possible that the task gets back on the CPU after the timeout occured. And since the timeout occured it won't check if the condition changed: Time 0 +---+ | c | Check for condition (false) | c | | c | | c | | c | | P | Task gets preempted | | | O | Condition is true, task still preempted, no check | | 2 | T | The timeout is true | | | | | | | p | Task gets back on the CPU, no re-check of condition In the last step, there is no checking of the condition after the timeout occured and it wrongly assumes that the condition is not true. Increasing the timeout would help as long as the task gets not preempted past the new timeout. The same pattern (check condition after timeout) is also used in wait_event_timeout() or readx_poll_timeout(). Would you prefer to refactor this with readx_poll_timeout() instead? > I'm not applying this. Please reconsider. Sebastian
diff --git a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c index 757a3b37ae8a..4f13125e4942 100644 --- a/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c +++ b/drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c @@ -21,15 +21,26 @@ int axienet_mdio_wait_until_ready(struct axienet_local *lp) { unsigned long end = jiffies + 2; - while (!(axienet_ior(lp, XAE_MDIO_MCR_OFFSET) & - XAE_MDIO_MCR_READY_MASK)) { + u32 val; + + while (1) { + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET); + + if (val & XAE_MDIO_MCR_READY_MASK) + return 0; + if (time_before_eq(end, jiffies)) { - WARN_ON(1); - return -ETIMEDOUT; + val = axienet_ior(lp, XAE_MDIO_MCR_OFFSET); + break; } + udelay(1); } - return 0; + if (val & XAE_MDIO_MCR_READY_MASK) + return 0; + + WARN_ON(1); + return -ETIMEDOUT; } /**
The function could report a false positive if it gets preempted between reading the XAE_MDIO_MCR_OFFSET register and checking for the timeout. In such a case, the condition has to be rechecked to avoid false positives. Therefore, check for expected condition even after the timeout occurred. Signed-off-by: Kurt Kanzenbach <kurt@linutronix.de> --- drivers/net/ethernet/xilinx/xilinx_axienet_mdio.c | 21 ++++++++++++++++----- 1 file changed, 16 insertions(+), 5 deletions(-)