Message ID | 20201010222351.7323-3-Sergey.Semin@baikalelectronics.ru (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | usb: dwc3: ulpi: Fix UPLI registers read/write ops | expand |
Hi, Serge Semin <Sergey.Semin@baikalelectronics.ru> writes: > Originally the procedure of the ULPI transaction finish detection has been > developed as a simple busy-loop with just decrementing counter and no > delays. It's wrong since on different systems the loop will take a > different time to complete. So if the system bus and CPU are fast enough > to overtake the ULPI bus and the companion PHY reaction, then we'll get to > take a false timeout error. Fix this by converting the busy-loop procedure > to take the standard bus speed, address value and the registers access > mode into account for the busy-loop delay calculation. > > Here is the way the fix works. It's known that the ULPI bus is clocked > with 60MHz signal. In accordance with [1] the ULPI bus protocol is created > so to spend 5 and 6 clock periods for immediate register write and read > operations respectively, and 6 and 7 clock periods - for the extended > register writes and reads. Based on that we can easily pre-calculate the > time which will be needed for the controller to perform a requested IO > operation. Note we'll still preserve the attempts counter in case if the > DWC USB3 controller has got some internals delays. > > [1] UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, > October 20, 2004, pp. 30 - 36. > > Fixes: 88bc9d194ff6 ("usb: dwc3: add ULPI interface support") > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > --- > drivers/usb/dwc3/ulpi.c | 18 +++++++++++++++--- > 1 file changed, 15 insertions(+), 3 deletions(-) > > diff --git a/drivers/usb/dwc3/ulpi.c b/drivers/usb/dwc3/ulpi.c > index 20f5d9aba317..0dbc826355a5 100644 > --- a/drivers/usb/dwc3/ulpi.c > +++ b/drivers/usb/dwc3/ulpi.c > @@ -7,6 +7,8 @@ > * Author: Heikki Krogerus <heikki.krogerus@linux.intel.com> > */ > > +#include <linux/delay.h> > +#include <linux/time64.h> > #include <linux/ulpi/regs.h> > > #include "core.h" > @@ -17,12 +19,22 @@ > DWC3_GUSB2PHYACC_ADDR(ULPI_ACCESS_EXTENDED) | \ > DWC3_GUSB2PHYACC_EXTEND_ADDR(a) : DWC3_GUSB2PHYACC_ADDR(a)) > > -static int dwc3_ulpi_busyloop(struct dwc3 *dwc) > +#define DWC3_ULPI_BASE_DELAY DIV_ROUND_UP(NSEC_PER_SEC, 60000000L) > + > +static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read) > { > + unsigned long ns = 5L * DWC3_ULPI_BASE_DELAY; > unsigned count = 1000; > u32 reg; > > + if (addr >= ULPI_EXT_VENDOR_SPECIFIC) > + ns += DWC3_ULPI_BASE_DELAY; > + > + if (read) > + ns += DWC3_ULPI_BASE_DELAY; > + > while (count--) { > + ndelay(ns); could we allow for a sleep here instead of a delay? Also, I wonder if you need to make this so complex or should we just take the larger access time of 7 clock cycles.
On Tue, Oct 27, 2020 at 11:18:51AM +0200, Felipe Balbi wrote: > > Hi, > > Serge Semin <Sergey.Semin@baikalelectronics.ru> writes: > > > Originally the procedure of the ULPI transaction finish detection has been > > developed as a simple busy-loop with just decrementing counter and no > > delays. It's wrong since on different systems the loop will take a > > different time to complete. So if the system bus and CPU are fast enough > > to overtake the ULPI bus and the companion PHY reaction, then we'll get to > > take a false timeout error. Fix this by converting the busy-loop procedure > > to take the standard bus speed, address value and the registers access > > mode into account for the busy-loop delay calculation. > > > > Here is the way the fix works. It's known that the ULPI bus is clocked > > with 60MHz signal. In accordance with [1] the ULPI bus protocol is created > > so to spend 5 and 6 clock periods for immediate register write and read > > operations respectively, and 6 and 7 clock periods - for the extended > > register writes and reads. Based on that we can easily pre-calculate the > > time which will be needed for the controller to perform a requested IO > > operation. Note we'll still preserve the attempts counter in case if the > > DWC USB3 controller has got some internals delays. > > > > [1] UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, > > October 20, 2004, pp. 30 - 36. > > > > Fixes: 88bc9d194ff6 ("usb: dwc3: add ULPI interface support") > > Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> > > --- > > drivers/usb/dwc3/ulpi.c | 18 +++++++++++++++--- > > 1 file changed, 15 insertions(+), 3 deletions(-) > > > > diff --git a/drivers/usb/dwc3/ulpi.c b/drivers/usb/dwc3/ulpi.c > > index 20f5d9aba317..0dbc826355a5 100644 > > --- a/drivers/usb/dwc3/ulpi.c > > +++ b/drivers/usb/dwc3/ulpi.c > > @@ -7,6 +7,8 @@ > > * Author: Heikki Krogerus <heikki.krogerus@linux.intel.com> > > */ > > > > +#include <linux/delay.h> > > +#include <linux/time64.h> > > #include <linux/ulpi/regs.h> > > > > #include "core.h" > > @@ -17,12 +19,22 @@ > > DWC3_GUSB2PHYACC_ADDR(ULPI_ACCESS_EXTENDED) | \ > > DWC3_GUSB2PHYACC_EXTEND_ADDR(a) : DWC3_GUSB2PHYACC_ADDR(a)) > > > > -static int dwc3_ulpi_busyloop(struct dwc3 *dwc) > > +#define DWC3_ULPI_BASE_DELAY DIV_ROUND_UP(NSEC_PER_SEC, 60000000L) > > + > > +static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read) > > { > > + unsigned long ns = 5L * DWC3_ULPI_BASE_DELAY; > > unsigned count = 1000; > > u32 reg; > > > > + if (addr >= ULPI_EXT_VENDOR_SPECIFIC) > > + ns += DWC3_ULPI_BASE_DELAY; > > + > > + if (read) > > + ns += DWC3_ULPI_BASE_DELAY; > > + > > while (count--) { > > + ndelay(ns); > > could we allow for a sleep here instead of a delay? The kernel ULPI-bus API isn't clear about that. I also couldn't find an example of using the ULPI-bus accessors in the atomic context or being implemented with the sleeping methods. So there is no certain answer to your question. Anyway I added an ms-sleep in the later patch to fix the suspend-regression problem. I thought it was reasonable since I couldn't find an example of using the accessors in the atomic context. Regarding this patch. I wouldn't suggest to replace the ndelay with sleeping here, since 5-7 ref clock ticks is enough to finish the transaction for the vast majority of the cases. It's just 80 - 115 ns, which can't be reached by the sleeping procedures. > Also, I wonder if > you need to make this so complex or should we just take the larger > access time of 7 clock cycles. I wouldn't say it's complex. Here I've implemented a simple calculation of the time needed to finish the ULPI-bus commands in accordance with the number of ticks they normally require. Regarding the while-looping alas we can't get rid of it here for the reason I've described in the patch 3 of the series. -Sergey > > -- > balbi
diff --git a/drivers/usb/dwc3/ulpi.c b/drivers/usb/dwc3/ulpi.c index 20f5d9aba317..0dbc826355a5 100644 --- a/drivers/usb/dwc3/ulpi.c +++ b/drivers/usb/dwc3/ulpi.c @@ -7,6 +7,8 @@ * Author: Heikki Krogerus <heikki.krogerus@linux.intel.com> */ +#include <linux/delay.h> +#include <linux/time64.h> #include <linux/ulpi/regs.h> #include "core.h" @@ -17,12 +19,22 @@ DWC3_GUSB2PHYACC_ADDR(ULPI_ACCESS_EXTENDED) | \ DWC3_GUSB2PHYACC_EXTEND_ADDR(a) : DWC3_GUSB2PHYACC_ADDR(a)) -static int dwc3_ulpi_busyloop(struct dwc3 *dwc) +#define DWC3_ULPI_BASE_DELAY DIV_ROUND_UP(NSEC_PER_SEC, 60000000L) + +static int dwc3_ulpi_busyloop(struct dwc3 *dwc, u8 addr, bool read) { + unsigned long ns = 5L * DWC3_ULPI_BASE_DELAY; unsigned count = 1000; u32 reg; + if (addr >= ULPI_EXT_VENDOR_SPECIFIC) + ns += DWC3_ULPI_BASE_DELAY; + + if (read) + ns += DWC3_ULPI_BASE_DELAY; + while (count--) { + ndelay(ns); reg = dwc3_readl(dwc->regs, DWC3_GUSB2PHYACC(0)); if (reg & DWC3_GUSB2PHYACC_DONE) return 0; @@ -47,7 +59,7 @@ static int dwc3_ulpi_read(struct device *dev, u8 addr) reg = DWC3_GUSB2PHYACC_NEWREGREQ | DWC3_ULPI_ADDR(addr); dwc3_writel(dwc->regs, DWC3_GUSB2PHYACC(0), reg); - ret = dwc3_ulpi_busyloop(dwc); + ret = dwc3_ulpi_busyloop(dwc, addr, true); if (ret) return ret; @@ -71,7 +83,7 @@ static int dwc3_ulpi_write(struct device *dev, u8 addr, u8 val) reg |= DWC3_GUSB2PHYACC_WRITE | val; dwc3_writel(dwc->regs, DWC3_GUSB2PHYACC(0), reg); - return dwc3_ulpi_busyloop(dwc); + return dwc3_ulpi_busyloop(dwc, addr, false); } static const struct ulpi_ops dwc3_ulpi_ops = {
Originally the procedure of the ULPI transaction finish detection has been developed as a simple busy-loop with just decrementing counter and no delays. It's wrong since on different systems the loop will take a different time to complete. So if the system bus and CPU are fast enough to overtake the ULPI bus and the companion PHY reaction, then we'll get to take a false timeout error. Fix this by converting the busy-loop procedure to take the standard bus speed, address value and the registers access mode into account for the busy-loop delay calculation. Here is the way the fix works. It's known that the ULPI bus is clocked with 60MHz signal. In accordance with [1] the ULPI bus protocol is created so to spend 5 and 6 clock periods for immediate register write and read operations respectively, and 6 and 7 clock periods - for the extended register writes and reads. Based on that we can easily pre-calculate the time which will be needed for the controller to perform a requested IO operation. Note we'll still preserve the attempts counter in case if the DWC USB3 controller has got some internals delays. [1] UTMI+ Low Pin Interface (ULPI) Specification, Revision 1.1, October 20, 2004, pp. 30 - 36. Fixes: 88bc9d194ff6 ("usb: dwc3: add ULPI interface support") Signed-off-by: Serge Semin <Sergey.Semin@baikalelectronics.ru> --- drivers/usb/dwc3/ulpi.c | 18 +++++++++++++++--- 1 file changed, 15 insertions(+), 3 deletions(-)