Message ID | 20190901142303.27815-1-repk@triplefau.lt (mailing list archive) |
---|---|
State | Superseded, archived |
Headers | show |
Series | PCI: aardvark: Don't rely on jiffies while holding spinlock | expand |
On Sun, Sep 01, 2019 at 04:23:03PM +0200, Remi Pommarel wrote: > advk_pcie_wait_pio() can be called while holding a spinlock (from > pci_bus_read_config_dword()), then depends on jiffies in order to > timeout while polling on PIO state registers. In the case the PIO > transaction failed, the timeout will never happen and will also cause > the cpu to stall. > > This decrements a variable and wait instead of using jiffies. > > Signed-off-by: Remi Pommarel <repk@triplefau.lt> > --- > drivers/pci/controller/pci-aardvark.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) Thomas, I would like to merge this patch and a couple of others from Remi, may I ask you please to review them ? https://patchwork.ozlabs.org/user/todo/linux-pci/?series=&submitter=67495&state=&q=&archive= Thanks, Lorenzo > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index fc0fe4d4de49..1fa6d04ad7aa 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -175,7 +175,8 @@ > (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ > PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where)) > > -#define PIO_TIMEOUT_MS 1 > +#define PIO_RETRY_CNT 10 > +#define PIO_RETRY_DELAY 100 /* 100 us*/ > > #define LINK_WAIT_MAX_RETRIES 10 > #define LINK_WAIT_USLEEP_MIN 90000 > @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) > static int advk_pcie_wait_pio(struct advk_pcie *pcie) > { > struct device *dev = &pcie->pdev->dev; > - unsigned long timeout; > + size_t i; > > - timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); > - > - while (time_before(jiffies, timeout)) { > + for (i = 0; i < PIO_RETRY_CNT; ++i) { > u32 start, isr; > > start = advk_readl(pcie, PIO_START); > isr = advk_readl(pcie, PIO_ISR); > if (!start && isr) > return 0; > + udelay(PIO_RETRY_DELAY); > } > > dev_err(dev, "config read/write timed out\n"); > -- > 2.20.1 >
On Sun, Sep 01, 2019 at 04:23:03PM +0200, Remi Pommarel wrote: > advk_pcie_wait_pio() can be called while holding a spinlock (from > pci_bus_read_config_dword()), then depends on jiffies in order to > timeout while polling on PIO state registers. In the case the PIO > transaction failed, the timeout will never happen and will also cause > the cpu to stall. > > This decrements a variable and wait instead of using jiffies. > > Signed-off-by: Remi Pommarel <repk@triplefau.lt> Reviewed-by: Andrew Murray <andrew.murray@arm.com> > --- > drivers/pci/controller/pci-aardvark.c | 10 +++++----- > 1 file changed, 5 insertions(+), 5 deletions(-) > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index fc0fe4d4de49..1fa6d04ad7aa 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -175,7 +175,8 @@ > (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ > PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where)) > > -#define PIO_TIMEOUT_MS 1 > +#define PIO_RETRY_CNT 10 > +#define PIO_RETRY_DELAY 100 /* 100 us*/ > > #define LINK_WAIT_MAX_RETRIES 10 > #define LINK_WAIT_USLEEP_MIN 90000 > @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) > static int advk_pcie_wait_pio(struct advk_pcie *pcie) > { > struct device *dev = &pcie->pdev->dev; > - unsigned long timeout; > + size_t i; > > - timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); > - > - while (time_before(jiffies, timeout)) { > + for (i = 0; i < PIO_RETRY_CNT; ++i) { > u32 start, isr; > > start = advk_readl(pcie, PIO_START); > isr = advk_readl(pcie, PIO_ISR); > if (!start && isr) > return 0; > + udelay(PIO_RETRY_DELAY); > } > > dev_err(dev, "config read/write timed out\n"); > -- > 2.20.1 >
Hello Remi, Thanks for the patch, I have a few comments/questions below. On Sun, 1 Sep 2019 16:23:03 +0200 Remi Pommarel <repk@triplefau.lt> wrote: > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > index fc0fe4d4de49..1fa6d04ad7aa 100644 > --- a/drivers/pci/controller/pci-aardvark.c > +++ b/drivers/pci/controller/pci-aardvark.c > @@ -175,7 +175,8 @@ > (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ > PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where)) > > -#define PIO_TIMEOUT_MS 1 > +#define PIO_RETRY_CNT 10 > +#define PIO_RETRY_DELAY 100 /* 100 us*/ > > #define LINK_WAIT_MAX_RETRIES 10 > #define LINK_WAIT_USLEEP_MIN 90000 > @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) > static int advk_pcie_wait_pio(struct advk_pcie *pcie) > { > struct device *dev = &pcie->pdev->dev; > - unsigned long timeout; > + size_t i; Is it common to use a size_t for a loop counter ? > > - timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); > - > - while (time_before(jiffies, timeout)) { > + for (i = 0; i < PIO_RETRY_CNT; ++i) { I find it more common to use post-increment for loop counters rather than pre-increment, but that's a really nitpick and I don't care much. > u32 start, isr; > > start = advk_readl(pcie, PIO_START); > isr = advk_readl(pcie, PIO_ISR); > if (!start && isr) > return 0; > + udelay(PIO_RETRY_DELAY); But the bigger issue is that this change causes a 100us delay at *every* single PIO read or write operation. Indeed, at the first iteration of the loop, the PIO operation has not completed, so you will always hit the udelay(100) a first time, and it's only at the second iteration of the loop that the PIO operation has completed (for successful PIO operations of course, which don't hit the timeout). I took a measurement around wait_pio() with sched_clock before and after the patch. Before the patch, I have measurements like this (in nanoseconds): [ 1.562801] time = 6000 [ 1.565310] time = 6000 [ 1.567809] time = 6080 [ 1.570327] time = 6080 [ 1.572836] time = 6080 [ 1.575339] time = 6080 [ 1.577858] time = 2720 [ 1.580366] time = 2720 [ 1.582862] time = 6000 [ 1.585377] time = 2720 [ 1.587890] time = 2720 [ 1.590393] time = 2720 So it takes a few microseconds for each PIO operation. With your patch applied: [ 2.267291] time = 101680 [ 2.270002] time = 100880 [ 2.272852] time = 100800 [ 2.275573] time = 100880 [ 2.278285] time = 100800 [ 2.281005] time = 100880 [ 2.283722] time = 100800 [ 2.286444] time = 100880 [ 2.289264] time = 100880 [ 2.291981] time = 100800 [ 2.294690] time = 100800 [ 2.297405] time = 100800 We're jumping to 100us for every PIO read/write operation. To be honest, I don't know if this is very important, there are not that many PIO operations, and they are not used in any performance hot path. But I thought it was worth pointing out the additional delay caused by this implementation change. Best regards, Thomas
Hi Thomas, Thanks for the review. On Wed, Sep 25, 2019 at 11:33:51AM +0200, Thomas Petazzoni wrote: > Hello Remi, > > Thanks for the patch, I have a few comments/questions below. > > On Sun, 1 Sep 2019 16:23:03 +0200 > Remi Pommarel <repk@triplefau.lt> wrote: > > > diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c > > index fc0fe4d4de49..1fa6d04ad7aa 100644 > > --- a/drivers/pci/controller/pci-aardvark.c > > +++ b/drivers/pci/controller/pci-aardvark.c > > @@ -175,7 +175,8 @@ > > (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ > > PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where)) > > > > -#define PIO_TIMEOUT_MS 1 > > +#define PIO_RETRY_CNT 10 > > +#define PIO_RETRY_DELAY 100 /* 100 us*/ > > > > #define LINK_WAIT_MAX_RETRIES 10 > > #define LINK_WAIT_USLEEP_MIN 90000 > > @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) > > static int advk_pcie_wait_pio(struct advk_pcie *pcie) > > { > > struct device *dev = &pcie->pdev->dev; > > - unsigned long timeout; > > + size_t i; > > Is it common to use a size_t for a loop counter ? It was for me but seem not to be used that much. I can change that to an int. > > > > - timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); > > - > > - while (time_before(jiffies, timeout)) { > > + for (i = 0; i < PIO_RETRY_CNT; ++i) { > > I find it more common to use post-increment for loop counters rather > than pre-increment, but that's a really nitpick and I don't care much. > Will change that to post-increment. > > u32 start, isr; > > > > start = advk_readl(pcie, PIO_START); > > isr = advk_readl(pcie, PIO_ISR); > > if (!start && isr) > > return 0; > > + udelay(PIO_RETRY_DELAY); > > But the bigger issue is that this change causes a 100us delay at > *every* single PIO read or write operation. > > Indeed, at the first iteration of the loop, the PIO operation has not > completed, so you will always hit the udelay(100) a first time, and > it's only at the second iteration of the loop that the PIO operation > has completed (for successful PIO operations of course, which don't hit > the timeout). > > I took a measurement around wait_pio() with sched_clock before and > after the patch. Before the patch, I have measurements like this (in > nanoseconds): > > [ 1.562801] time = 6000 > [ 1.565310] time = 6000 > [ 1.567809] time = 6080 > [ 1.570327] time = 6080 > [ 1.572836] time = 6080 > [ 1.575339] time = 6080 > [ 1.577858] time = 2720 > [ 1.580366] time = 2720 > [ 1.582862] time = 6000 > [ 1.585377] time = 2720 > [ 1.587890] time = 2720 > [ 1.590393] time = 2720 > > So it takes a few microseconds for each PIO operation. > > With your patch applied: > > [ 2.267291] time = 101680 > [ 2.270002] time = 100880 > [ 2.272852] time = 100800 > [ 2.275573] time = 100880 > [ 2.278285] time = 100800 > [ 2.281005] time = 100880 > [ 2.283722] time = 100800 > [ 2.286444] time = 100880 > [ 2.289264] time = 100880 > [ 2.291981] time = 100800 > [ 2.294690] time = 100800 > [ 2.297405] time = 100800 > > We're jumping to 100us for every PIO read/write operation. To be > honest, I don't know if this is very important, there are not that many > PIO operations, and they are not used in any performance hot path. But > I thought it was worth pointing out the additional delay caused by this > implementation change. Good catch thanks for the measurements, will move to a 2us delay.
diff --git a/drivers/pci/controller/pci-aardvark.c b/drivers/pci/controller/pci-aardvark.c index fc0fe4d4de49..1fa6d04ad7aa 100644 --- a/drivers/pci/controller/pci-aardvark.c +++ b/drivers/pci/controller/pci-aardvark.c @@ -175,7 +175,8 @@ (PCIE_CONF_BUS(bus) | PCIE_CONF_DEV(PCI_SLOT(devfn)) | \ PCIE_CONF_FUNC(PCI_FUNC(devfn)) | PCIE_CONF_REG(where)) -#define PIO_TIMEOUT_MS 1 +#define PIO_RETRY_CNT 10 +#define PIO_RETRY_DELAY 100 /* 100 us*/ #define LINK_WAIT_MAX_RETRIES 10 #define LINK_WAIT_USLEEP_MIN 90000 @@ -383,17 +384,16 @@ static void advk_pcie_check_pio_status(struct advk_pcie *pcie) static int advk_pcie_wait_pio(struct advk_pcie *pcie) { struct device *dev = &pcie->pdev->dev; - unsigned long timeout; + size_t i; - timeout = jiffies + msecs_to_jiffies(PIO_TIMEOUT_MS); - - while (time_before(jiffies, timeout)) { + for (i = 0; i < PIO_RETRY_CNT; ++i) { u32 start, isr; start = advk_readl(pcie, PIO_START); isr = advk_readl(pcie, PIO_ISR); if (!start && isr) return 0; + udelay(PIO_RETRY_DELAY); } dev_err(dev, "config read/write timed out\n");
advk_pcie_wait_pio() can be called while holding a spinlock (from pci_bus_read_config_dword()), then depends on jiffies in order to timeout while polling on PIO state registers. In the case the PIO transaction failed, the timeout will never happen and will also cause the cpu to stall. This decrements a variable and wait instead of using jiffies. Signed-off-by: Remi Pommarel <repk@triplefau.lt> --- drivers/pci/controller/pci-aardvark.c | 10 +++++----- 1 file changed, 5 insertions(+), 5 deletions(-)