Message ID | 1497956694-11784-1-git-send-email-thunder.leizhen@huawei.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 20/06/17 12:04, Zhen Lei wrote: > This function is protected by spinlock, and the latter will do memory > barrier implicitly. So that we can safely use writel_relaxed. In fact, the > dmb operation will lengthen the time protected by lock, which indirectly > increase the locking confliction in the stress scene. If you remove the DSB between writing the commands (to Normal memory) and writing the pointer (to Device memory), how can you guarantee that the complete command is visible to the SMMU and it isn't going to try to consume stale memory contents? The spinlock is irrelevant since it's taken *before* the command is written. Robin. > Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> > --- > drivers/iommu/arm-smmu-v3.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c > index 380969a..d2fbee3 100644 > --- a/drivers/iommu/arm-smmu-v3.c > +++ b/drivers/iommu/arm-smmu-v3.c > @@ -728,7 +728,7 @@ static void queue_inc_prod(struct arm_smmu_queue *q) > u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1; > > q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod); > - writel(q->prod, q->prod_reg); > + writel_relaxed(q->prod, q->prod_reg); > } > > /* > -- > 2.5.0 > >
On 2017/6/20 19:35, Robin Murphy wrote: > On 20/06/17 12:04, Zhen Lei wrote: >> This function is protected by spinlock, and the latter will do memory >> barrier implicitly. So that we can safely use writel_relaxed. In fact, the >> dmb operation will lengthen the time protected by lock, which indirectly >> increase the locking confliction in the stress scene. > > If you remove the DSB between writing the commands (to Normal memory) > and writing the pointer (to Device memory), how can you guarantee that > the complete command is visible to the SMMU and it isn't going to try to > consume stale memory contents? The spinlock is irrelevant since it's > taken *before* the command is written. OK, I see, thanks. Let's me see if there are any other methods. And I think that this may should be done well by hardware. > > Robin. > >> Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> >> --- >> drivers/iommu/arm-smmu-v3.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c >> index 380969a..d2fbee3 100644 >> --- a/drivers/iommu/arm-smmu-v3.c >> +++ b/drivers/iommu/arm-smmu-v3.c >> @@ -728,7 +728,7 @@ static void queue_inc_prod(struct arm_smmu_queue *q) >> u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1; >> >> q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod); >> - writel(q->prod, q->prod_reg); >> + writel_relaxed(q->prod, q->prod_reg); >> } >> >> /* >> -- >> 2.5.0 >> >> > > > . >
On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote: > On 2017/6/20 19:35, Robin Murphy wrote: > > On 20/06/17 12:04, Zhen Lei wrote: > >> This function is protected by spinlock, and the latter will do memory > >> barrier implicitly. So that we can safely use writel_relaxed. In fact, the > >> dmb operation will lengthen the time protected by lock, which indirectly > >> increase the locking confliction in the stress scene. > > > > If you remove the DSB between writing the commands (to Normal memory) > > and writing the pointer (to Device memory), how can you guarantee that > > the complete command is visible to the SMMU and it isn't going to try to > > consume stale memory contents? The spinlock is irrelevant since it's > > taken *before* the command is written. > OK, I see, thanks. Let's me see if there are any other methods. And I think > that this may should be done well by hardware. FWIW, I did use the _relaxed variants wherever I could when I wrote the driver. There might, of course, be bugs, but it's not like the normal case for drivers where the author didn't consider the _relaxed accessors initially. Will
On 2017/6/21 17:08, Will Deacon wrote: > On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote: >> On 2017/6/20 19:35, Robin Murphy wrote: >>> On 20/06/17 12:04, Zhen Lei wrote: >>>> This function is protected by spinlock, and the latter will do memory >>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the >>>> dmb operation will lengthen the time protected by lock, which indirectly >>>> increase the locking confliction in the stress scene. >>> >>> If you remove the DSB between writing the commands (to Normal memory) >>> and writing the pointer (to Device memory), how can you guarantee that >>> the complete command is visible to the SMMU and it isn't going to try to >>> consume stale memory contents? The spinlock is irrelevant since it's >>> taken *before* the command is written. >> OK, I see, thanks. Let's me see if there are any other methods. And I think >> that this may should be done well by hardware. > > FWIW, I did use the _relaxed variants wherever I could when I wrote the > driver. There might, of course, be bugs, but it's not like the normal case > for drivers where the author didn't consider the _relaxed accessors > initially. A good news. I got a new idea and I will post v2 later. > > Will > > . >
On 2017/6/26 21:29, Leizhen (ThunderTown) wrote: > > > On 2017/6/21 17:08, Will Deacon wrote: >> On Wed, Jun 21, 2017 at 09:28:23AM +0800, Leizhen (ThunderTown) wrote: >>> On 2017/6/20 19:35, Robin Murphy wrote: >>>> On 20/06/17 12:04, Zhen Lei wrote: >>>>> This function is protected by spinlock, and the latter will do memory >>>>> barrier implicitly. So that we can safely use writel_relaxed. In fact, the >>>>> dmb operation will lengthen the time protected by lock, which indirectly >>>>> increase the locking confliction in the stress scene. >>>> >>>> If you remove the DSB between writing the commands (to Normal memory) >>>> and writing the pointer (to Device memory), how can you guarantee that >>>> the complete command is visible to the SMMU and it isn't going to try to >>>> consume stale memory contents? The spinlock is irrelevant since it's >>>> taken *before* the command is written. >>> OK, I see, thanks. Let's me see if there are any other methods. And I think >>> that this may should be done well by hardware. >> >> FWIW, I did use the _relaxed variants wherever I could when I wrote the >> driver. There might, of course, be bugs, but it's not like the normal case >> for drivers where the author didn't consider the _relaxed accessors >> initially. > A good news. I got a new idea and I will post v2 later. [PATCH 0/5] arm-smmu: performance optimization [PATCH 1/5] iommu/arm-smmu-v3: put off the execution of TLBI* to reduce lock confliction I just sent. > >> >> Will >> >> . >> >
diff --git a/drivers/iommu/arm-smmu-v3.c b/drivers/iommu/arm-smmu-v3.c index 380969a..d2fbee3 100644 --- a/drivers/iommu/arm-smmu-v3.c +++ b/drivers/iommu/arm-smmu-v3.c @@ -728,7 +728,7 @@ static void queue_inc_prod(struct arm_smmu_queue *q) u32 prod = (Q_WRP(q, q->prod) | Q_IDX(q, q->prod)) + 1; q->prod = Q_OVF(q, q->prod) | Q_WRP(q, prod) | Q_IDX(q, prod); - writel(q->prod, q->prod_reg); + writel_relaxed(q->prod, q->prod_reg); } /*
This function is protected by spinlock, and the latter will do memory barrier implicitly. So that we can safely use writel_relaxed. In fact, the dmb operation will lengthen the time protected by lock, which indirectly increase the locking confliction in the stress scene. Signed-off-by: Zhen Lei <thunder.leizhen@huawei.com> --- drivers/iommu/arm-smmu-v3.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) -- 2.5.0