Message ID | 1471525542-14969-4-git-send-email-will.deacon@arm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: > Enabling stalling faults can result in hardware deadlock on poorly > designed systems, particularly those with a PCI root complex upstream of > the SMMU. > > Although it's not really Linux's job to save hardware integrators from > their own misfortune, it *is* our job to stop userspace (e.g. VFIO > clients) from hosing the system for everybody else, even if they might > already be required to have elevated privileges. > > Given that the fault handling code currently executes entirely in IRQ > context, there is nothing that can sensibly be done to recover from > things like page faults anyway, so let's rip this code out for now and > avoid the potential for deadlock. Hi Will, so, I'd like to re-introduce this feature, I *guess* as some sort of opt-in quirk (ie. disabled by default unless something in DT tells you otherwise?? But I'm open to suggestions. I'm not entirely sure what hw was having problems due to this feature.) On newer snapdragon devices we are using arm-smmu for the GPU, and halting the GPU so the driver's fault handler can dump some GPU state on faults is enormously helpful for debugging and tracking down where in the gpu cmdstream the fault was triggered. In addition, we will eventually want the ability to update pagetables from fault handler and resuming the faulting transition. Some additional comments below.. > Cc: <stable@vger.kernel.org> > Reported-by: Matt Evans <matt.evans@arm.com> > Signed-off-by: Will Deacon <will.deacon@arm.com> > --- > drivers/iommu/arm-smmu.c | 34 +++++++--------------------------- > 1 file changed, 7 insertions(+), 27 deletions(-) > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > index 4f49fe29f202..2db74ebc3240 100644 > --- a/drivers/iommu/arm-smmu.c > +++ b/drivers/iommu/arm-smmu.c > @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = { > > static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > { > - int flags, ret; > - u32 fsr, fsynr, resume; > + u32 fsr, fsynr; > unsigned long iova; > struct iommu_domain *domain = dev; > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > if (!(fsr & FSR_FAULT)) > return IRQ_NONE; > > - if (fsr & FSR_IGN) > - dev_err_ratelimited(smmu->dev, > - "Unexpected context fault (fsr 0x%x)\n", > - fsr); > - > fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0); > - flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ; > - > iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR); > - if (!report_iommu_fault(domain, smmu->dev, iova, flags)) { > - ret = IRQ_HANDLED; > - resume = RESUME_RETRY; > - } else { > - dev_err_ratelimited(smmu->dev, > - "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", > - iova, fsynr, cfg->cbndx); I would like to decouple this dev_err_ratelimit() print from the RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to indicate by return from my fault handler whether to resume or terminate. But I already have my own ratelimted prints and would prefer not to spam dmesg twice. I'm thinking about report_iommu_fault() returning: 0 => RESUME_RETRY -EFAULT => RESUME_TERMINATE but don't print anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print thoughts? > - ret = IRQ_NONE; > - resume = RESUME_TERMINATE; > - } > - > - /* Clear the faulting FSR */ > - writel(fsr, cb_base + ARM_SMMU_CB_FSR); > > - /* Retry or terminate any stalled transactions */ > - if (fsr & FSR_SS) > - writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME); This might be a bug in qcom's implementation of the smmu spec, but seems like we don't have SS bit set, yet we still require RESUME reg to be written, otherwise gpu is perma-wedged. Maybe topic for a separate quirk? I'm not sure if writing RESUME reg on other hw when SS bit is not set is likely to cause problems? If not I suppose we could just unconditionally write it. Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome.. in between debugging freedreno I'll try to put together some patches. BR, -R > + dev_err_ratelimited(smmu->dev, > + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n", > + fsr, iova, fsynr, cfg->cbndx); > > - return ret; > + writel(fsr, cb_base + ARM_SMMU_CB_FSR); > + return IRQ_HANDLED; > } > > static irqreturn_t arm_smmu_global_fault(int irq, void *dev) > @@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, > } > > /* SCTLR */ > - reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP; > + reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP; > if (stage1) > reg |= SCTLR_S1_ASIDPNE; > #ifdef __BIG_ENDIAN > -- > 2.1.4 > > _______________________________________________ > iommu mailing list > iommu@lists.linux-foundation.org > https://lists.linuxfoundation.org/mailman/listinfo/iommu
On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: > On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: > > Enabling stalling faults can result in hardware deadlock on poorly > > designed systems, particularly those with a PCI root complex upstream of > > the SMMU. > > > > Although it's not really Linux's job to save hardware integrators from > > their own misfortune, it *is* our job to stop userspace (e.g. VFIO > > clients) from hosing the system for everybody else, even if they might > > already be required to have elevated privileges. > > > > Given that the fault handling code currently executes entirely in IRQ > > context, there is nothing that can sensibly be done to recover from > > things like page faults anyway, so let's rip this code out for now and > > avoid the potential for deadlock. > > Hi Will, > > so, I'd like to re-introduce this feature, I *guess* as some sort of > opt-in quirk (ie. disabled by default unless something in DT tells you > otherwise?? But I'm open to suggestions. I'm not entirely sure what > hw was having problems due to this feature.) > > On newer snapdragon devices we are using arm-smmu for the GPU, and > halting the GPU so the driver's fault handler can dump some GPU state > on faults is enormously helpful for debugging and tracking down where > in the gpu cmdstream the fault was triggered. In addition, we will > eventually want the ability to update pagetables from fault handler > and resuming the faulting transition. > > Some additional comments below.. > > > Cc: <stable@vger.kernel.org> > > Reported-by: Matt Evans <matt.evans@arm.com> > > Signed-off-by: Will Deacon <will.deacon@arm.com> > > --- > > drivers/iommu/arm-smmu.c | 34 +++++++--------------------------- > > 1 file changed, 7 insertions(+), 27 deletions(-) > > > > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c > > index 4f49fe29f202..2db74ebc3240 100644 > > --- a/drivers/iommu/arm-smmu.c > > +++ b/drivers/iommu/arm-smmu.c > > @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = { > > > > static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > > { > > - int flags, ret; > > - u32 fsr, fsynr, resume; > > + u32 fsr, fsynr; > > unsigned long iova; > > struct iommu_domain *domain = dev; > > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); > > @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) > > if (!(fsr & FSR_FAULT)) > > return IRQ_NONE; > > > > - if (fsr & FSR_IGN) > > - dev_err_ratelimited(smmu->dev, > > - "Unexpected context fault (fsr 0x%x)\n", > > - fsr); > > - > > fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0); > > - flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ; > > - > > iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR); > > - if (!report_iommu_fault(domain, smmu->dev, iova, flags)) { > > - ret = IRQ_HANDLED; > > - resume = RESUME_RETRY; > > - } else { > > - dev_err_ratelimited(smmu->dev, > > - "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", > > - iova, fsynr, cfg->cbndx); > > I would like to decouple this dev_err_ratelimit() print from the > RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to > indicate by return from my fault handler whether to resume or > terminate. But I already have my own ratelimted prints and would > prefer not to spam dmesg twice. > > I'm thinking about report_iommu_fault() returning: > > 0 => RESUME_RETRY > -EFAULT => RESUME_TERMINATE but don't print > anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print > > thoughts? > > > - ret = IRQ_NONE; > > - resume = RESUME_TERMINATE; > > - } > > - > > - /* Clear the faulting FSR */ > > - writel(fsr, cb_base + ARM_SMMU_CB_FSR); > > > > - /* Retry or terminate any stalled transactions */ > > - if (fsr & FSR_SS) > > - writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME); > > This might be a bug in qcom's implementation of the smmu spec, but > seems like we don't have SS bit set, yet we still require RESUME reg > to be written, otherwise gpu is perma-wedged. Maybe topic for a > separate quirk? I'm not sure if writing RESUME reg on other hw when > SS bit is not set is likely to cause problems? If not I suppose we > could just unconditionally write it. > > Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome.. > in between debugging freedreno I'll try to put together some patches. From what I can tell we need SCTLR_CFCFG to make the stall happen otherwise the operation just gets terminated immediately and *then* we get notification but by then the system keeps going. I think SCTLR_HUPCF helps control that behavior (i.e. we don't go off faulting through eternity) but I don't know how it works. From my very unlearned understanding I think we do want to set CFCFG and then stall and let the interrupt handler decide to retry/terminate. Jordan
Hi Rob, >> > Although it's not really Linux's job to save hardware integrators from >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO >> > clients) from hosing the system for everybody else, even if they might >> > already be required to have elevated privileges. >> > >> > Given that the fault handling code currently executes entirely in IRQ >> > context, there is nothing that can sensibly be done to recover from >> > things like page faults anyway, so let's rip this code out for now and >> > avoid the potential for deadlock. >> >> Hi Will, >> >> so, I'd like to re-introduce this feature, I *guess* as some sort of >> opt-in quirk (ie. disabled by default unless something in DT tells you >> otherwise?? But I'm open to suggestions. I'm not entirely sure what >> hw was having problems due to this feature.) >> >> On newer snapdragon devices we are using arm-smmu for the GPU, and >> halting the GPU so the driver's fault handler can dump some GPU state >> on faults is enormously helpful for debugging and tracking down where >> in the gpu cmdstream the fault was triggered. In addition, we will >> eventually want the ability to update pagetables from fault handler >> and resuming the faulting transition. >> >> Some additional comments below.. >> >> > Cc: <stable@vger.kernel.org> >> > Reported-by: Matt Evans <matt.evans@arm.com> >> > Signed-off-by: Will Deacon <will.deacon@arm.com> >> > --- >> > drivers/iommu/arm-smmu.c | 34 +++++++--------------------------- >> > 1 file changed, 7 insertions(+), 27 deletions(-) >> > >> > diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c >> > index 4f49fe29f202..2db74ebc3240 100644 >> > --- a/drivers/iommu/arm-smmu.c >> > +++ b/drivers/iommu/arm-smmu.c >> > @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = { >> > >> > static irqreturn_t arm_smmu_context_fault(int irq, void *dev) >> > { >> > - int flags, ret; >> > - u32 fsr, fsynr, resume; >> > + u32 fsr, fsynr; >> > unsigned long iova; >> > struct iommu_domain *domain = dev; >> > struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); >> > @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) >> > if (!(fsr & FSR_FAULT)) >> > return IRQ_NONE; >> > >> > - if (fsr & FSR_IGN) >> > - dev_err_ratelimited(smmu->dev, >> > - "Unexpected context fault (fsr 0x%x)\n", >> > - fsr); >> > - >> > fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0); >> > - flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ; >> > - >> > iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR); >> > - if (!report_iommu_fault(domain, smmu->dev, iova, flags)) { >> > - ret = IRQ_HANDLED; >> > - resume = RESUME_RETRY; >> > - } else { >> > - dev_err_ratelimited(smmu->dev, >> > - "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", >> > - iova, fsynr, cfg->cbndx); >> >> I would like to decouple this dev_err_ratelimit() print from the >> RESUME_RETRY vs RESUME_TERMINATE behaviour. I need the ability to >> indicate by return from my fault handler whether to resume or >> terminate. But I already have my own ratelimted prints and would >> prefer not to spam dmesg twice. >> >> I'm thinking about report_iommu_fault() returning: >> >> 0 => RESUME_RETRY >> -EFAULT => RESUME_TERMINATE but don't print >> anything else (or specifically -ENOSYS?) => RESUME_TERMINATE and print >> >> thoughts? >> >> > - ret = IRQ_NONE; >> > - resume = RESUME_TERMINATE; >> > - } >> > - >> > - /* Clear the faulting FSR */ >> > - writel(fsr, cb_base + ARM_SMMU_CB_FSR); >> > >> > - /* Retry or terminate any stalled transactions */ >> > - if (fsr & FSR_SS) >> > - writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME); >> >> This might be a bug in qcom's implementation of the smmu spec, but >> seems like we don't have SS bit set, yet we still require RESUME reg >> to be written, otherwise gpu is perma-wedged. Maybe topic for a >> separate quirk? I'm not sure if writing RESUME reg on other hw when >> SS bit is not set is likely to cause problems? If not I suppose we >> could just unconditionally write it. >> >> Anyways, I'm not super-familiar w/ arm-smmu so suggestions welcome.. >> in between debugging freedreno I'll try to put together some patches. > >From what I can tell we need SCTLR_CFCFG to make the stall happen otherwise >the operation just gets terminated immediately and *then* we get notification >but by then the system keeps going. > Yes thats right, SCTLR_CFCFG was removed as a part of this patch. >I think SCTLR_HUPCF helps control that behavior (i.e. we don't go off faulting >through eternity) but I don't know how it works. > >From my very unlearned understanding I think we do want to set CFCFG and then >stall and let the interrupt handler decide to retry/terminate. Yes, infact i was thinking of adding this as a new patch, but now that this was a reverted one. As you said, it would be good to know the hw which was having problem with this and probably having this has a quirk otherwise. Also i see that FSR_SS is implemented by the qcom and the downstream code uses it. Regards, Sricharan
Hi Rob, On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: > On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: > > Enabling stalling faults can result in hardware deadlock on poorly > > designed systems, particularly those with a PCI root complex upstream of > > the SMMU. > > > > Although it's not really Linux's job to save hardware integrators from > > their own misfortune, it *is* our job to stop userspace (e.g. VFIO > > clients) from hosing the system for everybody else, even if they might > > already be required to have elevated privileges. > > > > Given that the fault handling code currently executes entirely in IRQ > > context, there is nothing that can sensibly be done to recover from > > things like page faults anyway, so let's rip this code out for now and > > avoid the potential for deadlock. > > so, I'd like to re-introduce this feature, I *guess* as some sort of > opt-in quirk (ie. disabled by default unless something in DT tells you > otherwise?? But I'm open to suggestions. I'm not entirely sure what > hw was having problems due to this feature.) > > On newer snapdragon devices we are using arm-smmu for the GPU, and > halting the GPU so the driver's fault handler can dump some GPU state > on faults is enormously helpful for debugging and tracking down where > in the gpu cmdstream the fault was triggered. In addition, we will > eventually want the ability to update pagetables from fault handler > and resuming the faulting transition. I'm not against reintroducing this, but it would certainly need to be opt-in, as you suggest. If we want to try to use stall faults to enable demand paging for DMA, then that means running core mm code to resolve the fault and we can't do that in irq context. Consequently, we have to hand this off to a thread, which means the hardware must allow the SS bit to remain set without immediately reasserting the interrupt line. Furthermore, we can't handle multiple faults on a context-bank, so we'd need to restrict ourselves to one device (i.e. faulting stream) per domain (CB). I think that means we want both specific compatible strings to identify the SS bit behaviour, but also a way to opt-in for the stall model as a separate property to indicate that the SoC integration can support this without e.g. deadlocking. Will
Hi Will, >On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: >> > Enabling stalling faults can result in hardware deadlock on poorly >> > designed systems, particularly those with a PCI root complex upstream of >> > the SMMU. >> > >> > Although it's not really Linux's job to save hardware integrators from >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO >> > clients) from hosing the system for everybody else, even if they might >> > already be required to have elevated privileges. >> > >> > Given that the fault handling code currently executes entirely in IRQ >> > context, there is nothing that can sensibly be done to recover from >> > things like page faults anyway, so let's rip this code out for now and >> > avoid the potential for deadlock. >> >> so, I'd like to re-introduce this feature, I *guess* as some sort of >> opt-in quirk (ie. disabled by default unless something in DT tells you >> otherwise?? But I'm open to suggestions. I'm not entirely sure what >> hw was having problems due to this feature.) >> >> On newer snapdragon devices we are using arm-smmu for the GPU, and >> halting the GPU so the driver's fault handler can dump some GPU state >> on faults is enormously helpful for debugging and tracking down where >> in the gpu cmdstream the fault was triggered. In addition, we will >> eventually want the ability to update pagetables from fault handler >> and resuming the faulting transition. > >I'm not against reintroducing this, but it would certainly need to be >opt-in, as you suggest. If we want to try to use stall faults to enable >demand paging for DMA, then that means running core mm code to resolve >the fault and we can't do that in irq context. Consequently, we have to >hand this off to a thread, which means the hardware must allow the SS >bit to remain set without immediately reasserting the interrupt line. >Furthermore, we can't handle multiple faults on a context-bank, so we'd >need to restrict ourselves to one device (i.e. faulting stream) per >domain (CB). > >I think that means we want both specific compatible strings to identify >the SS bit behaviour, but also a way to opt-in for the stall model as a >separate property to indicate that the SoC integration can support this >without e.g. deadlocking. > To understand the reason on the need for the quirk based on SS bit behavior, if the platform supports stall model and enabled, then SS bit should be implemented and remain set until the RESUME register is written back, means same behavior always ? Regards, Sricharan
On Mon, Dec 19, 2016 at 02:33:36PM +0530, Sricharan wrote: > >On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: > >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: > >> > Enabling stalling faults can result in hardware deadlock on poorly > >> > designed systems, particularly those with a PCI root complex upstream of > >> > the SMMU. > >> > > >> > Although it's not really Linux's job to save hardware integrators from > >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO > >> > clients) from hosing the system for everybody else, even if they might > >> > already be required to have elevated privileges. > >> > > >> > Given that the fault handling code currently executes entirely in IRQ > >> > context, there is nothing that can sensibly be done to recover from > >> > things like page faults anyway, so let's rip this code out for now and > >> > avoid the potential for deadlock. > >> > >> so, I'd like to re-introduce this feature, I *guess* as some sort of > >> opt-in quirk (ie. disabled by default unless something in DT tells you > >> otherwise?? But I'm open to suggestions. I'm not entirely sure what > >> hw was having problems due to this feature.) > >> > >> On newer snapdragon devices we are using arm-smmu for the GPU, and > >> halting the GPU so the driver's fault handler can dump some GPU state > >> on faults is enormously helpful for debugging and tracking down where > >> in the gpu cmdstream the fault was triggered. In addition, we will > >> eventually want the ability to update pagetables from fault handler > >> and resuming the faulting transition. > > > >I'm not against reintroducing this, but it would certainly need to be > >opt-in, as you suggest. If we want to try to use stall faults to enable > >demand paging for DMA, then that means running core mm code to resolve > >the fault and we can't do that in irq context. Consequently, we have to > >hand this off to a thread, which means the hardware must allow the SS > >bit to remain set without immediately reasserting the interrupt line. > >Furthermore, we can't handle multiple faults on a context-bank, so we'd > >need to restrict ourselves to one device (i.e. faulting stream) per > >domain (CB). > > > >I think that means we want both specific compatible strings to identify > >the SS bit behaviour, but also a way to opt-in for the stall model as a > >separate property to indicate that the SoC integration can support this > >without e.g. deadlocking. > > > > To understand the reason on the need for the quirk based on SS bit behavior, > if the platform supports stall model and enabled, then SS bit should be implemented > and remain set until the RESUME register is written back, means same behavior > always ? The behaviour of the SS bit is IMPLEMENTATION DEFINED per the architecture, so we need to know which way the given implementation chose to go. If we want to support paging, then we absolutely need a way to return from the interrupt handler without having handled the stall (i.e. without having written to the RESUME register). That means that we mustn't take the same interrupt immediately, otherwise we'll end up getting stuck in an infinite fault. One hacky option would be to mask the interrupt at the GIC, but that adds an additional requirement of one interrupt per context bank, which isn't typically implemented in my experience. Will
On Fri, Dec 16, 2016 at 6:54 AM, Will Deacon <will.deacon@arm.com> wrote: > Hi Rob, > > On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: >> > Enabling stalling faults can result in hardware deadlock on poorly >> > designed systems, particularly those with a PCI root complex upstream of >> > the SMMU. >> > >> > Although it's not really Linux's job to save hardware integrators from >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO >> > clients) from hosing the system for everybody else, even if they might >> > already be required to have elevated privileges. >> > >> > Given that the fault handling code currently executes entirely in IRQ >> > context, there is nothing that can sensibly be done to recover from >> > things like page faults anyway, so let's rip this code out for now and >> > avoid the potential for deadlock. >> >> so, I'd like to re-introduce this feature, I *guess* as some sort of >> opt-in quirk (ie. disabled by default unless something in DT tells you >> otherwise?? But I'm open to suggestions. I'm not entirely sure what >> hw was having problems due to this feature.) >> >> On newer snapdragon devices we are using arm-smmu for the GPU, and >> halting the GPU so the driver's fault handler can dump some GPU state >> on faults is enormously helpful for debugging and tracking down where >> in the gpu cmdstream the fault was triggered. In addition, we will >> eventually want the ability to update pagetables from fault handler >> and resuming the faulting transition. > > I'm not against reintroducing this, but it would certainly need to be > opt-in, as you suggest. If we want to try to use stall faults to enable > demand paging for DMA, then that means running core mm code to resolve > the fault and we can't do that in irq context. Consequently, we have to > hand this off to a thread, which means the hardware must allow the SS > bit to remain set without immediately reasserting the interrupt line. > Furthermore, we can't handle multiple faults on a context-bank, so we'd > need to restrict ourselves to one device (i.e. faulting stream) per > domain (CB). > > I think that means we want both specific compatible strings to identify > the SS bit behaviour, but also a way to opt-in for the stall model as a > separate property to indicate that the SoC integration can support this > without e.g. deadlocking. > How do you feel about a short-term step to keep things in irq context, but enable stall mode? I'm debugging an issue, where it appears that the gpu cannot handle a non-stalled fault (triggers some fairly bizarre follow-on problems). So I think even if we don't add a fault handler callback, we still want to set the CFCFG bit and RESUME_TERMINATE in the fault handler on this hardware. BR, -R
On Wed, Sep 13, 2017 at 03:31:20PM -0400, Rob Clark wrote: > On Fri, Dec 16, 2016 at 6:54 AM, Will Deacon <will.deacon@arm.com> wrote: > > Hi Rob, > > > > On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: > >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: > >> > Enabling stalling faults can result in hardware deadlock on poorly > >> > designed systems, particularly those with a PCI root complex upstream of > >> > the SMMU. > >> > > >> > Although it's not really Linux's job to save hardware integrators from > >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO > >> > clients) from hosing the system for everybody else, even if they might > >> > already be required to have elevated privileges. > >> > > >> > Given that the fault handling code currently executes entirely in IRQ > >> > context, there is nothing that can sensibly be done to recover from > >> > things like page faults anyway, so let's rip this code out for now and > >> > avoid the potential for deadlock. > >> > >> so, I'd like to re-introduce this feature, I *guess* as some sort of > >> opt-in quirk (ie. disabled by default unless something in DT tells you > >> otherwise?? But I'm open to suggestions. I'm not entirely sure what > >> hw was having problems due to this feature.) > >> > >> On newer snapdragon devices we are using arm-smmu for the GPU, and > >> halting the GPU so the driver's fault handler can dump some GPU state > >> on faults is enormously helpful for debugging and tracking down where > >> in the gpu cmdstream the fault was triggered. In addition, we will > >> eventually want the ability to update pagetables from fault handler > >> and resuming the faulting transition. > > > > I'm not against reintroducing this, but it would certainly need to be > > opt-in, as you suggest. If we want to try to use stall faults to enable > > demand paging for DMA, then that means running core mm code to resolve > > the fault and we can't do that in irq context. Consequently, we have to > > hand this off to a thread, which means the hardware must allow the SS > > bit to remain set without immediately reasserting the interrupt line. > > Furthermore, we can't handle multiple faults on a context-bank, so we'd > > need to restrict ourselves to one device (i.e. faulting stream) per > > domain (CB). > > > > I think that means we want both specific compatible strings to identify > > the SS bit behaviour, but also a way to opt-in for the stall model as a > > separate property to indicate that the SoC integration can support this > > without e.g. deadlocking. > > > > How do you feel about a short-term step to keep things in irq context, > but enable stall mode? I'm debugging an issue, where it appears that > the gpu cannot handle a non-stalled fault (triggers some fairly > bizarre follow-on problems). So I think even if we don't add a fault > handler callback, we still want to set the CFCFG bit and > RESUME_TERMINATE in the fault handler on this hardware. Hmm, colour me unconvinced. Why does enabling stalls fix this problem? I'd rather we bite the bullet and implement things properly on top of a workqueue so that you can build on the same basic infrastructure as the SVM work that Jean-Philippe is looking at, particular as you also have a use-case for running fault code in non-atomic context. Will
On Mon, Sep 18, 2017 at 1:33 PM, Will Deacon <will.deacon@arm.com> wrote: > On Wed, Sep 13, 2017 at 03:31:20PM -0400, Rob Clark wrote: >> On Fri, Dec 16, 2016 at 6:54 AM, Will Deacon <will.deacon@arm.com> wrote: >> > Hi Rob, >> > >> > On Tue, Dec 06, 2016 at 06:30:21PM -0500, Rob Clark wrote: >> >> On Thu, Aug 18, 2016 at 9:05 AM, Will Deacon <will.deacon@arm.com> wrote: >> >> > Enabling stalling faults can result in hardware deadlock on poorly >> >> > designed systems, particularly those with a PCI root complex upstream of >> >> > the SMMU. >> >> > >> >> > Although it's not really Linux's job to save hardware integrators from >> >> > their own misfortune, it *is* our job to stop userspace (e.g. VFIO >> >> > clients) from hosing the system for everybody else, even if they might >> >> > already be required to have elevated privileges. >> >> > >> >> > Given that the fault handling code currently executes entirely in IRQ >> >> > context, there is nothing that can sensibly be done to recover from >> >> > things like page faults anyway, so let's rip this code out for now and >> >> > avoid the potential for deadlock. >> >> >> >> so, I'd like to re-introduce this feature, I *guess* as some sort of >> >> opt-in quirk (ie. disabled by default unless something in DT tells you >> >> otherwise?? But I'm open to suggestions. I'm not entirely sure what >> >> hw was having problems due to this feature.) >> >> >> >> On newer snapdragon devices we are using arm-smmu for the GPU, and >> >> halting the GPU so the driver's fault handler can dump some GPU state >> >> on faults is enormously helpful for debugging and tracking down where >> >> in the gpu cmdstream the fault was triggered. In addition, we will >> >> eventually want the ability to update pagetables from fault handler >> >> and resuming the faulting transition. >> > >> > I'm not against reintroducing this, but it would certainly need to be >> > opt-in, as you suggest. If we want to try to use stall faults to enable >> > demand paging for DMA, then that means running core mm code to resolve >> > the fault and we can't do that in irq context. Consequently, we have to >> > hand this off to a thread, which means the hardware must allow the SS >> > bit to remain set without immediately reasserting the interrupt line. >> > Furthermore, we can't handle multiple faults on a context-bank, so we'd >> > need to restrict ourselves to one device (i.e. faulting stream) per >> > domain (CB). >> > >> > I think that means we want both specific compatible strings to identify >> > the SS bit behaviour, but also a way to opt-in for the stall model as a >> > separate property to indicate that the SoC integration can support this >> > without e.g. deadlocking. >> > >> >> How do you feel about a short-term step to keep things in irq context, >> but enable stall mode? I'm debugging an issue, where it appears that >> the gpu cannot handle a non-stalled fault (triggers some fairly >> bizarre follow-on problems). So I think even if we don't add a fault >> handler callback, we still want to set the CFCFG bit and >> RESUME_TERMINATE in the fault handler on this hardware. > > Hmm, colour me unconvinced. Why does enabling stalls fix this problem? > I'd rather we bite the bullet and implement things properly on top of > a workqueue so that you can build on the same basic infrastructure as > the SVM work that Jean-Philippe is looking at, particular as you also > have a use-case for running fault code in non-atomic context. > So, it seems like setting either CFCFG or HUPCF (which is what downstream android kernel apparently does) avoids this issue. It seems like some sort of hw bug, but not sure if in the iommu or the gpu.. you'd probably have to ask someone @qcom about the details, but without setting either of these bits, it seems that other concurrent memory transactions to the one triggering fault and up returning bogus values to the GPU. So the CP (which is reading cmdstream somewhat far ahead of the DRAW or COMPUTE shader that triggers a fault) ends up reading bogus cmdstream values and triggers all sorts of spectacular fireworks. BR, -R
diff --git a/drivers/iommu/arm-smmu.c b/drivers/iommu/arm-smmu.c index 4f49fe29f202..2db74ebc3240 100644 --- a/drivers/iommu/arm-smmu.c +++ b/drivers/iommu/arm-smmu.c @@ -686,8 +686,7 @@ static struct iommu_gather_ops arm_smmu_gather_ops = { static irqreturn_t arm_smmu_context_fault(int irq, void *dev) { - int flags, ret; - u32 fsr, fsynr, resume; + u32 fsr, fsynr; unsigned long iova; struct iommu_domain *domain = dev; struct arm_smmu_domain *smmu_domain = to_smmu_domain(domain); @@ -701,34 +700,15 @@ static irqreturn_t arm_smmu_context_fault(int irq, void *dev) if (!(fsr & FSR_FAULT)) return IRQ_NONE; - if (fsr & FSR_IGN) - dev_err_ratelimited(smmu->dev, - "Unexpected context fault (fsr 0x%x)\n", - fsr); - fsynr = readl_relaxed(cb_base + ARM_SMMU_CB_FSYNR0); - flags = fsynr & FSYNR0_WNR ? IOMMU_FAULT_WRITE : IOMMU_FAULT_READ; - iova = readq_relaxed(cb_base + ARM_SMMU_CB_FAR); - if (!report_iommu_fault(domain, smmu->dev, iova, flags)) { - ret = IRQ_HANDLED; - resume = RESUME_RETRY; - } else { - dev_err_ratelimited(smmu->dev, - "Unhandled context fault: iova=0x%08lx, fsynr=0x%x, cb=%d\n", - iova, fsynr, cfg->cbndx); - ret = IRQ_NONE; - resume = RESUME_TERMINATE; - } - - /* Clear the faulting FSR */ - writel(fsr, cb_base + ARM_SMMU_CB_FSR); - /* Retry or terminate any stalled transactions */ - if (fsr & FSR_SS) - writel_relaxed(resume, cb_base + ARM_SMMU_CB_RESUME); + dev_err_ratelimited(smmu->dev, + "Unhandled context fault: fsr=0x%x, iova=0x%08lx, fsynr=0x%x, cb=%d\n", + fsr, iova, fsynr, cfg->cbndx); - return ret; + writel(fsr, cb_base + ARM_SMMU_CB_FSR); + return IRQ_HANDLED; } static irqreturn_t arm_smmu_global_fault(int irq, void *dev) @@ -837,7 +817,7 @@ static void arm_smmu_init_context_bank(struct arm_smmu_domain *smmu_domain, } /* SCTLR */ - reg = SCTLR_CFCFG | SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP; + reg = SCTLR_CFIE | SCTLR_CFRE | SCTLR_M | SCTLR_EAE_SBOP; if (stage1) reg |= SCTLR_S1_ASIDPNE; #ifdef __BIG_ENDIAN
Enabling stalling faults can result in hardware deadlock on poorly designed systems, particularly those with a PCI root complex upstream of the SMMU. Although it's not really Linux's job to save hardware integrators from their own misfortune, it *is* our job to stop userspace (e.g. VFIO clients) from hosing the system for everybody else, even if they might already be required to have elevated privileges. Given that the fault handling code currently executes entirely in IRQ context, there is nothing that can sensibly be done to recover from things like page faults anyway, so let's rip this code out for now and avoid the potential for deadlock. Cc: <stable@vger.kernel.org> Reported-by: Matt Evans <matt.evans@arm.com> Signed-off-by: Will Deacon <will.deacon@arm.com> --- drivers/iommu/arm-smmu.c | 34 +++++++--------------------------- 1 file changed, 7 insertions(+), 27 deletions(-)