Message ID | 1429653035-19424-3-git-send-email-mcgrof@do-not-panic.com (mailing list archive) |
---|---|
State | Rejected |
Headers | show |
On Tue, Apr 21, 2015 at 02:50:35PM -0700, Luis R. Rodriguez wrote: > - if (qib_wc_pat) { > - resource_size_t vl15off; > - /* > - * We do not set WC on the VL15 buffers to avoid > - * a rare problem with unaligned writes from > - * interrupt-flushed store buffers, so we need > - * to map those separately here. We can't solve > - * this for the rarely used mtrr case. > - */ > - ret = init_chip_wc_pat(dd, 0); > - if (ret) > - goto bail; > + /* > + * We do not set WC on the VL15 buffers to avoid > + * a rare problem with unaligned writes from > + * interrupt-flushed store buffers, so we need > + * to map those separately here. We can't solve > + * this for the rarely used mtrr case. > + */ This is a small change in behavior, but it doesn't seem important.. Mike, what do you think about adding: if (dd->wc_cookie) dev_err(.., "Using this device without CPU PAT support is known to be broken"); or similar.. Jason -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: This: > + /* MTRR was used if this is non-zero */ > + if (!dd->wc_cookie) > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); And this: > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > + if (dd->wc_cookie < 0) > + ret = -EINVAL; don't agree on what wc_cookie will be on error.
On Wed, Apr 22, 2015 at 09:54:38AM -0400, Doug Ledford wrote: > On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: > > This: > > + /* MTRR was used if this is non-zero */ > > + if (!dd->wc_cookie) > > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > > And this: > > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > > + if (dd->wc_cookie < 0) > > + ret = -EINVAL; > > don't agree on what wc_cookie will be on error. Can you elaborate? The one below is the one that starts things, and arch_phys_wc_add() will return 0 on PAT systems. For non-PAT systems it will return a number > 0 *iff* a valid MTRR was added. It will return negative onloy on error then. The change above is meant to replace a check put in place to see if PAT was enabled. The way we replace this is to ensure that arch_phys_wc_add() returned 0. If you disagree it'd be great if you can elaborate why. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-04-22 at 17:33 +0200, Luis R. Rodriguez wrote: > On Wed, Apr 22, 2015 at 09:54:38AM -0400, Doug Ledford wrote: > > On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: > > > > This: > > > + /* MTRR was used if this is non-zero */ > > > + if (!dd->wc_cookie) > > > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > > > > And this: > > > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > > > + if (dd->wc_cookie < 0) > > > + ret = -EINVAL; > > > > don't agree on what wc_cookie will be on error. > > Can you elaborate? The one below is the one that starts things, > and arch_phys_wc_add() will return 0 on PAT systems. For non-PAT > systems it will return a number > 0 *iff* a valid MTRR was added. > It will return negative onloy on error then. > > The change above is meant to replace a check put in place to see > if PAT was enabled. The way we replace this is to ensure that > arch_phys_wc_add() returned 0. > > If you disagree it'd be great if you can elaborate why. Maybe I'm missing something, but in qib_enable_wc() you store the return from arch_phys_wc_add into wc_cookie. That return is negative, so you return from qib_enable_wc() to qib_init_one(), they see the ret value, they print out a warning about bad performance, then they clear the return value and continue with device initialization. In all of this though, wc_cookie is never cleared and so it still has the error condition in it. Then, much later at run time, you call mmap_piobufs() and you check the contents of wc_cookie, and if it's non-0 (which is still will be), you do the wrong thing, right? And what about at shutdown when you call qib_disable_wc() and your cookie still has an error code in it as well?
On Wed, Apr 22, 2015 at 12:57:18PM -0400, Doug Ledford wrote: > On Wed, 2015-04-22 at 17:33 +0200, Luis R. Rodriguez wrote: > > On Wed, Apr 22, 2015 at 09:54:38AM -0400, Doug Ledford wrote: > > > On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: > > > > > > This: > > > > + /* MTRR was used if this is non-zero */ > > > > + if (!dd->wc_cookie) > > > > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > > > > > > And this: > > > > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > > > > + if (dd->wc_cookie < 0) > > > > + ret = -EINVAL; > > > > > > don't agree on what wc_cookie will be on error. > > > > Can you elaborate? The one below is the one that starts things, > > and arch_phys_wc_add() will return 0 on PAT systems. For non-PAT > > systems it will return a number > 0 *iff* a valid MTRR was added. > > It will return negative onloy on error then. > > > > The change above is meant to replace a check put in place to see > > if PAT was enabled. The way we replace this is to ensure that > > arch_phys_wc_add() returned 0. > > > > If you disagree it'd be great if you can elaborate why. > > Maybe I'm missing something, but in qib_enable_wc() you store the return > from arch_phys_wc_add into wc_cookie. That return is negative, If and only if the system was non-PAT and mtrr_add() failed. > so you > return from qib_enable_wc() to qib_init_one(), they see the ret value, > they print out a warning about bad performance, then they clear the > return value and continue with device initialization. > > In all of this though, wc_cookie is never cleared and so it still has > the error condition in it. Then, much later at run time, you call > mmap_piobufs() and you check the contents of wc_cookie, and if it's > non-0 (which is still will be), you do the wrong thing, right? Originally the code had it to run pgprot_writecombine() if PAT was going to be used. After the code changes we check for !cookie which will be true when cookie is 0 only. In case the cookie was an error, that is if mtrr_add() failed, then this code would not run because (!negative) is false. The goal was to trigger a run if the cookie was 0, which can only happen if PAT was enabled. Please let me know, I'd like to get this right too. > And what > about at shutdown when you call qib_disable_wc() and your cookie still > has an error code in it as well? Well fortunately arch_phys_wc_del(negative) and arch_phys_wc_del(0) will be a no-op. Its what helps us remove so much clutter. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Wed, 2015-04-22 at 19:37 +0200, Luis R. Rodriguez wrote: > On Wed, Apr 22, 2015 at 12:57:18PM -0400, Doug Ledford wrote: > > On Wed, 2015-04-22 at 17:33 +0200, Luis R. Rodriguez wrote: > > > On Wed, Apr 22, 2015 at 09:54:38AM -0400, Doug Ledford wrote: > > > > On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: > > > > > > > > This: > > > > > + /* MTRR was used if this is non-zero */ > > > > > + if (!dd->wc_cookie) > > > > > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > > > > > > > > And this: > > > > > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > > > > > + if (dd->wc_cookie < 0) > > > > > + ret = -EINVAL; > > > > > > > > don't agree on what wc_cookie will be on error. > > > > > > Can you elaborate? The one below is the one that starts things, > > > and arch_phys_wc_add() will return 0 on PAT systems. For non-PAT > > > systems it will return a number > 0 *iff* a valid MTRR was added. > > > It will return negative onloy on error then. > > > > > > The change above is meant to replace a check put in place to see > > > if PAT was enabled. The way we replace this is to ensure that > > > arch_phys_wc_add() returned 0. > > > > > > If you disagree it'd be great if you can elaborate why. > > > > Maybe I'm missing something, but in qib_enable_wc() you store the return > > from arch_phys_wc_add into wc_cookie. That return is negative, > > If and only if the system was non-PAT and mtrr_add() failed. > > > so you > > return from qib_enable_wc() to qib_init_one(), they see the ret value, > > they print out a warning about bad performance, then they clear the > > return value and continue with device initialization. > > > > In all of this though, wc_cookie is never cleared and so it still has > > the error condition in it. Then, much later at run time, you call > > mmap_piobufs() and you check the contents of wc_cookie, and if it's > > non-0 (which is still will be), you do the wrong thing, right? > > Originally the code had it to run pgprot_writecombine() if PAT was going to be > used. After the code changes we check for !cookie which will be true when > cookie is 0 only. In case the cookie was an error, that is if mtrr_add() > failed, then this code would not run because (!negative) is false. The goal was > to trigger a run if the cookie was 0, which can only happen if PAT was enabled. OK, the logic works, but as much as anything, it's the comment that's misleading. The code would be clearer with a comment like this: /* We used PAT if wc_cookie == 0 */ if (!dd->wc_cookie) { That would be more accurate as well since the original comment didn't account for the possible error code in wc_cookie, so it's possible you didn't use either PAT or wc if you have that error code. > Please let me know, I'd like to get this right too. > > > And what > > about at shutdown when you call qib_disable_wc() and your cookie still > > has an error code in it as well? > > Well fortunately arch_phys_wc_del(negative) and arch_phys_wc_del(0) will be > a no-op. Its what helps us remove so much clutter. OK.
On Wed, Apr 22, 2015 at 01:48:27PM -0400, Doug Ledford wrote: > On Wed, 2015-04-22 at 19:37 +0200, Luis R. Rodriguez wrote: > > On Wed, Apr 22, 2015 at 12:57:18PM -0400, Doug Ledford wrote: > > > On Wed, 2015-04-22 at 17:33 +0200, Luis R. Rodriguez wrote: > > > > On Wed, Apr 22, 2015 at 09:54:38AM -0400, Doug Ledford wrote: > > > > > On Tue, 2015-04-21 at 14:50 -0700, Luis R. Rodriguez wrote: > > > > > > > > > > This: > > > > > > + /* MTRR was used if this is non-zero */ > > > > > > + if (!dd->wc_cookie) > > > > > > vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); > > > > > > > > > > And this: > > > > > > + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); > > > > > > + if (dd->wc_cookie < 0) > > > > > > + ret = -EINVAL; > > > > > > > > > > don't agree on what wc_cookie will be on error. > > > > > > > > Can you elaborate? The one below is the one that starts things, > > > > and arch_phys_wc_add() will return 0 on PAT systems. For non-PAT > > > > systems it will return a number > 0 *iff* a valid MTRR was added. > > > > It will return negative onloy on error then. > > > > > > > > The change above is meant to replace a check put in place to see > > > > if PAT was enabled. The way we replace this is to ensure that > > > > arch_phys_wc_add() returned 0. > > > > > > > > If you disagree it'd be great if you can elaborate why. > > > > > > Maybe I'm missing something, but in qib_enable_wc() you store the return > > > from arch_phys_wc_add into wc_cookie. That return is negative, > > > > If and only if the system was non-PAT and mtrr_add() failed. > > > > > so you > > > return from qib_enable_wc() to qib_init_one(), they see the ret value, > > > they print out a warning about bad performance, then they clear the > > > return value and continue with device initialization. > > > > > > In all of this though, wc_cookie is never cleared and so it still has > > > the error condition in it. Then, much later at run time, you call > > > mmap_piobufs() and you check the contents of wc_cookie, and if it's > > > non-0 (which is still will be), you do the wrong thing, right? > > > > Originally the code had it to run pgprot_writecombine() if PAT was going to be > > used. After the code changes we check for !cookie which will be true when > > cookie is 0 only. In case the cookie was an error, that is if mtrr_add() > > failed, then this code would not run because (!negative) is false. The goal was > > to trigger a run if the cookie was 0, which can only happen if PAT was enabled. > > OK, the logic works, but as much as anything, it's the comment that's > misleading. The code would be clearer with a comment like this: > > /* We used PAT if wc_cookie == 0 */ > if (!dd->wc_cookie) { > > That would be more accurate as well since the original comment didn't > account for the possible error code in wc_cookie, so it's possible you > didn't use either PAT or wc if you have that error code. Fair enough, will send a v5 follow up with the comment enhanced, but will leave the first patch in this series as-is. Luis -- To unsubscribe from this list: send the line "unsubscribe linux-rdma" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/drivers/infiniband/hw/qib/qib.h b/drivers/infiniband/hw/qib/qib.h index ffd48bf..ba5173e2 100644 --- a/drivers/infiniband/hw/qib/qib.h +++ b/drivers/infiniband/hw/qib/qib.h @@ -1136,7 +1136,6 @@ extern struct qib_devdata *qib_lookup(int unit); extern u32 qib_cpulist_count; extern unsigned long *qib_cpulist; -extern unsigned qib_wc_pat; extern unsigned qib_cc_table_size; int qib_init(struct qib_devdata *, int); int init_chip_wc_pat(struct qib_devdata *dd, u32); diff --git a/drivers/infiniband/hw/qib/qib_file_ops.c b/drivers/infiniband/hw/qib/qib_file_ops.c index 9ea6c44..9868499 100644 --- a/drivers/infiniband/hw/qib/qib_file_ops.c +++ b/drivers/infiniband/hw/qib/qib_file_ops.c @@ -835,7 +835,8 @@ static int mmap_piobufs(struct vm_area_struct *vma, vma->vm_flags &= ~VM_MAYREAD; vma->vm_flags |= VM_DONTCOPY | VM_DONTEXPAND; - if (qib_wc_pat) + /* MTRR was used if this is non-zero */ + if (!dd->wc_cookie) vma->vm_page_prot = pgprot_writecombine(vma->vm_page_prot); ret = io_remap_pfn_range(vma, vma->vm_start, phys >> PAGE_SHIFT, diff --git a/drivers/infiniband/hw/qib/qib_iba6120.c b/drivers/infiniband/hw/qib/qib_iba6120.c index 0d2ba59..4b927809 100644 --- a/drivers/infiniband/hw/qib/qib_iba6120.c +++ b/drivers/infiniband/hw/qib/qib_iba6120.c @@ -3315,11 +3315,9 @@ static int init_6120_variables(struct qib_devdata *dd) qib_6120_config_ctxts(dd); qib_set_ctxtcnt(dd); - if (qib_wc_pat) { - ret = init_chip_wc_pat(dd, 0); - if (ret) - goto bail; - } + ret = init_chip_wc_pat(dd, 0); + if (ret) + goto bail; set_6120_baseaddrs(dd); /* set chip access pointers now */ ret = 0; diff --git a/drivers/infiniband/hw/qib/qib_iba7220.c b/drivers/infiniband/hw/qib/qib_iba7220.c index 22affda..00b2af2 100644 --- a/drivers/infiniband/hw/qib/qib_iba7220.c +++ b/drivers/infiniband/hw/qib/qib_iba7220.c @@ -4126,11 +4126,9 @@ static int qib_init_7220_variables(struct qib_devdata *dd) qib_7220_config_ctxts(dd); qib_set_ctxtcnt(dd); /* needed for PAT setup */ - if (qib_wc_pat) { - ret = init_chip_wc_pat(dd, 0); - if (ret) - goto bail; - } + ret = init_chip_wc_pat(dd, 0); + if (ret) + goto bail; set_7220_baseaddrs(dd); /* set chip access pointers now */ ret = 0; diff --git a/drivers/infiniband/hw/qib/qib_iba7322.c b/drivers/infiniband/hw/qib/qib_iba7322.c index ef97b71..f32b462 100644 --- a/drivers/infiniband/hw/qib/qib_iba7322.c +++ b/drivers/infiniband/hw/qib/qib_iba7322.c @@ -6429,6 +6429,7 @@ static int qib_init_7322_variables(struct qib_devdata *dd) unsigned features, pidx, sbufcnt; int ret, mtu; u32 sbufs, updthresh; + resource_size_t vl15off; /* pport structs are contiguous, allocated after devdata */ ppd = (struct qib_pportdata *)(dd + 1); @@ -6677,29 +6678,27 @@ static int qib_init_7322_variables(struct qib_devdata *dd) qib_7322_config_ctxts(dd); qib_set_ctxtcnt(dd); - if (qib_wc_pat) { - resource_size_t vl15off; - /* - * We do not set WC on the VL15 buffers to avoid - * a rare problem with unaligned writes from - * interrupt-flushed store buffers, so we need - * to map those separately here. We can't solve - * this for the rarely used mtrr case. - */ - ret = init_chip_wc_pat(dd, 0); - if (ret) - goto bail; + /* + * We do not set WC on the VL15 buffers to avoid + * a rare problem with unaligned writes from + * interrupt-flushed store buffers, so we need + * to map those separately here. We can't solve + * this for the rarely used mtrr case. + */ + ret = init_chip_wc_pat(dd, 0); + if (ret) + goto bail; - /* vl15 buffers start just after the 4k buffers */ - vl15off = dd->physaddr + (dd->piobufbase >> 32) + - dd->piobcnt4k * dd->align4k; - dd->piovl15base = ioremap_nocache(vl15off, - NUM_VL15_BUFS * dd->align4k); - if (!dd->piovl15base) { - ret = -ENOMEM; - goto bail; - } + /* vl15 buffers start just after the 4k buffers */ + vl15off = dd->physaddr + (dd->piobufbase >> 32) + + dd->piobcnt4k * dd->align4k; + dd->piovl15base = ioremap_nocache(vl15off, + NUM_VL15_BUFS * dd->align4k); + if (!dd->piovl15base) { + ret = -ENOMEM; + goto bail; } + qib_7322_set_baseaddrs(dd); /* set chip access pointers now */ ret = 0; diff --git a/drivers/infiniband/hw/qib/qib_init.c b/drivers/infiniband/hw/qib/qib_init.c index 2ee3695..7e00470 100644 --- a/drivers/infiniband/hw/qib/qib_init.c +++ b/drivers/infiniband/hw/qib/qib_init.c @@ -91,15 +91,6 @@ MODULE_PARM_DESC(krcvqs, "number of kernel receive queues per IB port"); unsigned qib_cc_table_size; module_param_named(cc_table_size, qib_cc_table_size, uint, S_IRUGO); MODULE_PARM_DESC(cc_table_size, "Congestion control table entries 0 (CCA disabled - default), min = 128, max = 1984"); -/* - * qib_wc_pat parameter: - * 0 is WC via MTRR - * 1 is WC via PAT - * If PAT initialization fails, code reverts back to MTRR - */ -unsigned qib_wc_pat = 1; /* default (1) is to use PAT, not MTRR */ -module_param_named(wc_pat, qib_wc_pat, uint, S_IRUGO); -MODULE_PARM_DESC(wc_pat, "enable write-combining via PAT mechanism"); static void verify_interrupt(unsigned long); @@ -1377,8 +1368,7 @@ static void cleanup_device_data(struct qib_devdata *dd) spin_unlock(&dd->pport[pidx].cc_shadow_lock); } - if (!qib_wc_pat) - qib_disable_wc(dd); + qib_disable_wc(dd); if (dd->pioavailregs_dma) { dma_free_coherent(&dd->pcidev->dev, PAGE_SIZE, @@ -1547,14 +1537,12 @@ static int qib_init_one(struct pci_dev *pdev, const struct pci_device_id *ent) goto bail; } - if (!qib_wc_pat) { - ret = qib_enable_wc(dd); - if (ret) { - qib_dev_err(dd, - "Write combining not enabled (err %d): performance may be poor\n", - -ret); - ret = 0; - } + ret = qib_enable_wc(dd); + if (ret) { + qib_dev_err(dd, + "Write combining not enabled (err %d): performance may be poor\n", + -ret); + ret = 0; } qib_verify_pioperf(dd); diff --git a/drivers/infiniband/hw/qib/qib_wc_x86_64.c b/drivers/infiniband/hw/qib/qib_wc_x86_64.c index fe0850a..6d61ef9 100644 --- a/drivers/infiniband/hw/qib/qib_wc_x86_64.c +++ b/drivers/infiniband/hw/qib/qib_wc_x86_64.c @@ -116,21 +116,9 @@ int qib_enable_wc(struct qib_devdata *dd) } if (!ret) { - int cookie; - - cookie = mtrr_add(pioaddr, piolen, MTRR_TYPE_WRCOMB, 1); - if (cookie < 0) { - { - qib_devinfo(dd->pcidev, - "mtrr_add() WC for PIO bufs failed (%d)\n", - cookie); - ret = -EINVAL; - } - } else { - dd->wc_cookie = cookie; - dd->wc_base = (unsigned long) pioaddr; - dd->wc_len = (unsigned long) piolen; - } + dd->wc_cookie = arch_phys_wc_add(pioaddr, piolen); + if (dd->wc_cookie < 0) + ret = -EINVAL; } return ret; @@ -142,18 +130,7 @@ int qib_enable_wc(struct qib_devdata *dd) */ void qib_disable_wc(struct qib_devdata *dd) { - if (dd->wc_cookie) { - int r; - - r = mtrr_del(dd->wc_cookie, dd->wc_base, - dd->wc_len); - if (r < 0) - qib_devinfo(dd->pcidev, - "mtrr_del(%lx, %lx, %lx) failed: %d\n", - dd->wc_cookie, dd->wc_base, - dd->wc_len, r); - dd->wc_cookie = 0; /* even on failure */ - } + arch_phys_wc_del(dd->wc_cookie); } /**