diff mbox

[2/2] IB/hfi1: Do not warn on lid conversions for OPA

Message ID 20171129125948.31113.59826.stgit@phlsvslse11.ph.intel.com (mailing list archive)
State Not Applicable
Headers show

Commit Message

Marciniszyn, Mike Nov. 29, 2017, 12:59 p.m. UTC
From: Don Hiatt <don.hiatt@intel.com>

Upstream commit 4988be5813ff2afdc0d8bfa315ef34a577d3efbf.

On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
is Extended. In such cases, it is okay to lose the upper 16 bits of the
LID as this information is obtained elsewhere. Do not issue a warning
when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.

[75920.148985] ------------[ cut here ]------------
[75920.154651] WARNING: CPU: 0 PID: 1718 at ./include/rdma/ib_verbs.h:3788 hfi1_process_mad+0x1c1f/0x1c80 [hfi1]
[75920.166192] Modules linked in: ib_ipoib hfi1(E) rdmavt(E) rdma_ucm(E) ib_ucm(E) rdma_cm(E) ib_cm(E) iw_cm(E) ib_umad(E) ib_uverbs(E) ib_core(E) libiscsi scsi_transport_iscsi dm_mirror dm_region_hash dm_log dm_mod dax x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel mei_me ipmi_si iTCO_wdt iTCO_vendor_support crypto_simd ipmi_devintf pcspkr mei sg i2c_i801 glue_helper lpc_ich shpchp ioatdma mfd_core wmi ipmi_msghandler cryptd acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace sunrpc ip_tables xfs libcrc32c sd_mod mgag200 drm_kms_helper syscopyarea sysfillrect sysimgblt fb_sys_fops ttm drm igb ptp ahci libahci pps_core crc32c_intel libata dca i2c_algo_bit i2c_core [last unloaded: ib_core]
[75920.246331] CPU: 0 PID: 1718 Comm: kworker/0:1H Tainted: G        W I E   4.13.0-rc7+ #1
[75920.255907] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS SE5C610.86B.01.01.0008.021120151325 02/11/2015
[75920.268158] Workqueue: ib-comp-wq ib_cq_poll_work [ib_core]
[75920.274934] task: ffff88084a718000 task.stack: ffffc9000a424000
[75920.282123] RIP: 0010:hfi1_process_mad+0x1c1f/0x1c80 [hfi1]
[75920.288881] RSP: 0018:ffffc9000a427c38 EFLAGS: 00010206
[75920.295265] RAX: 0000000000010001 RBX: ffff8808361420e8 RCX: ffff880837811d80
[75920.303784] RDX: 0000000000000002 RSI: 0000000000007fff RDI: ffff880837811d80
[75920.312302] RBP: ffffc9000a427d38 R08: 0000000000000000 R09: ffff8808361420e8
[75920.320819] R10: ffff88083841f0e8 R11: ffffc9000a427da8 R12: 0000000000000001
[75920.329335] R13: ffff880837810000 R14: 0000000000000000 R15: ffff88084f1a4800
[75920.337849] FS:  0000000000000000(0000) GS:ffff88085f400000(0000) knlGS:0000000000000000
[75920.347450] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[75920.354405] CR2: 00007f9e4b3d9000 CR3: 0000000001c09000 CR4: 00000000001406f0
[75920.362947] Call Trace:
[75920.366257]  ? ib_mad_recv_done+0x258/0x9b0 [ib_core]
[75920.372457]  ? ib_mad_recv_done+0x258/0x9b0 [ib_core]
[75920.378652]  ? __kmalloc+0x1df/0x210
[75920.383229]  ib_mad_recv_done+0x305/0x9b0 [ib_core]
[75920.389270]  __ib_process_cq+0x5d/0xb0 [ib_core]
[75920.395032]  ib_cq_poll_work+0x20/0x60 [ib_core]
[75920.400777]  process_one_work+0x149/0x360
[75920.405836]  worker_thread+0x4d/0x3c0
[75920.410505]  kthread+0x109/0x140
[75920.414681]  ? rescuer_thread+0x380/0x380
[75920.419731]  ? kthread_park+0x60/0x60
[75920.424406]  ret_from_fork+0x25/0x30
[75920.428972] Code: 4c 89 9d 58 ff ff ff 49 89 45 00 66 b8 00 02 49 89 45 08 e8 44 27 89 e0 4c 8b 9d 58 ff ff ff e9 d8 f6 ff ff 0f ff e9 55 e7 ff ff <0f> ff e9 3b e5 ff ff 0f ff 0f 1f 84 00 00 00 00 00 e9 4b e9 ff
[75921.451269] ---[ end trace cf26df27c9597265 ]---

Cc: <stable@vger.kernel.org> # 4.14.x
Fixes: 62ede7779904 ("Add OPA extended LID support")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Signed-off-by: Don Hiatt <don.hiatt@intel.com>
Signed-off-by: Dennis Dalessandro <dennis.dalessandro@intel.com>
Reviewed-by: Leon Romanovsky <leonro@mellanox.com>
Signed-off-by: Doug Ledford <dledford@redhat.com>
---
 drivers/infiniband/hw/hfi1/mad.c |    7 +++++--
 1 file changed, 5 insertions(+), 2 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Jason Gunthorpe Nov. 29, 2017, 6:43 p.m. UTC | #1
On Wed, Nov 29, 2017 at 07:59:49AM -0500, Mike Marciniszyn wrote:
> From: Don Hiatt <don.hiatt@intel.com>
> 
> Upstream commit 4988be5813ff2afdc0d8bfa315ef34a577d3efbf.
> 
> On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
> is Extended. In such cases, it is okay to lose the upper 16 bits of the
> LID as this information is obtained elsewhere. Do not issue a warning
> when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.

The concept for the 32 bit lids was to keep it 32 bit inside the
kernel, ingress_key_table_fail puts the lid into
err_info_constraint.slid which is already 32 bits, so this cannot be
the right place for this.

These truncations should only appear at the uapi boundary.

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Greg KH Dec. 4, 2017, 12:13 p.m. UTC | #2
On Wed, Nov 29, 2017 at 11:43:24AM -0700, Jason Gunthorpe wrote:
> On Wed, Nov 29, 2017 at 07:59:49AM -0500, Mike Marciniszyn wrote:
> > From: Don Hiatt <don.hiatt@intel.com>
> > 
> > Upstream commit 4988be5813ff2afdc0d8bfa315ef34a577d3efbf.
> > 
> > On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
> > is Extended. In such cases, it is okay to lose the upper 16 bits of the
> > LID as this information is obtained elsewhere. Do not issue a warning
> > when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.
> 
> The concept for the 32 bit lids was to keep it 32 bit inside the
> kernel, ingress_key_table_fail puts the lid into
> err_info_constraint.slid which is already 32 bits, so this cannot be
> the right place for this.
> 
> These truncations should only appear at the uapi boundary.

Ok, but that's what the change in Linus's tree does.  If you object to
that, please fix it there :)

thanks,

greg k-h
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Hiatt, Don Dec. 4, 2017, 5:28 p.m. UTC | #3
On 12/4/2017 4:13 AM, Greg KH wrote:
> On Wed, Nov 29, 2017 at 11:43:24AM -0700, Jason Gunthorpe wrote:
>> On Wed, Nov 29, 2017 at 07:59:49AM -0500, Mike Marciniszyn wrote:
>>> From: Don Hiatt <don.hiatt@intel.com>
>>>
>>> Upstream commit 4988be5813ff2afdc0d8bfa315ef34a577d3efbf.
>>>
>>> On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
>>> is Extended. In such cases, it is okay to lose the upper 16 bits of the
>>> LID as this information is obtained elsewhere. Do not issue a warning
>>> when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.
>> The concept for the 32 bit lids was to keep it 32 bit inside the
>> kernel, ingress_key_table_fail puts the lid into
>> err_info_constraint.slid which is already 32 bits, so this cannot be
>> the right place for this.
>>
>> These truncations should only appear at the uapi boundary.
> Ok, but that's what the change in Linus's tree does.  If you object to
> that, please fix it there :)
Hi Jason,

I have changes for this that I can submit as a follow on patch, please 
let me know
if this is how you'd like to proceed.

don


> thanks,
>
> greg k-h
> --
> To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at  http://vger.kernel.org/majordomo-info.html

--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jason Gunthorpe Dec. 4, 2017, 6:04 p.m. UTC | #4
On Mon, Dec 04, 2017 at 09:28:07AM -0800, Don Hiatt wrote:
> 
> 
> On 12/4/2017 4:13 AM, Greg KH wrote:
> >On Wed, Nov 29, 2017 at 11:43:24AM -0700, Jason Gunthorpe wrote:
> >>On Wed, Nov 29, 2017 at 07:59:49AM -0500, Mike Marciniszyn wrote:
> >>>From: Don Hiatt <don.hiatt@intel.com>
> >>>
> >>>Upstream commit 4988be5813ff2afdc0d8bfa315ef34a577d3efbf.
> >>>
> >>>On OPA devices opa_local_smp_check will receive 32Bit LIDs when the LID
> >>>is Extended. In such cases, it is okay to lose the upper 16 bits of the
> >>>LID as this information is obtained elsewhere. Do not issue a warning
> >>>when calling ib_lid_cpu16() in this case by masking out the upper 16Bits.
> >>The concept for the 32 bit lids was to keep it 32 bit inside the
> >>kernel, ingress_key_table_fail puts the lid into
> >>err_info_constraint.slid which is already 32 bits, so this cannot be
> >>the right place for this.
> >>
> >>These truncations should only appear at the uapi boundary.
> >Ok, but that's what the change in Linus's tree does.  If you object to
> >that, please fix it there :)
> Hi Jason,
> 
> I have changes for this that I can submit as a follow on patch,
> please let me know if this is how you'd like to proceed.

Yes, please send a followup patch to mainline

Jason
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/drivers/infiniband/hw/hfi1/mad.c b/drivers/infiniband/hw/hfi1/mad.c
index f4c0ffc..07b80fa 100644
--- a/drivers/infiniband/hw/hfi1/mad.c
+++ b/drivers/infiniband/hw/hfi1/mad.c
@@ -4293,7 +4293,6 @@  static int opa_local_smp_check(struct hfi1_ibport *ibp,
 			       const struct ib_wc *in_wc)
 {
 	struct hfi1_pportdata *ppd = ppd_from_ibp(ibp);
-	u16 slid = ib_lid_cpu16(in_wc->slid);
 	u16 pkey;
 
 	if (in_wc->pkey_index >= ARRAY_SIZE(ppd->pkeys))
@@ -4320,7 +4319,11 @@  static int opa_local_smp_check(struct hfi1_ibport *ibp,
 	 */
 	if (pkey == LIM_MGMT_P_KEY || pkey == FULL_MGMT_P_KEY)
 		return 0;
-	ingress_pkey_table_fail(ppd, pkey, slid);
+	/*
+	 * On OPA devices it is okay to lose the upper 16 bits of LID as this
+	 * information is obtained elsewhere. Mask off the upper 16 bits.
+	 */
+	ingress_pkey_table_fail(ppd, pkey, ib_lid_cpu16(0xFFFF & in_wc->slid));
 	return 1;
 }