diff mbox

RDMA/netlink: OOPs in rdma_nl_rcv_msg() from misinterpreted flag

Message ID 20171024123957.32207.70888.stgit@phlsvslse11.ph.intel.com (mailing list archive)
State Accepted
Headers show

Commit Message

Michael J. Ruhl Oct. 24, 2017, 12:41 p.m. UTC
From: Michael J. Ruhl <michael.j.ruhl@intel.com>

rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
or the .doit() callback.  The check is done with this check:

if (flags & NLM_F_DUMP) ...

The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).

When an RDMA_NL_LS message (response) is received, the bit used for
indicating an error is the same bit as NLM_F_ROOT.

NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.

ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
occurs in the service.  The current code then misinterprets the
NLM_F_DUMP bit and trys to call the .dump() callback.

If the .dump() callback for the specified request is not available
(which is true for the RDMA_NL_LS messages) the following Oops occurs:

[ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
   (null)
[ 4555.969046] IP:           (null)
[ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
[ 4555.979287] Oops: 0010 [#1] SMP
[ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm
ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod
dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd
glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me
lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devintf
ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace
sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea
sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
pps_core drm dca libata i2c_algo_bit i2c_core
[ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
4.14.0-rc2+ #6
[ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS
SE5C610.86B.01.01.0008.021120151325 02/11/2015
[ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
[ 4556.087967] RIP: 0010:          (null)
[ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
[ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
0000000000000000
[ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
ffff881056362000
[ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
0000000000001100
[ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
ffff881056362000
[ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
0000000000000ec0
[ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
knlGS:0000000000000000
[ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
00000000001606e0
[ 4556.161419] Call Trace:
[ 4556.164167]  ? netlink_dump+0x12c/0x290
[ 4556.168468]  __netlink_dump_start+0x186/0x1f0
[ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
[ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
[ 4556.183604]  netlink_unicast+0x181/0x240
[ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
[ 4556.192392]  sock_sendmsg+0x38/0x50
[ 4556.196299]  SYSC_sendto+0x102/0x190
[ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
[ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
[ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
[ 4556.215442]  SyS_sendto+0xe/0x10
[ 4556.219060]  do_syscall_64+0x67/0x1b0
[ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
[ 4556.228328] RIP: 0033:0x7fe0b9db2a63
[ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
000000000000002c
[ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
00007fe0b9db2a63
[ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
000000000000000d
[ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
000000000000000c
[ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
00007ffc55edc280
[ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
0000000000000001
[ 4556.282368] Code:  Bad RIP value.
[ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
[ 4556.293013] CR2: 0000000000000000
[ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
[ 4556.305465] Kernel panic - not syncing: Fatal exception
[ 4556.313786] Kernel Offset: disabled
[ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
[ 4556.328960] ------------[ cut here ]------------

Special case RDMA_NL_LS response messages to call the appropriate
callback.

Additionally, make sure that the .dump() callback is not NULL
before calling it.

Fixes: 647c75ac59a48a54 ("RDMA/netlink: Convert LS to doit callback")
Reviewed-by: Mike Marciniszyn <mike.marciniszyn@intel.com>
Reviewed-by: Kaike Wan <kaike.wan@intel.com>
Reviewed-by: Alex Estrin <alex.estrin@intel.com>
Signed-off-by: Michael J. Ruhl <michael.j.ruhl@intel.com>
---
 drivers/infiniband/core/netlink.c |   13 ++++++++++++-
 1 files changed, 12 insertions(+), 1 deletions(-)


--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Leon Romanovsky Oct. 24, 2017, 2:41 p.m. UTC | #1
On Tue, Oct 24, 2017 at 08:41:01AM -0400, Michael J. Ruhl wrote:
> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
>
> rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
> or the .doit() callback.  The check is done with this check:
>
> if (flags & NLM_F_DUMP) ...
>
> The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).
>
> When an RDMA_NL_LS message (response) is received, the bit used for
> indicating an error is the same bit as NLM_F_ROOT.
>
> NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.
>
> ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
> occurs in the service.  The current code then misinterprets the
> NLM_F_DUMP bit and trys to call the .dump() callback.
>
> If the .dump() callback for the specified request is not available
> (which is true for the RDMA_NL_LS messages) the following Oops occurs:
>
> [ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
>    (null)
> [ 4555.969046] IP:           (null)
> [ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
> [ 4555.979287] Oops: 0010 [#1] SMP
> [ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
> target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm ib_ucm
> ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash dm_log dm_mod
> dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm irqbypass
> crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel crypto_simd
> glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core mei_me
> lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si ipmi_devintf
> ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd grace
> sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper syscopyarea
> sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
> pps_core drm dca libata i2c_algo_bit i2c_core
> [ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
> 4.14.0-rc2+ #6
> [ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2, BIOS
> SE5C610.86B.01.01.0008.021120151325 02/11/2015
> [ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
> [ 4556.087967] RIP: 0010:          (null)
> [ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
> [ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
> 0000000000000000
> [ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
> ffff881056362000
> [ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
> 0000000000001100
> [ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
> ffff881056362000
> [ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
> 0000000000000ec0
> [ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
> knlGS:0000000000000000
> [ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
> 00000000001606e0
> [ 4556.161419] Call Trace:
> [ 4556.164167]  ? netlink_dump+0x12c/0x290
> [ 4556.168468]  __netlink_dump_start+0x186/0x1f0
> [ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
> [ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
> [ 4556.183604]  netlink_unicast+0x181/0x240
> [ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
> [ 4556.192392]  sock_sendmsg+0x38/0x50
> [ 4556.196299]  SYSC_sendto+0x102/0x190
> [ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
> [ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
> [ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
> [ 4556.215442]  SyS_sendto+0xe/0x10
> [ 4556.219060]  do_syscall_64+0x67/0x1b0
> [ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
> [ 4556.228328] RIP: 0033:0x7fe0b9db2a63
> [ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
> 000000000000002c
> [ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
> 00007fe0b9db2a63
> [ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
> 000000000000000d
> [ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
> 000000000000000c
> [ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
> 00007ffc55edc280
> [ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
> 0000000000000001
> [ 4556.282368] Code:  Bad RIP value.
> [ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
> [ 4556.293013] CR2: 0000000000000000
> [ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
> [ 4556.305465] Kernel panic - not syncing: Fatal exception
> [ 4556.313786] Kernel Offset: disabled
> [ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
> [ 4556.328960] ------------[ cut here ]------------
>
> Special case RDMA_NL_LS response messages to call the appropriate
> callback.
>
> Additionally, make sure that the .dump() callback is not NULL
> before calling it.

Please don't add them here, it is dead code in wrong place.
First it is unachievable paths, and second they need to be checked
in is_nl_valid() function.

Thanks.
Shiraz Saleem Oct. 24, 2017, 2:42 p.m. UTC | #2
On Tue, Oct 24, 2017 at 08:41:01AM -0400, Michael J. Ruhl wrote:
> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> 
> ---
>  drivers/infiniband/core/netlink.c |   13 ++++++++++++-
>  1 files changed, 12 insertions(+), 1 deletions(-)
> 
> diff --git a/drivers/infiniband/core/netlink.c b/drivers/infiniband/core/netlink.c
> index b12e587..1fb72c3 100644
> --- a/drivers/infiniband/core/netlink.c
> +++ b/drivers/infiniband/core/netlink.c
> @@ -175,13 +175,24 @@ static int rdma_nl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
>  	    !netlink_capable(skb, CAP_NET_ADMIN))
>  		return -EPERM;
>  
> +	/*
> +	 * LS responses overload the 0x100 (NLM_F_ROOT) flag.  Don't
> +	 * mistakenly call the .dump() function.
> +	 */
> +	if (index == RDMA_NL_LS) {
> +		if (cb_table[op].doit)
> +			return cb_table[op].doit(skb, nlh, extack);
> +		return -EINVAL;
> +	}
>  	/* FIXME: Convert IWCM to properly handle doit callbacks */
>  	if ((nlh->nlmsg_flags & NLM_F_DUMP) || index == RDMA_NL_RDMA_CM ||
>  	    index == RDMA_NL_IWCM) {
>  		struct netlink_dump_control c = {
>  			.dump = cb_table[op].dump,
>  		};
> -		return netlink_dump_start(nls, skb, nlh, &c);
> +		if (c.dump)
> +			return netlink_dump_start(nls, skb, nlh, &c);
> +		return -EINVAL;
>  	}
>  
>  	if (cb_table[op].doit)
>

Do you neccessarily need the non-null checks for cb_table[op].doit and c.dump?

Otherwise, looks good.

Reviewed-by: Shiraz Saleem <shiraz.saleem@intel.com> 
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Michael J. Ruhl Oct. 24, 2017, 2:52 p.m. UTC | #3
> -----Original Message-----
> From: Leon Romanovsky [mailto:leon@kernel.org]
> Sent: Tuesday, October 24, 2017 10:42 AM
> To: Ruhl, Michael J <michael.j.ruhl@intel.com>
> Cc: linux-rdma@vger.kernel.org
> Subject: Re: [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from
> misinterpreted flag
> 
> On Tue, Oct 24, 2017 at 08:41:01AM -0400, Michael J. Ruhl wrote:
> > From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> >
> > rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
> > or the .doit() callback.  The check is done with this check:
> >
> > if (flags & NLM_F_DUMP) ...
> >
> > The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).
> >
> > When an RDMA_NL_LS message (response) is received, the bit used for
> > indicating an error is the same bit as NLM_F_ROOT.
> >
> > NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.
> >
> > ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
> > occurs in the service.  The current code then misinterprets the
> > NLM_F_DUMP bit and trys to call the .dump() callback.
> >
> > If the .dump() callback for the specified request is not available
> > (which is true for the RDMA_NL_LS messages) the following Oops occurs:
> >
> > [ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
> >    (null)
> > [ 4555.969046] IP:           (null)
> > [ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
> > [ 4555.979287] Oops: 0010 [#1] SMP
> > [ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
> > target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm
> ib_ucm
> > ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash
> dm_log dm_mod
> > dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm
> irqbypass
> > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
> crypto_simd
> > glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core
> mei_me
> > lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si
> ipmi_devintf
> > ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd
> grace
> > sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
> syscopyarea
> > sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
> > pps_core drm dca libata i2c_algo_bit i2c_core
> > [ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
> > 4.14.0-rc2+ #6
> > [ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2,
> BIOS
> > SE5C610.86B.01.01.0008.021120151325 02/11/2015
> > [ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
> > [ 4556.087967] RIP: 0010:          (null)
> > [ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
> > [ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
> > 0000000000000000
> > [ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
> > ffff881056362000
> > [ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
> > 0000000000001100
> > [ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
> > ffff881056362000
> > [ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
> > 0000000000000ec0
> > [ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
> > knlGS:0000000000000000
> > [ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
> > 00000000001606e0
> > [ 4556.161419] Call Trace:
> > [ 4556.164167]  ? netlink_dump+0x12c/0x290
> > [ 4556.168468]  __netlink_dump_start+0x186/0x1f0
> > [ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
> > [ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
> > [ 4556.183604]  netlink_unicast+0x181/0x240
> > [ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
> > [ 4556.192392]  sock_sendmsg+0x38/0x50
> > [ 4556.196299]  SYSC_sendto+0x102/0x190
> > [ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
> > [ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
> > [ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
> > [ 4556.215442]  SyS_sendto+0xe/0x10
> > [ 4556.219060]  do_syscall_64+0x67/0x1b0
> > [ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
> > [ 4556.228328] RIP: 0033:0x7fe0b9db2a63
> > [ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
> > 000000000000002c
> > [ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
> > 00007fe0b9db2a63
> > [ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
> > 000000000000000d
> > [ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
> > 000000000000000c
> > [ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
> > 00007ffc55edc280
> > [ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
> > 0000000000000001
> > [ 4556.282368] Code:  Bad RIP value.
> > [ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
> > [ 4556.293013] CR2: 0000000000000000
> > [ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
> > [ 4556.305465] Kernel panic - not syncing: Fatal exception
> > [ 4556.313786] Kernel Offset: disabled
> > [ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
> > [ 4556.328960] ------------[ cut here ]------------
> >
> > Special case RDMA_NL_LS response messages to call the appropriate
> > callback.
> >
> > Additionally, make sure that the .dump() callback is not NULL
> > before calling it.
> 
> Please don't add them here, it is dead code in wrong place.
> First it is unachievable paths, and second they need to be checked
> in is_nl_valid() function.

The check in is_nl_valid() is for .doit OR .done.  There is no context based on the index.
So if a specific index has one or the other is_nl_valid is true.

This is what caused the Oops (index only had .doit callbacks, but .dump path
was chosen).  So I don't think that these are unachievable paths.

The is_nl_valid() function is incomplete.  This shows that packets can be crafted
to cause the wrong path to be taken, and if there is no callback function we will
have the same issue.  Specific index validation is needed, and should probably be
addressed with a different patch.

Thanks,

M
> Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Leon Romanovsky Oct. 24, 2017, 3:19 p.m. UTC | #4
On Tue, Oct 24, 2017 at 02:52:31PM +0000, Ruhl, Michael J wrote:
> > -----Original Message-----
> > From: Leon Romanovsky [mailto:leon@kernel.org]
> > Sent: Tuesday, October 24, 2017 10:42 AM
> > To: Ruhl, Michael J <michael.j.ruhl@intel.com>
> > Cc: linux-rdma@vger.kernel.org
> > Subject: Re: [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from
> > misinterpreted flag
> >
> > On Tue, Oct 24, 2017 at 08:41:01AM -0400, Michael J. Ruhl wrote:
> > > From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> > >
> > > rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
> > > or the .doit() callback.  The check is done with this check:
> > >
> > > if (flags & NLM_F_DUMP) ...
> > >
> > > The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).
> > >
> > > When an RDMA_NL_LS message (response) is received, the bit used for
> > > indicating an error is the same bit as NLM_F_ROOT.
> > >
> > > NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.
> > >
> > > ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
> > > occurs in the service.  The current code then misinterprets the
> > > NLM_F_DUMP bit and trys to call the .dump() callback.
> > >
> > > If the .dump() callback for the specified request is not available
> > > (which is true for the RDMA_NL_LS messages) the following Oops occurs:
> > >
> > > [ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
> > >    (null)
> > > [ 4555.969046] IP:           (null)
> > > [ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
> > > [ 4555.979287] Oops: 0010 [#1] SMP
> > > [ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
> > > target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm
> > ib_ucm
> > > ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash
> > dm_log dm_mod
> > > dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm
> > irqbypass
> > > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
> > crypto_simd
> > > glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core
> > mei_me
> > > lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si
> > ipmi_devintf
> > > ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl lockd
> > grace
> > > sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
> > syscopyarea
> > > sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
> > > pps_core drm dca libata i2c_algo_bit i2c_core
> > > [ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
> > > 4.14.0-rc2+ #6
> > > [ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2,
> > BIOS
> > > SE5C610.86B.01.01.0008.021120151325 02/11/2015
> > > [ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
> > > [ 4556.087967] RIP: 0010:          (null)
> > > [ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
> > > [ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
> > > 0000000000000000
> > > [ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
> > > ffff881056362000
> > > [ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
> > > 0000000000001100
> > > [ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
> > > ffff881056362000
> > > [ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
> > > 0000000000000ec0
> > > [ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
> > > knlGS:0000000000000000
> > > [ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > [ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
> > > 00000000001606e0
> > > [ 4556.161419] Call Trace:
> > > [ 4556.164167]  ? netlink_dump+0x12c/0x290
> > > [ 4556.168468]  __netlink_dump_start+0x186/0x1f0
> > > [ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
> > > [ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
> > > [ 4556.183604]  netlink_unicast+0x181/0x240
> > > [ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
> > > [ 4556.192392]  sock_sendmsg+0x38/0x50
> > > [ 4556.196299]  SYSC_sendto+0x102/0x190
> > > [ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
> > > [ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
> > > [ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
> > > [ 4556.215442]  SyS_sendto+0xe/0x10
> > > [ 4556.219060]  do_syscall_64+0x67/0x1b0
> > > [ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
> > > [ 4556.228328] RIP: 0033:0x7fe0b9db2a63
> > > [ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293 ORIG_RAX:
> > > 000000000000002c
> > > [ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
> > > 00007fe0b9db2a63
> > > [ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
> > > 000000000000000d
> > > [ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
> > > 000000000000000c
> > > [ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
> > > 00007ffc55edc280
> > > [ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
> > > 0000000000000001
> > > [ 4556.282368] Code:  Bad RIP value.
> > > [ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
> > > [ 4556.293013] CR2: 0000000000000000
> > > [ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
> > > [ 4556.305465] Kernel panic - not syncing: Fatal exception
> > > [ 4556.313786] Kernel Offset: disabled
> > > [ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
> > > [ 4556.328960] ------------[ cut here ]------------
> > >
> > > Special case RDMA_NL_LS response messages to call the appropriate
> > > callback.
> > >
> > > Additionally, make sure that the .dump() callback is not NULL
> > > before calling it.
> >
> > Please don't add them here, it is dead code in wrong place.
> > First it is unachievable paths, and second they need to be checked
> > in is_nl_valid() function.
>
> The check in is_nl_valid() is for .doit OR .done.  There is no context based on the index.
> So if a specific index has one or the other is_nl_valid is true.
>
> This is what caused the Oops (index only had .doit callbacks, but .dump path
> was chosen).  So I don't think that these are unachievable paths.
>
> The is_nl_valid() function is incomplete.  This shows that packets can be crafted
> to cause the wrong path to be taken, and if there is no callback function we will
> have the same issue.  Specific index validation is needed, and should probably be
> addressed with a different patch.

.doit exists for RDMA_NL_LS and for other .dumpit exists, so no actual
check is needed.

if you want to be on the safe side, please extend is_nl_valid(). IMHO,
it will keep code cleaner, the checks in one localized place and execution
is not mixed with the validity checks.

Thanks

>
> Thanks,
>
> M
> > Thanks.
Michael J. Ruhl Oct. 24, 2017, 3:42 p.m. UTC | #5
> -----Original Message-----
> From: Leon Romanovsky [mailto:leon@kernel.org]
> Sent: Tuesday, October 24, 2017 11:20 AM
> To: Ruhl, Michael J <michael.j.ruhl@intel.com>
> Cc: linux-rdma@vger.kernel.org
> Subject: Re: [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from
> misinterpreted flag
> 
> On Tue, Oct 24, 2017 at 02:52:31PM +0000, Ruhl, Michael J wrote:
> > > -----Original Message-----
> > > From: Leon Romanovsky [mailto:leon@kernel.org]
> > > Sent: Tuesday, October 24, 2017 10:42 AM
> > > To: Ruhl, Michael J <michael.j.ruhl@intel.com>
> > > Cc: linux-rdma@vger.kernel.org
> > > Subject: Re: [PATCH] RDMA/netlink: OOPs in rdma_nl_rcv_msg() from
> > > misinterpreted flag
> > >
> > > On Tue, Oct 24, 2017 at 08:41:01AM -0400, Michael J. Ruhl wrote:
> > > > From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> > > >
> > > > rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
> > > > or the .doit() callback.  The check is done with this check:
> > > >
> > > > if (flags & NLM_F_DUMP) ...
> > > >
> > > > The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).
> > > >
> > > > When an RDMA_NL_LS message (response) is received, the bit used for
> > > > indicating an error is the same bit as NLM_F_ROOT.
> > > >
> > > > NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.
> > > >
> > > > ibacm sends a response with the RDMA_NL_LS_F_ERR bit set if an error
> > > > occurs in the service.  The current code then misinterprets the
> > > > NLM_F_DUMP bit and trys to call the .dump() callback.
> > > >
> > > > If the .dump() callback for the specified request is not available
> > > > (which is true for the RDMA_NL_LS messages) the following Oops occurs:
> > > >
> > > > [ 4555.960256] BUG: unable to handle kernel NULL pointer dereference at
> > > >    (null)
> > > > [ 4555.969046] IP:           (null)
> > > > [ 4555.972664] PGD 10543f1067 P4D 10543f1067 PUD 1033f93067 PMD 0
> > > > [ 4555.979287] Oops: 0010 [#1] SMP
> > > > [ 4555.982809] Modules linked in: rpcrdma ib_isert iscsi_target_mod
> > > > target_core_mod ib_iser libiscsi scsi_transport_iscsi ib_ipoib rdma_ucm
> > > ib_ucm
> > > > ib_uverbs ib_umad rdma_cm ib_cm iw_cm dm_mirror dm_region_hash
> > > dm_log dm_mod
> > > > dax sb_edac x86_pkg_temp_thermal intel_powerclamp coretemp kvm
> > > irqbypass
> > > > crct10dif_pclmul crc32_pclmul ghash_clmulni_intel pcbc aesni_intel
> > > crypto_simd
> > > > glue_helper cryptd hfi1 rdmavt iTCO_wdt iTCO_vendor_support ib_core
> > > mei_me
> > > > lpc_ich pcspkr mei ioatdma sg shpchp i2c_i801 mfd_core wmi ipmi_si
> > > ipmi_devintf
> > > > ipmi_msghandler acpi_power_meter acpi_pad nfsd auth_rpcgss nfs_acl
> lockd
> > > grace
> > > > sunrpc ip_tables ext4 mbcache jbd2 sd_mod mgag200 drm_kms_helper
> > > syscopyarea
> > > > sysfillrect sysimgblt fb_sys_fops ttm igb ahci crc32c_intel ptp libahci
> > > > pps_core drm dca libata i2c_algo_bit i2c_core
> > > > [ 4556.061190] CPU: 54 PID: 9841 Comm: ibacm Tainted: G          I
> > > > 4.14.0-rc2+ #6
> > > > [ 4556.069667] Hardware name: Intel Corporation S2600WT2/S2600WT2,
> > > BIOS
> > > > SE5C610.86B.01.01.0008.021120151325 02/11/2015
> > > > [ 4556.081339] task: ffff880855f42d00 task.stack: ffffc900246b4000
> > > > [ 4556.087967] RIP: 0010:          (null)
> > > > [ 4556.092166] RSP: 0018:ffffc900246b7bc8 EFLAGS: 00010246
> > > > [ 4556.098018] RAX: ffffffff81dbe9e0 RBX: ffff881058bb1000 RCX:
> > > > 0000000000000000
> > > > [ 4556.105997] RDX: 0000000000001100 RSI: ffff881058bb1320 RDI:
> > > > ffff881056362000
> > > > [ 4556.113984] RBP: ffffc900246b7bf8 R08: 0000000000000ec0 R09:
> > > > 0000000000001100
> > > > [ 4556.121971] R10: ffff8810573a5000 R11: 0000000000000000 R12:
> > > > ffff881056362000
> > > > [ 4556.129957] R13: 0000000000000ec0 R14: ffff881058bb1320 R15:
> > > > 0000000000000ec0
> > > > [ 4556.137945] FS:  00007fe0ba5a38c0(0000) GS:ffff88105f080000(0000)
> > > > knlGS:0000000000000000
> > > > [ 4556.147000] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > > > [ 4556.153433] CR2: 0000000000000000 CR3: 0000001056f5d003 CR4:
> > > > 00000000001606e0
> > > > [ 4556.161419] Call Trace:
> > > > [ 4556.164167]  ? netlink_dump+0x12c/0x290
> > > > [ 4556.168468]  __netlink_dump_start+0x186/0x1f0
> > > > [ 4556.173357]  rdma_nl_rcv_msg+0x193/0x1b0 [ib_core]
> > > > [ 4556.178724]  rdma_nl_rcv+0xdc/0x130 [ib_core]
> > > > [ 4556.183604]  netlink_unicast+0x181/0x240
> > > > [ 4556.187998]  netlink_sendmsg+0x2c2/0x3b0
> > > > [ 4556.192392]  sock_sendmsg+0x38/0x50
> > > > [ 4556.196299]  SYSC_sendto+0x102/0x190
> > > > [ 4556.200308]  ? __audit_syscall_entry+0xaf/0x100
> > > > [ 4556.205387]  ? syscall_trace_enter+0x1d0/0x2b0
> > > > [ 4556.210366]  ? __audit_syscall_exit+0x209/0x290
> > > > [ 4556.215442]  SyS_sendto+0xe/0x10
> > > > [ 4556.219060]  do_syscall_64+0x67/0x1b0
> > > > [ 4556.223165]  entry_SYSCALL64_slow_path+0x25/0x25
> > > > [ 4556.228328] RIP: 0033:0x7fe0b9db2a63
> > > > [ 4556.232333] RSP: 002b:00007ffc55edc260 EFLAGS: 00000293
> ORIG_RAX:
> > > > 000000000000002c
> > > > [ 4556.240808] RAX: ffffffffffffffda RBX: 0000000000000010 RCX:
> > > > 00007fe0b9db2a63
> > > > [ 4556.248796] RDX: 0000000000000010 RSI: 00007ffc55edc280 RDI:
> > > > 000000000000000d
> > > > [ 4556.256782] RBP: 00007ffc55edc670 R08: 00007ffc55edc270 R09:
> > > > 000000000000000c
> > > > [ 4556.265321] R10: 0000000000000000 R11: 0000000000000293 R12:
> > > > 00007ffc55edc280
> > > > [ 4556.273846] R13: 000000000260b400 R14: 000000000000000d R15:
> > > > 0000000000000001
> > > > [ 4556.282368] Code:  Bad RIP value.
> > > > [ 4556.286629] RIP:           (null) RSP: ffffc900246b7bc8
> > > > [ 4556.293013] CR2: 0000000000000000
> > > > [ 4556.297292] ---[ end trace 8d67abcfd10ec209 ]---
> > > > [ 4556.305465] Kernel panic - not syncing: Fatal exception
> > > > [ 4556.313786] Kernel Offset: disabled
> > > > [ 4556.321563] ---[ end Kernel panic - not syncing: Fatal exception
> > > > [ 4556.328960] ------------[ cut here ]------------
> > > >
> > > > Special case RDMA_NL_LS response messages to call the appropriate
> > > > callback.
> > > >
> > > > Additionally, make sure that the .dump() callback is not NULL
> > > > before calling it.
> > >
> > > Please don't add them here, it is dead code in wrong place.
> > > First it is unachievable paths, and second they need to be checked
> > > in is_nl_valid() function.
> >
> > The check in is_nl_valid() is for .doit OR .done.  There is no context based on
> the index.
> > So if a specific index has one or the other is_nl_valid is true.
> >
> > This is what caused the Oops (index only had .doit callbacks, but .dump path
> > was chosen).  So I don't think that these are unachievable paths.
> >
> > The is_nl_valid() function is incomplete.  This shows that packets can be
> crafted
> > to cause the wrong path to be taken, and if there is no callback function we
> will
> > have the same issue.  Specific index validation is needed, and should probably
> be
> > addressed with a different patch.
> 
> .doit exists for RDMA_NL_LS and for other .dumpit exists, so no actual
> check is needed.
> 
> if you want to be on the safe side, please extend is_nl_valid(). IMHO,
> it will keep code cleaner, the checks in one localized place and execution
> is not mixed with the validity checks.

Leon,

I do not disagree with you on this.  My patch is to fix the Oops, and make
sure it can't happen with the current code.

Extending is_nl_valid() to be more complete seems to be out of scope for
the current issue of making the Oops go away.

I can probably work on is_nl_valid(), but I will need much more time to make sure
that it is correct, and don't feel that I can get it done with in the appropriate
time frame (i.e. sometime this week).

Mike

 
> Thanks
> 
> >
> > Thanks,
> >
> > M
> > > Thanks.
--
To unsubscribe from this list: send the line "unsubscribe linux-rdma" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Doug Ledford Oct. 24, 2017, 4:31 p.m. UTC | #6
On Tue, 2017-10-24 at 08:41 -0400, Michael J. Ruhl wrote:
> From: Michael J. Ruhl <michael.j.ruhl@intel.com>
> 
> rdma_nl_rcv_msg() checks to see if it should use the .dump() callback
> or the .doit() callback.  The check is done with this check:
> 
> if (flags & NLM_F_DUMP) ...
> 
> The NLM_F_DUMP flag is two bits (NLM_F_ROOT | NLM_F_MATCH).
> 
> When an RDMA_NL_LS message (response) is received, the bit used for
> indicating an error is the same bit as NLM_F_ROOT.
> 
> NLM_F_ROOT == (0x100) == RDMA_NL_LS_F_ERR.

What are the remaining flags in the failing error case?

Or to be more specific,

> 
>  	/* FIXME: Convert IWCM to properly handle doit callbacks */
>  	if ((nlh->nlmsg_flags & NLM_F_DUMP)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
This test is technically faulty.  Since NLM_F_DUMP is a multi-bit flag,
it must be ((nlh->nlmsg_flags & NLM_F_DUMP) == NLM_F_DUMP) to be
technically correct.  So, my question then becomes, if we correct this
test, will the RDMA_NL_LS_F_ERR return message still trigger this wrong
path?  I'd rather have a technically correct fix to this if statement
than a special case of the index value if possible.
Doug Ledford Oct. 25, 2017, 6:57 p.m. UTC | #7
On Tue, 2017-10-24 at 15:42 +0000, Ruhl, Michael J wrote:
> > > have the same issue.  Specific index validation is needed, and
> > > should probably
> > 
> > be
> > > addressed with a different patch.
> > 
> > .doit exists for RDMA_NL_LS and for other .dumpit exists, so no
> > actual
> > check is needed.
> > 
> > if you want to be on the safe side, please extend is_nl_valid().
> > IMHO,
> > it will keep code cleaner, the checks in one localized place and
> > execution
> > is not mixed with the validity checks.
> 
> Leon,
> 
> I do not disagree with you on this.  My patch is to fix the Oops, and
> make
> sure it can't happen with the current code.
> 
> Extending is_nl_valid() to be more complete seems to be out of scope
> for
> the current issue of making the Oops go away.
> 
> I can probably work on is_nl_valid(), but I will need much more time
> to make sure
> that it is correct, and don't feel that I can get it done with in the
> appropriate
> time frame (i.e. sometime this week).

I agree with Michael that tweaking this code to be optimal versus
plainly failsafe is a patch for another day.  I've taken this one. 
Thanks Michael.
Leon Romanovsky Oct. 25, 2017, 7:06 p.m. UTC | #8
On Wed, Oct 25, 2017 at 02:57:20PM -0400, Doug Ledford wrote:
> On Tue, 2017-10-24 at 15:42 +0000, Ruhl, Michael J wrote:
> > > > have the same issue.  Specific index validation is needed, and
> > > > should probably
> > >
> > > be
> > > > addressed with a different patch.
> > >
> > > .doit exists for RDMA_NL_LS and for other .dumpit exists, so no
> > > actual
> > > check is needed.
> > >
> > > if you want to be on the safe side, please extend is_nl_valid().
> > > IMHO,
> > > it will keep code cleaner, the checks in one localized place and
> > > execution
> > > is not mixed with the validity checks.
> >
> > Leon,
> >
> > I do not disagree with you on this.  My patch is to fix the Oops, and
> > make
> > sure it can't happen with the current code.
> >
> > Extending is_nl_valid() to be more complete seems to be out of scope
> > for
> > the current issue of making the Oops go away.
> >
> > I can probably work on is_nl_valid(), but I will need much more time
> > to make sure
> > that it is correct, and don't feel that I can get it done with in the
> > appropriate
> > time frame (i.e. sometime this week).
>
> I agree with Michael that tweaking this code to be optimal versus
> plainly failsafe is a patch for another day.  I've taken this one.
> Thanks Michael.

Too bad, you actually suggested very good solution, it is unclear who
will now implement it.

Thanks

>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>
Doug Ledford Oct. 25, 2017, 7:17 p.m. UTC | #9
On Wed, 2017-10-25 at 22:06 +0300, Leon Romanovsky wrote:
> > I agree with Michael that tweaking this code to be optimal versus
> > plainly failsafe is a patch for another day.  I've taken this one.
> > Thanks Michael.
> 
> Too bad, you actually suggested very good solution, it is unclear who
> will now implement it.

I may do it myself.  I've been looking at the code and I don't like the
if construct because I don't feel it conveys the purpose of the code
clearly.  Implicit in the if construct is embedded knowledge about
which providers do what.  I'd rather have a switch construct on the
index with the flow split to those that have only .dump, those that
have only .doit, and those that have both.  The current if doesn't make
it clear that this is how the netlink services are structured.
Leon Romanovsky Oct. 25, 2017, 7:32 p.m. UTC | #10
On Wed, Oct 25, 2017 at 03:17:28PM -0400, Doug Ledford wrote:
> On Wed, 2017-10-25 at 22:06 +0300, Leon Romanovsky wrote:
> > > I agree with Michael that tweaking this code to be optimal versus
> > > plainly failsafe is a patch for another day.  I've taken this one.
> > > Thanks Michael.
> >
> > Too bad, you actually suggested very good solution, it is unclear who
> > will now implement it.
>
> I may do it myself.  I've been looking at the code and I don't like the
> if construct because I don't feel it conveys the purpose of the code
> clearly.  Implicit in the if construct is embedded knowledge about
> which providers do what.  I'd rather have a switch construct on the
> index with the flow split to those that have only .dump, those that
> have only .doit, and those that have both.  The current if doesn't make
> it clear that this is how the netlink services are structured.

Most probably you are right, I just need to see the code to be sure.

The original "if" was outcome of my wrong expectation that someone will
help me to modernize IWCM. Such modernization was supposed to remove
nasty "|| index .. " checks.

So my final goal is to see the following code:
if(dump_flag)
	return do_dump

return do_doit

Thanks

>
> --
> Doug Ledford <dledford@redhat.com>
>     GPG KeyID: B826A3330E572FDD
>     Key fingerprint = AE6B 1BDA 122B 23B4 265B  1274 B826 A333 0E57 2FDD
>
diff mbox

Patch

diff --git a/drivers/infiniband/core/netlink.c b/drivers/infiniband/core/netlink.c
index b12e587..1fb72c3 100644
--- a/drivers/infiniband/core/netlink.c
+++ b/drivers/infiniband/core/netlink.c
@@ -175,13 +175,24 @@  static int rdma_nl_rcv_msg(struct sk_buff *skb, struct nlmsghdr *nlh,
 	    !netlink_capable(skb, CAP_NET_ADMIN))
 		return -EPERM;
 
+	/*
+	 * LS responses overload the 0x100 (NLM_F_ROOT) flag.  Don't
+	 * mistakenly call the .dump() function.
+	 */
+	if (index == RDMA_NL_LS) {
+		if (cb_table[op].doit)
+			return cb_table[op].doit(skb, nlh, extack);
+		return -EINVAL;
+	}
 	/* FIXME: Convert IWCM to properly handle doit callbacks */
 	if ((nlh->nlmsg_flags & NLM_F_DUMP) || index == RDMA_NL_RDMA_CM ||
 	    index == RDMA_NL_IWCM) {
 		struct netlink_dump_control c = {
 			.dump = cb_table[op].dump,
 		};
-		return netlink_dump_start(nls, skb, nlh, &c);
+		if (c.dump)
+			return netlink_dump_start(nls, skb, nlh, &c);
+		return -EINVAL;
 	}
 
 	if (cb_table[op].doit)