Message ID | 20220523055056.2078994-1-liuyacan@corp.netease.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 8c3b8dc5cc9bf6d273ebe18b16e2d6882bcfb36d |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/smc: fix listen processing for SMC-Rv2 | expand |
Hello: This patch was applied to netdev/net.git (master) by David S. Miller <davem@davemloft.net>: On Mon, 23 May 2022 13:50:56 +0800 you wrote: > From: liuyacan <liuyacan@corp.netease.com> > > In the process of checking whether RDMAv2 is available, the current > implementation first sets ini->smcrv2.ib_dev_v2, and then allocates > smc buf desc, but the latter may fail. Unfortunately, the caller > will only check the former. In this case, a NULL pointer reference > will occur in smc_clc_send_confirm_accept() when accessing > conn->rmb_desc. > > [...] Here is the summary with links: - [net] net/smc: fix listen processing for SMC-Rv2 https://git.kernel.org/netdev/net/c/8c3b8dc5cc9b You are awesome, thank you!
On 23/05/2022 07:50, liuyacan@corp.netease.com wrote: > From: liuyacan <liuyacan@corp.netease.com> > > In the process of checking whether RDMAv2 is available, the current > implementation first sets ini->smcrv2.ib_dev_v2, and then allocates > smc buf desc, but the latter may fail. Unfortunately, the caller > will only check the former. In this case, a NULL pointer reference > will occur in smc_clc_send_confirm_accept() when accessing > conn->rmb_desc. > > This patch does two things: > 1. Use the return code to determine whether V2 is available. > 2. If the return code is NODEV, continue to check whether V1 is > available. > > Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") > Signed-off-by: liuyacan <liuyacan@corp.netease.com> > --- I am not happy with this patch. You are right that this is a problem, but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in smc_find_rdma_v2_device_serv() after the not_found label, just like it is done in a similar way for the ISM device in smc_find_ism_v1_device_serv(). Your patch changes many more things, and beside that you eliminated the calls to smc_find_ism_store_rc() completely, which is not correct. Since your patch was already applied (btw. 3:20 hours after you submitted it), please revert it and resend. Thank you.
> > From: liuyacan <liuyacan@corp.netease.com> > > > > In the process of checking whether RDMAv2 is available, the current > > implementation first sets ini->smcrv2.ib_dev_v2, and then allocates > > smc buf desc, but the latter may fail. Unfortunately, the caller > > will only check the former. In this case, a NULL pointer reference > > will occur in smc_clc_send_confirm_accept() when accessing > > conn->rmb_desc. > > > > This patch does two things: > > 1. Use the return code to determine whether V2 is available. > > 2. If the return code is NODEV, continue to check whether V1 is > > available. > > > > Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") > > Signed-off-by: liuyacan <liuyacan@corp.netease.com> > > --- > > I am not happy with this patch. You are right that this is a problem, > but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in > smc_find_rdma_v2_device_serv() after the not_found label, just like it is > done in a similar way for the ISM device in smc_find_ism_v1_device_serv(). > > Your patch changes many more things, and beside that you eliminated the calls > to smc_find_ism_store_rc() completely, which is not correct. > > Since your patch was already applied (btw. 3:20 hours after you submitted it), > please revert it and resend. Thank you. I also have considered this way, one question is that do we need to do more roll back work before V1 check? Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps: 1. smc_listen_rdma_init() 1.1 smc_conn_create() 1.2 smc_buf_create() --> may fail 2. smc_listen_rdma_reg() --> may fail When later steps fail, Do we need to roll back previous steps? Thank you.
On 23/05/2022 14:12, liuyacan@corp.netease.com wrote: >>> From: liuyacan <liuyacan@corp.netease.com> >>> >>> In the process of checking whether RDMAv2 is available, the current >>> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates >>> smc buf desc, but the latter may fail. Unfortunately, the caller >>> will only check the former. In this case, a NULL pointer reference >>> will occur in smc_clc_send_confirm_accept() when accessing >>> conn->rmb_desc. >>> >>> This patch does two things: >>> 1. Use the return code to determine whether V2 is available. >>> 2. If the return code is NODEV, continue to check whether V1 is >>> available. >>> >>> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") >>> Signed-off-by: liuyacan <liuyacan@corp.netease.com> >>> --- >> >> I am not happy with this patch. You are right that this is a problem, >> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in >> smc_find_rdma_v2_device_serv() after the not_found label, just like it is >> done in a similar way for the ISM device in smc_find_ism_v1_device_serv(). >> >> Your patch changes many more things, and beside that you eliminated the calls >> to smc_find_ism_store_rc() completely, which is not correct. >> >> Since your patch was already applied (btw. 3:20 hours after you submitted it), >> please revert it and resend. Thank you. > > I also have considered this way, one question is that do we need to do more roll > back work before V1 check? > > Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps: > > 1. smc_listen_rdma_init() > 1.1 smc_conn_create() > 1.2 smc_buf_create() --> may fail > 2. smc_listen_rdma_reg() --> may fail > > When later steps fail, Do we need to roll back previous steps? That is a good question and I think that is a different problem for another patch. smc_listen_rdma_init() maybe should call smc_conn_abort() similar to what smc_listen_ism_init() does in this situation. And when smc_listen_rdma_reg() fails ... hmm we need to think about this. We will also discuss this here in our team.
> >>> From: liuyacan <liuyacan@corp.netease.com> > >>> > >>> In the process of checking whether RDMAv2 is available, the current > >>> implementation first sets ini->smcrv2.ib_dev_v2, and then allocates > >>> smc buf desc, but the latter may fail. Unfortunately, the caller > >>> will only check the former. In this case, a NULL pointer reference > >>> will occur in smc_clc_send_confirm_accept() when accessing > >>> conn->rmb_desc. > >>> > >>> This patch does two things: > >>> 1. Use the return code to determine whether V2 is available. > >>> 2. If the return code is NODEV, continue to check whether V1 is > >>> available. > >>> > >>> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") > >>> Signed-off-by: liuyacan <liuyacan@corp.netease.com> > >>> --- > >> > >> I am not happy with this patch. You are right that this is a problem, > >> but the fix should be much simpler: set ini->smcrv2.ib_dev_v2 = NULL in > >> smc_find_rdma_v2_device_serv() after the not_found label, just like it is > >> done in a similar way for the ISM device in smc_find_ism_v1_device_serv(). > >> > >> Your patch changes many more things, and beside that you eliminated the calls > >> to smc_find_ism_store_rc() completely, which is not correct. > >> > >> Since your patch was already applied (btw. 3:20 hours after you submitted it), > >> please revert it and resend. Thank you. > > > > I also have considered this way, one question is that do we need to do more roll > > back work before V1 check? > > > > Specifically, In smc_find_rdma_v2_device_serv(), there are the following steps: > > > > 1. smc_listen_rdma_init() > > 1.1 smc_conn_create() > > 1.2 smc_buf_create() --> may fail > > 2. smc_listen_rdma_reg() --> may fail > > > > When later steps fail, Do we need to roll back previous steps? > > That is a good question and I think that is a different problem for another patch. > smc_listen_rdma_init() maybe should call smc_conn_abort() similar to what smc_listen_ism_init() > does in this situation. And when smc_listen_rdma_reg() fails ... hmm we need to think about this. > > We will also discuss this here in our team. Ok, I will revert this patch and resend a simpler one. Thank you.
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index 45a24d242..d3de54b70 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2093,13 +2093,13 @@ static int smc_listen_rdma_reg(struct smc_sock *new_smc, bool local_first) return 0; } -static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc, - struct smc_clc_msg_proposal *pclc, - struct smc_init_info *ini) +static int smc_find_rdma_v2_device_serv(struct smc_sock *new_smc, + struct smc_clc_msg_proposal *pclc, + struct smc_init_info *ini) { struct smc_clc_v2_extension *smc_v2_ext; u8 smcr_version; - int rc; + int rc = 0; if (!(ini->smcr_version & SMC_V2) || !smcr_indicated(ini->smc_type_v2)) goto not_found; @@ -2117,26 +2117,31 @@ static void smc_find_rdma_v2_device_serv(struct smc_sock *new_smc, ini->smcrv2.saddr = new_smc->clcsock->sk->sk_rcv_saddr; ini->smcrv2.daddr = smc_ib_gid_to_ipv4(smc_v2_ext->roce); rc = smc_find_rdma_device(new_smc, ini); - if (rc) { - smc_find_ism_store_rc(rc, ini); + if (rc) goto not_found; - } + if (!ini->smcrv2.uses_gateway) memcpy(ini->smcrv2.nexthop_mac, pclc->lcl.mac, ETH_ALEN); smcr_version = ini->smcr_version; ini->smcr_version = SMC_V2; rc = smc_listen_rdma_init(new_smc, ini); - if (!rc) - rc = smc_listen_rdma_reg(new_smc, ini->first_contact_local); - if (!rc) - return; - ini->smcr_version = smcr_version; - smc_find_ism_store_rc(rc, ini); + if (rc) { + ini->smcr_version = smcr_version; + goto not_found; + } + rc = smc_listen_rdma_reg(new_smc, ini->first_contact_local); + if (rc) { + ini->smcr_version = smcr_version; + goto not_found; + } + return 0; not_found: + rc = rc ?: SMC_CLC_DECL_NOSMCDEV; ini->smcr_version &= ~SMC_V2; ini->check_smcrv2 = false; + return rc; } static int smc_find_rdma_v1_device_serv(struct smc_sock *new_smc, @@ -2169,6 +2174,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, struct smc_init_info *ini) { int prfx_rc; + int rc; /* check for ISM device matching V2 proposed device */ smc_find_ism_v2_device_serv(new_smc, pclc, ini); @@ -2196,14 +2202,18 @@ static int smc_listen_find_device(struct smc_sock *new_smc, return ini->rc ?: SMC_CLC_DECL_NOSMCDDEV; /* check if RDMA V2 is available */ - smc_find_rdma_v2_device_serv(new_smc, pclc, ini); - if (ini->smcrv2.ib_dev_v2) + rc = smc_find_rdma_v2_device_serv(new_smc, pclc, ini); + if (!rc) return 0; + /* skip V1 check if V2 is unavailable for non-Device reason */ + if (rc != SMC_CLC_DECL_NOSMCDEV && + rc != SMC_CLC_DECL_NOSMCRDEV && + rc != SMC_CLC_DECL_NOSMCDDEV) + return rc; + /* check if RDMA V1 is available */ if (!prfx_rc) { - int rc; - rc = smc_find_rdma_v1_device_serv(new_smc, pclc, ini); smc_find_ism_store_rc(rc, ini); return (!rc) ? 0 : ini->rc;