Message ID | 20231012123729.29307-1-dust.li@linux.alibaba.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 4abbd2e3c1db671fa1286390f1310aec78386f1d |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net] net/smc: return the right falback reason when prefix checks fail | expand |
On 12.10.23 14:37, Dust Li wrote: > In the smc_listen_work(), if smc_listen_prfx_check() failed, > the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and > SMC_CLC_DECL_NOSMCDEV was returned. > > Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return > the real reason is much friendly for debugging. > > Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") > Signed-off-by: Dust Li <dust.li@linux.alibaba.com> As you point out the current code is not really wrong. So I am not sure, whether this should be a fix for net, or rather a debug improvement for net-next. > --- > net/smc/af_smc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index bacdd971615e..21d4476b937b 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -2361,7 +2361,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, > smc_find_ism_store_rc(rc, ini); > return (!rc) ? 0 : ini->rc; > } > - return SMC_CLC_DECL_NOSMCDEV; > + return prfx_rc; > } > > /* listen worker: finish RDMA setup */ For the code change: Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
On 12.10.23 15:05, Alexandra Winter wrote: > > > On 12.10.23 14:37, Dust Li wrote: >> In the smc_listen_work(), if smc_listen_prfx_check() failed, >> the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and >> SMC_CLC_DECL_NOSMCDEV was returned. >> >> Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return >> the real reason is much friendly for debugging. >> >> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") >> Signed-off-by: Dust Li <dust.li@linux.alibaba.com> > > As you point out the current code is not really wrong. So I am not sure, > whether this should be a fix for net, or rather a debug improvement for > net-next. > The return code was not precise, and since we do have already a more appropriate return code to use. IMO, it was wrong. I'm for net. Reviewed-by: Wenjia Zhang <wenjia@linux.ibm.com>
On Thu, Oct 12, 2023 at 03:05:20PM +0200, Alexandra Winter wrote: > > >On 12.10.23 14:37, Dust Li wrote: >> In the smc_listen_work(), if smc_listen_prfx_check() failed, >> the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and >> SMC_CLC_DECL_NOSMCDEV was returned. >> >> Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return >> the real reason is much friendly for debugging. >> >> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") >> Signed-off-by: Dust Li <dust.li@linux.alibaba.com> > >As you point out the current code is not really wrong. So I am not sure, >whether this should be a fix for net, or rather a debug improvement for >net-next. To be honest, I was a bit conflicted which branch should this go for. But after checking the code before e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2"), I discovered the previous behavior was to return SMC_CLC_DECL_DIFFPREFIX. Therefor, I have decided it should be considered a fix. I should have memtioned this in the commit message. Best regards, Dust > >> --- >> net/smc/af_smc.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c >> index bacdd971615e..21d4476b937b 100644 >> --- a/net/smc/af_smc.c >> +++ b/net/smc/af_smc.c >> @@ -2361,7 +2361,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, >> smc_find_ism_store_rc(rc, ini); >> return (!rc) ? 0 : ini->rc; >> } >> - return SMC_CLC_DECL_NOSMCDEV; >> + return prfx_rc; >> } >> >> /* listen worker: finish RDMA setup */ > >For the code change: >Reviewed-by: Alexandra Winter <wintera@linux.ibm.com>
On 2023/10/12 20:37, Dust Li wrote: > In the smc_listen_work(), if smc_listen_prfx_check() failed, > the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and > SMC_CLC_DECL_NOSMCDEV was returned. > > Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return > the real reason is much friendly for debugging. > > Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") > Signed-off-by: Dust Li <dust.li@linux.alibaba.com> > --- > net/smc/af_smc.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c > index bacdd971615e..21d4476b937b 100644 > --- a/net/smc/af_smc.c > +++ b/net/smc/af_smc.c > @@ -2361,7 +2361,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, > smc_find_ism_store_rc(rc, ini); > return (!rc) ? 0 : ini->rc; > } > - return SMC_CLC_DECL_NOSMCDEV; > + return prfx_rc; > } > > /* listen worker: finish RDMA setup */ Inspired by this fix, I am thinking that is it suitable to store the first decline reason rather than real decline reason that caused the return of smc_listen_find_device()? For example, when running SMC between two peers with only RDMA devices. Then in smc_listen_find_device(): 1. call smc_find_ism_v2_device_serv() and find that no ISMv2 can be used. the reason code will be stored as SMC_CLC_DECL_NOSMCD2DEV. ... 2. call smc_find_rdma_v1_device_serv() and find a RDMA device, but somehow it failed to create buffers. It should inform users that SMC_CLC_DECL_MEM occurs, but now the reason code returned SMC_CLC_DECL_NOSMCD2DEV. I think users may be confused that why peer declines with this reason and wonder what happens when trying to use SMC-R. Thanks, Wen Gu
On 2023/10/13 16:00, Wen Gu wrote: > > > On 2023/10/12 20:37, Dust Li wrote: > >> In the smc_listen_work(), if smc_listen_prfx_check() failed, >> the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and >> SMC_CLC_DECL_NOSMCDEV was returned. >> >> Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return >> the real reason is much friendly for debugging. >> >> Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") >> Signed-off-by: Dust Li <dust.li@linux.alibaba.com> >> --- >> net/smc/af_smc.c | 2 +- >> 1 file changed, 1 insertion(+), 1 deletion(-) >> >> diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c >> index bacdd971615e..21d4476b937b 100644 >> --- a/net/smc/af_smc.c >> +++ b/net/smc/af_smc.c >> @@ -2361,7 +2361,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, >> smc_find_ism_store_rc(rc, ini); >> return (!rc) ? 0 : ini->rc; >> } >> - return SMC_CLC_DECL_NOSMCDEV; >> + return prfx_rc; >> } >> /* listen worker: finish RDMA setup */ > Inspired by this fix, I am thinking that is it suitable to store the first > decline reason rather than real decline reason that caused the return of > smc_listen_find_device()? > > For example, when running SMC between two peers with only RDMA devices. Then > in smc_listen_find_device(): > > 1. call smc_find_ism_v2_device_serv() and find that no ISMv2 can be used. > the reason code will be stored as SMC_CLC_DECL_NOSMCD2DEV. > > ... > > 2. call smc_find_rdma_v1_device_serv() and find a RDMA device, but somehow > it failed to create buffers. It should inform users that SMC_CLC_DECL_MEM > occurs, but now the reason code returned SMC_CLC_DECL_NOSMCD2DEV. > > I think users may be confused that why peer declines with this reason and > wonder what happens when trying to use SMC-R. Yes, the reason code here also makes me confused. I think it is caused by not correctly using the function smc_find_ism_store_rc. I'm working for the fix. > > > Thanks, > Wen Gu >
Hello: This patch was applied to netdev/net.git (main) by Jakub Kicinski <kuba@kernel.org>: On Thu, 12 Oct 2023 20:37:29 +0800 you wrote: > In the smc_listen_work(), if smc_listen_prfx_check() failed, > the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and > SMC_CLC_DECL_NOSMCDEV was returned. > > Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return > the real reason is much friendly for debugging. > > [...] Here is the summary with links: - [net] net/smc: return the right falback reason when prefix checks fail https://git.kernel.org/netdev/net/c/4abbd2e3c1db You are awesome, thank you!
diff --git a/net/smc/af_smc.c b/net/smc/af_smc.c index bacdd971615e..21d4476b937b 100644 --- a/net/smc/af_smc.c +++ b/net/smc/af_smc.c @@ -2361,7 +2361,7 @@ static int smc_listen_find_device(struct smc_sock *new_smc, smc_find_ism_store_rc(rc, ini); return (!rc) ? 0 : ini->rc; } - return SMC_CLC_DECL_NOSMCDEV; + return prfx_rc; } /* listen worker: finish RDMA setup */
In the smc_listen_work(), if smc_listen_prfx_check() failed, the real reason: SMC_CLC_DECL_DIFFPREFIX was dropped, and SMC_CLC_DECL_NOSMCDEV was returned. Althrough this is also kind of SMC_CLC_DECL_NOSMCDEV, but return the real reason is much friendly for debugging. Fixes: e49300a6bf62 ("net/smc: add listen processing for SMC-Rv2") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> --- net/smc/af_smc.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)