Message ID | 20211228090325.27263-2-dust.li@linux.alibaba.com (mailing list archive) |
---|---|
State | Accepted |
Commit | 90cee52f2e780345d3629e278291aea5ac74f40f |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | net/smc: fix kernel panic caused by race of smc_sock | expand |
On 28/12/2021 10:03, Dust Li wrote: > We found smc_llc_send_link_delete_all() sometimes wait > for 2s timeout when testing with RDMA link up/down. > It is possible when a smc_link is in ACTIVATING state, > the underlaying QP is still in RESET or RTR state, which > cannot send any messages out. I see your point, but why do you needed to introduce a new wrapper instead of extending the existing smc_link_usable() wrapper? With that and without any comments the reader of the code does not know why there are 2 different functions and what is the reason for having two of them.
On Wed, Dec 29, 2021 at 01:36:06PM +0100, Karsten Graul wrote: >On 28/12/2021 10:03, Dust Li wrote: >> We found smc_llc_send_link_delete_all() sometimes wait >> for 2s timeout when testing with RDMA link up/down. >> It is possible when a smc_link is in ACTIVATING state, >> the underlaying QP is still in RESET or RTR state, which >> cannot send any messages out. > >I see your point, but why do you needed to introduce a new wrapper instead of >extending the existing smc_link_usable() wrapper? >With that and without any comments the reader of the code does not know why there are >2 different functions and what is the reason for having two of them. Sorry for that, I should add some comments for those wrappers to better explain that. The reason we need two wrappers here is because the QP state for the client side is not turned into RTS directly, but seperated into two stages. In the middle on RTR to RTS, we still need smc_link_usable(). For example: For the client side in first contact, the QP is still in RTR before it receives the LLC_CONFIRM message. So for functions like smc_llc_wait() or smc_llc_event_handler(), we can't use smc_link_sendable(), or we can't even receive LLC_CONFIRM message anymore. So the meaning for those 2 wrappers should be: smc_link_usable(): the link is ready to receive RDMA messages smc_link_sendable(): the link is ready to send RDMA messages For those places that need to send any CDC or LLC message, should go for smc_link_sendable(), otherwise, use smc_link_usable() instead. I saw David has already applied this to net, should I send another patch to add some comments ?
On 30/12/2021 04:02, dust.li wrote: > On Wed, Dec 29, 2021 at 01:36:06PM +0100, Karsten Graul wrote: >> On 28/12/2021 10:03, Dust Li wrote: > I saw David has already applied this to net, should I send another > patch to add some comments ? You could send a follow-on patch with your additional information, which I find is very helpful! Thanks.
On Thu, Dec 30, 2021 at 07:55:01PM +0100, Karsten Graul wrote: >On 30/12/2021 04:02, dust.li wrote: >> On Wed, Dec 29, 2021 at 01:36:06PM +0100, Karsten Graul wrote: >>> On 28/12/2021 10:03, Dust Li wrote: >> I saw David has already applied this to net, should I send another >> patch to add some comments ? > >You could send a follow-on patch with your additional information, which >I find is very helpful! Thanks. Sure, will do > >-- >Karsten
On 31/12/2021 04:15, dust.li wrote: > On Thu, Dec 30, 2021 at 07:55:01PM +0100, Karsten Graul wrote: >> On 30/12/2021 04:02, dust.li wrote: >>> On Wed, Dec 29, 2021 at 01:36:06PM +0100, Karsten Graul wrote: >>>> On 28/12/2021 10:03, Dust Li wrote: >>> I saw David has already applied this to net, should I send another >>> patch to add some comments ? >> >> You could send a follow-on patch with your additional information, which >> I find is very helpful! Thanks. > > Sure, will do Thank you!
diff --git a/net/smc/smc_core.c b/net/smc/smc_core.c index 387d28b2f8dd..55ca175e8d57 100644 --- a/net/smc/smc_core.c +++ b/net/smc/smc_core.c @@ -647,7 +647,7 @@ static void smcr_lgr_link_deactivate_all(struct smc_link_group *lgr) for (i = 0; i < SMC_LINKS_PER_LGR_MAX; i++) { struct smc_link *lnk = &lgr->lnk[i]; - if (smc_link_usable(lnk)) + if (smc_link_sendable(lnk)) lnk->state = SMC_LNK_INACTIVE; } wake_up_all(&lgr->llc_msg_waiter); diff --git a/net/smc/smc_core.h b/net/smc/smc_core.h index 59cef3b830d8..d63b08274197 100644 --- a/net/smc/smc_core.h +++ b/net/smc/smc_core.h @@ -415,6 +415,12 @@ static inline bool smc_link_usable(struct smc_link *lnk) return true; } +static inline bool smc_link_sendable(struct smc_link *lnk) +{ + return smc_link_usable(lnk) && + lnk->qp_attr.cur_qp_state == IB_QPS_RTS; +} + static inline bool smc_link_active(struct smc_link *lnk) { return lnk->state == SMC_LNK_ACTIVE; diff --git a/net/smc/smc_llc.c b/net/smc/smc_llc.c index b102680296b8..3e9fd8a3124c 100644 --- a/net/smc/smc_llc.c +++ b/net/smc/smc_llc.c @@ -1630,7 +1630,7 @@ void smc_llc_send_link_delete_all(struct smc_link_group *lgr, bool ord, u32 rsn) delllc.reason = htonl(rsn); for (i = 0; i < SMC_LINKS_PER_LGR_MAX; i++) { - if (!smc_link_usable(&lgr->lnk[i])) + if (!smc_link_sendable(&lgr->lnk[i])) continue; if (!smc_llc_send_message_wait(&lgr->lnk[i], &delllc)) break; diff --git a/net/smc/smc_wr.c b/net/smc/smc_wr.c index 600ab5889227..aaf3028a6fe9 100644 --- a/net/smc/smc_wr.c +++ b/net/smc/smc_wr.c @@ -188,7 +188,7 @@ void smc_wr_tx_cq_handler(struct ib_cq *ib_cq, void *cq_context) static inline int smc_wr_tx_get_free_slot_index(struct smc_link *link, u32 *idx) { *idx = link->wr_tx_cnt; - if (!smc_link_usable(link)) + if (!smc_link_sendable(link)) return -ENOLINK; for_each_clear_bit(*idx, link->wr_tx_mask, link->wr_tx_cnt) { if (!test_and_set_bit(*idx, link->wr_tx_mask)) @@ -231,7 +231,7 @@ int smc_wr_tx_get_free_slot(struct smc_link *link, } else { rc = wait_event_interruptible_timeout( link->wr_tx_wait, - !smc_link_usable(link) || + !smc_link_sendable(link) || lgr->terminating || (smc_wr_tx_get_free_slot_index(link, &idx) != -EBUSY), SMC_WR_TX_WAIT_FREE_SLOT_TIME); diff --git a/net/smc/smc_wr.h b/net/smc/smc_wr.h index f353311e6f84..48ed9b08ac7a 100644 --- a/net/smc/smc_wr.h +++ b/net/smc/smc_wr.h @@ -62,7 +62,7 @@ static inline void smc_wr_tx_set_wr_id(atomic_long_t *wr_tx_id, long val) static inline bool smc_wr_tx_link_hold(struct smc_link *link) { - if (!smc_link_usable(link)) + if (!smc_link_sendable(link)) return false; atomic_inc(&link->wr_tx_refcnt); return true;
We found smc_llc_send_link_delete_all() sometimes wait for 2s timeout when testing with RDMA link up/down. It is possible when a smc_link is in ACTIVATING state, the underlaying QP is still in RESET or RTR state, which cannot send any messages out. smc_llc_send_link_delete_all() use smc_link_usable() to checks whether the link is usable, if the QP is still in RESET or RTR state, but the smc_link is in ACTIVATING, this LLC message will always fail without any CQE entering the CQ, and we will always wait 2s before timeout. Since we cannot send any messages through the QP before the QP enter RTS. I add a wrapper smc_link_sendable() which checks the state of QP along with the link state. And replace smc_link_usable() with smc_link_sendable() in all LLC & CDC message sending routine. Fixes: 5f08318f617b ("smc: connection data control (CDC)") Signed-off-by: Dust Li <dust.li@linux.alibaba.com> --- net/smc/smc_core.c | 2 +- net/smc/smc_core.h | 6 ++++++ net/smc/smc_llc.c | 2 +- net/smc/smc_wr.c | 4 ++-- net/smc/smc_wr.h | 2 +- 5 files changed, 11 insertions(+), 5 deletions(-)