Message ID | 20231117111657.16266-1-lirongqing@baidu.com (mailing list archive) |
---|---|
State | Changes Requested |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | [net-next,v3] net/smc: avoid atomic_set and smp_wmb in the tx path when possible | expand |
On 17.11.23 12:16, Li RongQing wrote: > There is rare possibility that conn->tx_pushing is not 1, since > tx_pushing is just checked with 1, so move the setting tx_pushing > to 1 after atomic_dec_and_test() return false, to avoid atomic_set > and smp_wmb in tx path > > Reviewed-by: Dust Li <dust.li@linux.alibaba.com> > Signed-off-by: Li RongQing <lirongqing@baidu.com> > --- > diff v3: improvements in the commit body and comments > diff v2: fix a typo in commit body and add net-next subject-prefix > net/smc/smc_tx.c | 7 ++++--- > 1 file changed, 4 insertions(+), 3 deletions(-) > > diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c > index 3b0ff3b..2c2933f 100644 > --- a/net/smc/smc_tx.c > +++ b/net/smc/smc_tx.c > @@ -667,8 +667,6 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) > return 0; > > again: > - atomic_set(&conn->tx_pushing, 1); > - smp_wmb(); /* Make sure tx_pushing is 1 before real send */ > rc = __smc_tx_sndbuf_nonempty(conn); > > /* We need to check whether someone else have added some data into > @@ -677,8 +675,11 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) > * If so, we need to push again to prevent those data hang in the send > * queue. > */ > - if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) > + if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) { > + atomic_set(&conn->tx_pushing, 1); > + smp_wmb(); /* Make sure tx_pushing is 1 before send again */ > goto again; > + } > > return rc; > } It seems to me that the purpose of conn->tx_pushing is a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). b) Repeat, in case some other thread has added data to sndbuf concurrently. I agree that this patch does not change the behaviour of this function and removes an atomic_set() in the likely path. I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. So how can we ever run in a concurrency situation? Is this handling of conn->tx_pushing necessary at all?
On Fri, Nov 17, 2023 at 01:27:57PM +0100, Alexandra Winter wrote: > > >On 17.11.23 12:16, Li RongQing wrote: >> There is rare possibility that conn->tx_pushing is not 1, since >> tx_pushing is just checked with 1, so move the setting tx_pushing >> to 1 after atomic_dec_and_test() return false, to avoid atomic_set >> and smp_wmb in tx path >> >> Reviewed-by: Dust Li <dust.li@linux.alibaba.com> >> Signed-off-by: Li RongQing <lirongqing@baidu.com> >> --- >> diff v3: improvements in the commit body and comments >> diff v2: fix a typo in commit body and add net-next subject-prefix >> net/smc/smc_tx.c | 7 ++++--- >> 1 file changed, 4 insertions(+), 3 deletions(-) >> >> diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c >> index 3b0ff3b..2c2933f 100644 >> --- a/net/smc/smc_tx.c >> +++ b/net/smc/smc_tx.c >> @@ -667,8 +667,6 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) >> return 0; >> >> again: >> - atomic_set(&conn->tx_pushing, 1); >> - smp_wmb(); /* Make sure tx_pushing is 1 before real send */ >> rc = __smc_tx_sndbuf_nonempty(conn); >> >> /* We need to check whether someone else have added some data into >> @@ -677,8 +675,11 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) >> * If so, we need to push again to prevent those data hang in the send >> * queue. >> */ >> - if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) >> + if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) { >> + atomic_set(&conn->tx_pushing, 1); >> + smp_wmb(); /* Make sure tx_pushing is 1 before send again */ >> goto again; >> + } >> >> return rc; >> } > >It seems to me that the purpose of conn->tx_pushing is >a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). >b) Repeat, in case some other thread has added data to sndbuf concurrently. > >I agree that this patch does not change the behaviour of this function and removes an >atomic_set() in the likely path. > >I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. >So how can we ever run in a concurrency situation? >Is this handling of conn->tx_pushing necessary at all? Hi Sandy, Overall, I think you are right. But there is something we need to take care. Before commit 6b88af839d20 ("net/smc: don't send in the BH context if sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ, without checking sock_owned_by_user(), which would caused a race condition because bh_lock_sock() did not honor sock_lock(). To address this issue, I have added the tx_pushing mechanism. However, with commit 6b88af839d20, we now defer the transmission if sock_lock() is held by the user. Therefore, there should no longer be a race condition. Nevertheless, if we remove the tx_pending mechanism, we must always remember not to call smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock. Thanks Dust
On 20.11.23 04:20, Dust Li wrote: >> It seems to me that the purpose of conn->tx_pushing is >> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). >> b) Repeat, in case some other thread has added data to sndbuf concurrently. >> >> I agree that this patch does not change the behaviour of this function and removes an >> atomic_set() in the likely path. >> >> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. >> So how can we ever run in a concurrency situation? >> Is this handling of conn->tx_pushing necessary at all? > Hi Sandy, > > Overall, I think you are right. But there is something we need to take care. > > Before commit 6b88af839d20 ("net/smc: don't send in the BH context if > sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ, > without checking sock_owned_by_user(), which would caused a race condition > because bh_lock_sock() did not honor sock_lock(). To address this issue, > I have added the tx_pushing mechanism. However, with commit 6b88af839d20, > we now defer the transmission if sock_lock() is held by the user. > Therefore, there should no longer be a race condition. Nevertheless, if > we remove the tx_pending mechanism, we must always remember not to call > smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock. > > Thanks > Dust ok, I understand. So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(), should remember to document that requirement/precondition. Maybe in a Function context section of a kernel-doc function decription? (as described in https://docs.kernel.org/doc-guide/kernel-doc.html) Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful.
On 20.11.23 10:11, Alexandra Winter wrote: > > > On 20.11.23 04:20, Dust Li wrote: >>> It seems to me that the purpose of conn->tx_pushing is >>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). >>> b) Repeat, in case some other thread has added data to sndbuf concurrently. >>> >>> I agree that this patch does not change the behaviour of this function and removes an >>> atomic_set() in the likely path. >>> >>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. >>> So how can we ever run in a concurrency situation? >>> Is this handling of conn->tx_pushing necessary at all? >> Hi Sandy, >> >> Overall, I think you are right. But there is something we need to take care. >> >> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if >> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ, >> without checking sock_owned_by_user(), which would caused a race condition >> because bh_lock_sock() did not honor sock_lock(). To address this issue, >> I have added the tx_pushing mechanism. However, with commit 6b88af839d20, >> we now defer the transmission if sock_lock() is held by the user. >> Therefore, there should no longer be a race condition. Nevertheless, if >> we remove the tx_pending mechanism, we must always remember not to call >> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock. >> >> Thanks >> Dust > > > ok, I understand. > So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(), > should remember to document that requirement/precondition. > Maybe in a Function context section of a kernel-doc function decription? > (as described in https://docs.kernel.org/doc-guide/kernel-doc.html) > Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful. > Tony Lu <tonylu@linux.alibaba.com> ' mail address has been corrupted in this whole thread. Please reply to this message (corrected address) or take care, if replying to other messages in this thread.
On Mon, Nov 20, 2023 at 10:17:15AM +0100, Alexandra Winter wrote: > > > On 20.11.23 10:11, Alexandra Winter wrote: > > > > > > On 20.11.23 04:20, Dust Li wrote: > >>> It seems to me that the purpose of conn->tx_pushing is > >>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). > >>> b) Repeat, in case some other thread has added data to sndbuf concurrently. > >>> > >>> I agree that this patch does not change the behaviour of this function and removes an > >>> atomic_set() in the likely path. > >>> > >>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. > >>> So how can we ever run in a concurrency situation? > >>> Is this handling of conn->tx_pushing necessary at all? > >> Hi Sandy, > >> > >> Overall, I think you are right. But there is something we need to take care. > >> > >> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if > >> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ, > >> without checking sock_owned_by_user(), which would caused a race condition > >> because bh_lock_sock() did not honor sock_lock(). To address this issue, > >> I have added the tx_pushing mechanism. However, with commit 6b88af839d20, > >> we now defer the transmission if sock_lock() is held by the user. > >> Therefore, there should no longer be a race condition. Nevertheless, if > >> we remove the tx_pending mechanism, we must always remember not to call > >> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock. > >> > >> Thanks > >> Dust > > > > > > ok, I understand. > > So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(), > > should remember to document that requirement/precondition. > > Maybe in a Function context section of a kernel-doc function decription? > > (as described in https://docs.kernel.org/doc-guide/kernel-doc.html) > > Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful. > > > > > Tony Lu <tonylu@linux.alibaba.com> ' mail address has been corrupted in this whole thread. > Please reply to this message (corrected address) or take care, if replying to > other messages in this thread. Yes, that's true. Thanks Alexandra. Please use my correct address, RongQing: Tony Lu <tonylu@linux.alibaba.com>.
On Mon, Nov 20, 2023 at 10:11:17AM +0100, Alexandra Winter wrote: > > >On 20.11.23 04:20, Dust Li wrote: >>> It seems to me that the purpose of conn->tx_pushing is >>> a) Serve as a mutex, so only one thread per conn will call __smc_tx_sndbuf_nonempty(). >>> b) Repeat, in case some other thread has added data to sndbuf concurrently. >>> >>> I agree that this patch does not change the behaviour of this function and removes an >>> atomic_set() in the likely path. >>> >>> I wonder however: All callers of smc_tx_sndbuf_nonempty() must hold the socket lock. >>> So how can we ever run in a concurrency situation? >>> Is this handling of conn->tx_pushing necessary at all? >> Hi Sandy, >> >> Overall, I think you are right. But there is something we need to take care. >> >> Before commit 6b88af839d20 ("net/smc: don't send in the BH context if >> sock_owned_by_user"), we used to call smc_tx_pending() in the soft IRQ, >> without checking sock_owned_by_user(), which would caused a race condition >> because bh_lock_sock() did not honor sock_lock(). To address this issue, >> I have added the tx_pushing mechanism. However, with commit 6b88af839d20, >> we now defer the transmission if sock_lock() is held by the user. >> Therefore, there should no longer be a race condition. Nevertheless, if >> we remove the tx_pending mechanism, we must always remember not to call >> smc_tx_sndbuf_nonempty() in the soft IRQ when the user holds the sock lock. >> >> Thanks >> Dust > > >ok, I understand. >So whoever is willing to give it a try and simplify smc_tx_sndbuf_nonempty(), >should remember to document that requirement/precondition. >Maybe in a Function context section of a kernel-doc function decription? >(as described in https://docs.kernel.org/doc-guide/kernel-doc.html) >Although smc_tx_sndbuf_nonempty() is not exported, this format is helpful. I double checked this and realized that I may have missed something previously. The original goal of introducing tx_push was to maximize the amount of data that could be corked in order to achieve the best performance. __smc_tx_sndbuf_nonempty() is thread and context safe, meaning that it can be called simultaneously in both user context and softirq without causing any bugs, just some CPU waste. Although I think we should remove all the atomics & locks in the data path and only use sock_lock in the long-term. I will collaborate with Li RongQing on a new version that eliminates the tx_pushing. Best regards, Dust
diff v3: improvements in the commit body and comments diff v2: fix a typo in commit body and add net-next subject-prefix net/smc/smc_tx.c | 7 ++++--- 1 file changed, 4 insertions(+), 3 deletions(-) diff --git a/net/smc/smc_tx.c b/net/smc/smc_tx.c index 3b0ff3b..2c2933f 100644 --- a/net/smc/smc_tx.c +++ b/net/smc/smc_tx.c @@ -667,8 +667,6 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) return 0; again: - atomic_set(&conn->tx_pushing, 1); - smp_wmb(); /* Make sure tx_pushing is 1 before real send */ rc = __smc_tx_sndbuf_nonempty(conn); /* We need to check whether someone else have added some data into @@ -677,8 +675,11 @@ int smc_tx_sndbuf_nonempty(struct smc_connection *conn) * If so, we need to push again to prevent those data hang in the send * queue. */ - if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) + if (unlikely(!atomic_dec_and_test(&conn->tx_pushing))) { + atomic_set(&conn->tx_pushing, 1); + smp_wmb(); /* Make sure tx_pushing is 1 before send again */ goto again; + } return rc; }