Message ID | 20240417062721.45652-3-kerneljasonxing@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | locklessly protect left members in struct rps_dev_flow | expand |
On Wed, Apr 17, 2024 at 8:27 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > From: Jason Xing <kernelxing@tencent.com> > > As we can see, rflow->filter can be written/read concurrently, so > lockless access is needed. > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > --- > I'm not very sure if the READ_ONCE in set_rps_cpu() is useful. I > scaned/checked the codes and found no lock can prevent multiple > threads from calling set_rps_cpu() and handling the same flow > simultaneously. The same question still exists in patch [3/3]. > --- > net/core/dev.c | 6 +++--- > 1 file changed, 3 insertions(+), 3 deletions(-) > > diff --git a/net/core/dev.c b/net/core/dev.c > index 2003b9a61e40..40a535158e45 100644 > --- a/net/core/dev.c > +++ b/net/core/dev.c > @@ -4524,8 +4524,8 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb, > goto out; > old_rflow = rflow; > rflow = &flow_table->flows[flow_id]; > - rflow->filter = rc; > - if (old_rflow->filter == rflow->filter) > + WRITE_ONCE(rflow->filter, rc); > + if (old_rflow->filter == READ_ONCE(rflow->filter)) You missed the obvious opportunity to use if (old_rflow->filter == rc) Here your code is going to force the compiler to read the memory right after a prior write, adding a stall on some arches. > old_rflow->filter = RPS_NO_FILTER; > out: > #endif > @@ -4666,7 +4666,7 @@ bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, > if (flow_table && flow_id <= flow_table->mask) { > rflow = &flow_table->flows[flow_id]; > cpu = READ_ONCE(rflow->cpu); > - if (rflow->filter == filter_id && cpu < nr_cpu_ids && > + if (READ_ONCE(rflow->filter) == filter_id && cpu < nr_cpu_ids && > ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) - > READ_ONCE(rflow->last_qtail)) < > (int)(10 * flow_table->mask))) > -- > 2.37.3 >
On Wed, Apr 17, 2024 at 6:04 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, Apr 17, 2024 at 8:27 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > From: Jason Xing <kernelxing@tencent.com> > > > > As we can see, rflow->filter can be written/read concurrently, so > > lockless access is needed. > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > --- > > I'm not very sure if the READ_ONCE in set_rps_cpu() is useful. I > > scaned/checked the codes and found no lock can prevent multiple > > threads from calling set_rps_cpu() and handling the same flow > > simultaneously. The same question still exists in patch [3/3]. > > --- > > net/core/dev.c | 6 +++--- > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > index 2003b9a61e40..40a535158e45 100644 > > --- a/net/core/dev.c > > +++ b/net/core/dev.c > > @@ -4524,8 +4524,8 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb, > > goto out; > > old_rflow = rflow; > > rflow = &flow_table->flows[flow_id]; > > - rflow->filter = rc; > > - if (old_rflow->filter == rflow->filter) > > + WRITE_ONCE(rflow->filter, rc); > > + if (old_rflow->filter == READ_ONCE(rflow->filter)) > > You missed the obvious opportunity to use > > if (old_rflow->filter == rc) > > Here your code is going to force the compiler to read the memory right > after a prior write, adding a stall on some arches. Thanks. I see. I will remove READ_ONCE() and then reuse 'rc'. I would like to ask one relational question: could multiple threads access the same rflow in set_rps_cpu() concurrently? Because I was thinking a lot about whether I should use the READ_ONCE() here to prevent another thread accessing/modifying this value concurrently. The answer is probably yes? Thanks, Jason > > > old_rflow->filter = RPS_NO_FILTER; > > out: > > #endif > > @@ -4666,7 +4666,7 @@ bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, > > if (flow_table && flow_id <= flow_table->mask) { > > rflow = &flow_table->flows[flow_id]; > > cpu = READ_ONCE(rflow->cpu); > > - if (rflow->filter == filter_id && cpu < nr_cpu_ids && > > + if (READ_ONCE(rflow->filter) == filter_id && cpu < nr_cpu_ids && > > ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) - > > READ_ONCE(rflow->last_qtail)) < > > (int)(10 * flow_table->mask))) > > -- > > 2.37.3 > >
On Wed, Apr 17, 2024 at 1:03 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Wed, Apr 17, 2024 at 6:04 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Wed, Apr 17, 2024 at 8:27 AM Jason Xing <kerneljasonxing@gmail.com> wrote: > > > > > > From: Jason Xing <kernelxing@tencent.com> > > > > > > As we can see, rflow->filter can be written/read concurrently, so > > > lockless access is needed. > > > > > > Signed-off-by: Jason Xing <kernelxing@tencent.com> > > > --- > > > I'm not very sure if the READ_ONCE in set_rps_cpu() is useful. I > > > scaned/checked the codes and found no lock can prevent multiple > > > threads from calling set_rps_cpu() and handling the same flow > > > simultaneously. The same question still exists in patch [3/3]. > > > --- > > > net/core/dev.c | 6 +++--- > > > 1 file changed, 3 insertions(+), 3 deletions(-) > > > > > > diff --git a/net/core/dev.c b/net/core/dev.c > > > index 2003b9a61e40..40a535158e45 100644 > > > --- a/net/core/dev.c > > > +++ b/net/core/dev.c > > > @@ -4524,8 +4524,8 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb, > > > goto out; > > > old_rflow = rflow; > > > rflow = &flow_table->flows[flow_id]; > > > - rflow->filter = rc; > > > - if (old_rflow->filter == rflow->filter) > > > + WRITE_ONCE(rflow->filter, rc); > > > + if (old_rflow->filter == READ_ONCE(rflow->filter)) > > > > You missed the obvious opportunity to use > > > > if (old_rflow->filter == rc) > > > > Here your code is going to force the compiler to read the memory right > > after a prior write, adding a stall on some arches. > > Thanks. I see. I will remove READ_ONCE() and then reuse 'rc'. > > I would like to ask one relational question: could multiple threads > access the same rflow in set_rps_cpu() concurrently? Because I was > thinking a lot about whether I should use the READ_ONCE() here to > prevent another thread accessing/modifying this value concurrently. READ_ONCE() would not prevent this. > The answer is probably yes? I think the answer is no. rflow is located in rxqueue->rps_flow_table, it is thus private to current thread. Only one cpu can service an RX queue at a time. I think you can scrap the patch series. I will instead remove the not needed annotations : diff --git a/include/net/rps.h b/include/net/rps.h index a93401d23d66e45210acc73f0326087813b69d59..3f913464a2b321efe38a05dd107bf134fae6ad17 100644 --- a/include/net/rps.h +++ b/include/net/rps.h @@ -134,7 +134,7 @@ static inline u32 rps_input_queue_tail_incr(struct softnet_data *sd) static inline void rps_input_queue_tail_save(u32 *dest, u32 tail) { #ifdef CONFIG_RPS - WRITE_ONCE(*dest, tail); + *dest = tail; #endif } diff --git a/net/core/dev.c b/net/core/dev.c index 854a3a28a8d85b335a9158378ae0cca6dfbf8b36..d774e4009790c9af30d3c8f9a5eab83e9cf01bd8 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4613,7 +4613,7 @@ static int get_rps_cpu(struct net_device *dev, struct sk_buff *skb, if (unlikely(tcpu != next_cpu) && (tcpu >= nr_cpu_ids || !cpu_online(tcpu) || ((int)(READ_ONCE(per_cpu(softnet_data, tcpu).input_queue_head) - - READ_ONCE(rflow->last_qtail))) >= 0)) { + rflow->last_qtail)) >= 0)) { tcpu = next_cpu; rflow = set_rps_cpu(dev, skb, rflow, next_cpu); } @@ -4668,7 +4668,7 @@ bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, cpu = READ_ONCE(rflow->cpu); if (rflow->filter == filter_id && cpu < nr_cpu_ids && ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) - - READ_ONCE(rflow->last_qtail)) < + rflow->last_qtail) < (int)(10 * flow_table->mask))) expire = false; }
On Wed, Apr 17, 2024 at 1:52 PM Eric Dumazet <edumazet@google.com> wrote: > > @@ -4668,7 +4668,7 @@ bool rps_may_expire_flow(struct net_device *dev, > u16 rxq_index, > cpu = READ_ONCE(rflow->cpu); > if (rflow->filter == filter_id && cpu < nr_cpu_ids && > ((int)(READ_ONCE(per_cpu(softnet_data, > cpu).input_queue_head) - > - READ_ONCE(rflow->last_qtail)) < > + rflow->last_qtail) < > (int)(10 * flow_table->mask))) > expire = false; > } Oh well, rps_may_expire_flow() might be called from other contexts, so only the READ_ONCE() from get_rps_cpu() is not really necessary.
On Wed, Apr 17, 2024 at 7:58 PM Eric Dumazet <edumazet@google.com> wrote: > > On Wed, Apr 17, 2024 at 1:52 PM Eric Dumazet <edumazet@google.com> wrote: > > > > @@ -4668,7 +4668,7 @@ bool rps_may_expire_flow(struct net_device *dev, > > u16 rxq_index, > > cpu = READ_ONCE(rflow->cpu); > > if (rflow->filter == filter_id && cpu < nr_cpu_ids && > > ((int)(READ_ONCE(per_cpu(softnet_data, > > cpu).input_queue_head) - > > - READ_ONCE(rflow->last_qtail)) < > > + rflow->last_qtail) < > > (int)(10 * flow_table->mask))) > > expire = false; > > } > > Oh well, rps_may_expire_flow() might be called from other contexts, so > only the READ_ONCE() > from get_rps_cpu() is not really necessary. Thanks for telling me the access logic about qtail in the previous email. Yes, I'm writing exactly what you're saying now :) I can keep protecting rflow->cpu and rflow->filter locklessly. I can remove the unneeded annotations around qtail as you suggested with those two patches if I can, or you can submit it first. It's up to you :) Thanks, Jason
On Wed, Apr 17, 2024 at 8:14 PM Jason Xing <kerneljasonxing@gmail.com> wrote: > > On Wed, Apr 17, 2024 at 7:58 PM Eric Dumazet <edumazet@google.com> wrote: > > > > On Wed, Apr 17, 2024 at 1:52 PM Eric Dumazet <edumazet@google.com> wrote: > > > > > > @@ -4668,7 +4668,7 @@ bool rps_may_expire_flow(struct net_device *dev, > > > u16 rxq_index, > > > cpu = READ_ONCE(rflow->cpu); > > > if (rflow->filter == filter_id && cpu < nr_cpu_ids && > > > ((int)(READ_ONCE(per_cpu(softnet_data, > > > cpu).input_queue_head) - > > > - READ_ONCE(rflow->last_qtail)) < > > > + rflow->last_qtail) < > > > (int)(10 * flow_table->mask))) > > > expire = false; > > > } > > > > Oh well, rps_may_expire_flow() might be called from other contexts, so > > only the READ_ONCE() > > from get_rps_cpu() is not really necessary. > > Thanks for telling me the access logic about qtail in the previous email. > > Yes, I'm writing exactly what you're saying now :) I can keep > protecting rflow->cpu and rflow->filter locklessly. > I can keep these three patches just like now only without that READ_ONCE(), I have to update my statement. [...] > I can remove the unneeded annotations around qtail as you suggested > with those two patches if I can, or you can submit it first. It's up > to you :) The 'qtail' also needs protection. What I was saying is not true. > > Thanks, > Jason
diff --git a/net/core/dev.c b/net/core/dev.c index 2003b9a61e40..40a535158e45 100644 --- a/net/core/dev.c +++ b/net/core/dev.c @@ -4524,8 +4524,8 @@ set_rps_cpu(struct net_device *dev, struct sk_buff *skb, goto out; old_rflow = rflow; rflow = &flow_table->flows[flow_id]; - rflow->filter = rc; - if (old_rflow->filter == rflow->filter) + WRITE_ONCE(rflow->filter, rc); + if (old_rflow->filter == READ_ONCE(rflow->filter)) old_rflow->filter = RPS_NO_FILTER; out: #endif @@ -4666,7 +4666,7 @@ bool rps_may_expire_flow(struct net_device *dev, u16 rxq_index, if (flow_table && flow_id <= flow_table->mask) { rflow = &flow_table->flows[flow_id]; cpu = READ_ONCE(rflow->cpu); - if (rflow->filter == filter_id && cpu < nr_cpu_ids && + if (READ_ONCE(rflow->filter) == filter_id && cpu < nr_cpu_ids && ((int)(READ_ONCE(per_cpu(softnet_data, cpu).input_queue_head) - READ_ONCE(rflow->last_qtail)) < (int)(10 * flow_table->mask)))