Message ID | 20221117112050.3942407-1-qiang1.zhang@intel.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | srcu: Add detection of boot CPU srcu_data structure's->srcu_cblist | expand |
On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > and other CPU srcu_data structure's->srcu_cblist is always empty. so > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > pend cbs in srcu_might_be_idle(). > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > --- > kernel/rcu/srcutree.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 6af031200580..6d9af9901765 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > unsigned long tlast; > > check_init_srcu_struct(ssp); > - /* If the local srcu_data structure has callbacks, not idle. */ > - sdp = raw_cpu_ptr(ssp->sda); > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > + else > + sdp = raw_cpu_ptr(ssp->sda); While at it if someone is interested in documenting/commenting on the meaning of all these SRCU_SIZE_* things, it would be much appreciated! Thanks. > spin_lock_irqsave_rcu_node(sdp, flags); > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > spin_unlock_irqrestore_rcu_node(sdp, flags); > -- > 2.25.1 >
On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > and other CPU srcu_data structure's->srcu_cblist is always empty. so > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > pend cbs in srcu_might_be_idle(). > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > --- > kernel/rcu/srcutree.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 6af031200580..6d9af9901765 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > unsigned long tlast; > > check_init_srcu_struct(ssp); > - /* If the local srcu_data structure has callbacks, not idle. */ > - sdp = raw_cpu_ptr(ssp->sda); > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > + else > + sdp = raw_cpu_ptr(ssp->sda); > >While at it if someone is interested in documenting/commenting on the meaning of >all these SRCU_SIZE_* things, it would be much appreciated! In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist to store srcu callbacks. if the contention of srcu_struct and srcu_data structure's->lock become busy, transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU grace period. if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks. the task calling call_srcu() may have access to a new srcu_node structure or may not, because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly, need to wait in this state for a SRCU grace period to end. After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist. Does my description make my commit more detailed? Thanks Zqiang > >Thanks. > > spin_lock_irqsave_rcu_node(sdp, flags); > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > spin_unlock_irqrestore_rcu_node(sdp, flags); > -- > 2.25.1 >
>On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > and other CPU srcu_data structure's->srcu_cblist is always empty. so > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > pend cbs in srcu_might_be_idle(). > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > --- > kernel/rcu/srcutree.c | 7 +++++-- > 1 file changed, 5 insertions(+), 2 deletions(-) > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > index 6af031200580..6d9af9901765 100644 > --- a/kernel/rcu/srcutree.c > +++ b/kernel/rcu/srcutree.c > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > unsigned long tlast; > > check_init_srcu_struct(ssp); > - /* If the local srcu_data structure has callbacks, not idle. */ > - sdp = raw_cpu_ptr(ssp->sda); > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > + else > + sdp = raw_cpu_ptr(ssp->sda); > Hi Paul, For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful. Thoughts ? Thanks Zqiang >While at it if someone is interested in documenting/commenting on the meaning of >all these SRCU_SIZE_* things, it would be much appreciated! > >In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized >per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist >to store srcu callbacks. >if the contention of srcu_struct and srcu_data structure's->lock become busy, >transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU >grace period. >if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke >srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks. >the task calling call_srcu() may have access to a new srcu_node structure or may not, >because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly, >need to wait in this state for a SRCU grace period to end. >After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist. > >Does my description make my commit more detailed? > >Thanks >Zqiang > > > >Thanks. > > spin_lock_irqsave_rcu_node(sdp, flags); > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > spin_unlock_irqrestore_rcu_node(sdp, flags); > -- > 2.25.1 >
On Wed, Nov 23, 2022 at 02:01:50AM +0000, Zhang, Qiang1 wrote: > > >On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > > and other CPU srcu_data structure's->srcu_cblist is always empty. so > > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > > pend cbs in srcu_might_be_idle(). > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > --- > > kernel/rcu/srcutree.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 6af031200580..6d9af9901765 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > > unsigned long tlast; > > > > check_init_srcu_struct(ssp); > > - /* If the local srcu_data structure has callbacks, not idle. */ > > - sdp = raw_cpu_ptr(ssp->sda); > > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > > + else > > + sdp = raw_cpu_ptr(ssp->sda); > > > > Hi Paul, > > For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in > SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful. > > Thoughts ? You are right that this change might make srcu_might_be_idle() return a more accurate value in the common case. As things are now, some other CPU might just now have added a callback, but might not yet have started that new grace period. In that case, we might expedite an SRCU grace period when we would not have otherwise done so. However, this change would also increase contention on the get_boot_cpu_id() CPU's srcu_data structure's ->lock. So the result of that inaccurate return value is that the first two SRCU grace periods in a burst might be expedited instead of only the first one, and even then only if we hit a very narrow race window. Besides, this same thing can happen if two CPUs do synchronize_srcu() at the same time after a long-enough pause between grace periods. Both see no callbacks, so both ask for an expedited grace period. So again, the first two grace periods are expedited. As a result, I am having a very hard time justifying the increased lock contention. Am I missing something here? Thanx, Paul > Thanks > Zqiang > > > >While at it if someone is interested in documenting/commenting on the meaning of > >all these SRCU_SIZE_* things, it would be much appreciated! > > > >In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized > >per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist > >to store srcu callbacks. > >if the contention of srcu_struct and srcu_data structure's->lock become busy, > >transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU > >grace period. > >if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke > >srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks. > >the task calling call_srcu() may have access to a new srcu_node structure or may not, > >because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly, > >need to wait in this state for a SRCU grace period to end. > >After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist. > > > >Does my description make my commit more detailed? > > > >Thanks > >Zqiang > > > > > > > > > >Thanks. > > > > spin_lock_irqsave_rcu_node(sdp, flags); > > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > > spin_unlock_irqrestore_rcu_node(sdp, flags); > > -- > > 2.25.1 > >
On Wed, Nov 23, 2022 at 02:01:50AM +0000, Zhang, Qiang1 wrote: > > >On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > > and other CPU srcu_data structure's->srcu_cblist is always empty. so > > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > > pend cbs in srcu_might_be_idle(). > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > --- > > kernel/rcu/srcutree.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 6af031200580..6d9af9901765 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > > unsigned long tlast; > > > > check_init_srcu_struct(ssp); > > - /* If the local srcu_data structure has callbacks, not idle. */ > > - sdp = raw_cpu_ptr(ssp->sda); > > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > > + else > > + sdp = raw_cpu_ptr(ssp->sda); > > > > Hi Paul, > > For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in > SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful. > > Thoughts ? >You are right that this change might make srcu_might_be_idle() return a >more accurate value in the common case. As things are now, some other >CPU might just now have added a callback, but might not yet have started >that new grace period. In that case, we might expedite an SRCU grace >period when we would not have otherwise done so. However, this change >would also increase contention on the get_boot_cpu_id() CPU's srcu_data >structure's ->lock. > >So the result of that inaccurate return value is that the first two SRCU >grace periods in a burst might be expedited instead of only the first one, >and even then only if we hit a very narrow race window. > >Besides, this same thing can happen if two CPUs do synchronize_srcu() >at the same time after a long-enough pause between grace periods. >Both see no callbacks, so both ask for an expedited grace period. >So again, the first two grace periods are expedited. > >As a result, I am having a very hard time justifying the increased >lock contention. Thanks reply, I have another question, Is this srcu_data structure's->lock necessary? the rcu_segcblist_pend_cbs() only check *tails[RCU_DONE_TAIL] is empty. or can use rcu_segcblist_get_seglen(RCU_WAIT_TAIL + RCU_NEXT_READY_TAIL + RCU_NEXT_TAIL) instead of rcu_segcblist_pend_cbs() to avoid locking? (although this is also inaccurate) Thanks Zqiang > >Am I missing something here? > > Thanx, Paul > > Thanks > Zqiang > > > >While at it if someone is interested in documenting/commenting on the meaning of > >all these SRCU_SIZE_* things, it would be much appreciated! > > > >In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized > >per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist > >to store srcu callbacks. > >if the contention of srcu_struct and srcu_data structure's->lock become busy, > >transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU > >grace period. > >if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke > >srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks. > >the task calling call_srcu() may have access to a new srcu_node structure or may not, > >because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly, > >need to wait in this state for a SRCU grace period to end. > >After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist. > > > >Does my description make my commit more detailed? > > > >Thanks > >Zqiang > > > > > > > > > >Thanks. > > > > spin_lock_irqsave_rcu_node(sdp, flags); > > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > > spin_unlock_irqrestore_rcu_node(sdp, flags); > > -- > > 2.25.1 > >
On Thu, Nov 24, 2022 at 01:43:42AM +0000, Zhang, Qiang1 wrote: > On Wed, Nov 23, 2022 at 02:01:50AM +0000, Zhang, Qiang1 wrote: > > > > >On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > > > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > > > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > > > and other CPU srcu_data structure's->srcu_cblist is always empty. so > > > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > > > pend cbs in srcu_might_be_idle(). > > > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > > --- > > > kernel/rcu/srcutree.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > > index 6af031200580..6d9af9901765 100644 > > > --- a/kernel/rcu/srcutree.c > > > +++ b/kernel/rcu/srcutree.c > > > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > > > unsigned long tlast; > > > > > > check_init_srcu_struct(ssp); > > > - /* If the local srcu_data structure has callbacks, not idle. */ > > > - sdp = raw_cpu_ptr(ssp->sda); > > > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > > > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > > > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > > > + else > > > + sdp = raw_cpu_ptr(ssp->sda); > > > > > > > Hi Paul, > > > > For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in > > SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful. > > > > Thoughts ? > > >You are right that this change might make srcu_might_be_idle() return a > >more accurate value in the common case. As things are now, some other > >CPU might just now have added a callback, but might not yet have started > >that new grace period. In that case, we might expedite an SRCU grace > >period when we would not have otherwise done so. However, this change > >would also increase contention on the get_boot_cpu_id() CPU's srcu_data > >structure's ->lock. > > > >So the result of that inaccurate return value is that the first two SRCU > >grace periods in a burst might be expedited instead of only the first one, > >and even then only if we hit a very narrow race window. > > > >Besides, this same thing can happen if two CPUs do synchronize_srcu() > >at the same time after a long-enough pause between grace periods. > >Both see no callbacks, so both ask for an expedited grace period. > >So again, the first two grace periods are expedited. > > > >As a result, I am having a very hard time justifying the increased > >lock contention. > > Thanks reply, I have another question, Is this srcu_data structure's->lock necessary? > the rcu_segcblist_pend_cbs() only check *tails[RCU_DONE_TAIL] is empty. > or can use rcu_segcblist_get_seglen(RCU_WAIT_TAIL + RCU_NEXT_READY_TAIL + RCU_NEXT_TAIL) > instead of rcu_segcblist_pend_cbs() to avoid locking? (although this is also inaccurate) That extra "*" means that the lock must be acquired. Otherwise, the pointed-to callback might even be unmapped between the time we fetched the pointer and the time we dereferenced it. Thanx, Paul. > Thanks > Zqiang > > > > >Am I missing something here? > > > > Thanx, Paul > > > > Thanks > > Zqiang > > > > > > >While at it if someone is interested in documenting/commenting on the meaning of > > >all these SRCU_SIZE_* things, it would be much appreciated! > > > > > >In the initial stage(SRCU_SIZE_SMALL), there is no srcu_node structures, only initialized > > >per-CPU srcu_data structures, at this time, we only use boot-cpu srcu_data structures's ->srcu_cblist > > >to store srcu callbacks. > > >if the contention of srcu_struct and srcu_data structure's->lock become busy, > > >transition to SRCU_SIZE_ALLOC. allocated memory for srcu_node structure at end of the SRCU > > >grace period. > > >if allocated success, transition to SRCU_SIZE_WAIT_BARRIER, in this state, invoke > > >srcu_barrier() will iterate over all possible CPUs, but we still only use boot-CPU srcu_cblist to store callbacks. > > >the task calling call_srcu() may have access to a new srcu_node structure or may not, > > >because call_srcu() is protected by SRCU, we may not transition to SRCU_SIZE_WAIT_CALL quickly, > > >need to wait in this state for a SRCU grace period to end. > > >After transition to SRCU_SIZE_WAIT_CALL, we enqueue srcu callbacks to own srcu_data structures's ->srcu_cblist. > > > > > >Does my description make my commit more detailed? > > > > > >Thanks > > >Zqiang > > > > > > > > > > > > > > > >Thanks. > > > > > > spin_lock_irqsave_rcu_node(sdp, flags); > > > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > > > spin_unlock_irqrestore_rcu_node(sdp, flags); > > > -- > > > 2.25.1 > > >
On Thu, Nov 24, 2022 at 01:43:42AM +0000, Zhang, Qiang1 wrote: > On Wed, Nov 23, 2022 at 02:01:50AM +0000, Zhang, Qiang1 wrote: > > > > >On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > > > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > > > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > > > and other CPU srcu_data structure's->srcu_cblist is always empty. so > > > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > > > pend cbs in srcu_might_be_idle(). > > > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > > --- > > > kernel/rcu/srcutree.c | 7 +++++-- > > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > > index 6af031200580..6d9af9901765 100644 > > > --- a/kernel/rcu/srcutree.c > > > +++ b/kernel/rcu/srcutree.c > > > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > > > unsigned long tlast; > > > > > > check_init_srcu_struct(ssp); > > > - /* If the local srcu_data structure has callbacks, not idle. */ > > > - sdp = raw_cpu_ptr(ssp->sda); > > > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > > > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > > > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > > > + else > > > + sdp = raw_cpu_ptr(ssp->sda); > > > > > > > Hi Paul, > > > > For the convert_to_big default configuration(SRCU_SIZING_AUTO), I found that the srcu is in > > SRCU_SIZE_SMALL state most of the time in the system, I think this change is meaningful. > > > > Thoughts ? > > >You are right that this change might make srcu_might_be_idle() return a > >more accurate value in the common case. As things are now, some other > >CPU might just now have added a callback, but might not yet have started > >that new grace period. In that case, we might expedite an SRCU grace > >period when we would not have otherwise done so. However, this change > >would also increase contention on the get_boot_cpu_id() CPU's srcu_data > >structure's ->lock. > > > >So the result of that inaccurate return value is that the first two SRCU > >grace periods in a burst might be expedited instead of only the first one, > >and even then only if we hit a very narrow race window. > > > >Besides, this same thing can happen if two CPUs do synchronize_srcu() > >at the same time after a long-enough pause between grace periods. > >Both see no callbacks, so both ask for an expedited grace period. > >So again, the first two grace periods are expedited. > > > >As a result, I am having a very hard time justifying the increased > >lock contention. > > Thanks reply, I have another question, Is this srcu_data structure's->lock necessary? > the rcu_segcblist_pend_cbs() only check *tails[RCU_DONE_TAIL] is empty. > or can use rcu_segcblist_get_seglen(RCU_WAIT_TAIL + RCU_NEXT_READY_TAIL + RCU_NEXT_TAIL) > instead of rcu_segcblist_pend_cbs() to avoid locking? (although this is also inaccurate) >That extra "*" means that the lock must be acquired. Otherwise, the >pointed-to callback might even be unmapped between the time we fetched >the pointer and the time we dereferenced it. Thanks for detailed explanation , learn more
On Thu, Nov 17, 2022 at 03:20:25PM +0100, Frederic Weisbecker wrote: > On Thu, Nov 17, 2022 at 07:20:50PM +0800, Zqiang wrote: > > Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu > > callback only insert to boot-CPU srcu_data structure's->srcu_cblist > > and other CPU srcu_data structure's->srcu_cblist is always empty. so > > before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU > > pend cbs in srcu_might_be_idle(). > > > > Signed-off-by: Zqiang <qiang1.zhang@intel.com> > > --- > > kernel/rcu/srcutree.c | 7 +++++-- > > 1 file changed, 5 insertions(+), 2 deletions(-) > > > > diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c > > index 6af031200580..6d9af9901765 100644 > > --- a/kernel/rcu/srcutree.c > > +++ b/kernel/rcu/srcutree.c > > @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) > > unsigned long tlast; > > > > check_init_srcu_struct(ssp); > > - /* If the local srcu_data structure has callbacks, not idle. */ > > - sdp = raw_cpu_ptr(ssp->sda); > > + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ > > + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) > > + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); > > + else > > + sdp = raw_cpu_ptr(ssp->sda); > > While at it if someone is interested in documenting/commenting on the meaning of > all these SRCU_SIZE_* things, it would be much appreciated! > Due to a conflict understanding to the code, once hesitate to jump in. But anyway, just bold to move ahead. I have send a series "[PATCH 0/3] srcu: shrink the num of srcu_size_state", which opens the discussion. (Have Cced you and Zqiang) Please have a look. Thanks, Pingfan > Thanks. > > > spin_lock_irqsave_rcu_node(sdp, flags); > > if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { > > spin_unlock_irqrestore_rcu_node(sdp, flags); > > -- > > 2.25.1 > >
diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c index 6af031200580..6d9af9901765 100644 --- a/kernel/rcu/srcutree.c +++ b/kernel/rcu/srcutree.c @@ -1098,8 +1098,11 @@ static bool srcu_might_be_idle(struct srcu_struct *ssp) unsigned long tlast; check_init_srcu_struct(ssp); - /* If the local srcu_data structure has callbacks, not idle. */ - sdp = raw_cpu_ptr(ssp->sda); + /* If the boot CPU or local srcu_data structure has callbacks, not idle. */ + if (smp_load_acquire(&ssp->srcu_size_state) < SRCU_SIZE_WAIT_CALL) + sdp = per_cpu_ptr(ssp->sda, get_boot_cpu_id()); + else + sdp = raw_cpu_ptr(ssp->sda); spin_lock_irqsave_rcu_node(sdp, flags); if (rcu_segcblist_pend_cbs(&sdp->srcu_cblist)) { spin_unlock_irqrestore_rcu_node(sdp, flags);
Before SRCU_SIZE_WAIT_CALL srcu_size_state is reached, the srcu callback only insert to boot-CPU srcu_data structure's->srcu_cblist and other CPU srcu_data structure's->srcu_cblist is always empty. so before SRCU_SIZE_WAIT_CALL is reached, need to correctly check boot-CPU pend cbs in srcu_might_be_idle(). Signed-off-by: Zqiang <qiang1.zhang@intel.com> --- kernel/rcu/srcutree.c | 7 +++++-- 1 file changed, 5 insertions(+), 2 deletions(-)