Message ID | 5131FB4C.7070408@cn.fujitsu.com (mailing list archive) |
---|---|
State | Not Applicable, archived |
Headers | show |
On 03/02/2013 06:44 PM, Lai Jiangshan wrote: > From 345a7a75c314ff567be48983e0892bc69c4452e7 Mon Sep 17 00:00:00 2001 > From: Lai Jiangshan <laijs@cn.fujitsu.com> > Date: Sat, 2 Mar 2013 20:33:14 +0800 > Subject: [PATCH] lglock: add read-preference local-global rwlock > > Current lglock is not read-preference, so it can't be used on some cases > which read-preference rwlock can do. Example, get_cpu_online_atomic(). > [...] > diff --git a/kernel/lglock.c b/kernel/lglock.c > index 6535a66..52e9b2c 100644 > --- a/kernel/lglock.c > +++ b/kernel/lglock.c > @@ -87,3 +87,71 @@ void lg_global_unlock(struct lglock *lg) > preempt_enable(); > } > EXPORT_SYMBOL(lg_global_unlock); > + > +#define FALLBACK_BASE (1UL << 30) > + > +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) > +{ > + struct lglock *lg = &lgrw->lglock; > + > + preempt_disable(); > + if (likely(!__this_cpu_read(*lgrw->reader_refcnt))) { > + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); > + if (unlikely(!arch_spin_trylock(this_cpu_ptr(lg->lock)))) { > + read_lock(&lgrw->fallback_rwlock); > + __this_cpu_write(*lgrw->reader_refcnt, FALLBACK_BASE); > + return; > + } > + } > + > + __this_cpu_inc(*lgrw->reader_refcnt); > +} > +EXPORT_SYMBOL(lg_rwlock_local_read_lock); > + > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > +{ > + switch (__this_cpu_read(*lgrw->reader_refcnt)) { > + case 1: > + __this_cpu_write(*lgrw->reader_refcnt, 0); > + lg_local_unlock(&lgrw->lglock); > + return; This should be a break, instead of a return, right? Otherwise, there will be a preempt imbalance... > + case FALLBACK_BASE: > + __this_cpu_write(*lgrw->reader_refcnt, 0); > + read_unlock(&lgrw->fallback_rwlock); > + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); > + break; > + default: > + __this_cpu_dec(*lgrw->reader_refcnt); > + break; > + } > + > + preempt_enable(); > +} Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/02, Lai Jiangshan wrote: > > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > +{ > + switch (__this_cpu_read(*lgrw->reader_refcnt)) { > + case 1: > + __this_cpu_write(*lgrw->reader_refcnt, 0); > + lg_local_unlock(&lgrw->lglock); > + return; > + case FALLBACK_BASE: > + __this_cpu_write(*lgrw->reader_refcnt, 0); > + read_unlock(&lgrw->fallback_rwlock); > + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); I guess "case 1:" should do rwlock_release() too. Otherwise, at first glance looks correct... However, I still think that FALLBACK_BASE only adds the unnecessary complications. But even if I am right this is subjective of course, please feel free to ignore. And btw, I am not sure about lg->lock_dep_map, perhaps we should use fallback_rwlock->dep_map ? We need rwlock_acquire_read() even in the fast-path, and this acquire_read should be paired with rwlock_acquire() in _write_lock(), but it does spin_acquire(lg->lock_dep_map). Yes, currently this is the same (afaics) but perhaps fallback_rwlock->dep_map would be more clean. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/02, Oleg Nesterov wrote: > > On 03/02, Lai Jiangshan wrote: > > > > +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > > +{ > > + switch (__this_cpu_read(*lgrw->reader_refcnt)) { > > + case 1: > > + __this_cpu_write(*lgrw->reader_refcnt, 0); > > + lg_local_unlock(&lgrw->lglock); > > + return; > > + case FALLBACK_BASE: > > + __this_cpu_write(*lgrw->reader_refcnt, 0); > > + read_unlock(&lgrw->fallback_rwlock); > > + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); > > I guess "case 1:" should do rwlock_release() too. > > Otherwise, at first glance looks correct... > > However, I still think that FALLBACK_BASE only adds the unnecessary > complications. But even if I am right this is subjective of course, please > feel free to ignore. Yes, but... > And btw, I am not sure about lg->lock_dep_map, perhaps we should use > fallback_rwlock->dep_map ? > > We need rwlock_acquire_read() even in the fast-path, and this acquire_read > should be paired with rwlock_acquire() in _write_lock(), but it does > spin_acquire(lg->lock_dep_map). Yes, currently this is the same (afaics) > but perhaps fallback_rwlock->dep_map would be more clean. Please ignore this part. I missed that lg_rwlock_global_write_lock() relies on lg_global_lock(), and I just noticed that it does rwlock_acquire(lg->lock_dep_map). Hmm. But then I do not understand the lglock annotations. Obviously, rwlock_acquire_read() in lg_local_lock() can't even detect the simplest deadlock, say, lg_local_lock(LOCK) + lg_local_lock(LOCK). Not to mention spin_lock(X) + lg_local_lock(Y) vs lg_local_lock(Y) + spin_lock(X). OK, I understand that it is not easy to make these annotations correct... Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Sun, Mar 3, 2013 at 9:40 AM, Oleg Nesterov <oleg@redhat.com> wrote: >> However, I still think that FALLBACK_BASE only adds the unnecessary >> complications. But even if I am right this is subjective of course, please >> feel free to ignore. Would it help if I sent out that version (without FALLBACK_BASE) as a formal proposal ? > Hmm. But then I do not understand the lglock annotations. Obviously, > rwlock_acquire_read() in lg_local_lock() can't even detect the simplest > deadlock, say, lg_local_lock(LOCK) + lg_local_lock(LOCK). Not to mention > spin_lock(X) + lg_local_lock(Y) vs lg_local_lock(Y) + spin_lock(X). > > OK, I understand that it is not easy to make these annotations correct... I am going to send out a proposal to fix the existing lglock annotations and detect the two cases you noticed. It's actually not that hard :)
On 03/03/13 01:20, Oleg Nesterov wrote: > On 03/02, Lai Jiangshan wrote: >> >> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) >> +{ >> + switch (__this_cpu_read(*lgrw->reader_refcnt)) { >> + case 1: >> + __this_cpu_write(*lgrw->reader_refcnt, 0); >> + lg_local_unlock(&lgrw->lglock); >> + return; >> + case FALLBACK_BASE: >> + __this_cpu_write(*lgrw->reader_refcnt, 0); >> + read_unlock(&lgrw->fallback_rwlock); >> + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); > > I guess "case 1:" should do rwlock_release() too. Already do it in "lg_local_unlock(&lgrw->lglock);" before it returns. (I like reuse old code) > > Otherwise, at first glance looks correct... > > However, I still think that FALLBACK_BASE only adds the unnecessary > complications. But even if I am right this is subjective of course, please > feel free to ignore. OK, I kill FALLBACK_BASE in later patch. > > And btw, I am not sure about lg->lock_dep_map, perhaps we should use > fallback_rwlock->dep_map ? Use either one is OK. > > We need rwlock_acquire_read() even in the fast-path, and this acquire_read > should be paired with rwlock_acquire() in _write_lock(), but it does > spin_acquire(lg->lock_dep_map). Yes, currently this is the same (afaics) > but perhaps fallback_rwlock->dep_map would be more clean. > I can't tell which one is better. I try to use fallback_rwlock->dep_map later. > Oleg. > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/03/13 01:11, Srivatsa S. Bhat wrote: > On 03/02/2013 06:44 PM, Lai Jiangshan wrote: >> From 345a7a75c314ff567be48983e0892bc69c4452e7 Mon Sep 17 00:00:00 2001 >> From: Lai Jiangshan <laijs@cn.fujitsu.com> >> Date: Sat, 2 Mar 2013 20:33:14 +0800 >> Subject: [PATCH] lglock: add read-preference local-global rwlock >> >> Current lglock is not read-preference, so it can't be used on some cases >> which read-preference rwlock can do. Example, get_cpu_online_atomic(). >> > [...] >> diff --git a/kernel/lglock.c b/kernel/lglock.c >> index 6535a66..52e9b2c 100644 >> --- a/kernel/lglock.c >> +++ b/kernel/lglock.c >> @@ -87,3 +87,71 @@ void lg_global_unlock(struct lglock *lg) >> preempt_enable(); >> } >> EXPORT_SYMBOL(lg_global_unlock); >> + >> +#define FALLBACK_BASE (1UL << 30) >> + >> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) >> +{ >> + struct lglock *lg = &lgrw->lglock; >> + >> + preempt_disable(); >> + if (likely(!__this_cpu_read(*lgrw->reader_refcnt))) { >> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); >> + if (unlikely(!arch_spin_trylock(this_cpu_ptr(lg->lock)))) { >> + read_lock(&lgrw->fallback_rwlock); >> + __this_cpu_write(*lgrw->reader_refcnt, FALLBACK_BASE); >> + return; >> + } >> + } >> + >> + __this_cpu_inc(*lgrw->reader_refcnt); >> +} >> +EXPORT_SYMBOL(lg_rwlock_local_read_lock); >> + >> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) >> +{ >> + switch (__this_cpu_read(*lgrw->reader_refcnt)) { >> + case 1: >> + __this_cpu_write(*lgrw->reader_refcnt, 0); >> + lg_local_unlock(&lgrw->lglock); >> + return; > > This should be a break, instead of a return, right? > Otherwise, there will be a preempt imbalance... "lockdep" and "preempt" are handled in lg_local_unlock(&lgrw->lglock); Thanks, Lai > >> + case FALLBACK_BASE: >> + __this_cpu_write(*lgrw->reader_refcnt, 0); >> + read_unlock(&lgrw->fallback_rwlock); >> + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); >> + break; >> + default: >> + __this_cpu_dec(*lgrw->reader_refcnt); >> + break; >> + } >> + >> + preempt_enable(); >> +} > > > Regards, > Srivatsa S. Bhat > > -- > To unsubscribe from this list: send the line "unsubscribe linux-kernel" in > the body of a message to majordomo@vger.kernel.org > More majordomo info at http://vger.kernel.org/majordomo-info.html > Please read the FAQ at http://www.tux.org/lkml/ > -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/05, Lai Jiangshan wrote: > > On 03/03/13 01:20, Oleg Nesterov wrote: > > On 03/02, Lai Jiangshan wrote: > >> > >> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) > >> +{ > >> + switch (__this_cpu_read(*lgrw->reader_refcnt)) { > >> + case 1: > >> + __this_cpu_write(*lgrw->reader_refcnt, 0); > >> + lg_local_unlock(&lgrw->lglock); > >> + return; > >> + case FALLBACK_BASE: > >> + __this_cpu_write(*lgrw->reader_refcnt, 0); > >> + read_unlock(&lgrw->fallback_rwlock); > >> + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); > > > > I guess "case 1:" should do rwlock_release() too. > > Already do it in "lg_local_unlock(&lgrw->lglock);" before it returns. > (I like reuse old code) Yes, I was wrong thanks. Another case when I didn't notice that you re-use the regular lg_ code... > > We need rwlock_acquire_read() even in the fast-path, and this acquire_read > > should be paired with rwlock_acquire() in _write_lock(), but it does > > spin_acquire(lg->lock_dep_map). Yes, currently this is the same (afaics) > > but perhaps fallback_rwlock->dep_map would be more clean. > > I can't tell which one is better. I try to use fallback_rwlock->dep_map later. I am not sure which one should be better too, please check. Again, I forgot that _write_lock/unlock use lg_global_*() code. Oleg. -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 03/05/2013 09:11 PM, Lai Jiangshan wrote: > On 03/03/13 01:11, Srivatsa S. Bhat wrote: >> On 03/02/2013 06:44 PM, Lai Jiangshan wrote: >>> From 345a7a75c314ff567be48983e0892bc69c4452e7 Mon Sep 17 00:00:00 2001 >>> From: Lai Jiangshan <laijs@cn.fujitsu.com> >>> Date: Sat, 2 Mar 2013 20:33:14 +0800 >>> Subject: [PATCH] lglock: add read-preference local-global rwlock >>> >>> Current lglock is not read-preference, so it can't be used on some cases >>> which read-preference rwlock can do. Example, get_cpu_online_atomic(). >>> >> [...] >>> diff --git a/kernel/lglock.c b/kernel/lglock.c >>> index 6535a66..52e9b2c 100644 >>> --- a/kernel/lglock.c >>> +++ b/kernel/lglock.c >>> @@ -87,3 +87,71 @@ void lg_global_unlock(struct lglock *lg) >>> preempt_enable(); >>> } >>> EXPORT_SYMBOL(lg_global_unlock); >>> + >>> +#define FALLBACK_BASE (1UL << 30) >>> + >>> +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) >>> +{ >>> + struct lglock *lg = &lgrw->lglock; >>> + >>> + preempt_disable(); >>> + if (likely(!__this_cpu_read(*lgrw->reader_refcnt))) { >>> + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); >>> + if (unlikely(!arch_spin_trylock(this_cpu_ptr(lg->lock)))) { >>> + read_lock(&lgrw->fallback_rwlock); >>> + __this_cpu_write(*lgrw->reader_refcnt, FALLBACK_BASE); >>> + return; >>> + } >>> + } >>> + >>> + __this_cpu_inc(*lgrw->reader_refcnt); >>> +} >>> +EXPORT_SYMBOL(lg_rwlock_local_read_lock); >>> + >>> +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) >>> +{ >>> + switch (__this_cpu_read(*lgrw->reader_refcnt)) { >>> + case 1: >>> + __this_cpu_write(*lgrw->reader_refcnt, 0); >>> + lg_local_unlock(&lgrw->lglock); >>> + return; >> >> This should be a break, instead of a return, right? >> Otherwise, there will be a preempt imbalance... > > > "lockdep" and "preempt" are handled in lg_local_unlock(&lgrw->lglock); > Ah, ok.. I had missed that. Regards, Srivatsa S. Bhat -- To unsubscribe from this list: send the line "unsubscribe linux-pm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/include/linux/lglock.h b/include/linux/lglock.h index 0d24e93..90bfe79 100644 --- a/include/linux/lglock.h +++ b/include/linux/lglock.h @@ -67,4 +67,42 @@ void lg_local_unlock_cpu(struct lglock *lg, int cpu); void lg_global_lock(struct lglock *lg); void lg_global_unlock(struct lglock *lg); +/* read-preference read-write-lock like rwlock but provides local-read-lock */ +struct lgrwlock { + unsigned long __percpu *reader_refcnt; + struct lglock lglock; + rwlock_t fallback_rwlock; +}; + +#define __DEFINE_LGRWLOCK_PERCPU_DATA(name) \ + static DEFINE_PER_CPU(arch_spinlock_t, name ## _lock) \ + = __ARCH_SPIN_LOCK_UNLOCKED; \ + static DEFINE_PER_CPU(unsigned long, name ## _refcnt); + +#define __LGRWLOCK_INIT(name) \ + { \ + .reader_refcnt = &name ## _refcnt, \ + .lglock = { .lock = &name ## _lock }, \ + .fallback_rwlock = __RW_LOCK_UNLOCKED(name.fallback_rwlock)\ + } + +#define DEFINE_LGRWLOCK(name) \ + __DEFINE_LGRWLOCK_PERCPU_DATA(name) \ + struct lgrwlock name = __LGRWLOCK_INIT(name) + +#define DEFINE_STATIC_LGRWLOCK(name) \ + __DEFINE_LGRWLOCK_PERCPU_DATA(name) \ + static struct lgrwlock name = __LGRWLOCK_INIT(name) + +static inline void lg_rwlock_init(struct lgrwlock *lgrw, char *name) +{ + lg_lock_init(&lgrw->lglock, name); +} + +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw); +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw); +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw); +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw); +void __lg_rwlock_global_read_write_lock(struct lgrwlock *lgrw); +void __lg_rwlock_global_read_write_unlock(struct lgrwlock *lgrw); #endif diff --git a/kernel/lglock.c b/kernel/lglock.c index 6535a66..52e9b2c 100644 --- a/kernel/lglock.c +++ b/kernel/lglock.c @@ -87,3 +87,71 @@ void lg_global_unlock(struct lglock *lg) preempt_enable(); } EXPORT_SYMBOL(lg_global_unlock); + +#define FALLBACK_BASE (1UL << 30) + +void lg_rwlock_local_read_lock(struct lgrwlock *lgrw) +{ + struct lglock *lg = &lgrw->lglock; + + preempt_disable(); + if (likely(!__this_cpu_read(*lgrw->reader_refcnt))) { + rwlock_acquire_read(&lg->lock_dep_map, 0, 0, _RET_IP_); + if (unlikely(!arch_spin_trylock(this_cpu_ptr(lg->lock)))) { + read_lock(&lgrw->fallback_rwlock); + __this_cpu_write(*lgrw->reader_refcnt, FALLBACK_BASE); + return; + } + } + + __this_cpu_inc(*lgrw->reader_refcnt); +} +EXPORT_SYMBOL(lg_rwlock_local_read_lock); + +void lg_rwlock_local_read_unlock(struct lgrwlock *lgrw) +{ + switch (__this_cpu_read(*lgrw->reader_refcnt)) { + case 1: + __this_cpu_write(*lgrw->reader_refcnt, 0); + lg_local_unlock(&lgrw->lglock); + return; + case FALLBACK_BASE: + __this_cpu_write(*lgrw->reader_refcnt, 0); + read_unlock(&lgrw->fallback_rwlock); + rwlock_release(&lg->lock_dep_map, 1, _RET_IP_); + break; + default: + __this_cpu_dec(*lgrw->reader_refcnt); + break; + } + + preempt_enable(); +} +EXPORT_SYMBOL(lg_rwlock_local_read_unlock); + +void lg_rwlock_global_write_lock(struct lgrwlock *lgrw) +{ + lg_global_lock(&lgrw->lglock); + write_lock(&lgrw->fallback_rwlock); +} +EXPORT_SYMBOL(lg_rwlock_global_write_lock); + +void lg_rwlock_global_write_unlock(struct lgrwlock *lgrw) +{ + write_unlock(&lgrw->fallback_rwlock); + lg_global_unlock(&lgrw->lglock); +} +EXPORT_SYMBOL(lg_rwlock_global_write_unlock); + +/* special write-site APIs allolw nested reader in such write-site */ +void __lg_rwlock_global_read_write_lock(struct lgrwlock *lgrw) +{ + lg_rwlock_global_write_lock(lgrw); + __this_cpu_write(*lgrw->reader_refcnt, 1); +} + +void __lg_rwlock_global_read_write_unlock(struct lgrwlock *lgrw) +{ + __this_cpu_write(*lgrw->reader_refcnt, 0); + lg_rwlock_global_write_unlock(lgrw); +}