Message ID | 1424769899-14158-1-git-send-email-raghavendra.kt@linux.vnet.ibm.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: > Paravirt spinlock clears slowpath flag after doing unlock. > As explained by Linus currently it does: > prev = *lock; > add_smp(&lock->tickets.head, TICKET_LOCK_INC); > > /* add_smp() is a full mb() */ > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > __ticket_unlock_slowpath(lock, prev); > > which is *exactly* the kind of things you cannot do with spinlocks, > because after you've done the "add_smp()" and released the spinlock > for the fast-path, you can't access the spinlock any more. Exactly > because a fast-path lock might come in, and release the whole data > structure. > > Linus suggested that we should not do any writes to lock after unlock(), > and we can move slowpath clearing to fastpath lock. > > So this patch implements the fix with: > 1. Moving slowpath flag to head (Oleg): > Unlocked locks don't care about the slowpath flag; therefore we can keep > it set after the last unlock, and clear it again on the first (try)lock. > -- this removes the write after unlock. note that keeping slowpath flag would > result in unnecessary kicks. > By moving the slowpath flag from the tail to the head ticket we also avoid > the need to access both the head and tail tickets on unlock. > > 2. use xadd to avoid read/write after unlock that checks the need for > unlock_kick (Linus): > We further avoid the need for a read-after-release by using xadd; > the prev head value will include the slowpath flag and indicate if we > need to do PV kicking of suspended spinners -- on modern chips xadd > isn't (much) more expensive than an add + load. > > Result: > setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) > benchmark overcommit %improve > kernbench 1x -0.13 > kernbench 2x 0.02 > dbench 1x -1.77 > dbench 2x -0.63 > > [Jeremy: hinted missing TICKET_LOCK_INC for kick] > [Oleg: Moving slowpath flag to head, ticket_equals idea] > [PeterZ: Detailed changelog] > > Reported-by: Sasha Levin <sasha.levin@oracle.com> > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > Reviewed-by: Oleg Nesterov <oleg@redhat.com> > Acked-by: David Vrabel <david.vrabel@citrix.com> > --- > arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- > arch/x86/kernel/kvm.c | 7 ++- > arch/x86/xen/spinlock.c | 7 ++- > 3 files changed, 58 insertions(+), 50 deletions(-) > > Changes for stable: > - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous > Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) What is the git commit id of this in Linus's tree? What stable tree(s) do you want this applied to? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Greg KH <gregkh@linuxfoundation.org> wrote: > On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: > > Paravirt spinlock clears slowpath flag after doing unlock. > > As explained by Linus currently it does: > > prev = *lock; > > add_smp(&lock->tickets.head, TICKET_LOCK_INC); > > > > /* add_smp() is a full mb() */ > > > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > > __ticket_unlock_slowpath(lock, prev); > > > > which is *exactly* the kind of things you cannot do with spinlocks, > > because after you've done the "add_smp()" and released the spinlock > > for the fast-path, you can't access the spinlock any more. Exactly > > because a fast-path lock might come in, and release the whole data > > structure. > > > > Linus suggested that we should not do any writes to lock after unlock(), > > and we can move slowpath clearing to fastpath lock. > > > > So this patch implements the fix with: > > 1. Moving slowpath flag to head (Oleg): > > Unlocked locks don't care about the slowpath flag; therefore we can keep > > it set after the last unlock, and clear it again on the first (try)lock. > > -- this removes the write after unlock. note that keeping slowpath flag would > > result in unnecessary kicks. > > By moving the slowpath flag from the tail to the head ticket we also avoid > > the need to access both the head and tail tickets on unlock. > > > > 2. use xadd to avoid read/write after unlock that checks the need for > > unlock_kick (Linus): > > We further avoid the need for a read-after-release by using xadd; > > the prev head value will include the slowpath flag and indicate if we > > need to do PV kicking of suspended spinners -- on modern chips xadd > > isn't (much) more expensive than an add + load. > > > > Result: > > setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) > > benchmark overcommit %improve > > kernbench 1x -0.13 > > kernbench 2x 0.02 > > dbench 1x -1.77 > > dbench 2x -0.63 > > > > [Jeremy: hinted missing TICKET_LOCK_INC for kick] > > [Oleg: Moving slowpath flag to head, ticket_equals idea] > > [PeterZ: Detailed changelog] > > > > Reported-by: Sasha Levin <sasha.levin@oracle.com> > > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > > Reviewed-by: Oleg Nesterov <oleg@redhat.com> > > Acked-by: David Vrabel <david.vrabel@citrix.com> > > --- > > arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- > > arch/x86/kernel/kvm.c | 7 ++- > > arch/x86/xen/spinlock.c | 7 ++- > > 3 files changed, 58 insertions(+), 50 deletions(-) > > > > Changes for stable: > > - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous > > Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) > > What is the git commit id of this in Linus's tree? What > stable tree(s) do you want this applied to? It's: d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock You'll also need this fix from Linus to avoid (harmless) build warnings: dd36929720f4 kernel: make READ_ONCE() valid on const arguments Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 24, 2015 at 03:47:37PM +0100, Ingo Molnar wrote: > > * Greg KH <gregkh@linuxfoundation.org> wrote: > > > On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: > > > Paravirt spinlock clears slowpath flag after doing unlock. > > > As explained by Linus currently it does: > > > prev = *lock; > > > add_smp(&lock->tickets.head, TICKET_LOCK_INC); > > > > > > /* add_smp() is a full mb() */ > > > > > > if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > > > __ticket_unlock_slowpath(lock, prev); > > > > > > which is *exactly* the kind of things you cannot do with spinlocks, > > > because after you've done the "add_smp()" and released the spinlock > > > for the fast-path, you can't access the spinlock any more. Exactly > > > because a fast-path lock might come in, and release the whole data > > > structure. > > > > > > Linus suggested that we should not do any writes to lock after unlock(), > > > and we can move slowpath clearing to fastpath lock. > > > > > > So this patch implements the fix with: > > > 1. Moving slowpath flag to head (Oleg): > > > Unlocked locks don't care about the slowpath flag; therefore we can keep > > > it set after the last unlock, and clear it again on the first (try)lock. > > > -- this removes the write after unlock. note that keeping slowpath flag would > > > result in unnecessary kicks. > > > By moving the slowpath flag from the tail to the head ticket we also avoid > > > the need to access both the head and tail tickets on unlock. > > > > > > 2. use xadd to avoid read/write after unlock that checks the need for > > > unlock_kick (Linus): > > > We further avoid the need for a read-after-release by using xadd; > > > the prev head value will include the slowpath flag and indicate if we > > > need to do PV kicking of suspended spinners -- on modern chips xadd > > > isn't (much) more expensive than an add + load. > > > > > > Result: > > > setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) > > > benchmark overcommit %improve > > > kernbench 1x -0.13 > > > kernbench 2x 0.02 > > > dbench 1x -1.77 > > > dbench 2x -0.63 > > > > > > [Jeremy: hinted missing TICKET_LOCK_INC for kick] > > > [Oleg: Moving slowpath flag to head, ticket_equals idea] > > > [PeterZ: Detailed changelog] > > > > > > Reported-by: Sasha Levin <sasha.levin@oracle.com> > > > Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > > > Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > > > Reviewed-by: Oleg Nesterov <oleg@redhat.com> > > > Acked-by: David Vrabel <david.vrabel@citrix.com> > > > --- > > > arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- > > > arch/x86/kernel/kvm.c | 7 ++- > > > arch/x86/xen/spinlock.c | 7 ++- > > > 3 files changed, 58 insertions(+), 50 deletions(-) > > > > > > Changes for stable: > > > - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous > > > Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) > > > > What is the git commit id of this in Linus's tree? What > > stable tree(s) do you want this applied to? > > It's: > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock > > You'll also need this fix from Linus to avoid (harmless) > build warnings: > > dd36929720f4 kernel: make READ_ONCE() valid on const arguments Great. But what stable kernel trees should it be backported to? Just 3.19? Or anything older? thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/24/2015 08:17 PM, Ingo Molnar wrote: > > * Greg KH <gregkh@linuxfoundation.org> wrote: > >> On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: >>> Paravirt spinlock clears slowpath flag after doing unlock. >>> As explained by Linus currently it does: >>> prev = *lock; >>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >>> >>> /* add_smp() is a full mb() */ >>> >>> if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) >>> __ticket_unlock_slowpath(lock, prev); >>> >>> which is *exactly* the kind of things you cannot do with spinlocks, >>> because after you've done the "add_smp()" and released the spinlock >>> for the fast-path, you can't access the spinlock any more. Exactly >>> because a fast-path lock might come in, and release the whole data >>> structure. >>> >>> Linus suggested that we should not do any writes to lock after unlock(), >>> and we can move slowpath clearing to fastpath lock. >>> >>> So this patch implements the fix with: >>> 1. Moving slowpath flag to head (Oleg): >>> Unlocked locks don't care about the slowpath flag; therefore we can keep >>> it set after the last unlock, and clear it again on the first (try)lock. >>> -- this removes the write after unlock. note that keeping slowpath flag would >>> result in unnecessary kicks. >>> By moving the slowpath flag from the tail to the head ticket we also avoid >>> the need to access both the head and tail tickets on unlock. >>> >>> 2. use xadd to avoid read/write after unlock that checks the need for >>> unlock_kick (Linus): >>> We further avoid the need for a read-after-release by using xadd; >>> the prev head value will include the slowpath flag and indicate if we >>> need to do PV kicking of suspended spinners -- on modern chips xadd >>> isn't (much) more expensive than an add + load. >>> >>> Result: >>> setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) >>> benchmark overcommit %improve >>> kernbench 1x -0.13 >>> kernbench 2x 0.02 >>> dbench 1x -1.77 >>> dbench 2x -0.63 >>> >>> [Jeremy: hinted missing TICKET_LOCK_INC for kick] >>> [Oleg: Moving slowpath flag to head, ticket_equals idea] >>> [PeterZ: Detailed changelog] >>> >>> Reported-by: Sasha Levin <sasha.levin@oracle.com> >>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> >>> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> >>> Reviewed-by: Oleg Nesterov <oleg@redhat.com> >>> Acked-by: David Vrabel <david.vrabel@citrix.com> >>> --- >>> arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- >>> arch/x86/kernel/kvm.c | 7 ++- >>> arch/x86/xen/spinlock.c | 7 ++- >>> 3 files changed, 58 insertions(+), 50 deletions(-) >>> >>> Changes for stable: >>> - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous >>> Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) >> >> What is the git commit id of this in Linus's tree? What >> stable tree(s) do you want this applied to? > > It's: > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock Yes, This is the original patch. Please note I have taken out the READ_ONCE changes from the original patch to avoid build warnings mentioned below. (Those READ_ONCE changes were cosmetic and was not present in the previous versions) > > You'll also need this fix from Linus to avoid (harmless) > build warnings: > > dd36929720f4 kernel: make READ_ONCE() valid on const arguments So this may not be absolutely necessary with the current patch. -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On 02/24/2015 08:50 PM, Greg KH wrote: > On Tue, Feb 24, 2015 at 03:47:37PM +0100, Ingo Molnar wrote: >> >> * Greg KH <gregkh@linuxfoundation.org> wrote: >> >>> On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: >>>> Paravirt spinlock clears slowpath flag after doing unlock. >>>> As explained by Linus currently it does: >>>> prev = *lock; >>>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); >>>> >>>> /* add_smp() is a full mb() */ >>>> >>>> if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) >>>> __ticket_unlock_slowpath(lock, prev); >>>> >>>> which is *exactly* the kind of things you cannot do with spinlocks, >>>> because after you've done the "add_smp()" and released the spinlock >>>> for the fast-path, you can't access the spinlock any more. Exactly >>>> because a fast-path lock might come in, and release the whole data >>>> structure. >>>> >>>> Linus suggested that we should not do any writes to lock after unlock(), >>>> and we can move slowpath clearing to fastpath lock. >>>> >>>> So this patch implements the fix with: >>>> 1. Moving slowpath flag to head (Oleg): >>>> Unlocked locks don't care about the slowpath flag; therefore we can keep >>>> it set after the last unlock, and clear it again on the first (try)lock. >>>> -- this removes the write after unlock. note that keeping slowpath flag would >>>> result in unnecessary kicks. >>>> By moving the slowpath flag from the tail to the head ticket we also avoid >>>> the need to access both the head and tail tickets on unlock. >>>> >>>> 2. use xadd to avoid read/write after unlock that checks the need for >>>> unlock_kick (Linus): >>>> We further avoid the need for a read-after-release by using xadd; >>>> the prev head value will include the slowpath flag and indicate if we >>>> need to do PV kicking of suspended spinners -- on modern chips xadd >>>> isn't (much) more expensive than an add + load. >>>> >>>> Result: >>>> setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) >>>> benchmark overcommit %improve >>>> kernbench 1x -0.13 >>>> kernbench 2x 0.02 >>>> dbench 1x -1.77 >>>> dbench 2x -0.63 >>>> >>>> [Jeremy: hinted missing TICKET_LOCK_INC for kick] >>>> [Oleg: Moving slowpath flag to head, ticket_equals idea] >>>> [PeterZ: Detailed changelog] >>>> >>>> Reported-by: Sasha Levin <sasha.levin@oracle.com> >>>> Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> >>>> Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> >>>> Reviewed-by: Oleg Nesterov <oleg@redhat.com> >>>> Acked-by: David Vrabel <david.vrabel@citrix.com> >>>> --- >>>> arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- >>>> arch/x86/kernel/kvm.c | 7 ++- >>>> arch/x86/xen/spinlock.c | 7 ++- >>>> 3 files changed, 58 insertions(+), 50 deletions(-) >>>> >>>> Changes for stable: >>>> - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous >>>> Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) >>> >>> What is the git commit id of this in Linus's tree? What >>> stable tree(s) do you want this applied to? >> >> It's: >> >> d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock >> >> You'll also need this fix from Linus to avoid (harmless) >> build warnings: >> >> dd36929720f4 kernel: make READ_ONCE() valid on const arguments > > Great. But what stable kernel trees should it be backported to? Just > 3.19? Or anything older? My patch was intended only for 3.19. Though paravirt changes have gone in 3.12, the problem manifested clearly after some of the completion related changes. but I leave that decision to experts here. (I 'll send necessary changes if patch is needed for older versions because it may not apply cleanly). -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
On Tue, Feb 24, 2015 at 11:49:13PM +0530, Raghavendra K T wrote: > On 02/24/2015 08:17 PM, Ingo Molnar wrote: > > > >* Greg KH <gregkh@linuxfoundation.org> wrote: > > > >>On Tue, Feb 24, 2015 at 02:54:59PM +0530, Raghavendra K T wrote: > >>>Paravirt spinlock clears slowpath flag after doing unlock. > >>>As explained by Linus currently it does: > >>> prev = *lock; > >>> add_smp(&lock->tickets.head, TICKET_LOCK_INC); > >>> > >>> /* add_smp() is a full mb() */ > >>> > >>> if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) > >>> __ticket_unlock_slowpath(lock, prev); > >>> > >>>which is *exactly* the kind of things you cannot do with spinlocks, > >>>because after you've done the "add_smp()" and released the spinlock > >>>for the fast-path, you can't access the spinlock any more. Exactly > >>>because a fast-path lock might come in, and release the whole data > >>>structure. > >>> > >>>Linus suggested that we should not do any writes to lock after unlock(), > >>>and we can move slowpath clearing to fastpath lock. > >>> > >>>So this patch implements the fix with: > >>>1. Moving slowpath flag to head (Oleg): > >>>Unlocked locks don't care about the slowpath flag; therefore we can keep > >>>it set after the last unlock, and clear it again on the first (try)lock. > >>>-- this removes the write after unlock. note that keeping slowpath flag would > >>>result in unnecessary kicks. > >>>By moving the slowpath flag from the tail to the head ticket we also avoid > >>>the need to access both the head and tail tickets on unlock. > >>> > >>>2. use xadd to avoid read/write after unlock that checks the need for > >>>unlock_kick (Linus): > >>>We further avoid the need for a read-after-release by using xadd; > >>>the prev head value will include the slowpath flag and indicate if we > >>>need to do PV kicking of suspended spinners -- on modern chips xadd > >>>isn't (much) more expensive than an add + load. > >>> > >>>Result: > >>> setup: 16core (32 cpu +ht sandy bridge 8GB 16vcpu guest) > >>> benchmark overcommit %improve > >>> kernbench 1x -0.13 > >>> kernbench 2x 0.02 > >>> dbench 1x -1.77 > >>> dbench 2x -0.63 > >>> > >>>[Jeremy: hinted missing TICKET_LOCK_INC for kick] > >>>[Oleg: Moving slowpath flag to head, ticket_equals idea] > >>>[PeterZ: Detailed changelog] > >>> > >>>Reported-by: Sasha Levin <sasha.levin@oracle.com> > >>>Suggested-by: Linus Torvalds <torvalds@linux-foundation.org> > >>>Signed-off-by: Raghavendra K T <raghavendra.kt@linux.vnet.ibm.com> > >>>Reviewed-by: Oleg Nesterov <oleg@redhat.com> > >>>Acked-by: David Vrabel <david.vrabel@citrix.com> > >>>--- > >>> arch/x86/include/asm/spinlock.h | 94 ++++++++++++++++++++--------------------- > >>> arch/x86/kernel/kvm.c | 7 ++- > >>> arch/x86/xen/spinlock.c | 7 ++- > >>> 3 files changed, 58 insertions(+), 50 deletions(-) > >>> > >>>Changes for stable: > >>> - Don't replace the ACCESS_ONCE to READ_ONCE which would cause horraneous > >>> Compiler warnings (Linus, David Vbriel, PeterZ, Ingo) > >> > >>What is the git commit id of this in Linus's tree? What > >>stable tree(s) do you want this applied to? > > > >It's: > > > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock > > Yes, This is the original patch. Please note I have taken out the > READ_ONCE changes from the original patch to avoid build warnings > mentioned below. > (Those READ_ONCE changes were cosmetic and was not present in the > previous versions) > > > > >You'll also need this fix from Linus to avoid (harmless) > >build warnings: > > > > dd36929720f4 kernel: make READ_ONCE() valid on const arguments > > So this may not be absolutely necessary with the current patch. I'd prefer to be as close as possible to the upstream patch. So if applying both of these patches will work, I'd much rather do that. Changing patches when backporting them to stable for no good reason than to clean things up, just confuses everyone involved. Let's keep our messy history :) thanks, greg k-h -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Greg KH <gregkh@linuxfoundation.org> wrote: > > >It's: > > > > > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock > > > > Yes, This is the original patch. Please note I have taken out the > > READ_ONCE changes from the original patch to avoid build warnings > > mentioned below. > > (Those READ_ONCE changes were cosmetic and was not present in the > > previous versions) > > > > > > > >You'll also need this fix from Linus to avoid (harmless) > > >build warnings: > > > > > > dd36929720f4 kernel: make READ_ONCE() valid on const arguments > > > > So this may not be absolutely necessary with the current patch. > > I'd prefer to be as close as possible to the upstream > patch. So if applying both of these patches will work, > I'd much rather do that. Changing patches when > backporting them to stable for no good reason than to > clean things up, just confuses everyone involved. > > Let's keep our messy history :) By all means! You'll first need to cherry-pick these commits: 927609d622a3 kernel: tighten rules for ACCESS ONCE c5b19946eb76 kernel: Fix sparse warning for ACCESS_ONCE dd36929720f4 kernel: make READ_ONCE() valid on const arguments That's the minimum set you will need for backporting, due to overlapping changes to the ACCESS_ONCE() definition. and then apply this commit: d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock I've double checked that these commits will cherry-pick fine on top of v3.19, in that order, and that an x86-64 defconfig+kvmconfig+PARAVIRT_SPINLOCK=y kernel builds fine without warnings. I've not boot tested the changes, so if anything breaks it's all your fault - while if it works just fine then I'll be glad to take credit for that. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Am 25.02.2015 um 11:08 schrieb Ingo Molnar: > > * Greg KH <gregkh@linuxfoundation.org> wrote: > >>>> It's: >>>> >>>> d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock >>> >>> Yes, This is the original patch. Please note I have taken out the >>> READ_ONCE changes from the original patch to avoid build warnings >>> mentioned below. >>> (Those READ_ONCE changes were cosmetic and was not present in the >>> previous versions) >>> >>>> >>>> You'll also need this fix from Linus to avoid (harmless) >>>> build warnings: >>>> >>>> dd36929720f4 kernel: make READ_ONCE() valid on const arguments >>> >>> So this may not be absolutely necessary with the current patch. >> >> I'd prefer to be as close as possible to the upstream >> patch. So if applying both of these patches will work, >> I'd much rather do that. Changing patches when >> backporting them to stable for no good reason than to >> clean things up, just confuses everyone involved. >> >> Let's keep our messy history :) > > By all means! > > You'll first need to cherry-pick these commits: > > 927609d622a3 kernel: tighten rules for ACCESS ONCE > c5b19946eb76 kernel: Fix sparse warning for ACCESS_ONCE > dd36929720f4 kernel: make READ_ONCE() valid on const arguments If you go before 3.19, you will also need 230fa253df63 kernel: Provide READ_ONCE and ASSIGN_ONCE 43239cbe79fc kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val) > > That's the minimum set you will need for backporting, due > to overlapping changes to the ACCESS_ONCE() definition. > > and then apply this commit: > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock the alternative might be to replace READ_ONCE with ACCESS_ONCE when doing the backport. This depends on how important you consider backporting the ACCESS_ONCE fixes. Christian -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
* Christian Borntraeger <borntraeger@de.ibm.com> wrote: > > By all means! > > > > You'll first need to cherry-pick these commits: > > > 927609d622a3 kernel: tighten rules for ACCESS ONCE > > c5b19946eb76 kernel: Fix sparse warning for ACCESS_ONCE > > dd36929720f4 kernel: make READ_ONCE() valid on const arguments > > If you go before 3.19, you will also need > > 230fa253df63 kernel: Provide READ_ONCE and ASSIGN_ONCE > 43239cbe79fc kernel: Change ASSIGN_ONCE(val, x) to WRITE_ONCE(x, val) The affected spinlock code went over several iterations post v3.18, which I think makes the spinlock change too risky and complex to backport so far back. So it's not necessay to backport these READ_ONCE() changes. > > That's the minimum set you will need for backporting, > > due to overlapping changes to the ACCESS_ONCE() > > definition. > > > > and then apply this commit: > > > > d6abfdb20223 x86/spinlocks/paravirt: Fix memory corruption on unlock > > the alternative might be to replace READ_ONCE with > ACCESS_ONCE when doing the backport. Doing changes to patches when doing a backport is a big no-no IMHO. Either there is a clean sequence of upstream commit IDs to cherry-pick, or it should not be backported in most cases. Thanks, Ingo -- To unsubscribe from this list: send the line "unsubscribe kvm" in the body of a message to majordomo@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
diff --git a/arch/x86/include/asm/spinlock.h b/arch/x86/include/asm/spinlock.h index 625660f..cf87de3 100644 --- a/arch/x86/include/asm/spinlock.h +++ b/arch/x86/include/asm/spinlock.h @@ -46,7 +46,7 @@ static __always_inline bool static_key_false(struct static_key *key); static inline void __ticket_enter_slowpath(arch_spinlock_t *lock) { - set_bit(0, (volatile unsigned long *)&lock->tickets.tail); + set_bit(0, (volatile unsigned long *)&lock->tickets.head); } #else /* !CONFIG_PARAVIRT_SPINLOCKS */ @@ -60,10 +60,30 @@ static inline void __ticket_unlock_kick(arch_spinlock_t *lock, } #endif /* CONFIG_PARAVIRT_SPINLOCKS */ +static inline int __tickets_equal(__ticket_t one, __ticket_t two) +{ + return !((one ^ two) & ~TICKET_SLOWPATH_FLAG); +} + +static inline void __ticket_check_and_clear_slowpath(arch_spinlock_t *lock, + __ticket_t head) +{ + if (head & TICKET_SLOWPATH_FLAG) { + arch_spinlock_t old, new; + + old.tickets.head = head; + new.tickets.head = head & ~TICKET_SLOWPATH_FLAG; + old.tickets.tail = new.tickets.head + TICKET_LOCK_INC; + new.tickets.tail = old.tickets.tail; + + /* try to clear slowpath flag when there are no contenders */ + cmpxchg(&lock->head_tail, old.head_tail, new.head_tail); + } +} static __always_inline int arch_spin_value_unlocked(arch_spinlock_t lock) { - return lock.tickets.head == lock.tickets.tail; + return __tickets_equal(lock.tickets.head, lock.tickets.tail); } /* @@ -87,18 +107,21 @@ static __always_inline void arch_spin_lock(arch_spinlock_t *lock) if (likely(inc.head == inc.tail)) goto out; - inc.tail &= ~TICKET_SLOWPATH_FLAG; for (;;) { unsigned count = SPIN_THRESHOLD; do { - if (READ_ONCE(lock->tickets.head) == inc.tail) - goto out; + inc.head = READ_ONCE(lock->tickets.head); + if (__tickets_equal(inc.head, inc.tail)) + goto clear_slowpath; cpu_relax(); } while (--count); __ticket_lock_spinning(lock, inc.tail); } -out: barrier(); /* make sure nothing creeps before the lock is taken */ +clear_slowpath: + __ticket_check_and_clear_slowpath(lock, inc.head); +out: + barrier(); /* make sure nothing creeps before the lock is taken */ } static __always_inline int arch_spin_trylock(arch_spinlock_t *lock) @@ -106,56 +129,30 @@ static __always_inline int arch_spin_trylock(arch_spinlock_t *lock) arch_spinlock_t old, new; old.tickets = READ_ONCE(lock->tickets); - if (old.tickets.head != (old.tickets.tail & ~TICKET_SLOWPATH_FLAG)) + if (!__tickets_equal(old.tickets.head, old.tickets.tail)) return 0; new.head_tail = old.head_tail + (TICKET_LOCK_INC << TICKET_SHIFT); + new.head_tail &= ~TICKET_SLOWPATH_FLAG; /* cmpxchg is a full barrier, so nothing can move before it */ return cmpxchg(&lock->head_tail, old.head_tail, new.head_tail) == old.head_tail; } -static inline void __ticket_unlock_slowpath(arch_spinlock_t *lock, - arch_spinlock_t old) -{ - arch_spinlock_t new; - - BUILD_BUG_ON(((__ticket_t)NR_CPUS) != NR_CPUS); - - /* Perform the unlock on the "before" copy */ - old.tickets.head += TICKET_LOCK_INC; - - /* Clear the slowpath flag */ - new.head_tail = old.head_tail & ~(TICKET_SLOWPATH_FLAG << TICKET_SHIFT); - - /* - * If the lock is uncontended, clear the flag - use cmpxchg in - * case it changes behind our back though. - */ - if (new.tickets.head != new.tickets.tail || - cmpxchg(&lock->head_tail, old.head_tail, - new.head_tail) != old.head_tail) { - /* - * Lock still has someone queued for it, so wake up an - * appropriate waiter. - */ - __ticket_unlock_kick(lock, old.tickets.head); - } -} - static __always_inline void arch_spin_unlock(arch_spinlock_t *lock) { if (TICKET_SLOWPATH_FLAG && - static_key_false(¶virt_ticketlocks_enabled)) { - arch_spinlock_t prev; + static_key_false(¶virt_ticketlocks_enabled)) { + __ticket_t head; - prev = *lock; - add_smp(&lock->tickets.head, TICKET_LOCK_INC); + BUILD_BUG_ON(((__ticket_t)NR_CPUS) != NR_CPUS); - /* add_smp() is a full mb() */ + head = xadd(&lock->tickets.head, TICKET_LOCK_INC); - if (unlikely(lock->tickets.tail & TICKET_SLOWPATH_FLAG)) - __ticket_unlock_slowpath(lock, prev); + if (unlikely(head & TICKET_SLOWPATH_FLAG)) { + head &= ~TICKET_SLOWPATH_FLAG; + __ticket_unlock_kick(lock, (head + TICKET_LOCK_INC)); + } } else __add(&lock->tickets.head, TICKET_LOCK_INC, UNLOCK_LOCK_PREFIX); } @@ -164,14 +161,15 @@ static inline int arch_spin_is_locked(arch_spinlock_t *lock) { struct __raw_tickets tmp = READ_ONCE(lock->tickets); - return tmp.tail != tmp.head; + return !__tickets_equal(tmp.tail, tmp.head); } static inline int arch_spin_is_contended(arch_spinlock_t *lock) { struct __raw_tickets tmp = READ_ONCE(lock->tickets); - return (__ticket_t)(tmp.tail - tmp.head) > TICKET_LOCK_INC; + tmp.head &= ~TICKET_SLOWPATH_FLAG; + return (tmp.tail - tmp.head) > TICKET_LOCK_INC; } #define arch_spin_is_contended arch_spin_is_contended @@ -183,16 +181,16 @@ static __always_inline void arch_spin_lock_flags(arch_spinlock_t *lock, static inline void arch_spin_unlock_wait(arch_spinlock_t *lock) { - __ticket_t head = ACCESS_ONCE(lock->tickets.head); + __ticket_t head = READ_ONCE(lock->tickets.head); for (;;) { - struct __raw_tickets tmp = ACCESS_ONCE(lock->tickets); + struct __raw_tickets tmp = READ_ONCE(lock->tickets); /* * We need to check "unlocked" in a loop, tmp.head == head * can be false positive because of overflow. */ - if (tmp.head == (tmp.tail & ~TICKET_SLOWPATH_FLAG) || - tmp.head != head) + if (__tickets_equal(tmp.head, tmp.tail) || + !__tickets_equal(tmp.head, head)) break; cpu_relax(); diff --git a/arch/x86/kernel/kvm.c b/arch/x86/kernel/kvm.c index 94f6434..c9ccfcf 100644 --- a/arch/x86/kernel/kvm.c +++ b/arch/x86/kernel/kvm.c @@ -727,6 +727,7 @@ __visible void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) int cpu; u64 start; unsigned long flags; + __ticket_t head; if (in_nmi()) return; @@ -768,11 +769,15 @@ __visible void kvm_lock_spinning(struct arch_spinlock *lock, __ticket_t want) */ __ticket_enter_slowpath(lock); + /* make sure enter_slowpath, which is atomic does not cross the read */ + smp_mb__after_atomic(); + /* * check again make sure it didn't become free while * we weren't looking. */ - if (ACCESS_ONCE(lock->tickets.head) == want) { + head = READ_ONCE(lock->tickets.head); + if (__tickets_equal(head, want)) { add_stats(TAKEN_SLOW_PICKUP, 1); goto out; } diff --git a/arch/x86/xen/spinlock.c b/arch/x86/xen/spinlock.c index 23b45eb..4ab725c 100644 --- a/arch/x86/xen/spinlock.c +++ b/arch/x86/xen/spinlock.c @@ -112,6 +112,7 @@ __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want) struct xen_lock_waiting *w = this_cpu_ptr(&lock_waiting); int cpu = smp_processor_id(); u64 start; + __ticket_t head; unsigned long flags; /* If kicker interrupts not initialized yet, just spin */ @@ -159,11 +160,15 @@ __visible void xen_lock_spinning(struct arch_spinlock *lock, __ticket_t want) */ __ticket_enter_slowpath(lock); + /* make sure enter_slowpath, which is atomic does not cross the read */ + smp_mb__after_atomic(); + /* * check again make sure it didn't become free while * we weren't looking */ - if (ACCESS_ONCE(lock->tickets.head) == want) { + head = READ_ONCE(lock->tickets.head); + if (__tickets_equal(head, want)) { add_stats(TAKEN_SLOW_PICKUP, 1); goto out; }