diff mbox

[v2,1/5] KVM: eventfd: Fix lock order inversion.

Message ID 1395079899-29239-2-git-send-email-cornelia.huck@de.ibm.com (mailing list archive)
State New, archived
Headers show

Commit Message

Cornelia Huck March 17, 2014, 6:11 p.m. UTC
When registering a new irqfd, we call its ->poll method to collect any
event that might have previously been pending so that we can trigger it.
This is done under the kvm->irqfds.lock, which means the eventfd's ctx
lock is taken under it.

However, if we get a POLLHUP in irqfd_wakeup, we will be called with the
ctx lock held before getting the irqfds.lock to deactivate the irqfd,
causing lockdep to complain.

Calling the ->poll method does not really need the irqfds.lock, so let's
just move it after we've given up the irqfds.lock in kvm_irqfd_assign().

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
---
 virt/kvm/eventfd.c |    8 ++++----
 1 file changed, 4 insertions(+), 4 deletions(-)

Comments

Christian Borntraeger March 17, 2014, 9:55 p.m. UTC | #1
On 17/03/14 19:11, Cornelia Huck wrote:
> When registering a new irqfd, we call its ->poll method to collect any
> event that might have previously been pending so that we can trigger it.
> This is done under the kvm->irqfds.lock, which means the eventfd's ctx
> lock is taken under it.
> 
> However, if we get a POLLHUP in irqfd_wakeup, we will be called with the
> ctx lock held before getting the irqfds.lock to deactivate the irqfd,
> causing lockdep to complain.
> 
> Calling the ->poll method does not really need the irqfds.lock, so let's
> just move it after we've given up the irqfds.lock in kvm_irqfd_assign().
> 
> Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Do you still have the lockdep message somewhere? 
Looking at the patch and the description this makes sense. Even without
irqfd for s390:
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>

Paolo, maybe this patch can go in independently from s390?

Christian

> ---
>  virt/kvm/eventfd.c |    8 ++++----
>  1 file changed, 4 insertions(+), 4 deletions(-)
> 
> diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
> index abe4d60..29c2a04 100644
> --- a/virt/kvm/eventfd.c
> +++ b/virt/kvm/eventfd.c
> @@ -391,19 +391,19 @@ kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
>  					   lockdep_is_held(&kvm->irqfds.lock));
>  	irqfd_update(kvm, irqfd, irq_rt);
> 
> -	events = f.file->f_op->poll(f.file, &irqfd->pt);
> -
>  	list_add_tail(&irqfd->list, &kvm->irqfds.items);
> 
> +	spin_unlock_irq(&kvm->irqfds.lock);
> +
>  	/*
>  	 * Check if there was an event already pending on the eventfd
>  	 * before we registered, and trigger it as if we didn't miss it.
>  	 */
> +	events = f.file->f_op->poll(f.file, &irqfd->pt);
> +
>  	if (events & POLLIN)
>  		schedule_work(&irqfd->inject);
> 
> -	spin_unlock_irq(&kvm->irqfds.lock);
> -
>  	/*
>  	 * do not drop the file until the irqfd is fully initialized, otherwise
>  	 * we might race against the POLLHUP
> 

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Cornelia Huck March 18, 2014, 9:18 a.m. UTC | #2
On Mon, 17 Mar 2014 22:55:49 +0100
Christian Borntraeger <borntraeger@de.ibm.com> wrote:

> On 17/03/14 19:11, Cornelia Huck wrote:
> > When registering a new irqfd, we call its ->poll method to collect any
> > event that might have previously been pending so that we can trigger it.
> > This is done under the kvm->irqfds.lock, which means the eventfd's ctx
> > lock is taken under it.
> > 
> > However, if we get a POLLHUP in irqfd_wakeup, we will be called with the
> > ctx lock held before getting the irqfds.lock to deactivate the irqfd,
> > causing lockdep to complain.
> > 
> > Calling the ->poll method does not really need the irqfds.lock, so let's
> > just move it after we've given up the irqfds.lock in kvm_irqfd_assign().
> > 
> > Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> Do you still have the lockdep message somewhere? 

Unfortunately not, and it does not seem to be readily triggerable.

> Looking at the patch and the description this makes sense. Even without
> irqfd for s390:
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
> 
> Paolo, maybe this patch can go in independently from s390?
> 
> Christian

--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Paolo Bonzini March 18, 2014, 4:05 p.m. UTC | #3
Il 17/03/2014 22:55, Christian Borntraeger ha scritto:
>> > Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
> Do you still have the lockdep message somewhere?
> Looking at the patch and the description this makes sense. Even without
> irqfd for s390:
> Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
>
> Paolo, maybe this patch can go in independently from s390?

Yes, that's fine.  I'm applying it  to kvm/queue.

Paolo
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/virt/kvm/eventfd.c b/virt/kvm/eventfd.c
index abe4d60..29c2a04 100644
--- a/virt/kvm/eventfd.c
+++ b/virt/kvm/eventfd.c
@@ -391,19 +391,19 @@  kvm_irqfd_assign(struct kvm *kvm, struct kvm_irqfd *args)
 					   lockdep_is_held(&kvm->irqfds.lock));
 	irqfd_update(kvm, irqfd, irq_rt);
 
-	events = f.file->f_op->poll(f.file, &irqfd->pt);
-
 	list_add_tail(&irqfd->list, &kvm->irqfds.items);
 
+	spin_unlock_irq(&kvm->irqfds.lock);
+
 	/*
 	 * Check if there was an event already pending on the eventfd
 	 * before we registered, and trigger it as if we didn't miss it.
 	 */
+	events = f.file->f_op->poll(f.file, &irqfd->pt);
+
 	if (events & POLLIN)
 		schedule_work(&irqfd->inject);
 
-	spin_unlock_irq(&kvm->irqfds.lock);
-
 	/*
 	 * do not drop the file until the irqfd is fully initialized, otherwise
 	 * we might race against the POLLHUP