diff mbox series

irqchip/gic-v4: Fix ordering between vmapp and vpe locks

Message ID 20240723175203.3193882-1-maz@kernel.org (mailing list archive)
State New, archived
Headers show
Series irqchip/gic-v4: Fix ordering between vmapp and vpe locks | expand

Commit Message

Marc Zyngier July 23, 2024, 5:52 p.m. UTC
The (recently established) lock ordering mandates that the per-VM
vmapp_lock is acquired before we take the per-VPE lock.

As it turns out, its_vpe_set_affinity() takes the VPE lock, and
then calls into its_send_vmovp(), which itself takes the vmapp
lock. Obviously, this isn't what we want.

As its_send_vmovp() is only called from its_vpe_set_affinity(),
hoist the vmapp locking from the former into the latter, restoring
the expected order.

Fixes: f0eb154c39471 ("irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock")
Reported-by: Zhou Wang <wangzhou1@hisilicon.com>
Signed-off-by: Marc Zyngier <maz@kernel.org>
---
 drivers/irqchip/irq-gic-v3-its.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

Comments

Zhou Wang July 24, 2024, 1:26 a.m. UTC | #1
On 2024/7/24 1:52, Marc Zyngier wrote:
> The (recently established) lock ordering mandates that the per-VM
> vmapp_lock is acquired before we take the per-VPE lock.
> 
> As it turns out, its_vpe_set_affinity() takes the VPE lock, and
> then calls into its_send_vmovp(), which itself takes the vmapp
> lock. Obviously, this isn't what we want.
> 
> As its_send_vmovp() is only called from its_vpe_set_affinity(),
> hoist the vmapp locking from the former into the latter, restoring
> the expected order.
> 
> Fixes: f0eb154c39471 ("irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock")
> Reported-by: Zhou Wang <wangzhou1@hisilicon.com>

Hi Marc,

Tested-by: Zhou Wang <wangzhou1@hisilicon.com>

Thanks,
Zhou

> Signed-off-by: Marc Zyngier <maz@kernel.org>
> ---
>  drivers/irqchip/irq-gic-v3-its.c | 18 +++++++++++-------
>  1 file changed, 11 insertions(+), 7 deletions(-)
> 
> diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
> index 951ec140bcea2..b88c6011c8771 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -1328,12 +1328,6 @@ static void its_send_vmovp(struct its_vpe *vpe)
>  		return;
>  	}
>  
> -	/*
> -	 * Protect against concurrent updates of the mapping state on
> -	 * individual VMs.
> -	 */
> -	guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock);
> -
>  	/*
>  	 * Yet another marvel of the architecture. If using the
>  	 * its_list "feature", we need to make sure that all ITSs
> @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
>  	unsigned int from, cpu = nr_cpu_ids;
>  	struct cpumask *table_mask;
> -	unsigned long flags;
> +	unsigned long flags, vmapp_flags;
>  
>  	/*
>  	 * Changing affinity is mega expensive, so let's be as lazy as
> @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	 * protect us, and that we must ensure nobody samples vpe->col_idx
>  	 * during the update, hence the lock below which must also be
>  	 * taken on any vLPI handling path that evaluates vpe->col_idx.
> +	 *
> +	 * Finally, we must protect ourselves against concurrent
> +	 * updates of the mapping state on this VM should the ITS list
> +	 * be in use.
>  	 */
> +	if (its_list_map)
> +		raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags);
> +
>  	from = vpe_to_cpuid_lock(vpe, &flags);
>  	table_mask = gic_data_rdist_cpu(from)->vpe_table_mask;
>  
> @@ -3852,6 +3853,9 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	irq_data_update_effective_affinity(d, cpumask_of(cpu));
>  	vpe_to_cpuid_unlock(vpe, flags);
>  
> +	if (its_list_map)
> +		raw_spin_unlock_irqrestore(&vpe->its_vm->vmapp_lock, vmapp_flags);
> +
>  	return IRQ_SET_MASK_OK_DONE;
>  }
>
Thomas Gleixner July 26, 2024, 8:52 p.m. UTC | #2
On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote:
> @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
>  	unsigned int from, cpu = nr_cpu_ids;
>  	struct cpumask *table_mask;
> -	unsigned long flags;
> +	unsigned long flags, vmapp_flags;

What's this flags business for? its_vpe_set_affinity() is called with
interrupts disabled, no?
  
>  	/*
>  	 * Changing affinity is mega expensive, so let's be as lazy as
> @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d,
>  	 * protect us, and that we must ensure nobody samples vpe->col_idx
>  	 * during the update, hence the lock below which must also be
>  	 * taken on any vLPI handling path that evaluates vpe->col_idx.
> +	 *
> +	 * Finally, we must protect ourselves against concurrent
> +	 * updates of the mapping state on this VM should the ITS list
> +	 * be in use.
>  	 */
> +	if (its_list_map)
> +		raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags);

Confused. This changes the locking from unconditional to
conditional. What's the rationale here?

Thanks,

        tglx
Marc Zyngier July 28, 2024, 9:42 a.m. UTC | #3
On Fri, 26 Jul 2024 21:52:40 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote:
> > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> >  	unsigned int from, cpu = nr_cpu_ids;
> >  	struct cpumask *table_mask;
> > -	unsigned long flags;
> > +	unsigned long flags, vmapp_flags;
> 
> What's this flags business for? its_vpe_set_affinity() is called with
> interrupts disabled, no?

Duh. Of course. Cargo-culted braindead logic. I'll fix that.

>   
> >  	/*
> >  	 * Changing affinity is mega expensive, so let's be as lazy as
> > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	 * protect us, and that we must ensure nobody samples vpe->col_idx
> >  	 * during the update, hence the lock below which must also be
> >  	 * taken on any vLPI handling path that evaluates vpe->col_idx.
> > +	 *
> > +	 * Finally, we must protect ourselves against concurrent
> > +	 * updates of the mapping state on this VM should the ITS list
> > +	 * be in use.
> >  	 */
> > +	if (its_list_map)
> > +		raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags);
> 
> Confused. This changes the locking from unconditional to
> conditional. What's the rationale here?

I think I'm confused too. I've written this as a mix of the VMOVP lock
(which must be conditional) and the new VMAPP lock, which must be
taken to avoid racing against a new vcpu coming up. And of course,
this makes zero sense.

I'll get some sleep first, and then fix this correctly. Thanks for
spotting it.

	M.
Marc Zyngier July 29, 2024, 7:25 a.m. UTC | #4
On Fri, 26 Jul 2024 21:52:40 +0100,
Thomas Gleixner <tglx@linutronix.de> wrote:
> 
> On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote:
> > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
> >  	unsigned int from, cpu = nr_cpu_ids;
> >  	struct cpumask *table_mask;
> > -	unsigned long flags;
> > +	unsigned long flags, vmapp_flags;
> 
> What's this flags business for? its_vpe_set_affinity() is called with
> interrupts disabled, no?
>   
> >  	/*
> >  	 * Changing affinity is mega expensive, so let's be as lazy as
> > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d,
> >  	 * protect us, and that we must ensure nobody samples vpe->col_idx
> >  	 * during the update, hence the lock below which must also be
> >  	 * taken on any vLPI handling path that evaluates vpe->col_idx.
> > +	 *
> > +	 * Finally, we must protect ourselves against concurrent
> > +	 * updates of the mapping state on this VM should the ITS list
> > +	 * be in use.
> >  	 */
> > +	if (its_list_map)
> > +		raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags);
> 
> Confused. This changes the locking from unconditional to
> conditional. What's the rationale here?

Haven't managed to sleep much, but came to the conclusion that I
wasn't that stupid in my initial patch. Let's look at the full
picture, starting with its_send_vmovp():

        if (!its_list_map) {
                its = list_first_entry(&its_nodes, struct its_node, entry);
                desc.its_vmovp_cmd.col = &its->collections[col_id];
                its_send_single_vcommand(its, its_build_vmovp_cmd, &desc);
                return;
        }

        /*
         * Protect against concurrent updates of the mapping state on
         * individual VMs.
         */
        guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock);

The vmapp locking *is* conditional. Which makes a lot of sense as the
presence of ITS list is the only thing that prevents the VPEs from
being mapped eagerly at VM startup time (although this is a
performance reason, and not a correctness issue).

So there is no point in taking that lock if there is no ITS list,
given that the VPEs are mapped before we can do anything else. This
has the massive benefit of allowing concurrent VPE affinity changes on
modern HW.

This means that on GICv4.0 without ITSList or GICv4.1, the only lock
we need to acquire is the VPE lock itself on VPE affinity change.

I'll respin the patch shortly.

Thanks,

	M.
Thomas Gleixner July 29, 2024, 9:48 a.m. UTC | #5
On Mon, Jul 29 2024 at 08:25, Marc Zyngier wrote:
> On Fri, 26 Jul 2024 21:52:40 +0100,
> Thomas Gleixner <tglx@linutronix.de> wrote:
>> Confused. This changes the locking from unconditional to
>> conditional. What's the rationale here?
>
> Haven't managed to sleep much, but came to the conclusion that I
> wasn't that stupid in my initial patch. Let's look at the full
> picture, starting with its_send_vmovp():
>
>         if (!its_list_map) {
>                 its = list_first_entry(&its_nodes, struct its_node, entry);
>                 desc.its_vmovp_cmd.col = &its->collections[col_id];
>                 its_send_single_vcommand(its, its_build_vmovp_cmd, &desc);
>                 return;
>         }
>
>         /*
>          * Protect against concurrent updates of the mapping state on
>          * individual VMs.
>          */
>         guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock);
>
> The vmapp locking *is* conditional. Which makes a lot of sense as the

Misread the patch ...
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index 951ec140bcea2..b88c6011c8771 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -1328,12 +1328,6 @@  static void its_send_vmovp(struct its_vpe *vpe)
 		return;
 	}
 
-	/*
-	 * Protect against concurrent updates of the mapping state on
-	 * individual VMs.
-	 */
-	guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock);
-
 	/*
 	 * Yet another marvel of the architecture. If using the
 	 * its_list "feature", we need to make sure that all ITSs
@@ -3808,7 +3802,7 @@  static int its_vpe_set_affinity(struct irq_data *d,
 	struct its_vpe *vpe = irq_data_get_irq_chip_data(d);
 	unsigned int from, cpu = nr_cpu_ids;
 	struct cpumask *table_mask;
-	unsigned long flags;
+	unsigned long flags, vmapp_flags;
 
 	/*
 	 * Changing affinity is mega expensive, so let's be as lazy as
@@ -3822,7 +3816,14 @@  static int its_vpe_set_affinity(struct irq_data *d,
 	 * protect us, and that we must ensure nobody samples vpe->col_idx
 	 * during the update, hence the lock below which must also be
 	 * taken on any vLPI handling path that evaluates vpe->col_idx.
+	 *
+	 * Finally, we must protect ourselves against concurrent
+	 * updates of the mapping state on this VM should the ITS list
+	 * be in use.
 	 */
+	if (its_list_map)
+		raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags);
+
 	from = vpe_to_cpuid_lock(vpe, &flags);
 	table_mask = gic_data_rdist_cpu(from)->vpe_table_mask;
 
@@ -3852,6 +3853,9 @@  static int its_vpe_set_affinity(struct irq_data *d,
 	irq_data_update_effective_affinity(d, cpumask_of(cpu));
 	vpe_to_cpuid_unlock(vpe, flags);
 
+	if (its_list_map)
+		raw_spin_unlock_irqrestore(&vpe->its_vm->vmapp_lock, vmapp_flags);
+
 	return IRQ_SET_MASK_OK_DONE;
 }