Message ID | 20240723175203.3193882-1-maz@kernel.org (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | irqchip/gic-v4: Fix ordering between vmapp and vpe locks | expand |
On 2024/7/24 1:52, Marc Zyngier wrote: > The (recently established) lock ordering mandates that the per-VM > vmapp_lock is acquired before we take the per-VPE lock. > > As it turns out, its_vpe_set_affinity() takes the VPE lock, and > then calls into its_send_vmovp(), which itself takes the vmapp > lock. Obviously, this isn't what we want. > > As its_send_vmovp() is only called from its_vpe_set_affinity(), > hoist the vmapp locking from the former into the latter, restoring > the expected order. > > Fixes: f0eb154c39471 ("irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock") > Reported-by: Zhou Wang <wangzhou1@hisilicon.com> Hi Marc, Tested-by: Zhou Wang <wangzhou1@hisilicon.com> Thanks, Zhou > Signed-off-by: Marc Zyngier <maz@kernel.org> > --- > drivers/irqchip/irq-gic-v3-its.c | 18 +++++++++++------- > 1 file changed, 11 insertions(+), 7 deletions(-) > > diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c > index 951ec140bcea2..b88c6011c8771 100644 > --- a/drivers/irqchip/irq-gic-v3-its.c > +++ b/drivers/irqchip/irq-gic-v3-its.c > @@ -1328,12 +1328,6 @@ static void its_send_vmovp(struct its_vpe *vpe) > return; > } > > - /* > - * Protect against concurrent updates of the mapping state on > - * individual VMs. > - */ > - guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock); > - > /* > * Yet another marvel of the architecture. If using the > * its_list "feature", we need to make sure that all ITSs > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d, > struct its_vpe *vpe = irq_data_get_irq_chip_data(d); > unsigned int from, cpu = nr_cpu_ids; > struct cpumask *table_mask; > - unsigned long flags; > + unsigned long flags, vmapp_flags; > > /* > * Changing affinity is mega expensive, so let's be as lazy as > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d, > * protect us, and that we must ensure nobody samples vpe->col_idx > * during the update, hence the lock below which must also be > * taken on any vLPI handling path that evaluates vpe->col_idx. > + * > + * Finally, we must protect ourselves against concurrent > + * updates of the mapping state on this VM should the ITS list > + * be in use. > */ > + if (its_list_map) > + raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags); > + > from = vpe_to_cpuid_lock(vpe, &flags); > table_mask = gic_data_rdist_cpu(from)->vpe_table_mask; > > @@ -3852,6 +3853,9 @@ static int its_vpe_set_affinity(struct irq_data *d, > irq_data_update_effective_affinity(d, cpumask_of(cpu)); > vpe_to_cpuid_unlock(vpe, flags); > > + if (its_list_map) > + raw_spin_unlock_irqrestore(&vpe->its_vm->vmapp_lock, vmapp_flags); > + > return IRQ_SET_MASK_OK_DONE; > } >
On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote: > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d, > struct its_vpe *vpe = irq_data_get_irq_chip_data(d); > unsigned int from, cpu = nr_cpu_ids; > struct cpumask *table_mask; > - unsigned long flags; > + unsigned long flags, vmapp_flags; What's this flags business for? its_vpe_set_affinity() is called with interrupts disabled, no? > /* > * Changing affinity is mega expensive, so let's be as lazy as > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d, > * protect us, and that we must ensure nobody samples vpe->col_idx > * during the update, hence the lock below which must also be > * taken on any vLPI handling path that evaluates vpe->col_idx. > + * > + * Finally, we must protect ourselves against concurrent > + * updates of the mapping state on this VM should the ITS list > + * be in use. > */ > + if (its_list_map) > + raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags); Confused. This changes the locking from unconditional to conditional. What's the rationale here? Thanks, tglx
On Fri, 26 Jul 2024 21:52:40 +0100, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote: > > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d, > > struct its_vpe *vpe = irq_data_get_irq_chip_data(d); > > unsigned int from, cpu = nr_cpu_ids; > > struct cpumask *table_mask; > > - unsigned long flags; > > + unsigned long flags, vmapp_flags; > > What's this flags business for? its_vpe_set_affinity() is called with > interrupts disabled, no? Duh. Of course. Cargo-culted braindead logic. I'll fix that. > > > /* > > * Changing affinity is mega expensive, so let's be as lazy as > > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d, > > * protect us, and that we must ensure nobody samples vpe->col_idx > > * during the update, hence the lock below which must also be > > * taken on any vLPI handling path that evaluates vpe->col_idx. > > + * > > + * Finally, we must protect ourselves against concurrent > > + * updates of the mapping state on this VM should the ITS list > > + * be in use. > > */ > > + if (its_list_map) > > + raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags); > > Confused. This changes the locking from unconditional to > conditional. What's the rationale here? I think I'm confused too. I've written this as a mix of the VMOVP lock (which must be conditional) and the new VMAPP lock, which must be taken to avoid racing against a new vcpu coming up. And of course, this makes zero sense. I'll get some sleep first, and then fix this correctly. Thanks for spotting it. M.
On Fri, 26 Jul 2024 21:52:40 +0100, Thomas Gleixner <tglx@linutronix.de> wrote: > > On Tue, Jul 23 2024 at 18:52, Marc Zyngier wrote: > > @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d, > > struct its_vpe *vpe = irq_data_get_irq_chip_data(d); > > unsigned int from, cpu = nr_cpu_ids; > > struct cpumask *table_mask; > > - unsigned long flags; > > + unsigned long flags, vmapp_flags; > > What's this flags business for? its_vpe_set_affinity() is called with > interrupts disabled, no? > > > /* > > * Changing affinity is mega expensive, so let's be as lazy as > > @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d, > > * protect us, and that we must ensure nobody samples vpe->col_idx > > * during the update, hence the lock below which must also be > > * taken on any vLPI handling path that evaluates vpe->col_idx. > > + * > > + * Finally, we must protect ourselves against concurrent > > + * updates of the mapping state on this VM should the ITS list > > + * be in use. > > */ > > + if (its_list_map) > > + raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags); > > Confused. This changes the locking from unconditional to > conditional. What's the rationale here? Haven't managed to sleep much, but came to the conclusion that I wasn't that stupid in my initial patch. Let's look at the full picture, starting with its_send_vmovp(): if (!its_list_map) { its = list_first_entry(&its_nodes, struct its_node, entry); desc.its_vmovp_cmd.col = &its->collections[col_id]; its_send_single_vcommand(its, its_build_vmovp_cmd, &desc); return; } /* * Protect against concurrent updates of the mapping state on * individual VMs. */ guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock); The vmapp locking *is* conditional. Which makes a lot of sense as the presence of ITS list is the only thing that prevents the VPEs from being mapped eagerly at VM startup time (although this is a performance reason, and not a correctness issue). So there is no point in taking that lock if there is no ITS list, given that the VPEs are mapped before we can do anything else. This has the massive benefit of allowing concurrent VPE affinity changes on modern HW. This means that on GICv4.0 without ITSList or GICv4.1, the only lock we need to acquire is the VPE lock itself on VPE affinity change. I'll respin the patch shortly. Thanks, M.
On Mon, Jul 29 2024 at 08:25, Marc Zyngier wrote: > On Fri, 26 Jul 2024 21:52:40 +0100, > Thomas Gleixner <tglx@linutronix.de> wrote: >> Confused. This changes the locking from unconditional to >> conditional. What's the rationale here? > > Haven't managed to sleep much, but came to the conclusion that I > wasn't that stupid in my initial patch. Let's look at the full > picture, starting with its_send_vmovp(): > > if (!its_list_map) { > its = list_first_entry(&its_nodes, struct its_node, entry); > desc.its_vmovp_cmd.col = &its->collections[col_id]; > its_send_single_vcommand(its, its_build_vmovp_cmd, &desc); > return; > } > > /* > * Protect against concurrent updates of the mapping state on > * individual VMs. > */ > guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock); > > The vmapp locking *is* conditional. Which makes a lot of sense as the Misread the patch ...
diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c index 951ec140bcea2..b88c6011c8771 100644 --- a/drivers/irqchip/irq-gic-v3-its.c +++ b/drivers/irqchip/irq-gic-v3-its.c @@ -1328,12 +1328,6 @@ static void its_send_vmovp(struct its_vpe *vpe) return; } - /* - * Protect against concurrent updates of the mapping state on - * individual VMs. - */ - guard(raw_spinlock_irqsave)(&vpe->its_vm->vmapp_lock); - /* * Yet another marvel of the architecture. If using the * its_list "feature", we need to make sure that all ITSs @@ -3808,7 +3802,7 @@ static int its_vpe_set_affinity(struct irq_data *d, struct its_vpe *vpe = irq_data_get_irq_chip_data(d); unsigned int from, cpu = nr_cpu_ids; struct cpumask *table_mask; - unsigned long flags; + unsigned long flags, vmapp_flags; /* * Changing affinity is mega expensive, so let's be as lazy as @@ -3822,7 +3816,14 @@ static int its_vpe_set_affinity(struct irq_data *d, * protect us, and that we must ensure nobody samples vpe->col_idx * during the update, hence the lock below which must also be * taken on any vLPI handling path that evaluates vpe->col_idx. + * + * Finally, we must protect ourselves against concurrent + * updates of the mapping state on this VM should the ITS list + * be in use. */ + if (its_list_map) + raw_spin_lock_irqsave(&vpe->its_vm->vmapp_lock, vmapp_flags); + from = vpe_to_cpuid_lock(vpe, &flags); table_mask = gic_data_rdist_cpu(from)->vpe_table_mask; @@ -3852,6 +3853,9 @@ static int its_vpe_set_affinity(struct irq_data *d, irq_data_update_effective_affinity(d, cpumask_of(cpu)); vpe_to_cpuid_unlock(vpe, flags); + if (its_list_map) + raw_spin_unlock_irqrestore(&vpe->its_vm->vmapp_lock, vmapp_flags); + return IRQ_SET_MASK_OK_DONE; }
The (recently established) lock ordering mandates that the per-VM vmapp_lock is acquired before we take the per-VPE lock. As it turns out, its_vpe_set_affinity() takes the VPE lock, and then calls into its_send_vmovp(), which itself takes the vmapp lock. Obviously, this isn't what we want. As its_send_vmovp() is only called from its_vpe_set_affinity(), hoist the vmapp locking from the former into the latter, restoring the expected order. Fixes: f0eb154c39471 ("irqchip/gic-v4: Substitute vmovp_lock for a per-VM lock") Reported-by: Zhou Wang <wangzhou1@hisilicon.com> Signed-off-by: Marc Zyngier <maz@kernel.org> --- drivers/irqchip/irq-gic-v3-its.c | 18 +++++++++++------- 1 file changed, 11 insertions(+), 7 deletions(-)