diff mbox series

irqchip/gic-v3-its: Use the exact ITSList for VMOVP

Message ID 1571742366-21008-1-git-send-email-yuzenghui@huawei.com (mailing list archive)
State New, archived
Headers show
Series irqchip/gic-v3-its: Use the exact ITSList for VMOVP | expand

Commit Message

Zenghui Yu Oct. 22, 2019, 11:06 a.m. UTC
On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
we will map vPEs only on ITSs that will actually control interrupts
for the given VM.  And when moving a vPE, the VMOVP command will be
issued only for those ITSs.

But when issuing VMOVPs we seemed fail to present the exact ITSList
to ITSs who are actually included in the synchronization operation.
The its_list_map we're currently using includes all ITSs in the system,
even though some of them don't have the corrsponding vPE mapping at all.

Introduce get_its_list() to get the per-VM its_list_map, to indicate
which ITSs have vPE mappings for the given VM, and use this map as
the expected ITSList when building VMOVP.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
---

I haven't seen any broken with the current its_list_map, but behavior
might differ between implementations.  Let's do what the spec expects
us to do and try not to confuse the implementation.  Or is there any
points I've missed?

 drivers/irqchip/irq-gic-v3-its.c | 15 ++++++++++++++-
 1 file changed, 14 insertions(+), 1 deletion(-)

Comments

Marc Zyngier Oct. 22, 2019, 12:44 p.m. UTC | #1
Hi Zenghui,

On 2019-10-22 12:06, Zenghui Yu wrote:
> On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
> we will map vPEs only on ITSs that will actually control interrupts
> for the given VM.  And when moving a vPE, the VMOVP command will be
> issued only for those ITSs.
>
> But when issuing VMOVPs we seemed fail to present the exact ITSList
> to ITSs who are actually included in the synchronization operation.
> The its_list_map we're currently using includes all ITSs in the 
> system,
> even though some of them don't have the corrsponding vPE mapping at 
> all.
>
> Introduce get_its_list() to get the per-VM its_list_map, to indicate
> which ITSs have vPE mappings for the given VM, and use this map as
> the expected ITSList when building VMOVP.
>
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
> ---
>
> I haven't seen any broken with the current its_list_map, but behavior
> might differ between implementations.  Let's do what the spec expects
> us to do and try not to confuse the implementation.  Or is there any
> points I've missed?

No, I think you're right, and this is just an oversight on my part:
for example, we seem to do the right thing when handling a guest 
invall,
where we scan the ITS nodes and only emit a vinvall on an ITS that
has VLPI mapped in.

I don't think this is likely to cause any harm (after all, a DISCARD 
and
a VMOVP could race at any time), but it is certainly a performance gain
not to throw extra commands at unsuspecting ITSs. So thanks for 
spotting this.

A couple of comments below:

>
>  drivers/irqchip/irq-gic-v3-its.c | 15 ++++++++++++++-
>  1 file changed, 14 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/irqchip/irq-gic-v3-its.c
> b/drivers/irqchip/irq-gic-v3-its.c
> index c81da27044bf..eebee588ea30 100644
> --- a/drivers/irqchip/irq-gic-v3-its.c
> +++ b/drivers/irqchip/irq-gic-v3-its.c
> @@ -175,6 +175,19 @@ static DEFINE_IDA(its_vpeid_ida);
>  #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
>  #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + 
> SZ_128K)
>
> +static inline u16 get_its_list(struct its_vm *vm)

Please drop the inline, the compiler will do it for you.

> +{
> +	struct its_node *its;
> +	unsigned long its_list;
> +
> +	list_for_each_entry(its, &its_nodes, entry) {

You probably want to skip non v4 ITSs, as they don't have a list_nr 
associated
to them (and you'd probably end-up hitting ITS #0 for no good reason).

> +		if (vm->vlpi_count[its->list_nr])
> +			set_bit(its->list_nr, &its_list);

We don't need to be atomic here, so __set_bit would be just as fine.

> +	}
> +
> +	return (u16)its_list;
> +}
> +
>  static struct its_collection *dev_event_to_col(struct its_device 
> *its_dev,
>  					       u32 event)
>  {
> @@ -982,7 +995,6 @@ static void its_send_vmovp(struct its_vpe *vpe)
>  	int col_id = vpe->col_idx;
>
>  	desc.its_vmovp_cmd.vpe = vpe;
> -	desc.its_vmovp_cmd.its_list = (u16)its_list_map;

Careful here, you're leaving the its_list field uninitialized, and it
really should be 0 when GITS_TYPER.VMOVP == 1 (i.e. when its_list_map
is zero). Just initialize the whole descriptor to zero.
>
>  	if (!its_list_map) {
>  		its = list_first_entry(&its_nodes, struct its_node, entry);
> @@ -1003,6 +1015,7 @@ static void its_send_vmovp(struct its_vpe *vpe)
>  	raw_spin_lock_irqsave(&vmovp_lock, flags);
>
>  	desc.its_vmovp_cmd.seq_num = vmovp_seq_num++;
> +	desc.its_vmovp_cmd.its_list = get_its_list(vpe->its_vm);
>
>  	/* Emit VMOVPs */
>  	list_for_each_entry(its, &its_nodes, entry) {

Thanks,

         M.
Zenghui Yu Oct. 23, 2019, 2:08 a.m. UTC | #2
Hi Marc,

On 2019/10/22 20:44, Marc Zyngier wrote:
> Hi Zenghui,
> 
> On 2019-10-22 12:06, Zenghui Yu wrote:
>> On a system without Single VMOVP support (say GITS_TYPER.VMOVP == 0),
>> we will map vPEs only on ITSs that will actually control interrupts
>> for the given VM.  And when moving a vPE, the VMOVP command will be
>> issued only for those ITSs.
>>
>> But when issuing VMOVPs we seemed fail to present the exact ITSList
>> to ITSs who are actually included in the synchronization operation.
>> The its_list_map we're currently using includes all ITSs in the system,
>> even though some of them don't have the corrsponding vPE mapping at all.
>>
>> Introduce get_its_list() to get the per-VM its_list_map, to indicate
>> which ITSs have vPE mappings for the given VM, and use this map as
>> the expected ITSList when building VMOVP.
>>
>> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
>> ---
>>
>> I haven't seen any broken with the current its_list_map, but behavior
>> might differ between implementations.  Let's do what the spec expects
>> us to do and try not to confuse the implementation.  Or is there any
>> points I've missed?
> 
> No, I think you're right, and this is just an oversight on my part:
> for example, we seem to do the right thing when handling a guest invall,
> where we scan the ITS nodes and only emit a vinvall on an ITS that
> has VLPI mapped in.
> 
> I don't think this is likely to cause any harm (after all, a DISCARD and
> a VMOVP could race at any time), but it is certainly a performance gain
> not to throw extra commands at unsuspecting ITSs. So thanks for spotting 
> this.

I agree that it will be a performance gain with this patch - since after
receiving VMOVP, there might be a synchronization process among all ITSs
included in the ITSList, but waiting for those unsuspecting one is
totally pointless.

> 
> A couple of comments below:

Thanks for the comments. I will do a respin and send a v2 shortly, with
(hopefully) all of them addressed.

> 
>>
>>  drivers/irqchip/irq-gic-v3-its.c | 15 ++++++++++++++-
>>  1 file changed, 14 insertions(+), 1 deletion(-)
>>
>> diff --git a/drivers/irqchip/irq-gic-v3-its.c
>> b/drivers/irqchip/irq-gic-v3-its.c
>> index c81da27044bf..eebee588ea30 100644
>> --- a/drivers/irqchip/irq-gic-v3-its.c
>> +++ b/drivers/irqchip/irq-gic-v3-its.c
>> @@ -175,6 +175,19 @@ static DEFINE_IDA(its_vpeid_ida);
>>  #define gic_data_rdist_rd_base()    (gic_data_rdist()->rd_base)
>>  #define gic_data_rdist_vlpi_base()    (gic_data_rdist_rd_base() + 
>> SZ_128K)
>>
>> +static inline u16 get_its_list(struct its_vm *vm)
> 
> Please drop the inline, the compiler will do it for you.
> 
>> +{
>> +    struct its_node *its;
>> +    unsigned long its_list;
>> +
>> +    list_for_each_entry(its, &its_nodes, entry) {
> 
> You probably want to skip non v4 ITSs, as they don't have a list_nr 
> associated
> to them (and you'd probably end-up hitting ITS #0 for no good reason).
> 
>> +        if (vm->vlpi_count[its->list_nr])
>> +            set_bit(its->list_nr, &its_list);
> 
> We don't need to be atomic here, so __set_bit would be just as fine.
> 
>> +    }
>> +
>> +    return (u16)its_list;
>> +}
>> +
>>  static struct its_collection *dev_event_to_col(struct its_device 
>> *its_dev,
>>                             u32 event)
>>  {
>> @@ -982,7 +995,6 @@ static void its_send_vmovp(struct its_vpe *vpe)
>>      int col_id = vpe->col_idx;
>>
>>      desc.its_vmovp_cmd.vpe = vpe;
>> -    desc.its_vmovp_cmd.its_list = (u16)its_list_map;
> 
> Careful here, you're leaving the its_list field uninitialized, and it
> really should be 0 when GITS_TYPER.VMOVP == 1 (i.e. when its_list_map
> is zero). Just initialize the whole descriptor to zero.

Yes, this needs to be fixed, as well as initialize "its_list" to zero
in get_its_list().


Thanks,
zenghui
diff mbox series

Patch

diff --git a/drivers/irqchip/irq-gic-v3-its.c b/drivers/irqchip/irq-gic-v3-its.c
index c81da27044bf..eebee588ea30 100644
--- a/drivers/irqchip/irq-gic-v3-its.c
+++ b/drivers/irqchip/irq-gic-v3-its.c
@@ -175,6 +175,19 @@  static DEFINE_IDA(its_vpeid_ida);
 #define gic_data_rdist_rd_base()	(gic_data_rdist()->rd_base)
 #define gic_data_rdist_vlpi_base()	(gic_data_rdist_rd_base() + SZ_128K)
 
+static inline u16 get_its_list(struct its_vm *vm)
+{
+	struct its_node *its;
+	unsigned long its_list;
+
+	list_for_each_entry(its, &its_nodes, entry) {
+		if (vm->vlpi_count[its->list_nr])
+			set_bit(its->list_nr, &its_list);
+	}
+
+	return (u16)its_list;
+}
+
 static struct its_collection *dev_event_to_col(struct its_device *its_dev,
 					       u32 event)
 {
@@ -982,7 +995,6 @@  static void its_send_vmovp(struct its_vpe *vpe)
 	int col_id = vpe->col_idx;
 
 	desc.its_vmovp_cmd.vpe = vpe;
-	desc.its_vmovp_cmd.its_list = (u16)its_list_map;
 
 	if (!its_list_map) {
 		its = list_first_entry(&its_nodes, struct its_node, entry);
@@ -1003,6 +1015,7 @@  static void its_send_vmovp(struct its_vpe *vpe)
 	raw_spin_lock_irqsave(&vmovp_lock, flags);
 
 	desc.its_vmovp_cmd.seq_num = vmovp_seq_num++;
+	desc.its_vmovp_cmd.its_list = get_its_list(vpe->its_vm);
 
 	/* Emit VMOVPs */
 	list_for_each_entry(its, &its_nodes, entry) {