diff mbox

[v2] KVM: Implement support for the RH bit

Message ID 4E60C7EB.9060401@siemens.com (mailing list archive)
State New, archived
Headers show

Commit Message

Jan Kiszka Sept. 2, 2011, 12:11 p.m. UTC
On 2011-09-02 13:36, Jan Kiszka wrote:
> On 2011-09-02 13:27, Jan Kiszka wrote:
>> On 2011-09-02 09:48, Sasha Levin wrote:
>>> The RH bit exists in the message address register (lower 32 bits of
>>> the address).
>>>
>>> The bit indicates whether the message should go to the processor which was
>>> indicated in the destination ID bits, or whether it should go to the
>>> processor running at the lowest priority.
>>>
>>> Cc: Avi Kivity <avi@redhat.com>
>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>>> ---
>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
>>>  1 files changed, 16 insertions(+), 1 deletions(-)
>>>
>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>> index 9f614b4..0ba3a3d 100644
>>> --- a/virt/kvm/irq_comm.c
>>> +++ b/virt/kvm/irq_comm.c
>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>  	irq.level = 1;
>>>  	irq.shorthand = 0;
>>>  
>>> -	/* TODO Deal with RH bit of MSI message address */
>>> +	/*
>>> +	 * If the RH bit is set, we'll deliver to the processor running
>>> +	 * at the lowest priority.
>>> +	 */
>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
>>> +	} else {
>>> +		/*
>>> +		 * If the RH bit is not set, we'll deliver to the specific
>>> +		 * processor mentioned in destination ID, and ignore the DM
>>> +		 * bit.
>>> +		 */
>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
>>> +	}
>>> +
>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
>>>  }
>>>  
>>
>> Do you happen have a kvm unit test for this? Or how did you validate the
>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
>> as well which apparently already has the logic above but also some
>> contradictory comment.
> 
> Err, no, QEMU does not have this logic, it also ignores RH.
> 
> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> pointless. And that strongly suggests something is still wrong.

I tend to believe that this is what the spec tries to tell us:


ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
to RH==1 && DH==0.

BTW, irq_comm.c is surely the wrong place for all this IA32-specific
interpretation of MSI address and data. And we have yet another
guest-triggerable printk in kvm_irq_delivery_to_apic (messages to
physical ID 0xff).

Jan

Comments

Sasha Levin Sept. 2, 2011, 1:13 p.m. UTC | #1
On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
> On 2011-09-02 13:36, Jan Kiszka wrote:
> > On 2011-09-02 13:27, Jan Kiszka wrote:
> >> On 2011-09-02 09:48, Sasha Levin wrote:
> >>> The RH bit exists in the message address register (lower 32 bits of
> >>> the address).
> >>>
> >>> The bit indicates whether the message should go to the processor which was
> >>> indicated in the destination ID bits, or whether it should go to the
> >>> processor running at the lowest priority.
> >>>
> >>> Cc: Avi Kivity <avi@redhat.com>
> >>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> >>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >>> ---
> >>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
> >>>  1 files changed, 16 insertions(+), 1 deletions(-)
> >>>
> >>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>> index 9f614b4..0ba3a3d 100644
> >>> --- a/virt/kvm/irq_comm.c
> >>> +++ b/virt/kvm/irq_comm.c
> >>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>  	irq.level = 1;
> >>>  	irq.shorthand = 0;
> >>>  
> >>> -	/* TODO Deal with RH bit of MSI message address */
> >>> +	/*
> >>> +	 * If the RH bit is set, we'll deliver to the processor running
> >>> +	 * at the lowest priority.
> >>> +	 */
> >>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
> >>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
> >>> +	} else {
> >>> +		/*
> >>> +		 * If the RH bit is not set, we'll deliver to the specific
> >>> +		 * processor mentioned in destination ID, and ignore the DM
> >>> +		 * bit.
> >>> +		 */
> >>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
> >>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
> >>> +	}
> >>> +
> >>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
> >>>  }
> >>>  
> >>
> >> Do you happen have a kvm unit test for this? Or how did you validate the
> >> change? It doesn't look incorrect to me, I'd just like to check it QEMU
> >> as well which apparently already has the logic above but also some
> >> contradictory comment.
> > 
> > Err, no, QEMU does not have this logic, it also ignores RH.
> > 
> > But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> > pointless. And that strongly suggests something is still wrong.
> 
> I tend to believe that this is what the spec tries to tell us:
> 
> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> index 9f614b4..b72f77a 100644
> --- a/virt/kvm/irq_comm.c
> +++ b/virt/kvm/irq_comm.c
> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>  	irq.vector = (e->msi.data &
>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
>  	irq.delivery_mode = e->msi.data & 0x700;
>  	irq.level = 1;
> 
> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
> to RH==1 && DH==0.

Thing is, the spec specifically states that RH==1 should deliver to
lowest priority - even though it doesn't state whats the relationship
between delivery mode and RH bit.

Maybe we should set irq.delivery_mode only if RH==1?

> 
> BTW, irq_comm.c is surely the wrong place for all this IA32-specific
> interpretation of MSI address and data. And we have yet another
> guest-triggerable printk in kvm_irq_delivery_to_apic (messages to
> physical ID 0xff).
> 
> Jan
>
Jan Kiszka Sept. 2, 2011, 2 p.m. UTC | #2
On 2011-09-02 15:13, Sasha Levin wrote:
> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
>> On 2011-09-02 13:36, Jan Kiszka wrote:
>>> On 2011-09-02 13:27, Jan Kiszka wrote:
>>>> On 2011-09-02 09:48, Sasha Levin wrote:
>>>>> The RH bit exists in the message address register (lower 32 bits of
>>>>> the address).
>>>>>
>>>>> The bit indicates whether the message should go to the processor which was
>>>>> indicated in the destination ID bits, or whether it should go to the
>>>>> processor running at the lowest priority.
>>>>>
>>>>> Cc: Avi Kivity <avi@redhat.com>
>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>>>>> ---
>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
>>>>>
>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>> index 9f614b4..0ba3a3d 100644
>>>>> --- a/virt/kvm/irq_comm.c
>>>>> +++ b/virt/kvm/irq_comm.c
>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>  	irq.level = 1;
>>>>>  	irq.shorthand = 0;
>>>>>  
>>>>> -	/* TODO Deal with RH bit of MSI message address */
>>>>> +	/*
>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
>>>>> +	 * at the lowest priority.
>>>>> +	 */
>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
>>>>> +	} else {
>>>>> +		/*
>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
>>>>> +		 * processor mentioned in destination ID, and ignore the DM
>>>>> +		 * bit.
>>>>> +		 */
>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
>>>>> +	}
>>>>> +
>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
>>>>>  }
>>>>>  
>>>>
>>>> Do you happen have a kvm unit test for this? Or how did you validate the
>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
>>>> as well which apparently already has the logic above but also some
>>>> contradictory comment.
>>>
>>> Err, no, QEMU does not have this logic, it also ignores RH.
>>>
>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
>>> pointless. And that strongly suggests something is still wrong.
>>
>> I tend to believe that this is what the spec tries to tell us:
>>
>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>> index 9f614b4..b72f77a 100644
>> --- a/virt/kvm/irq_comm.c
>> +++ b/virt/kvm/irq_comm.c
>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>>  	irq.vector = (e->msi.data &
>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
>>  	irq.delivery_mode = e->msi.data & 0x700;
>>  	irq.level = 1;
>>
>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
>> to RH==1 && DH==0.
> 
> Thing is, the spec specifically states that RH==1 should deliver to
> lowest priority - even though it doesn't state whats the relationship
> between delivery mode and RH bit.

The spec says "When RH is 1 and the physical destination mode is used
[DM=0], the Destination ID field must not be set to 0xFF; it must point
to a processor that is present and enabled to receive the interrupt."

As far as I understand, there is no "lowest prio" in the setup RH=1/DM=0.

I've cc'ed Kevin who just worked on the APIC model. Maybe he can provide
some authoritative answer or dig it up @Intel.

Jan
Sasha Levin Sept. 2, 2011, 2:11 p.m. UTC | #3
On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
> On 2011-09-02 15:13, Sasha Levin wrote:
> > On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
> >> On 2011-09-02 13:36, Jan Kiszka wrote:
> >>> On 2011-09-02 13:27, Jan Kiszka wrote:
> >>>> On 2011-09-02 09:48, Sasha Levin wrote:
> >>>>> The RH bit exists in the message address register (lower 32 bits of
> >>>>> the address).
> >>>>>
> >>>>> The bit indicates whether the message should go to the processor which was
> >>>>> indicated in the destination ID bits, or whether it should go to the
> >>>>> processor running at the lowest priority.
> >>>>>
> >>>>> Cc: Avi Kivity <avi@redhat.com>
> >>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> >>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >>>>> ---
> >>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
> >>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
> >>>>>
> >>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>> index 9f614b4..0ba3a3d 100644
> >>>>> --- a/virt/kvm/irq_comm.c
> >>>>> +++ b/virt/kvm/irq_comm.c
> >>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>  	irq.level = 1;
> >>>>>  	irq.shorthand = 0;
> >>>>>  
> >>>>> -	/* TODO Deal with RH bit of MSI message address */
> >>>>> +	/*
> >>>>> +	 * If the RH bit is set, we'll deliver to the processor running
> >>>>> +	 * at the lowest priority.
> >>>>> +	 */
> >>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
> >>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
> >>>>> +	} else {
> >>>>> +		/*
> >>>>> +		 * If the RH bit is not set, we'll deliver to the specific
> >>>>> +		 * processor mentioned in destination ID, and ignore the DM
> >>>>> +		 * bit.
> >>>>> +		 */
> >>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
> >>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
> >>>>> +	}
> >>>>> +
> >>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
> >>>>>  }
> >>>>>  
> >>>>
> >>>> Do you happen have a kvm unit test for this? Or how did you validate the
> >>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
> >>>> as well which apparently already has the logic above but also some
> >>>> contradictory comment.
> >>>
> >>> Err, no, QEMU does not have this logic, it also ignores RH.
> >>>
> >>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> >>> pointless. And that strongly suggests something is still wrong.
> >>
> >> I tend to believe that this is what the spec tries to tell us:
> >>
> >> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >> index 9f614b4..b72f77a 100644
> >> --- a/virt/kvm/irq_comm.c
> >> +++ b/virt/kvm/irq_comm.c
> >> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> >>  	irq.vector = (e->msi.data &
> >>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
> >> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> >> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
> >> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
> >>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
> >>  	irq.delivery_mode = e->msi.data & 0x700;
> >>  	irq.level = 1;
> >>
> >> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
> >> to RH==1 && DH==0.
> > 
> > Thing is, the spec specifically states that RH==1 should deliver to
> > lowest priority - even though it doesn't state whats the relationship
> > between delivery mode and RH bit.
> 
> The spec says "When RH is 1 and the physical destination mode is used
> [DM=0], the Destination ID field must not be set to 0xFF; it must point
> to a processor that is present and enabled to receive the interrupt."
> 

When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?

From what I understand we need to select the processor running at the
lowest priority between the processors that are identified by the
destination ID:

"If RH is 1 and DM is 1, the Destination ID Field is interpreted as in
logical destination mode and the redirection is limited to only those
processors that are part of the logical group of processors based on the
processor's logical APIC ID and the Destination ID field in the
message."

> As far as I understand, there is no "lowest prio" in the setup RH=1/DM=0.
> 
> I've cc'ed Kevin who just worked on the APIC model. Maybe he can provide
> some authoritative answer or dig it up @Intel.
> 
> Jan
>
Jan Kiszka Sept. 2, 2011, 2:25 p.m. UTC | #4
On 2011-09-02 16:11, Sasha Levin wrote:
> On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
>> On 2011-09-02 15:13, Sasha Levin wrote:
>>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
>>>> On 2011-09-02 13:36, Jan Kiszka wrote:
>>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
>>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
>>>>>>> The RH bit exists in the message address register (lower 32 bits of
>>>>>>> the address).
>>>>>>>
>>>>>>> The bit indicates whether the message should go to the processor which was
>>>>>>> indicated in the destination ID bits, or whether it should go to the
>>>>>>> processor running at the lowest priority.
>>>>>>>
>>>>>>> Cc: Avi Kivity <avi@redhat.com>
>>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>>>>>>> ---
>>>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
>>>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
>>>>>>>
>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>>>> index 9f614b4..0ba3a3d 100644
>>>>>>> --- a/virt/kvm/irq_comm.c
>>>>>>> +++ b/virt/kvm/irq_comm.c
>>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>>>  	irq.level = 1;
>>>>>>>  	irq.shorthand = 0;
>>>>>>>  
>>>>>>> -	/* TODO Deal with RH bit of MSI message address */
>>>>>>> +	/*
>>>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
>>>>>>> +	 * at the lowest priority.
>>>>>>> +	 */
>>>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
>>>>>>> +	} else {
>>>>>>> +		/*
>>>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
>>>>>>> +		 * processor mentioned in destination ID, and ignore the DM
>>>>>>> +		 * bit.
>>>>>>> +		 */
>>>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
>>>>>>> +	}
>>>>>>> +
>>>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
>>>>>>>  }
>>>>>>>  
>>>>>>
>>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
>>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
>>>>>> as well which apparently already has the logic above but also some
>>>>>> contradictory comment.
>>>>>
>>>>> Err, no, QEMU does not have this logic, it also ignores RH.
>>>>>
>>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
>>>>> pointless. And that strongly suggests something is still wrong.
>>>>
>>>> I tend to believe that this is what the spec tries to tell us:
>>>>
>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>> index 9f614b4..b72f77a 100644
>>>> --- a/virt/kvm/irq_comm.c
>>>> +++ b/virt/kvm/irq_comm.c
>>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>>>>  	irq.vector = (e->msi.data &
>>>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>>>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
>>>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
>>>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
>>>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
>>>>  	irq.delivery_mode = e->msi.data & 0x700;
>>>>  	irq.level = 1;
>>>>
>>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
>>>> to RH==1 && DH==0.
>>>
>>> Thing is, the spec specifically states that RH==1 should deliver to
>>> lowest priority - even though it doesn't state whats the relationship
>>> between delivery mode and RH bit.
>>
>> The spec says "When RH is 1 and the physical destination mode is used
>> [DM=0], the Destination ID field must not be set to 0xFF; it must point
>> to a processor that is present and enabled to receive the interrupt."
>>
> 
> When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?

irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
kvm_apic_match_logical_addr for filtering out possible target CPUs.

Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
doing the right thing, even for delivery_mode != APIC_DM_LOWEST.

Again my question to you: Did you observe unexpected behaviour with some
real guests, or is this just based on code and spec study so far? If we
had a test case, that could also provide valuable hints.

Jan
Sasha Levin Sept. 2, 2011, 2:30 p.m. UTC | #5
On Fri, 2011-09-02 at 16:25 +0200, Jan Kiszka wrote:
> On 2011-09-02 16:11, Sasha Levin wrote:
> > On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
> >> On 2011-09-02 15:13, Sasha Levin wrote:
> >>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
> >>>> On 2011-09-02 13:36, Jan Kiszka wrote:
> >>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
> >>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
> >>>>>>> The RH bit exists in the message address register (lower 32 bits of
> >>>>>>> the address).
> >>>>>>>
> >>>>>>> The bit indicates whether the message should go to the processor which was
> >>>>>>> indicated in the destination ID bits, or whether it should go to the
> >>>>>>> processor running at the lowest priority.
> >>>>>>>
> >>>>>>> Cc: Avi Kivity <avi@redhat.com>
> >>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> >>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >>>>>>> ---
> >>>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
> >>>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
> >>>>>>>
> >>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>>> index 9f614b4..0ba3a3d 100644
> >>>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>>>  	irq.level = 1;
> >>>>>>>  	irq.shorthand = 0;
> >>>>>>>  
> >>>>>>> -	/* TODO Deal with RH bit of MSI message address */
> >>>>>>> +	/*
> >>>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
> >>>>>>> +	 * at the lowest priority.
> >>>>>>> +	 */
> >>>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
> >>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
> >>>>>>> +	} else {
> >>>>>>> +		/*
> >>>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
> >>>>>>> +		 * processor mentioned in destination ID, and ignore the DM
> >>>>>>> +		 * bit.
> >>>>>>> +		 */
> >>>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
> >>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
> >>>>>>> +	}
> >>>>>>> +
> >>>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
> >>>>>>>  }
> >>>>>>>  
> >>>>>>
> >>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
> >>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
> >>>>>> as well which apparently already has the logic above but also some
> >>>>>> contradictory comment.
> >>>>>
> >>>>> Err, no, QEMU does not have this logic, it also ignores RH.
> >>>>>
> >>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> >>>>> pointless. And that strongly suggests something is still wrong.
> >>>>
> >>>> I tend to believe that this is what the spec tries to tell us:
> >>>>
> >>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>> index 9f614b4..b72f77a 100644
> >>>> --- a/virt/kvm/irq_comm.c
> >>>> +++ b/virt/kvm/irq_comm.c
> >>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> >>>>  	irq.vector = (e->msi.data &
> >>>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
> >>>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> >>>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
> >>>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
> >>>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
> >>>>  	irq.delivery_mode = e->msi.data & 0x700;
> >>>>  	irq.level = 1;
> >>>>
> >>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
> >>>> to RH==1 && DH==0.
> >>>
> >>> Thing is, the spec specifically states that RH==1 should deliver to
> >>> lowest priority - even though it doesn't state whats the relationship
> >>> between delivery mode and RH bit.
> >>
> >> The spec says "When RH is 1 and the physical destination mode is used
> >> [DM=0], the Destination ID field must not be set to 0xFF; it must point
> >> to a processor that is present and enabled to receive the interrupt."
> >>
> > 
> > When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?
> 
> irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
> kvm_apic_match_logical_addr for filtering out possible target CPUs.
> 
> Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
> doing the right thing, even for delivery_mode != APIC_DM_LOWEST.
> 

The missing part is that when RH=1 we must look for the lowest priority:

"Redirection hint indication (RH) - This bit indicates whether the
message should be directed to the processor with the lowest interrupt
priority among processors that can receive the interrupt."

So it's not enough to set dest_mode, we must also make sure that
delivery_mode is set to low prio when RH=1.

> Again my question to you: Did you observe unexpected behaviour with some
> real guests, or is this just based on code and spec study so far? If we
> had a test case, that could also provide valuable hints.

Sorry, no test case.

I've stumbled on the 'TODO' comment when I was digging into the MSI
implementation in KVM and decided to implement it based on specs.

> 
> Jan
>
Jan Kiszka Sept. 2, 2011, 2:36 p.m. UTC | #6
On 2011-09-02 16:30, Sasha Levin wrote:
> On Fri, 2011-09-02 at 16:25 +0200, Jan Kiszka wrote:
>> On 2011-09-02 16:11, Sasha Levin wrote:
>>> On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
>>>> On 2011-09-02 15:13, Sasha Levin wrote:
>>>>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
>>>>>> On 2011-09-02 13:36, Jan Kiszka wrote:
>>>>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
>>>>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
>>>>>>>>> The RH bit exists in the message address register (lower 32 bits of
>>>>>>>>> the address).
>>>>>>>>>
>>>>>>>>> The bit indicates whether the message should go to the processor which was
>>>>>>>>> indicated in the destination ID bits, or whether it should go to the
>>>>>>>>> processor running at the lowest priority.
>>>>>>>>>
>>>>>>>>> Cc: Avi Kivity <avi@redhat.com>
>>>>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>>>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>>>>>>>>> ---
>>>>>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
>>>>>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
>>>>>>>>>
>>>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>>>>>> index 9f614b4..0ba3a3d 100644
>>>>>>>>> --- a/virt/kvm/irq_comm.c
>>>>>>>>> +++ b/virt/kvm/irq_comm.c
>>>>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>>>>>  	irq.level = 1;
>>>>>>>>>  	irq.shorthand = 0;
>>>>>>>>>  
>>>>>>>>> -	/* TODO Deal with RH bit of MSI message address */
>>>>>>>>> +	/*
>>>>>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
>>>>>>>>> +	 * at the lowest priority.
>>>>>>>>> +	 */
>>>>>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
>>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
>>>>>>>>> +	} else {
>>>>>>>>> +		/*
>>>>>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
>>>>>>>>> +		 * processor mentioned in destination ID, and ignore the DM
>>>>>>>>> +		 * bit.
>>>>>>>>> +		 */
>>>>>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
>>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
>>>>>>>>> +	}
>>>>>>>>> +
>>>>>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
>>>>>>>>>  }
>>>>>>>>>  
>>>>>>>>
>>>>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
>>>>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
>>>>>>>> as well which apparently already has the logic above but also some
>>>>>>>> contradictory comment.
>>>>>>>
>>>>>>> Err, no, QEMU does not have this logic, it also ignores RH.
>>>>>>>
>>>>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
>>>>>>> pointless. And that strongly suggests something is still wrong.
>>>>>>
>>>>>> I tend to believe that this is what the spec tries to tell us:
>>>>>>
>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>>> index 9f614b4..b72f77a 100644
>>>>>> --- a/virt/kvm/irq_comm.c
>>>>>> +++ b/virt/kvm/irq_comm.c
>>>>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>>>>>>  	irq.vector = (e->msi.data &
>>>>>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>>>>>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
>>>>>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
>>>>>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
>>>>>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
>>>>>>  	irq.delivery_mode = e->msi.data & 0x700;
>>>>>>  	irq.level = 1;
>>>>>>
>>>>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
>>>>>> to RH==1 && DH==0.
>>>>>
>>>>> Thing is, the spec specifically states that RH==1 should deliver to
>>>>> lowest priority - even though it doesn't state whats the relationship
>>>>> between delivery mode and RH bit.
>>>>
>>>> The spec says "When RH is 1 and the physical destination mode is used
>>>> [DM=0], the Destination ID field must not be set to 0xFF; it must point
>>>> to a processor that is present and enabled to receive the interrupt."
>>>>
>>>
>>> When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?
>>
>> irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
>> kvm_apic_match_logical_addr for filtering out possible target CPUs.
>>
>> Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
>> doing the right thing, even for delivery_mode != APIC_DM_LOWEST.
>>
> 
> The missing part is that when RH=1 we must look for the lowest priority:
> 
> "Redirection hint indication (RH) - This bit indicates whether the
> message should be directed to the processor with the lowest interrupt
> priority among processors that can receive the interrupt."
> 
> So it's not enough to set dest_mode, we must also make sure that
> delivery_mode is set to low prio when RH=1.

That's debatable. delivery_mode == APIC_DM_LOWEST includes this target
selection, but also more. I have a bad feeling when we just overwrite
delivery_mode as defined by the MSI data field instead of only patching
kvm_irq_delivery_to_apic or kvm_is_dm_lowest_prio - if required.

> 
>> Again my question to you: Did you observe unexpected behaviour with some
>> real guests, or is this just based on code and spec study so far? If we
>> had a test case, that could also provide valuable hints.
> 
> Sorry, no test case.
> 
> I've stumbled on the 'TODO' comment when I was digging into the MSI
> implementation in KVM and decided to implement it based on specs.

Then we definitely need some blessing by Intel to avoid subtle regressions.

Jan
Gleb Natapov Sept. 2, 2011, 2:44 p.m. UTC | #7
On Fri, Sep 02, 2011 at 04:36:46PM +0200, Jan Kiszka wrote:
> On 2011-09-02 16:30, Sasha Levin wrote:
> > On Fri, 2011-09-02 at 16:25 +0200, Jan Kiszka wrote:
> >> On 2011-09-02 16:11, Sasha Levin wrote:
> >>> On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
> >>>> On 2011-09-02 15:13, Sasha Levin wrote:
> >>>>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
> >>>>>> On 2011-09-02 13:36, Jan Kiszka wrote:
> >>>>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
> >>>>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
> >>>>>>>>> The RH bit exists in the message address register (lower 32 bits of
> >>>>>>>>> the address).
> >>>>>>>>>
> >>>>>>>>> The bit indicates whether the message should go to the processor which was
> >>>>>>>>> indicated in the destination ID bits, or whether it should go to the
> >>>>>>>>> processor running at the lowest priority.
> >>>>>>>>>
> >>>>>>>>> Cc: Avi Kivity <avi@redhat.com>
> >>>>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
> >>>>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
> >>>>>>>>> ---
> >>>>>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
> >>>>>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
> >>>>>>>>>
> >>>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>>>>> index 9f614b4..0ba3a3d 100644
> >>>>>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>>>>>  	irq.level = 1;
> >>>>>>>>>  	irq.shorthand = 0;
> >>>>>>>>>  
> >>>>>>>>> -	/* TODO Deal with RH bit of MSI message address */
> >>>>>>>>> +	/*
> >>>>>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
> >>>>>>>>> +	 * at the lowest priority.
> >>>>>>>>> +	 */
> >>>>>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
> >>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
> >>>>>>>>> +	} else {
> >>>>>>>>> +		/*
> >>>>>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
> >>>>>>>>> +		 * processor mentioned in destination ID, and ignore the DM
> >>>>>>>>> +		 * bit.
> >>>>>>>>> +		 */
> >>>>>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
> >>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
> >>>>>>>>> +	}
> >>>>>>>>> +
> >>>>>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
> >>>>>>>>>  }
> >>>>>>>>>  
> >>>>>>>>
> >>>>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
> >>>>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
> >>>>>>>> as well which apparently already has the logic above but also some
> >>>>>>>> contradictory comment.
> >>>>>>>
> >>>>>>> Err, no, QEMU does not have this logic, it also ignores RH.
> >>>>>>>
> >>>>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
> >>>>>>> pointless. And that strongly suggests something is still wrong.
> >>>>>>
> >>>>>> I tend to believe that this is what the spec tries to tell us:
> >>>>>>
> >>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
> >>>>>> index 9f614b4..b72f77a 100644
> >>>>>> --- a/virt/kvm/irq_comm.c
> >>>>>> +++ b/virt/kvm/irq_comm.c
> >>>>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
> >>>>>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
> >>>>>>  	irq.vector = (e->msi.data &
> >>>>>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
> >>>>>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
> >>>>>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
> >>>>>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
> >>>>>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
> >>>>>>  	irq.delivery_mode = e->msi.data & 0x700;
> >>>>>>  	irq.level = 1;
> >>>>>>
> >>>>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
> >>>>>> to RH==1 && DH==0.
> >>>>>
> >>>>> Thing is, the spec specifically states that RH==1 should deliver to
> >>>>> lowest priority - even though it doesn't state whats the relationship
> >>>>> between delivery mode and RH bit.
> >>>>
> >>>> The spec says "When RH is 1 and the physical destination mode is used
> >>>> [DM=0], the Destination ID field must not be set to 0xFF; it must point
> >>>> to a processor that is present and enabled to receive the interrupt."
> >>>>
> >>>
> >>> When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?
> >>
> >> irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
> >> kvm_apic_match_logical_addr for filtering out possible target CPUs.
> >>
> >> Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
> >> doing the right thing, even for delivery_mode != APIC_DM_LOWEST.
> >>
> > 
> > The missing part is that when RH=1 we must look for the lowest priority:
> > 
> > "Redirection hint indication (RH) - This bit indicates whether the
> > message should be directed to the processor with the lowest interrupt
> > priority among processors that can receive the interrupt."
> > 
> > So it's not enough to set dest_mode, we must also make sure that
> > delivery_mode is set to low prio when RH=1.
> 
> That's debatable. delivery_mode == APIC_DM_LOWEST includes this target
> selection, but also more. I have a bad feeling when we just overwrite
> delivery_mode as defined by the MSI data field instead of only patching
> kvm_irq_delivery_to_apic or kvm_is_dm_lowest_prio - if required.
> 
Patching them how? To behave exactly like delivery_mode == APIC_DM_LOWEST in
case RH bit is set? Then setting delivery_mode to APIC_DM_LOWEST will
achieve the same goal.

> > 
> >> Again my question to you: Did you observe unexpected behaviour with some
> >> real guests, or is this just based on code and spec study so far? If we
> >> had a test case, that could also provide valuable hints.
> > 
> > Sorry, no test case.
> > 
> > I've stumbled on the 'TODO' comment when I was digging into the MSI
> > implementation in KVM and decided to implement it based on specs.
> 
> Then we definitely need some blessing by Intel to avoid subtle regressions.
> 
Yes, if we are going to pursue that we need Intel to clarify what SDM means.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Jan Kiszka Sept. 2, 2011, 2:52 p.m. UTC | #8
On 2011-09-02 16:44, Gleb Natapov wrote:
> On Fri, Sep 02, 2011 at 04:36:46PM +0200, Jan Kiszka wrote:
>> On 2011-09-02 16:30, Sasha Levin wrote:
>>> On Fri, 2011-09-02 at 16:25 +0200, Jan Kiszka wrote:
>>>> On 2011-09-02 16:11, Sasha Levin wrote:
>>>>> On Fri, 2011-09-02 at 16:00 +0200, Jan Kiszka wrote:
>>>>>> On 2011-09-02 15:13, Sasha Levin wrote:
>>>>>>> On Fri, 2011-09-02 at 14:11 +0200, Jan Kiszka wrote:
>>>>>>>> On 2011-09-02 13:36, Jan Kiszka wrote:
>>>>>>>>> On 2011-09-02 13:27, Jan Kiszka wrote:
>>>>>>>>>> On 2011-09-02 09:48, Sasha Levin wrote:
>>>>>>>>>>> The RH bit exists in the message address register (lower 32 bits of
>>>>>>>>>>> the address).
>>>>>>>>>>>
>>>>>>>>>>> The bit indicates whether the message should go to the processor which was
>>>>>>>>>>> indicated in the destination ID bits, or whether it should go to the
>>>>>>>>>>> processor running at the lowest priority.
>>>>>>>>>>>
>>>>>>>>>>> Cc: Avi Kivity <avi@redhat.com>
>>>>>>>>>>> Cc: Marcelo Tosatti <mtosatti@redhat.com>
>>>>>>>>>>> Signed-off-by: Sasha Levin <levinsasha928@gmail.com>
>>>>>>>>>>> ---
>>>>>>>>>>>  virt/kvm/irq_comm.c |   17 ++++++++++++++++-
>>>>>>>>>>>  1 files changed, 16 insertions(+), 1 deletions(-)
>>>>>>>>>>>
>>>>>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>>>>>>>> index 9f614b4..0ba3a3d 100644
>>>>>>>>>>> --- a/virt/kvm/irq_comm.c
>>>>>>>>>>> +++ b/virt/kvm/irq_comm.c
>>>>>>>>>>> @@ -134,7 +134,22 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>>>>>>>  	irq.level = 1;
>>>>>>>>>>>  	irq.shorthand = 0;
>>>>>>>>>>>  
>>>>>>>>>>> -	/* TODO Deal with RH bit of MSI message address */
>>>>>>>>>>> +	/*
>>>>>>>>>>> +	 * If the RH bit is set, we'll deliver to the processor running
>>>>>>>>>>> +	 * at the lowest priority.
>>>>>>>>>>> +	 */
>>>>>>>>>>> +	if (e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI) {
>>>>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_LOWPRI;
>>>>>>>>>>> +	} else {
>>>>>>>>>>> +		/*
>>>>>>>>>>> +		 * If the RH bit is not set, we'll deliver to the specific
>>>>>>>>>>> +		 * processor mentioned in destination ID, and ignore the DM
>>>>>>>>>>> +		 * bit.
>>>>>>>>>>> +		 */
>>>>>>>>>>> +		irq.dest_mode = MSI_ADDR_DEST_MODE_PHYSICAL;
>>>>>>>>>>> +		irq.delivery_mode = MSI_DATA_DELIVERY_FIXED;
>>>>>>>>>>> +	}
>>>>>>>>>>> +
>>>>>>>>>>>  	return kvm_irq_delivery_to_apic(kvm, NULL, &irq);
>>>>>>>>>>>  }
>>>>>>>>>>>  
>>>>>>>>>>
>>>>>>>>>> Do you happen have a kvm unit test for this? Or how did you validate the
>>>>>>>>>> change? It doesn't look incorrect to me, I'd just like to check it QEMU
>>>>>>>>>> as well which apparently already has the logic above but also some
>>>>>>>>>> contradictory comment.
>>>>>>>>>
>>>>>>>>> Err, no, QEMU does not have this logic, it also ignores RH.
>>>>>>>>>
>>>>>>>>> But the above bits make "irq.delivery_mode = e->msi.data & 0x700"
>>>>>>>>> pointless. And that strongly suggests something is still wrong.
>>>>>>>>
>>>>>>>> I tend to believe that this is what the spec tries to tell us:
>>>>>>>>
>>>>>>>> diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
>>>>>>>> index 9f614b4..b72f77a 100644
>>>>>>>> --- a/virt/kvm/irq_comm.c
>>>>>>>> +++ b/virt/kvm/irq_comm.c
>>>>>>>> @@ -128,7 +128,8 @@ int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
>>>>>>>>  			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
>>>>>>>>  	irq.vector = (e->msi.data &
>>>>>>>>  			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
>>>>>>>> -	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
>>>>>>>> +	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
>>>>>>>> +		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
>>>>>>>>  	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
>>>>>>>>  	irq.delivery_mode = e->msi.data & 0x700;
>>>>>>>>  	irq.level = 1;
>>>>>>>>
>>>>>>>> ie. the DM flag is only relevant if RH is set, and RH==0 is equivalent
>>>>>>>> to RH==1 && DH==0.
>>>>>>>
>>>>>>> Thing is, the spec specifically states that RH==1 should deliver to
>>>>>>> lowest priority - even though it doesn't state whats the relationship
>>>>>>> between delivery mode and RH bit.
>>>>>>
>>>>>> The spec says "When RH is 1 and the physical destination mode is used
>>>>>> [DM=0], the Destination ID field must not be set to 0xFF; it must point
>>>>>> to a processor that is present and enabled to receive the interrupt."
>>>>>>
>>>>>
>>>>> When RH=1 and DM=0 yes, but what happens when RH=1 and DM=1?
>>>>
>>>> irq.dest_mode becomes non-zero, and kvm_apic_match_dest uses
>>>> kvm_apic_match_logical_addr for filtering out possible target CPUs.
>>>>
>>>> Mmh, a remaining question is if kvm_irq_delivery_to_apic is then already
>>>> doing the right thing, even for delivery_mode != APIC_DM_LOWEST.
>>>>
>>>
>>> The missing part is that when RH=1 we must look for the lowest priority:
>>>
>>> "Redirection hint indication (RH) - This bit indicates whether the
>>> message should be directed to the processor with the lowest interrupt
>>> priority among processors that can receive the interrupt."
>>>
>>> So it's not enough to set dest_mode, we must also make sure that
>>> delivery_mode is set to low prio when RH=1.
>>
>> That's debatable. delivery_mode == APIC_DM_LOWEST includes this target
>> selection, but also more. I have a bad feeling when we just overwrite
>> delivery_mode as defined by the MSI data field instead of only patching
>> kvm_irq_delivery_to_apic or kvm_is_dm_lowest_prio - if required.
>>
> Patching them how? To behave exactly like delivery_mode == APIC_DM_LOWEST in
> case RH bit is set? Then setting delivery_mode to APIC_DM_LOWEST will
> achieve the same goal.

/Wrt selecting the target CPU, but not regarding the vector type
delivered to that CPU (think of obscure things like RH=1,
delivery_mode=APIC_DM_NMI). If RH=1 only meant hard-wiring delivery_mode
to single value, then this would be redundant encoding. And that's
always suspicious (unless there is legacy involved).

Jan
Gleb Natapov Sept. 2, 2011, 3:03 p.m. UTC | #9
On Fri, Sep 02, 2011 at 04:52:51PM +0200, Jan Kiszka wrote:
> >> That's debatable. delivery_mode == APIC_DM_LOWEST includes this target
> >> selection, but also more. I have a bad feeling when we just overwrite
> >> delivery_mode as defined by the MSI data field instead of only patching
> >> kvm_irq_delivery_to_apic or kvm_is_dm_lowest_prio - if required.
> >>
> > Patching them how? To behave exactly like delivery_mode == APIC_DM_LOWEST in
> > case RH bit is set? Then setting delivery_mode to APIC_DM_LOWEST will
> > achieve the same goal.
> 
> /Wrt selecting the target CPU, but not regarding the vector type
> delivered to that CPU (think of obscure things like RH=1,
> delivery_mode=APIC_DM_NMI). If RH=1 only meant hard-wiring delivery_mode
> to single value, then this would be redundant encoding. And that's
> always suspicious (unless there is legacy involved).
> 
Hmm, good point.

--
			Gleb.
--
To unsubscribe from this list: send the line "unsubscribe kvm" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
diff mbox

Patch

diff --git a/virt/kvm/irq_comm.c b/virt/kvm/irq_comm.c
index 9f614b4..b72f77a 100644
--- a/virt/kvm/irq_comm.c
+++ b/virt/kvm/irq_comm.c
@@ -128,7 +128,8 @@  int kvm_set_msi(struct kvm_kernel_irq_routing_entry *e,
 			MSI_ADDR_DEST_ID_MASK) >> MSI_ADDR_DEST_ID_SHIFT;
 	irq.vector = (e->msi.data &
 			MSI_DATA_VECTOR_MASK) >> MSI_DATA_VECTOR_SHIFT;
-	irq.dest_mode = (1 << MSI_ADDR_DEST_MODE_SHIFT) & e->msi.address_lo;
+	irq.dest_mode = ((e->msi.address_lo & MSI_ADDR_DEST_MODE_LOGICAL) &&
+		(e->msi.address_lo & MSI_ADDR_REDIRECTION_LOWPRI));
 	irq.trig_mode = (1 << MSI_DATA_TRIGGER_SHIFT) & e->msi.data;
 	irq.delivery_mode = e->msi.data & 0x700;
 	irq.level = 1;