diff mbox series

[v6,10/13] vpci/header: reset the command register when adding devices

Message ID 20220204063459.680961-11-andr2000@gmail.com (mailing list archive)
State New, archived
Headers show
Series PCI devices passthrough on Arm, part 3 | expand

Commit Message

Oleksandr Andrushchenko Feb. 4, 2022, 6:34 a.m. UTC
From: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>

Reset the command register when assigning a PCI device to a guest:
according to the PCI spec the PCI_COMMAND register is typically all 0's
after reset.

Signed-off-by: Oleksandr Andrushchenko <oleksandr_andrushchenko@epam.com>
---
Since v5:
- updated commit message
Since v1:
 - do not write 0 to the command register, but respect host settings.
---
 xen/drivers/vpci/header.c | 15 ++++++++++++---
 1 file changed, 12 insertions(+), 3 deletions(-)

Comments

Jan Beulich Feb. 4, 2022, 2:30 p.m. UTC | #1
On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
> Reset the command register when assigning a PCI device to a guest:
> according to the PCI spec the PCI_COMMAND register is typically all 0's
> after reset.

It's not entirely clear to me whether setting the hardware register to
zero is okay. What wants to be zero is the value the guest observes
initially.

> --- a/xen/drivers/vpci/header.c
> +++ b/xen/drivers/vpci/header.c
> @@ -454,8 +454,7 @@ static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
>          pci_conf_write16(pdev->sbdf, reg, cmd);
>  }
>  
> -static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
> -                            uint32_t cmd, void *data)
> +static uint32_t emulate_cmd_reg(const struct pci_dev *pdev, uint32_t cmd)

The command register is a 16-bit one, so parameter and return type should
either be plain unsigned int (preferred, see ./CODING_STYLE) or uint16_t
imo.

Jan
Oleksandr Andrushchenko Feb. 4, 2022, 2:37 p.m. UTC | #2
On 04.02.22 16:30, Jan Beulich wrote:
> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>> Reset the command register when assigning a PCI device to a guest:
>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>> after reset.
> It's not entirely clear to me whether setting the hardware register to
> zero is okay. What wants to be zero is the value the guest observes
> initially.
"the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
Why wouldn't it be ok? What is the exact concern here?
>> --- a/xen/drivers/vpci/header.c
>> +++ b/xen/drivers/vpci/header.c
>> @@ -454,8 +454,7 @@ static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
>>           pci_conf_write16(pdev->sbdf, reg, cmd);
>>   }
>>   
>> -static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
>> -                            uint32_t cmd, void *data)
>> +static uint32_t emulate_cmd_reg(const struct pci_dev *pdev, uint32_t cmd)
> The command register is a 16-bit one, so parameter and return type should
> either be plain unsigned int (preferred, see ./CODING_STYLE) or uint16_t
> imo.
God catch, thank you
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 7, 2022, 7:29 a.m. UTC | #3
On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
> On 04.02.22 16:30, Jan Beulich wrote:
>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>> Reset the command register when assigning a PCI device to a guest:
>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>> after reset.
>> It's not entirely clear to me whether setting the hardware register to
>> zero is okay. What wants to be zero is the value the guest observes
>> initially.
> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
> Why wouldn't it be ok? What is the exact concern here?

The concern is - as voiced is similar ways before, perhaps in other
contexts - that you need to consider bit-by-bit whether overwriting
with 0 what is currently there is okay. Xen and/or Dom0 may have put
values there which they expect to remain unaltered. I guess
PCI_COMMAND_SERR is a good example: While the guest's view of this
will want to be zero initially, the host having set it to 1 may not
easily be overwritten with 0, or else you'd effectively imply giving
the guest control of the bit.

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 11:27 a.m. UTC | #4
On 07.02.22 09:29, Jan Beulich wrote:
> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>> On 04.02.22 16:30, Jan Beulich wrote:
>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>> Reset the command register when assigning a PCI device to a guest:
>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>> after reset.
>>> It's not entirely clear to me whether setting the hardware register to
>>> zero is okay. What wants to be zero is the value the guest observes
>>> initially.
>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>> Why wouldn't it be ok? What is the exact concern here?
> The concern is - as voiced is similar ways before, perhaps in other
> contexts - that you need to consider bit-by-bit whether overwriting
> with 0 what is currently there is okay. Xen and/or Dom0 may have put
> values there which they expect to remain unaltered. I guess
> PCI_COMMAND_SERR is a good example: While the guest's view of this
> will want to be zero initially, the host having set it to 1 may not
> easily be overwritten with 0, or else you'd effectively imply giving
> the guest control of the bit.
We have already discussed in great detail PCI_COMMAND emulation [1].
At the end you wrote [1]:
"Well, in order for the whole thing to be security supported it needs to
be explained for every bit why it is safe to allow the guest to drive it.
Until you mean vPCI to reach that state, leaving TODO notes in the code
for anything not investigated may indeed be good enough.

Jan"

So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
care about INTx which is honored with the code in this patch.
>
> Jan
>

Thank you,
Oleksandr

[1] https://patchwork.kernel.org/project/xen-devel/patch/20210903100831.177748-9-andr2000@gmail.com/
[2] https://lists.xenproject.org/archives/html/xen-devel/2021-09/msg00737.html
Jan Beulich Feb. 7, 2022, 12:38 p.m. UTC | #5
On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
> 
> 
> On 07.02.22 09:29, Jan Beulich wrote:
>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>> after reset.
>>>> It's not entirely clear to me whether setting the hardware register to
>>>> zero is okay. What wants to be zero is the value the guest observes
>>>> initially.
>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>> Why wouldn't it be ok? What is the exact concern here?
>> The concern is - as voiced is similar ways before, perhaps in other
>> contexts - that you need to consider bit-by-bit whether overwriting
>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>> values there which they expect to remain unaltered. I guess
>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>> will want to be zero initially, the host having set it to 1 may not
>> easily be overwritten with 0, or else you'd effectively imply giving
>> the guest control of the bit.
> We have already discussed in great detail PCI_COMMAND emulation [1].
> At the end you wrote [1]:
> "Well, in order for the whole thing to be security supported it needs to
> be explained for every bit why it is safe to allow the guest to drive it.
> Until you mean vPCI to reach that state, leaving TODO notes in the code
> for anything not investigated may indeed be good enough.
> 
> Jan"
> 
> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
> care about INTx which is honored with the code in this patch.

Right. The issue I see is that the description does not have any
mention of this, but instead talks about simply writing zero.

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 12:51 p.m. UTC | #6
On 07.02.22 14:38, Jan Beulich wrote:
> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>
>> On 07.02.22 09:29, Jan Beulich wrote:
>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>> after reset.
>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>> initially.
>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>> Why wouldn't it be ok? What is the exact concern here?
>>> The concern is - as voiced is similar ways before, perhaps in other
>>> contexts - that you need to consider bit-by-bit whether overwriting
>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>> values there which they expect to remain unaltered. I guess
>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>> will want to be zero initially, the host having set it to 1 may not
>>> easily be overwritten with 0, or else you'd effectively imply giving
>>> the guest control of the bit.
>> We have already discussed in great detail PCI_COMMAND emulation [1].
>> At the end you wrote [1]:
>> "Well, in order for the whole thing to be security supported it needs to
>> be explained for every bit why it is safe to allow the guest to drive it.
>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>> for anything not investigated may indeed be good enough.
>>
>> Jan"
>>
>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>> care about INTx which is honored with the code in this patch.
> Right. The issue I see is that the description does not have any
> mention of this, but instead talks about simply writing zero.
How do you want that mentioned? Extended commit message or
just a link to the thread [1]?
With the above done, do you think that writing 0's is an acceptable
approach as of now?
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 7, 2022, 12:54 p.m. UTC | #7
On 07.02.2022 13:51, Oleksandr Andrushchenko wrote:
> 
> 
> On 07.02.22 14:38, Jan Beulich wrote:
>> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>>
>>> On 07.02.22 09:29, Jan Beulich wrote:
>>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>>> after reset.
>>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>>> initially.
>>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>>> Why wouldn't it be ok? What is the exact concern here?
>>>> The concern is - as voiced is similar ways before, perhaps in other
>>>> contexts - that you need to consider bit-by-bit whether overwriting
>>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>>> values there which they expect to remain unaltered. I guess
>>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>>> will want to be zero initially, the host having set it to 1 may not
>>>> easily be overwritten with 0, or else you'd effectively imply giving
>>>> the guest control of the bit.
>>> We have already discussed in great detail PCI_COMMAND emulation [1].
>>> At the end you wrote [1]:
>>> "Well, in order for the whole thing to be security supported it needs to
>>> be explained for every bit why it is safe to allow the guest to drive it.
>>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>>> for anything not investigated may indeed be good enough.
>>>
>>> Jan"
>>>
>>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>>> care about INTx which is honored with the code in this patch.
>> Right. The issue I see is that the description does not have any
>> mention of this, but instead talks about simply writing zero.
> How do you want that mentioned? Extended commit message or
> just a link to the thread [1]?

What I'd like you to describe is what the change does without
fundamentally implying it'll end up being zero which gets written
to the register. Stating as a conclusion that for the time being
this means writing zero is certainly fine (and likely helpful if
made explicit).

> With the above done, do you think that writing 0's is an acceptable
> approach as of now?

Well, yes, provided we have a sufficiently similar understanding
of what "acceptable" here means.

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 2:17 p.m. UTC | #8
On 07.02.22 14:54, Jan Beulich wrote:
> On 07.02.2022 13:51, Oleksandr Andrushchenko wrote:
>>
>> On 07.02.22 14:38, Jan Beulich wrote:
>>> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>>> On 07.02.22 09:29, Jan Beulich wrote:
>>>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>>>> after reset.
>>>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>>>> initially.
>>>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>>>> Why wouldn't it be ok? What is the exact concern here?
>>>>> The concern is - as voiced is similar ways before, perhaps in other
>>>>> contexts - that you need to consider bit-by-bit whether overwriting
>>>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>>>> values there which they expect to remain unaltered. I guess
>>>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>>>> will want to be zero initially, the host having set it to 1 may not
>>>>> easily be overwritten with 0, or else you'd effectively imply giving
>>>>> the guest control of the bit.
>>>> We have already discussed in great detail PCI_COMMAND emulation [1].
>>>> At the end you wrote [1]:
>>>> "Well, in order for the whole thing to be security supported it needs to
>>>> be explained for every bit why it is safe to allow the guest to drive it.
>>>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>>>> for anything not investigated may indeed be good enough.
>>>>
>>>> Jan"
>>>>
>>>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>>>> care about INTx which is honored with the code in this patch.
>>> Right. The issue I see is that the description does not have any
>>> mention of this, but instead talks about simply writing zero.
>> How do you want that mentioned? Extended commit message or
>> just a link to the thread [1]?
> What I'd like you to describe is what the change does without
> fundamentally implying it'll end up being zero which gets written
> to the register. Stating as a conclusion that for the time being
> this means writing zero is certainly fine (and likely helpful if
> made explicit).
Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
guest's view of this will want to be zero initially, the host having set
it to 1 may not easily be overwritten with 0, or else we'd effectively
imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
proper emulation in order to honor host's settings.

There are examples of emulators [1], [2] which already deal with PCI_COMMAND
register emulation and it seems that at most they care about the only INTX
bit (besides IO/memory enable and bus muster which are write through).
It could be because in order to properly emulate the PCI_COMMAND register
we need to know about the whole PCI topology, e.g. if any setting in device's
command register is aligned with the upstream port etc.
This makes me think that because of this complexity others just ignore that.
Neither I think this can be easily done in Xen case.

According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
Device Control" says that the reset state of the command register is
typically 0, so reset the command register when assigning a PCI device
to a guest t all 0's and for now only make sure INTx bit is set according
to if MSI/MSI-X enabled.

[1] https://github.com/qemu/qemu/blob/master/hw/xen/xen_pt_config_init.c#L310
[2] https://github.com/projectacrn/acrn-hypervisor/blob/master/hypervisor/hw/pci.c#L336

Will the above description be enough?

It also seems to be a good move to squash the following patches:
[PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
[PATCH v6 10/13] vpci/header: reset the command register when adding devices

as they implement a single piece of functionality now.
>> With the above done, do you think that writing 0's is an acceptable
>> approach as of now?
> Well, yes, provided we have a sufficiently similar understanding
> of what "acceptable" here means.
>
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 7, 2022, 2:31 p.m. UTC | #9
On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
> 
> 
> On 07.02.22 14:54, Jan Beulich wrote:
>> On 07.02.2022 13:51, Oleksandr Andrushchenko wrote:
>>>
>>> On 07.02.22 14:38, Jan Beulich wrote:
>>>> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>>>> On 07.02.22 09:29, Jan Beulich wrote:
>>>>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>>>>> after reset.
>>>>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>>>>> initially.
>>>>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>>>>> Why wouldn't it be ok? What is the exact concern here?
>>>>>> The concern is - as voiced is similar ways before, perhaps in other
>>>>>> contexts - that you need to consider bit-by-bit whether overwriting
>>>>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>>>>> values there which they expect to remain unaltered. I guess
>>>>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>>>>> will want to be zero initially, the host having set it to 1 may not
>>>>>> easily be overwritten with 0, or else you'd effectively imply giving
>>>>>> the guest control of the bit.
>>>>> We have already discussed in great detail PCI_COMMAND emulation [1].
>>>>> At the end you wrote [1]:
>>>>> "Well, in order for the whole thing to be security supported it needs to
>>>>> be explained for every bit why it is safe to allow the guest to drive it.
>>>>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>>>>> for anything not investigated may indeed be good enough.
>>>>>
>>>>> Jan"
>>>>>
>>>>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>>>>> care about INTx which is honored with the code in this patch.
>>>> Right. The issue I see is that the description does not have any
>>>> mention of this, but instead talks about simply writing zero.
>>> How do you want that mentioned? Extended commit message or
>>> just a link to the thread [1]?
>> What I'd like you to describe is what the change does without
>> fundamentally implying it'll end up being zero which gets written
>> to the register. Stating as a conclusion that for the time being
>> this means writing zero is certainly fine (and likely helpful if
>> made explicit).
> Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
> to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
> guest's view of this will want to be zero initially, the host having set
> it to 1 may not easily be overwritten with 0, or else we'd effectively
> imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
> proper emulation in order to honor host's settings.
> 
> There are examples of emulators [1], [2] which already deal with PCI_COMMAND
> register emulation and it seems that at most they care about the only INTX
> bit (besides IO/memory enable and bus muster which are write through).
> It could be because in order to properly emulate the PCI_COMMAND register
> we need to know about the whole PCI topology, e.g. if any setting in device's
> command register is aligned with the upstream port etc.
> This makes me think that because of this complexity others just ignore that.
> Neither I think this can be easily done in Xen case.
> 
> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
> Device Control" says that the reset state of the command register is
> typically 0, so reset the command register when assigning a PCI device
> to a guest t all 0's and for now only make sure INTx bit is set according
> to if MSI/MSI-X enabled.

"... is typically 0, so when assigning a PCI device reset the guest view of
 the command register to all 0's. For now our emulation only makes sure INTx
 is set according to host requirements, i.e. depending on MSI/MSI-X enabled
 state."

Maybe? (Obviously a fresh device given to a guest will have MSI/MSI-X 
disabled, so I'm not sure that aspect really needs mentioning.)

But: What's still missing here then is the separation of guest and host
views. When we set INTx behind the guest's back, it shouldn't observe the
bit set. Or is this meant to be another (big) TODO?

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 2:46 p.m. UTC | #10
On 07.02.22 16:31, Jan Beulich wrote:
> On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
>>
>> On 07.02.22 14:54, Jan Beulich wrote:
>>> On 07.02.2022 13:51, Oleksandr Andrushchenko wrote:
>>>> On 07.02.22 14:38, Jan Beulich wrote:
>>>>> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>>>>> On 07.02.22 09:29, Jan Beulich wrote:
>>>>>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>>>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>>>>>> after reset.
>>>>>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>>>>>> initially.
>>>>>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>>>>>> Why wouldn't it be ok? What is the exact concern here?
>>>>>>> The concern is - as voiced is similar ways before, perhaps in other
>>>>>>> contexts - that you need to consider bit-by-bit whether overwriting
>>>>>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>>>>>> values there which they expect to remain unaltered. I guess
>>>>>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>>>>>> will want to be zero initially, the host having set it to 1 may not
>>>>>>> easily be overwritten with 0, or else you'd effectively imply giving
>>>>>>> the guest control of the bit.
>>>>>> We have already discussed in great detail PCI_COMMAND emulation [1].
>>>>>> At the end you wrote [1]:
>>>>>> "Well, in order for the whole thing to be security supported it needs to
>>>>>> be explained for every bit why it is safe to allow the guest to drive it.
>>>>>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>>>>>> for anything not investigated may indeed be good enough.
>>>>>>
>>>>>> Jan"
>>>>>>
>>>>>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>>>>>> care about INTx which is honored with the code in this patch.
>>>>> Right. The issue I see is that the description does not have any
>>>>> mention of this, but instead talks about simply writing zero.
>>>> How do you want that mentioned? Extended commit message or
>>>> just a link to the thread [1]?
>>> What I'd like you to describe is what the change does without
>>> fundamentally implying it'll end up being zero which gets written
>>> to the register. Stating as a conclusion that for the time being
>>> this means writing zero is certainly fine (and likely helpful if
>>> made explicit).
>> Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
>> to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
>> guest's view of this will want to be zero initially, the host having set
>> it to 1 may not easily be overwritten with 0, or else we'd effectively
>> imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
>> proper emulation in order to honor host's settings.
>>
>> There are examples of emulators [1], [2] which already deal with PCI_COMMAND
>> register emulation and it seems that at most they care about the only INTX
>> bit (besides IO/memory enable and bus muster which are write through).
>> It could be because in order to properly emulate the PCI_COMMAND register
>> we need to know about the whole PCI topology, e.g. if any setting in device's
>> command register is aligned with the upstream port etc.
>> This makes me think that because of this complexity others just ignore that.
>> Neither I think this can be easily done in Xen case.
>>
>> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
>> Device Control" says that the reset state of the command register is
>> typically 0, so reset the command register when assigning a PCI device
>> to a guest t all 0's and for now only make sure INTx bit is set according
>> to if MSI/MSI-X enabled.
> "... is typically 0, so when assigning a PCI device reset the guest view of
>   the command register to all 0's. For now our emulation only makes sure INTx
>   is set according to host requirements, i.e. depending on MSI/MSI-X enabled
>   state."
This sounds good, I will use it. Thank you
>
> Maybe? (Obviously a fresh device given to a guest will have MSI/MSI-X
> disabled, so I'm not sure that aspect really needs mentioning.)
>
> But: What's still missing here then is the separation of guest and host
> views. When we set INTx behind the guest's back, it shouldn't observe the
> bit set. Or is this meant to be another (big) TODO?
But, patch [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
already takes care of it, I mean that it will set/reset INTx for the guest
according to MSI/MSI-X. So, if we squash these two patches the whole
picture will be seen at once.
>
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 7, 2022, 3:05 p.m. UTC | #11
On 07.02.2022 15:46, Oleksandr Andrushchenko wrote:
> On 07.02.22 16:31, Jan Beulich wrote:
>> But: What's still missing here then is the separation of guest and host
>> views. When we set INTx behind the guest's back, it shouldn't observe the
>> bit set. Or is this meant to be another (big) TODO?
> But, patch [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
> already takes care of it, I mean that it will set/reset INTx for the guest
> according to MSI/MSI-X. So, if we squash these two patches the whole
> picture will be seen at once.

Does it? I did get the impression that the guest would be able to observe
the bit set even after writing zero to it (while a reason exists that Xen
wants the bit set).

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 3:14 p.m. UTC | #12
On 07.02.22 17:05, Jan Beulich wrote:
> On 07.02.2022 15:46, Oleksandr Andrushchenko wrote:
>> On 07.02.22 16:31, Jan Beulich wrote:
>>> But: What's still missing here then is the separation of guest and host
>>> views. When we set INTx behind the guest's back, it shouldn't observe the
>>> bit set. Or is this meant to be another (big) TODO?
>> But, patch [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
>> already takes care of it, I mean that it will set/reset INTx for the guest
>> according to MSI/MSI-X. So, if we squash these two patches the whole
>> picture will be seen at once.
> Does it? I did get the impression that the guest would be able to observe
> the bit set even after writing zero to it (while a reason exists that Xen
> wants the bit set).
Yes, you are correct: guest might not see what it wanted to set.
I meant that Xen won't allow resetting INTx if it is not possible
due to MSI/MSI-X

Anyways, I think squashing will be a good idea to have the relevant
functionality in a single change set. Will this work for you?
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 7, 2022, 3:28 p.m. UTC | #13
On 07.02.2022 16:14, Oleksandr Andrushchenko wrote:
> On 07.02.22 17:05, Jan Beulich wrote:
>> On 07.02.2022 15:46, Oleksandr Andrushchenko wrote:
>>> On 07.02.22 16:31, Jan Beulich wrote:
>>>> But: What's still missing here then is the separation of guest and host
>>>> views. When we set INTx behind the guest's back, it shouldn't observe the
>>>> bit set. Or is this meant to be another (big) TODO?
>>> But, patch [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
>>> already takes care of it, I mean that it will set/reset INTx for the guest
>>> according to MSI/MSI-X. So, if we squash these two patches the whole
>>> picture will be seen at once.
>> Does it? I did get the impression that the guest would be able to observe
>> the bit set even after writing zero to it (while a reason exists that Xen
>> wants the bit set).
> Yes, you are correct: guest might not see what it wanted to set.
> I meant that Xen won't allow resetting INTx if it is not possible
> due to MSI/MSI-X
> 
> Anyways, I think squashing will be a good idea to have the relevant
> functionality in a single change set. Will this work for you?

It might work, but I'd prefer things which can sensibly be separate to
remain separate.

Jan
Oleksandr Andrushchenko Feb. 7, 2022, 3:59 p.m. UTC | #14
On 07.02.22 17:28, Jan Beulich wrote:
> On 07.02.2022 16:14, Oleksandr Andrushchenko wrote:
>> On 07.02.22 17:05, Jan Beulich wrote:
>>> On 07.02.2022 15:46, Oleksandr Andrushchenko wrote:
>>>> On 07.02.22 16:31, Jan Beulich wrote:
>>>>> But: What's still missing here then is the separation of guest and host
>>>>> views. When we set INTx behind the guest's back, it shouldn't observe the
>>>>> bit set. Or is this meant to be another (big) TODO?
>>>> But, patch [PATCH v6 09/13] vpci/header: emulate PCI_COMMAND register for guests
>>>> already takes care of it, I mean that it will set/reset INTx for the guest
>>>> according to MSI/MSI-X. So, if we squash these two patches the whole
>>>> picture will be seen at once.
>>> Does it? I did get the impression that the guest would be able to observe
>>> the bit set even after writing zero to it (while a reason exists that Xen
>>> wants the bit set).
>> Yes, you are correct: guest might not see what it wanted to set.
>> I meant that Xen won't allow resetting INTx if it is not possible
>> due to MSI/MSI-X
>>
>> Anyways, I think squashing will be a good idea to have the relevant
>> functionality in a single change set. Will this work for you?
> It might work, but I'd prefer things which can sensibly be separate to
> remain separate.
Ok, two patches
> Jan
>
Oleksandr Andrushchenko Feb. 10, 2022, 12:54 p.m. UTC | #15
On 07.02.22 16:31, Jan Beulich wrote:
> On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
> But: What's still missing here then is the separation of guest and host
> views. When we set INTx behind the guest's back, it shouldn't observe the
> bit set. Or is this meant to be another (big) TODO?
Why not? This seems to be when a guest tries to both enable MSI/MSI-X
and INTx which is a wrong combination. Let's pretend to be a really
smart PCI device which partially rejects such PCI_COMMAND write,
so guest still sees the register consistent wrt INTx bit. Namely it remains
set.
>
> Jan
>
>
Oleksandr Andrushchenko Feb. 10, 2022, 12:59 p.m. UTC | #16
On 07.02.22 16:31, Jan Beulich wrote:
> On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
>>
>> On 07.02.22 14:54, Jan Beulich wrote:
>>> On 07.02.2022 13:51, Oleksandr Andrushchenko wrote:
>>>> On 07.02.22 14:38, Jan Beulich wrote:
>>>>> On 07.02.2022 12:27, Oleksandr Andrushchenko wrote:
>>>>>> On 07.02.22 09:29, Jan Beulich wrote:
>>>>>>> On 04.02.2022 15:37, Oleksandr Andrushchenko wrote:
>>>>>>>> On 04.02.22 16:30, Jan Beulich wrote:
>>>>>>>>> On 04.02.2022 07:34, Oleksandr Andrushchenko wrote:
>>>>>>>>>> Reset the command register when assigning a PCI device to a guest:
>>>>>>>>>> according to the PCI spec the PCI_COMMAND register is typically all 0's
>>>>>>>>>> after reset.
>>>>>>>>> It's not entirely clear to me whether setting the hardware register to
>>>>>>>>> zero is okay. What wants to be zero is the value the guest observes
>>>>>>>>> initially.
>>>>>>>> "the PCI spec says the PCI_COMMAND register is typically all 0's after reset."
>>>>>>>> Why wouldn't it be ok? What is the exact concern here?
>>>>>>> The concern is - as voiced is similar ways before, perhaps in other
>>>>>>> contexts - that you need to consider bit-by-bit whether overwriting
>>>>>>> with 0 what is currently there is okay. Xen and/or Dom0 may have put
>>>>>>> values there which they expect to remain unaltered. I guess
>>>>>>> PCI_COMMAND_SERR is a good example: While the guest's view of this
>>>>>>> will want to be zero initially, the host having set it to 1 may not
>>>>>>> easily be overwritten with 0, or else you'd effectively imply giving
>>>>>>> the guest control of the bit.
>>>>>> We have already discussed in great detail PCI_COMMAND emulation [1].
>>>>>> At the end you wrote [1]:
>>>>>> "Well, in order for the whole thing to be security supported it needs to
>>>>>> be explained for every bit why it is safe to allow the guest to drive it.
>>>>>> Until you mean vPCI to reach that state, leaving TODO notes in the code
>>>>>> for anything not investigated may indeed be good enough.
>>>>>>
>>>>>> Jan"
>>>>>>
>>>>>> So, this is why I left a TODO in the PCI_COMMAND emulation for now and only
>>>>>> care about INTx which is honored with the code in this patch.
>>>>> Right. The issue I see is that the description does not have any
>>>>> mention of this, but instead talks about simply writing zero.
>>>> How do you want that mentioned? Extended commit message or
>>>> just a link to the thread [1]?
>>> What I'd like you to describe is what the change does without
>>> fundamentally implying it'll end up being zero which gets written
>>> to the register. Stating as a conclusion that for the time being
>>> this means writing zero is certainly fine (and likely helpful if
>>> made explicit).
>> Xen and/or Dom0 may have put values in PCI_COMMAND which they expect
>> to remain unaltered. PCI_COMMAND_SERR bit is a good example: while the
>> guest's view of this will want to be zero initially, the host having set
>> it to 1 may not easily be overwritten with 0, or else we'd effectively
>> imply giving the guest control of the bit. Thus, PCI_COMMAND register needs
>> proper emulation in order to honor host's settings.
>>
>> There are examples of emulators [1], [2] which already deal with PCI_COMMAND
>> register emulation and it seems that at most they care about the only INTX
>> bit (besides IO/memory enable and bus muster which are write through).
>> It could be because in order to properly emulate the PCI_COMMAND register
>> we need to know about the whole PCI topology, e.g. if any setting in device's
>> command register is aligned with the upstream port etc.
>> This makes me think that because of this complexity others just ignore that.
>> Neither I think this can be easily done in Xen case.
>>
>> According to "PCI LOCAL BUS SPECIFICATION, REV. 3.0", section "6.2.2
>> Device Control" says that the reset state of the command register is
>> typically 0, so reset the command register when assigning a PCI device
>> to a guest t all 0's and for now only make sure INTx bit is set according
>> to if MSI/MSI-X enabled.
> "... is typically 0, so when assigning a PCI device reset the guest view of
>   the command register to all 0's. For now our emulation only makes sure INTx
>   is set according to host requirements, i.e. depending on MSI/MSI-X enabled
>   state."
I'll put this description into PCI_COMMAND emulation patch
>
> Maybe? (Obviously a fresh device given to a guest will have MSI/MSI-X
> disabled, so I'm not sure that aspect really needs mentioning.)
>
> But: What's still missing here then is the separation of guest and host
> views. When we set INTx behind the guest's back, it shouldn't observe the
> bit set. Or is this meant to be another (big) TODO?
>
> Jan
>
Thank you,
Oleksandr
Jan Beulich Feb. 10, 2022, 1:36 p.m. UTC | #17
On 10.02.2022 13:54, Oleksandr Andrushchenko wrote:
> On 07.02.22 16:31, Jan Beulich wrote:
>> On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
>> But: What's still missing here then is the separation of guest and host
>> views. When we set INTx behind the guest's back, it shouldn't observe the
>> bit set. Or is this meant to be another (big) TODO?
> Why not? This seems to be when a guest tries to both enable MSI/MSI-X
> and INTx which is a wrong combination. Let's pretend to be a really
> smart PCI device which partially rejects such PCI_COMMAND write,
> so guest still sees the register consistent wrt INTx bit. Namely it remains
> set.

I'm afraid this wouldn't be "smart", but "buggy". I'm not aware of
the spec leaving room for such behavior. And our emulation should
give the guest a spec-compliant view of the device.

Jan
Oleksandr Andrushchenko Feb. 10, 2022, 1:56 p.m. UTC | #18
On 10.02.22 15:36, Jan Beulich wrote:
> On 10.02.2022 13:54, Oleksandr Andrushchenko wrote:
>> On 07.02.22 16:31, Jan Beulich wrote:
>>> On 07.02.2022 15:17, Oleksandr Andrushchenko wrote:
>>> But: What's still missing here then is the separation of guest and host
>>> views. When we set INTx behind the guest's back, it shouldn't observe the
>>> bit set. Or is this meant to be another (big) TODO?
>> Why not? This seems to be when a guest tries to both enable MSI/MSI-X
>> and INTx which is a wrong combination. Let's pretend to be a really
>> smart PCI device which partially rejects such PCI_COMMAND write,
>> so guest still sees the register consistent wrt INTx bit. Namely it remains
>> set.
> I'm afraid this wouldn't be "smart", but "buggy". I'm not aware of
> the spec leaving room for such behavior. And our emulation should
> give the guest a spec-compliant view of the device.
This means we need to emulate PCI_COMMAND for guests in terms
we need to maintain their state just like we do for BARs (header->guest_reg)
So, we will need header->guest_cmd to hold the state
>
> Jan
>
>
Thank you,
Oleksandr
diff mbox series

Patch

diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c
index 33d8c15ae6e8..407fa2fc4749 100644
--- a/xen/drivers/vpci/header.c
+++ b/xen/drivers/vpci/header.c
@@ -454,8 +454,7 @@  static void cmd_write(const struct pci_dev *pdev, unsigned int reg,
         pci_conf_write16(pdev->sbdf, reg, cmd);
 }
 
-static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
-                            uint32_t cmd, void *data)
+static uint32_t emulate_cmd_reg(const struct pci_dev *pdev, uint32_t cmd)
 {
     /* TODO: Add proper emulation for all bits of the command register. */
 
@@ -467,7 +466,13 @@  static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
     }
 #endif
 
-    cmd_write(pdev, reg, cmd, data);
+    return cmd;
+}
+
+static void guest_cmd_write(const struct pci_dev *pdev, unsigned int reg,
+                            uint32_t cmd, void *data)
+{
+    cmd_write(pdev, reg, emulate_cmd_reg(pdev, cmd), data);
 }
 
 static void bar_write(const struct pci_dev *pdev, unsigned int reg,
@@ -676,6 +681,10 @@  static int init_bars(struct pci_dev *pdev)
         return -EOPNOTSUPP;
     }
 
+    /* Reset the command register for the guest. */
+    if ( !is_hwdom )
+        pci_conf_write16(pdev->sbdf, PCI_COMMAND, emulate_cmd_reg(pdev, 0));
+
     /* Setup a handler for the command register. */
     rc = vpci_add_register(pdev->vpci, vpci_hw_read16,
                            is_hwdom ? cmd_write : guest_cmd_write,