Message ID | 20230913143550.14565-2-stewart.hildebrand@amd.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
Series | vPCI capabilities filtering | expand |
On 13.09.2023 16:35, Stewart Hildebrand wrote: > Introduce a handler for the PCI status register, with ability to mask the > capabilities bit. The status register contains RsvdZ bits, read-only bits, and > write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a bit > in the bitmask is set, then the special meaning applies: > > rsvdz_mask: read as zero, guest write ignore (write zero to hardware) > ro_mask: read normal, guest write ignore (preserve on write to hardware) > rw1c_mask: read normal, write 1 to clear > > The RsvdZ naming was borrowed from the PCI Express Base 4.0 specification. > > Xen preserves the value of read-only bits on write to hardware, discarding the > guests write value. > > The mask_cap_list flag will be set in a follow-on patch. > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> Reviewed-by: Jan Beulich <jbeulich@suse.com>
On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > Introduce a handler for the PCI status register, with ability to mask the > capabilities bit. The status register contains RsvdZ bits, read-only bits, and > write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a bit > in the bitmask is set, then the special meaning applies: > > rsvdz_mask: read as zero, guest write ignore (write zero to hardware) > ro_mask: read normal, guest write ignore (preserve on write to hardware) > rw1c_mask: read normal, write 1 to clear > > The RsvdZ naming was borrowed from the PCI Express Base 4.0 specification. > > Xen preserves the value of read-only bits on write to hardware, discarding the > guests write value. > > The mask_cap_list flag will be set in a follow-on patch. > > Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> > --- > v6->v7: > * re-work args passed to vpci_add_register_mask() (called in init_bars()) > * also check for overlap of (rsvdz_mask & ro_mask) in add_register() > * slightly adjust masking operation in vpci_write_helper() > > v5->v6: > * remove duplicate PCI_STATUS_CAP_LIST in constant definition > * style fixup in constant definitions > * s/res_mask/rsvdz_mask/ > * combine a new masking operation into single line > * preserve r/o bits on write > * get rid of status_read. Instead, use rsvdz_mask for conditionally masking > PCI_STATUS_CAP_LIST bit > * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior > * add sanity checks in add_register > * move mask_cap_list from struct vpci_header to local variable > > v4->v5: > * add support for res_mask > * add support for ro_mask (squash ro_mask patch) > * add constants for reserved, read-only, and rw1c masks > > v3->v4: > * move mask_cap_list setting to the capabilities patch > * single pci_conf_read16 in status_read > * align mask_cap_list bitfield in struct vpci_header > * change to rw1c bit mask instead of treating whole register as rw1c > * drop subsystem prefix on renamed add_register function > > v2->v3: > * new patch > --- > xen/drivers/vpci/header.c | 16 +++++++++++ > xen/drivers/vpci/vpci.c | 55 +++++++++++++++++++++++++++++--------- > xen/include/xen/pci_regs.h | 9 +++++++ > xen/include/xen/vpci.h | 8 ++++++ > 4 files changed, 76 insertions(+), 12 deletions(-) > > diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c > index 767c1ba718d7..af267b75ac31 100644 > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev) > struct vpci_header *header = &pdev->vpci->header; > struct vpci_bar *bars = header->bars; > int rc; > + bool mask_cap_list = false; > > switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f ) > { > @@ -544,6 +545,21 @@ static int cf_check init_bars(struct pci_dev *pdev) > if ( rc ) > return rc; > > + /* > + * Utilize rsvdz_mask to hide PCI_STATUS_CAP_LIST from the guest for now. If > + * support for rsvdp (reserved & preserved) is added in the future, the > + * rsvdp mask should be used instead. > + */ > + rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16, > + PCI_STATUS, 2, NULL, > + PCI_STATUS_RSVDZ_MASK | > + (mask_cap_list ? PCI_STATUS_CAP_LIST : 0), > + PCI_STATUS_RO_MASK & > + ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0), > + PCI_STATUS_RW1C_MASK); > + if ( rc ) > + return rc; > + > if ( pdev->ignore_bars ) > return 0; > > diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c > index 3bec9a4153da..b4cde7db1b3f 100644 > --- a/xen/drivers/vpci/vpci.c > +++ b/xen/drivers/vpci/vpci.c > @@ -29,6 +29,9 @@ struct vpci_register { > unsigned int offset; > void *private; > struct list_head node; > + uint32_t rsvdz_mask; > + uint32_t ro_mask; > + uint32_t rw1c_mask; I understand that we need the rw1c_mask in order to properly merge values when doing partial writes, but the other fields I'm not sure we do need them. RO bits don't care about what's written to them, and RsvdZ are always read as 0 and written as 0, so the mask shouldn't affect the merging. > }; > > #ifdef __XEN__ > @@ -145,9 +148,16 @@ uint32_t cf_check vpci_hw_read32( > return pci_conf_read32(pdev->sbdf, reg); > } > > -int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > - vpci_write_t *write_handler, unsigned int offset, > - unsigned int size, void *data) > +void cf_check vpci_hw_write16( > + const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data) > +{ > + pci_conf_write16(pdev->sbdf, reg, val); > +} > + > +static int add_register(struct vpci *vpci, vpci_read_t *read_handler, > + vpci_write_t *write_handler, unsigned int offset, > + unsigned int size, void *data, uint32_t rsvdz_mask, > + uint32_t ro_mask, uint32_t rw1c_mask) > { > struct list_head *prev; > struct vpci_register *r; > @@ -155,7 +165,8 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > /* Some sanity checks. */ > if ( (size != 1 && size != 2 && size != 4) || > offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) || > - (!read_handler && !write_handler) ) > + (!read_handler && !write_handler) || (rsvdz_mask & ro_mask) || > + (rsvdz_mask & rw1c_mask) || (ro_mask & rw1c_mask) ) > return -EINVAL; > > r = xmalloc(struct vpci_register); > @@ -167,6 +178,9 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > r->size = size; > r->offset = offset; > r->private = data; > + r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size)); > + r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size)); > + r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size)); > > spin_lock(&vpci->lock); > > @@ -193,6 +207,23 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > return 0; > } > > +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, > + vpci_write_t *write_handler, unsigned int offset, > + unsigned int size, void *data) > +{ > + return add_register(vpci, read_handler, write_handler, offset, size, data, > + 0, 0, 0); > +} > + > +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler, > + vpci_write_t *write_handler, unsigned int offset, > + unsigned int size, void *data, uint32_t rsvdz_mask, > + uint32_t ro_mask, uint32_t rw1c_mask) > +{ > + return add_register(vpci, read_handler, write_handler, offset, size, data, > + rsvdz_mask, ro_mask, rw1c_mask); > +} > + > int vpci_remove_register(struct vpci *vpci, unsigned int offset, > unsigned int size) > { > @@ -376,6 +407,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) > } > > val = r->read(pdev, r->offset, r->private); > + val &= ~r->rsvdz_mask; > > /* Check if the read is in the middle of a register. */ > if ( r->offset < emu.offset ) > @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) > > /* > * Perform a maybe partial write to a register. > - * > - * Note that this will only work for simple registers, if Xen needs to > - * trap accesses to rw1c registers (like the status PCI header register) > - * the logic in vpci_write will have to be expanded in order to correctly > - * deal with them. > */ > static void vpci_write_helper(const struct pci_dev *pdev, > const struct vpci_register *r, unsigned int size, > unsigned int offset, uint32_t data) > { > + uint32_t val = 0; > + > ASSERT(size <= r->size); > > - if ( size != r->size ) > + if ( (size != r->size) || r->ro_mask ) Hm, I'm not sure this specific handling for read-only bits is required, software is allowed to write either 0 or 1 to read-only bits, but the write will be ignored. > { > - uint32_t val; > - > val = r->read(pdev, r->offset, r->private); > + val &= ~r->rw1c_mask; > data = merge_result(val, data, size, offset); > } > > + data &= ~(r->rsvdz_mask | r->ro_mask); > + data |= val & r->ro_mask; You cannot apply the register masks directly into the final value, you need to offset and mask them as necessary, likewise for val, see what's done in merge_result(). Regardless of the offset issue, I think the usage of val with the ro_mask is bogus here, see above. > + > r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), > r->private); > } > diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h > index 84b18736a85d..b72131729db6 100644 > --- a/xen/include/xen/pci_regs.h > +++ b/xen/include/xen/pci_regs.h > @@ -66,6 +66,15 @@ > #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ > #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ > #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ > +#define PCI_STATUS_RSVDZ_MASK 0x0006 In my copy of the PCIe 6 spec bit 6 is also RsvdZ, so the mask should be 0x46. But I'm unsure we really need this mask. Thanks, Roger.
On 17.11.2023 13:40, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >> --- a/xen/drivers/vpci/vpci.c >> +++ b/xen/drivers/vpci/vpci.c >> @@ -29,6 +29,9 @@ struct vpci_register { >> unsigned int offset; >> void *private; >> struct list_head node; >> + uint32_t rsvdz_mask; >> + uint32_t ro_mask; >> + uint32_t rw1c_mask; > > I understand that we need the rw1c_mask in order to properly merge > values when doing partial writes, but the other fields I'm not sure we > do need them. RO bits don't care about what's written to them, and > RsvdZ are always read as 0 and written as 0, so the mask shouldn't > affect the merging. What some version of the spec says is r/o or reserved may be different in another. Also iirc devices may (wrongly?) implement r/o bits as r/w. When presenting a virtual view of devices to guests, in this regard I think we want (or even need) to enforce our view of the world. Jan
On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler, > + vpci_write_t *write_handler, unsigned int offset, > + unsigned int size, void *data, uint32_t rsvdz_mask, > + uint32_t ro_mask, uint32_t rw1c_mask) Forgot to mention, it would be good if you can add some tests in tools/tests/vpci that ensure the masks work as expected. Thanks, Roger.
On Fri, Nov 17, 2023 at 02:23:42PM +0100, Jan Beulich wrote: > On 17.11.2023 13:40, Roger Pau Monné wrote: > > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > >> --- a/xen/drivers/vpci/vpci.c > >> +++ b/xen/drivers/vpci/vpci.c > >> @@ -29,6 +29,9 @@ struct vpci_register { > >> unsigned int offset; > >> void *private; > >> struct list_head node; > >> + uint32_t rsvdz_mask; > >> + uint32_t ro_mask; > >> + uint32_t rw1c_mask; > > > > I understand that we need the rw1c_mask in order to properly merge > > values when doing partial writes, but the other fields I'm not sure we > > do need them. RO bits don't care about what's written to them, and > > RsvdZ are always read as 0 and written as 0, so the mask shouldn't > > affect the merging. > > What some version of the spec says is r/o or reserved may be different > in another. Also iirc devices may (wrongly?) implement r/o bits as r/w. > When presenting a virtual view of devices to guests, in this regard I > think we want (or even need) to enforce our view of the world. That needs to be part of the commit message then. Ideally we would also want to do a swept of all registers we currently implement, in order to check for ro or rsvdz bits and properly enforce them. Thanks, Roger.
On Fri, Nov 17, 2023 at 01:40:37PM +0100, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > > { > > - uint32_t val; > > - > > val = r->read(pdev, r->offset, r->private); > > + val &= ~r->rw1c_mask; > > data = merge_result(val, data, size, offset); > > } > > > > + data &= ~(r->rsvdz_mask | r->ro_mask); > > + data |= val & r->ro_mask; > > You cannot apply the register masks directly into the final value, you > need to offset and mask them as necessary, likewise for val, see > what's done in merge_result(). Never mind, I was wrong, there's no need to offset anything here. Roger.
On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) > > /* > * Perform a maybe partial write to a register. > - * > - * Note that this will only work for simple registers, if Xen needs to > - * trap accesses to rw1c registers (like the status PCI header register) > - * the logic in vpci_write will have to be expanded in order to correctly > - * deal with them. > */ > static void vpci_write_helper(const struct pci_dev *pdev, > const struct vpci_register *r, unsigned int size, > unsigned int offset, uint32_t data) > { > + uint32_t val = 0; > + > ASSERT(size <= r->size); > > - if ( size != r->size ) > + if ( (size != r->size) || r->ro_mask ) > { > - uint32_t val; > - > val = r->read(pdev, r->offset, r->private); > + val &= ~r->rw1c_mask; > data = merge_result(val, data, size, offset); > } > > + data &= ~(r->rsvdz_mask | r->ro_mask); > + data |= val & r->ro_mask; I've been thinking about this, and the way the ro_mask is implemented (and the way we want to handle ro bits) is the same behavior as RsvdP. I would suggest to rename the ro_mask to rsvdp_mask and note that for resilience reasons we will handle RO bits as RsvdP. Thanks, Roger.
On 11/21/23 09:45, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >> @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) >> >> /* >> * Perform a maybe partial write to a register. >> - * >> - * Note that this will only work for simple registers, if Xen needs to >> - * trap accesses to rw1c registers (like the status PCI header register) >> - * the logic in vpci_write will have to be expanded in order to correctly >> - * deal with them. >> */ >> static void vpci_write_helper(const struct pci_dev *pdev, >> const struct vpci_register *r, unsigned int size, >> unsigned int offset, uint32_t data) >> { >> + uint32_t val = 0; >> + >> ASSERT(size <= r->size); >> >> - if ( size != r->size ) >> + if ( (size != r->size) || r->ro_mask ) >> { >> - uint32_t val; >> - >> val = r->read(pdev, r->offset, r->private); >> + val &= ~r->rw1c_mask; >> data = merge_result(val, data, size, offset); >> } >> >> + data &= ~(r->rsvdz_mask | r->ro_mask); >> + data |= val & r->ro_mask; > > I've been thinking about this, and the way the ro_mask is implemented > (and the way we want to handle ro bits) is the same behavior as RsvdP. > I would suggest to rename the ro_mask to rsvdp_mask and note > that for resilience reasons we will handle RO bits as RsvdP. But the reads behave differently. RO should return the value, RsvdP should return 0 when read (according to the PCIe Base Spec 4.0).
On Tue, Nov 21, 2023 at 10:03:01AM -0500, Stewart Hildebrand wrote: > On 11/21/23 09:45, Roger Pau Monné wrote: > > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > >> @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) > >> > >> /* > >> * Perform a maybe partial write to a register. > >> - * > >> - * Note that this will only work for simple registers, if Xen needs to > >> - * trap accesses to rw1c registers (like the status PCI header register) > >> - * the logic in vpci_write will have to be expanded in order to correctly > >> - * deal with them. > >> */ > >> static void vpci_write_helper(const struct pci_dev *pdev, > >> const struct vpci_register *r, unsigned int size, > >> unsigned int offset, uint32_t data) > >> { > >> + uint32_t val = 0; > >> + > >> ASSERT(size <= r->size); > >> > >> - if ( size != r->size ) > >> + if ( (size != r->size) || r->ro_mask ) > >> { > >> - uint32_t val; > >> - > >> val = r->read(pdev, r->offset, r->private); > >> + val &= ~r->rw1c_mask; > >> data = merge_result(val, data, size, offset); > >> } > >> > >> + data &= ~(r->rsvdz_mask | r->ro_mask); > >> + data |= val & r->ro_mask; > > > > I've been thinking about this, and the way the ro_mask is implemented > > (and the way we want to handle ro bits) is the same behavior as RsvdP. > > I would suggest to rename the ro_mask to rsvdp_mask and note > > that for resilience reasons we will handle RO bits as RsvdP. > > But the reads behave differently. RO should return the value, RsvdP should return 0 when read (according to the PCIe Base Spec 4.0). Hm, right, sorry for the wrong suggestion. We should force bits to 0 for guests reads, but make sure that for writes the value on the hardware is preserved. So we need the separate mask for RsvdP, which will be handled like ro_mask in vpci_write_helper() and like rsvdz_mask in vpci_read(). Roger.
On 11/21/23 10:18, Roger Pau Monné wrote: > On Tue, Nov 21, 2023 at 10:03:01AM -0500, Stewart Hildebrand wrote: >> On 11/21/23 09:45, Roger Pau Monné wrote: >>> On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >>>> @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) >>>> >>>> /* >>>> * Perform a maybe partial write to a register. >>>> - * >>>> - * Note that this will only work for simple registers, if Xen needs to >>>> - * trap accesses to rw1c registers (like the status PCI header register) >>>> - * the logic in vpci_write will have to be expanded in order to correctly >>>> - * deal with them. >>>> */ >>>> static void vpci_write_helper(const struct pci_dev *pdev, >>>> const struct vpci_register *r, unsigned int size, >>>> unsigned int offset, uint32_t data) >>>> { >>>> + uint32_t val = 0; >>>> + >>>> ASSERT(size <= r->size); >>>> >>>> - if ( size != r->size ) >>>> + if ( (size != r->size) || r->ro_mask ) >>>> { >>>> - uint32_t val; >>>> - >>>> val = r->read(pdev, r->offset, r->private); >>>> + val &= ~r->rw1c_mask; >>>> data = merge_result(val, data, size, offset); >>>> } >>>> >>>> + data &= ~(r->rsvdz_mask | r->ro_mask); >>>> + data |= val & r->ro_mask; >>> >>> I've been thinking about this, and the way the ro_mask is implemented >>> (and the way we want to handle ro bits) is the same behavior as RsvdP. >>> I would suggest to rename the ro_mask to rsvdp_mask and note >>> that for resilience reasons we will handle RO bits as RsvdP. >> >> But the reads behave differently. RO should return the value, RsvdP should return 0 when read (according to the PCIe Base Spec 4.0). > > Hm, right, sorry for the wrong suggestion. We should force bits to 0 > for guests reads, but make sure that for writes the value on the > hardware is preserved. > > So we need the separate mask for RsvdP, which will be handled like > ro_mask in vpci_write_helper() and like rsvdz_mask in vpci_read(). Agreed. The only reason I didn't add RsvdP support in this patch was that it wasn't needed for the status register... Since RsvdP will be needed for the command register, I think I may as well implement it as part of this series, perhaps in a separate patch following this one. Stay tuned for v8.
On 11/21/23 11:27, Stewart Hildebrand wrote: > On 11/21/23 10:18, Roger Pau Monné wrote: >> On Tue, Nov 21, 2023 at 10:03:01AM -0500, Stewart Hildebrand wrote: >>> On 11/21/23 09:45, Roger Pau Monné wrote: >>>> On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >>>>> @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) >>>>> >>>>> /* >>>>> * Perform a maybe partial write to a register. >>>>> - * >>>>> - * Note that this will only work for simple registers, if Xen needs to >>>>> - * trap accesses to rw1c registers (like the status PCI header register) >>>>> - * the logic in vpci_write will have to be expanded in order to correctly >>>>> - * deal with them. >>>>> */ >>>>> static void vpci_write_helper(const struct pci_dev *pdev, >>>>> const struct vpci_register *r, unsigned int size, >>>>> unsigned int offset, uint32_t data) >>>>> { >>>>> + uint32_t val = 0; >>>>> + >>>>> ASSERT(size <= r->size); >>>>> >>>>> - if ( size != r->size ) >>>>> + if ( (size != r->size) || r->ro_mask ) >>>>> { >>>>> - uint32_t val; >>>>> - >>>>> val = r->read(pdev, r->offset, r->private); >>>>> + val &= ~r->rw1c_mask; >>>>> data = merge_result(val, data, size, offset); >>>>> } >>>>> >>>>> + data &= ~(r->rsvdz_mask | r->ro_mask); >>>>> + data |= val & r->ro_mask; >>>> >>>> I've been thinking about this, and the way the ro_mask is implemented >>>> (and the way we want to handle ro bits) is the same behavior as RsvdP. >>>> I would suggest to rename the ro_mask to rsvdp_mask and note >>>> that for resilience reasons we will handle RO bits as RsvdP. >>> >>> But the reads behave differently. RO should return the value, RsvdP should return 0 when read (according to the PCIe Base Spec 4.0). >> >> Hm, right, sorry for the wrong suggestion. We should force bits to 0 >> for guests reads, but make sure that for writes the value on the >> hardware is preserved. >> >> So we need the separate mask for RsvdP, which will be handled like >> ro_mask in vpci_write_helper() and like rsvdz_mask in vpci_read(). > > Agreed. The only reason I didn't add RsvdP support in this patch was that it wasn't needed for the status register... Since RsvdP will be needed for the command register, I think I may as well implement it as part of this series, perhaps in a separate patch following this one. Stay tuned for v8. I just remembered that RsvdP would actually be useful for the PCI_STATUS_CAP_LIST bit in the status register. So I'll implement it directly in this patch afterall, not a separate one.
On 11/17/23 07:40, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >> Introduce a handler for the PCI status register, with ability to mask the >> capabilities bit. The status register contains RsvdZ bits, read-only bits, and >> write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a bit >> in the bitmask is set, then the special meaning applies: >> >> rsvdz_mask: read as zero, guest write ignore (write zero to hardware) >> ro_mask: read normal, guest write ignore (preserve on write to hardware) >> rw1c_mask: read normal, write 1 to clear >> >> The RsvdZ naming was borrowed from the PCI Express Base 4.0 specification. >> >> Xen preserves the value of read-only bits on write to hardware, discarding the >> guests write value. >> >> The mask_cap_list flag will be set in a follow-on patch. >> >> Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> >> --- >> v6->v7: >> * re-work args passed to vpci_add_register_mask() (called in init_bars()) >> * also check for overlap of (rsvdz_mask & ro_mask) in add_register() >> * slightly adjust masking operation in vpci_write_helper() >> >> v5->v6: >> * remove duplicate PCI_STATUS_CAP_LIST in constant definition >> * style fixup in constant definitions >> * s/res_mask/rsvdz_mask/ >> * combine a new masking operation into single line >> * preserve r/o bits on write >> * get rid of status_read. Instead, use rsvdz_mask for conditionally masking >> PCI_STATUS_CAP_LIST bit >> * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior >> * add sanity checks in add_register >> * move mask_cap_list from struct vpci_header to local variable >> >> v4->v5: >> * add support for res_mask >> * add support for ro_mask (squash ro_mask patch) >> * add constants for reserved, read-only, and rw1c masks >> >> v3->v4: >> * move mask_cap_list setting to the capabilities patch >> * single pci_conf_read16 in status_read >> * align mask_cap_list bitfield in struct vpci_header >> * change to rw1c bit mask instead of treating whole register as rw1c >> * drop subsystem prefix on renamed add_register function >> >> v2->v3: >> * new patch >> --- >> xen/drivers/vpci/header.c | 16 +++++++++++ >> xen/drivers/vpci/vpci.c | 55 +++++++++++++++++++++++++++++--------- >> xen/include/xen/pci_regs.h | 9 +++++++ >> xen/include/xen/vpci.h | 8 ++++++ >> 4 files changed, 76 insertions(+), 12 deletions(-) >> >> diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c >> index 767c1ba718d7..af267b75ac31 100644 >> --- a/xen/drivers/vpci/header.c >> +++ b/xen/drivers/vpci/header.c >> @@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev) >> struct vpci_header *header = &pdev->vpci->header; >> struct vpci_bar *bars = header->bars; >> int rc; >> + bool mask_cap_list = false; >> >> switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f ) >> { >> @@ -544,6 +545,21 @@ static int cf_check init_bars(struct pci_dev *pdev) >> if ( rc ) >> return rc; >> >> + /* >> + * Utilize rsvdz_mask to hide PCI_STATUS_CAP_LIST from the guest for now. If >> + * support for rsvdp (reserved & preserved) is added in the future, the >> + * rsvdp mask should be used instead. >> + */ >> + rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16, >> + PCI_STATUS, 2, NULL, >> + PCI_STATUS_RSVDZ_MASK | >> + (mask_cap_list ? PCI_STATUS_CAP_LIST : 0), >> + PCI_STATUS_RO_MASK & >> + ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0), >> + PCI_STATUS_RW1C_MASK); >> + if ( rc ) >> + return rc; >> + >> if ( pdev->ignore_bars ) >> return 0; >> >> diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c >> index 3bec9a4153da..b4cde7db1b3f 100644 >> --- a/xen/drivers/vpci/vpci.c >> +++ b/xen/drivers/vpci/vpci.c >> @@ -29,6 +29,9 @@ struct vpci_register { >> unsigned int offset; >> void *private; >> struct list_head node; >> + uint32_t rsvdz_mask; >> + uint32_t ro_mask; >> + uint32_t rw1c_mask; > > I understand that we need the rw1c_mask in order to properly merge > values when doing partial writes, but the other fields I'm not sure we > do need them. RO bits don't care about what's written to them, and > RsvdZ are always read as 0 and written as 0, so the mask shouldn't > affect the merging. See discussions at [1] and [2]. I'll keep ro_mask/rsvdz_mask, add rsvdp_mask, and expand the commit message. For another data point, qemu (hw/xen/xen_pt_config_init.c:xen_pt_emu_reg_header0) also has a similar way of masking bits. [1] https://lists.xenproject.org/archives/html/xen-devel/2023-11/msg01398.html [2] https://lists.xenproject.org/archives/html/xen-devel/2023-11/msg01693.html > >> }; >> >> #ifdef __XEN__ >> @@ -145,9 +148,16 @@ uint32_t cf_check vpci_hw_read32( >> return pci_conf_read32(pdev->sbdf, reg); >> } >> >> -int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, >> - vpci_write_t *write_handler, unsigned int offset, >> - unsigned int size, void *data) >> +void cf_check vpci_hw_write16( >> + const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data) >> +{ >> + pci_conf_write16(pdev->sbdf, reg, val); >> +} >> + >> +static int add_register(struct vpci *vpci, vpci_read_t *read_handler, >> + vpci_write_t *write_handler, unsigned int offset, >> + unsigned int size, void *data, uint32_t rsvdz_mask, >> + uint32_t ro_mask, uint32_t rw1c_mask) >> { >> struct list_head *prev; >> struct vpci_register *r; >> @@ -155,7 +165,8 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, >> /* Some sanity checks. */ >> if ( (size != 1 && size != 2 && size != 4) || >> offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) || >> - (!read_handler && !write_handler) ) >> + (!read_handler && !write_handler) || (rsvdz_mask & ro_mask) || >> + (rsvdz_mask & rw1c_mask) || (ro_mask & rw1c_mask) ) >> return -EINVAL; >> >> r = xmalloc(struct vpci_register); >> @@ -167,6 +178,9 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, >> r->size = size; >> r->offset = offset; >> r->private = data; >> + r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size)); >> + r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size)); >> + r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size)); >> >> spin_lock(&vpci->lock); >> >> @@ -193,6 +207,23 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, >> return 0; >> } >> >> +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, >> + vpci_write_t *write_handler, unsigned int offset, >> + unsigned int size, void *data) >> +{ >> + return add_register(vpci, read_handler, write_handler, offset, size, data, >> + 0, 0, 0); >> +} >> + >> +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler, >> + vpci_write_t *write_handler, unsigned int offset, >> + unsigned int size, void *data, uint32_t rsvdz_mask, >> + uint32_t ro_mask, uint32_t rw1c_mask) >> +{ >> + return add_register(vpci, read_handler, write_handler, offset, size, data, >> + rsvdz_mask, ro_mask, rw1c_mask); >> +} >> + >> int vpci_remove_register(struct vpci *vpci, unsigned int offset, >> unsigned int size) >> { >> @@ -376,6 +407,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) >> } >> >> val = r->read(pdev, r->offset, r->private); >> + val &= ~r->rsvdz_mask; >> >> /* Check if the read is in the middle of a register. */ >> if ( r->offset < emu.offset ) >> @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) >> >> /* >> * Perform a maybe partial write to a register. >> - * >> - * Note that this will only work for simple registers, if Xen needs to >> - * trap accesses to rw1c registers (like the status PCI header register) >> - * the logic in vpci_write will have to be expanded in order to correctly >> - * deal with them. >> */ >> static void vpci_write_helper(const struct pci_dev *pdev, >> const struct vpci_register *r, unsigned int size, >> unsigned int offset, uint32_t data) >> { >> + uint32_t val = 0; >> + >> ASSERT(size <= r->size); >> >> - if ( size != r->size ) >> + if ( (size != r->size) || r->ro_mask ) > > Hm, I'm not sure this specific handling for read-only bits is > required, software is allowed to write either 0 or 1 to read-only > bits, but the write will be ignored. See [1]. > >> { >> - uint32_t val; >> - >> val = r->read(pdev, r->offset, r->private); >> + val &= ~r->rw1c_mask; >> data = merge_result(val, data, size, offset); >> } >> >> + data &= ~(r->rsvdz_mask | r->ro_mask); >> + data |= val & r->ro_mask; > > You cannot apply the register masks directly into the final value, you > need to offset and mask them as necessary, likewise for val, see > what's done in merge_result(). See [3]. [3] https://lists.xenproject.org/archives/html/xen-devel/2023-11/msg01412.html > > Regardless of the offset issue, I think the usage of val with the > ro_mask is bogus here, see above. See [1]. > >> + >> r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), >> r->private); >> } >> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h >> index 84b18736a85d..b72131729db6 100644 >> --- a/xen/include/xen/pci_regs.h >> +++ b/xen/include/xen/pci_regs.h >> @@ -66,6 +66,15 @@ >> #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ >> #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ >> #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ >> +#define PCI_STATUS_RSVDZ_MASK 0x0006 > > In my copy of the PCIe 6 spec bit 6 is also RsvdZ, so the mask should > be 0x46. Right, mine too. It's probably safer to follow the newer version of the spec, so I'll make the change. For completeness / archaeology purposes, I just want to document some relevant data points here. In PCIe 4 spec, it says this about bit 6: "These bits were used in previous versions of the programming model. Careful consideration should be given to any attempt to repurpose them." Going further back, PCI (old school PCI, not Express) spec 3.0 says this about bit 6: "This bit is reserved. *" "* In Revision 2.1 of this specification, this bit was used to indicate whether or not a device supported User Definable Features." Just above in our pci_regs.h (and equally in Linux include/uapi/linux/pci_regs.h) we have this definition for bit 6: #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */ Qemu hw/xen/xen_pt_config_init.c treats bit 6 as RO: .ro_mask = 0x06F8, > > But I'm unsure we really need this mask. > > Thanks, Roger.
On 11/17/23 08:33, Roger Pau Monné wrote: > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >> +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler, >> + vpci_write_t *write_handler, unsigned int offset, >> + unsigned int size, void *data, uint32_t rsvdz_mask, >> + uint32_t ro_mask, uint32_t rw1c_mask) > > Forgot to mention, it would be good if you can add some tests in > tools/tests/vpci that ensure the masks work as expected. Will do > > Thanks, Roger.
On Wed, Nov 22, 2023 at 03:16:29PM -0500, Stewart Hildebrand wrote: > On 11/17/23 07:40, Roger Pau Monné wrote: > > On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > >> r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), > >> r->private); > >> } > >> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h > >> index 84b18736a85d..b72131729db6 100644 > >> --- a/xen/include/xen/pci_regs.h > >> +++ b/xen/include/xen/pci_regs.h > >> @@ -66,6 +66,15 @@ > >> #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ > >> #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ > >> #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ > >> +#define PCI_STATUS_RSVDZ_MASK 0x0006 > > > > In my copy of the PCIe 6 spec bit 6 is also RsvdZ, so the mask should > > be 0x46. > > Right, mine too. It's probably safer to follow the newer version of the spec, so I'll make the change. For completeness / archaeology purposes, I just want to document some relevant data points here. > > In PCIe 4 spec, it says this about bit 6: > "These bits were used in previous versions of the programming model. Careful consideration should be given to any attempt to repurpose them." > > Going further back, PCI (old school PCI, not Express) spec 3.0 says this about bit 6: > "This bit is reserved. *" > "* In Revision 2.1 of this specification, this bit was used to indicate whether or not a device supported User Definable Features." > > Just above in our pci_regs.h (and equally in Linux include/uapi/linux/pci_regs.h) we have this definition for bit 6: > #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */ > > Qemu hw/xen/xen_pt_config_init.c treats bit 6 as RO: > .ro_mask = 0x06F8, Right, given the implementation of ro_mask that would likely be fine. Reading unconditionally as 0 while preserving the value on writes seems the safest option. Thanks, Roger.
On 11/23/23 03:14, Roger Pau Monné wrote: > On Wed, Nov 22, 2023 at 03:16:29PM -0500, Stewart Hildebrand wrote: >> On 11/17/23 07:40, Roger Pau Monné wrote: >>> On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: >>>> r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), >>>> r->private); >>>> } >>>> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h >>>> index 84b18736a85d..b72131729db6 100644 >>>> --- a/xen/include/xen/pci_regs.h >>>> +++ b/xen/include/xen/pci_regs.h >>>> @@ -66,6 +66,15 @@ >>>> #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ >>>> #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ >>>> #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ >>>> +#define PCI_STATUS_RSVDZ_MASK 0x0006 >>> >>> In my copy of the PCIe 6 spec bit 6 is also RsvdZ, so the mask should >>> be 0x46. >> >> Right, mine too. It's probably safer to follow the newer version of the spec, so I'll make the change. For completeness / archaeology purposes, I just want to document some relevant data points here. >> >> In PCIe 4 spec, it says this about bit 6: >> "These bits were used in previous versions of the programming model. Careful consideration should be given to any attempt to repurpose them." >> >> Going further back, PCI (old school PCI, not Express) spec 3.0 says this about bit 6: >> "This bit is reserved. *" >> "* In Revision 2.1 of this specification, this bit was used to indicate whether or not a device supported User Definable Features." >> >> Just above in our pci_regs.h (and equally in Linux include/uapi/linux/pci_regs.h) we have this definition for bit 6: >> #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */ >> >> Qemu hw/xen/xen_pt_config_init.c treats bit 6 as RO: >> .ro_mask = 0x06F8, > > Right, given the implementation of ro_mask that would likely be fine. > Reading unconditionally as 0 while preserving the value on writes > seems the safest option. That would mean treating bit 6 as RsvdP, even though the PCIe 6 spec says RsvdZ. I just want to confirm this is indeed the intent since we both said RsvdZ just a moment ago? If so, I would add a comment since it's deviating from spec. I would personally still vote in favor of following PCIe 6 spec (RsvdZ).
On Thu, Nov 23, 2023 at 07:57:21AM -0500, Stewart Hildebrand wrote: > On 11/23/23 03:14, Roger Pau Monné wrote: > > On Wed, Nov 22, 2023 at 03:16:29PM -0500, Stewart Hildebrand wrote: > >> On 11/17/23 07:40, Roger Pau Monné wrote: > >>> On Wed, Sep 13, 2023 at 10:35:46AM -0400, Stewart Hildebrand wrote: > >>>> r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), > >>>> r->private); > >>>> } > >>>> diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h > >>>> index 84b18736a85d..b72131729db6 100644 > >>>> --- a/xen/include/xen/pci_regs.h > >>>> +++ b/xen/include/xen/pci_regs.h > >>>> @@ -66,6 +66,15 @@ > >>>> #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ > >>>> #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ > >>>> #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ > >>>> +#define PCI_STATUS_RSVDZ_MASK 0x0006 > >>> > >>> In my copy of the PCIe 6 spec bit 6 is also RsvdZ, so the mask should > >>> be 0x46. > >> > >> Right, mine too. It's probably safer to follow the newer version of the spec, so I'll make the change. For completeness / archaeology purposes, I just want to document some relevant data points here. > >> > >> In PCIe 4 spec, it says this about bit 6: > >> "These bits were used in previous versions of the programming model. Careful consideration should be given to any attempt to repurpose them." > >> > >> Going further back, PCI (old school PCI, not Express) spec 3.0 says this about bit 6: > >> "This bit is reserved. *" > >> "* In Revision 2.1 of this specification, this bit was used to indicate whether or not a device supported User Definable Features." > >> > >> Just above in our pci_regs.h (and equally in Linux include/uapi/linux/pci_regs.h) we have this definition for bit 6: > >> #define PCI_STATUS_UDF 0x40 /* Support User Definable Features [obsolete] */ > >> > >> Qemu hw/xen/xen_pt_config_init.c treats bit 6 as RO: > >> .ro_mask = 0x06F8, > > > > Right, given the implementation of ro_mask that would likely be fine. > > Reading unconditionally as 0 while preserving the value on writes > > seems the safest option. > > That would mean treating bit 6 as RsvdP, even though the PCIe 6 spec > says RsvdZ. I just want to confirm this is indeed the intent since > we both said RsvdZ just a moment ago? If so, I would add a comment > since it's deviating from spec. I would personally still vote in > favor of following PCIe 6 spec (RsvdZ). Hm, yes, lets go with the spec and use RsdvZ. I very much doubt we passthrough any device that actually uses the UDF bit. Thanks, Roger.
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index 767c1ba718d7..af267b75ac31 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -521,6 +521,7 @@ static int cf_check init_bars(struct pci_dev *pdev) struct vpci_header *header = &pdev->vpci->header; struct vpci_bar *bars = header->bars; int rc; + bool mask_cap_list = false; switch ( pci_conf_read8(pdev->sbdf, PCI_HEADER_TYPE) & 0x7f ) { @@ -544,6 +545,21 @@ static int cf_check init_bars(struct pci_dev *pdev) if ( rc ) return rc; + /* + * Utilize rsvdz_mask to hide PCI_STATUS_CAP_LIST from the guest for now. If + * support for rsvdp (reserved & preserved) is added in the future, the + * rsvdp mask should be used instead. + */ + rc = vpci_add_register_mask(pdev->vpci, vpci_hw_read16, vpci_hw_write16, + PCI_STATUS, 2, NULL, + PCI_STATUS_RSVDZ_MASK | + (mask_cap_list ? PCI_STATUS_CAP_LIST : 0), + PCI_STATUS_RO_MASK & + ~(mask_cap_list ? PCI_STATUS_CAP_LIST : 0), + PCI_STATUS_RW1C_MASK); + if ( rc ) + return rc; + if ( pdev->ignore_bars ) return 0; diff --git a/xen/drivers/vpci/vpci.c b/xen/drivers/vpci/vpci.c index 3bec9a4153da..b4cde7db1b3f 100644 --- a/xen/drivers/vpci/vpci.c +++ b/xen/drivers/vpci/vpci.c @@ -29,6 +29,9 @@ struct vpci_register { unsigned int offset; void *private; struct list_head node; + uint32_t rsvdz_mask; + uint32_t ro_mask; + uint32_t rw1c_mask; }; #ifdef __XEN__ @@ -145,9 +148,16 @@ uint32_t cf_check vpci_hw_read32( return pci_conf_read32(pdev->sbdf, reg); } -int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, - vpci_write_t *write_handler, unsigned int offset, - unsigned int size, void *data) +void cf_check vpci_hw_write16( + const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data) +{ + pci_conf_write16(pdev->sbdf, reg, val); +} + +static int add_register(struct vpci *vpci, vpci_read_t *read_handler, + vpci_write_t *write_handler, unsigned int offset, + unsigned int size, void *data, uint32_t rsvdz_mask, + uint32_t ro_mask, uint32_t rw1c_mask) { struct list_head *prev; struct vpci_register *r; @@ -155,7 +165,8 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, /* Some sanity checks. */ if ( (size != 1 && size != 2 && size != 4) || offset >= PCI_CFG_SPACE_EXP_SIZE || (offset & (size - 1)) || - (!read_handler && !write_handler) ) + (!read_handler && !write_handler) || (rsvdz_mask & ro_mask) || + (rsvdz_mask & rw1c_mask) || (ro_mask & rw1c_mask) ) return -EINVAL; r = xmalloc(struct vpci_register); @@ -167,6 +178,9 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, r->size = size; r->offset = offset; r->private = data; + r->rsvdz_mask = rsvdz_mask & (0xffffffffU >> (32 - 8 * size)); + r->ro_mask = ro_mask & (0xffffffffU >> (32 - 8 * size)); + r->rw1c_mask = rw1c_mask & (0xffffffffU >> (32 - 8 * size)); spin_lock(&vpci->lock); @@ -193,6 +207,23 @@ int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, return 0; } +int vpci_add_register(struct vpci *vpci, vpci_read_t *read_handler, + vpci_write_t *write_handler, unsigned int offset, + unsigned int size, void *data) +{ + return add_register(vpci, read_handler, write_handler, offset, size, data, + 0, 0, 0); +} + +int vpci_add_register_mask(struct vpci *vpci, vpci_read_t *read_handler, + vpci_write_t *write_handler, unsigned int offset, + unsigned int size, void *data, uint32_t rsvdz_mask, + uint32_t ro_mask, uint32_t rw1c_mask) +{ + return add_register(vpci, read_handler, write_handler, offset, size, data, + rsvdz_mask, ro_mask, rw1c_mask); +} + int vpci_remove_register(struct vpci *vpci, unsigned int offset, unsigned int size) { @@ -376,6 +407,7 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) } val = r->read(pdev, r->offset, r->private); + val &= ~r->rsvdz_mask; /* Check if the read is in the middle of a register. */ if ( r->offset < emu.offset ) @@ -407,26 +439,25 @@ uint32_t vpci_read(pci_sbdf_t sbdf, unsigned int reg, unsigned int size) /* * Perform a maybe partial write to a register. - * - * Note that this will only work for simple registers, if Xen needs to - * trap accesses to rw1c registers (like the status PCI header register) - * the logic in vpci_write will have to be expanded in order to correctly - * deal with them. */ static void vpci_write_helper(const struct pci_dev *pdev, const struct vpci_register *r, unsigned int size, unsigned int offset, uint32_t data) { + uint32_t val = 0; + ASSERT(size <= r->size); - if ( size != r->size ) + if ( (size != r->size) || r->ro_mask ) { - uint32_t val; - val = r->read(pdev, r->offset, r->private); + val &= ~r->rw1c_mask; data = merge_result(val, data, size, offset); } + data &= ~(r->rsvdz_mask | r->ro_mask); + data |= val & r->ro_mask; + r->write(pdev, r->offset, data & (0xffffffffU >> (32 - 8 * r->size)), r->private); } diff --git a/xen/include/xen/pci_regs.h b/xen/include/xen/pci_regs.h index 84b18736a85d..b72131729db6 100644 --- a/xen/include/xen/pci_regs.h +++ b/xen/include/xen/pci_regs.h @@ -66,6 +66,15 @@ #define PCI_STATUS_REC_MASTER_ABORT 0x2000 /* Set on master abort */ #define PCI_STATUS_SIG_SYSTEM_ERROR 0x4000 /* Set when we drive SERR */ #define PCI_STATUS_DETECTED_PARITY 0x8000 /* Set on parity error */ +#define PCI_STATUS_RSVDZ_MASK 0x0006 + +#define PCI_STATUS_RO_MASK (PCI_STATUS_IMM_READY | PCI_STATUS_INTERRUPT | \ + PCI_STATUS_CAP_LIST | PCI_STATUS_66MHZ | PCI_STATUS_UDF | \ + PCI_STATUS_FAST_BACK | PCI_STATUS_DEVSEL_MASK) +#define PCI_STATUS_RW1C_MASK (PCI_STATUS_PARITY | \ + PCI_STATUS_SIG_TARGET_ABORT | PCI_STATUS_REC_TARGET_ABORT | \ + PCI_STATUS_REC_MASTER_ABORT | PCI_STATUS_SIG_SYSTEM_ERROR | \ + PCI_STATUS_DETECTED_PARITY) #define PCI_CLASS_REVISION 0x08 /* High 24 bits are class, low 8 revision */ #define PCI_REVISION_ID 0x08 /* Revision ID */ diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h index 0b8a2a3c745b..7a5cca29e54c 100644 --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -37,6 +37,12 @@ int __must_check vpci_add_register(struct vpci *vpci, vpci_write_t *write_handler, unsigned int offset, unsigned int size, void *data); +int __must_check vpci_add_register_mask(struct vpci *vpci, + vpci_read_t *read_handler, + vpci_write_t *write_handler, + unsigned int offset, unsigned int size, + void *data, uint32_t rsvdz_mask, + uint32_t ro_mask, uint32_t rw1c_mask); int __must_check vpci_remove_register(struct vpci *vpci, unsigned int offset, unsigned int size); @@ -50,6 +56,8 @@ uint32_t cf_check vpci_hw_read16( const struct pci_dev *pdev, unsigned int reg, void *data); uint32_t cf_check vpci_hw_read32( const struct pci_dev *pdev, unsigned int reg, void *data); +void cf_check vpci_hw_write16( + const struct pci_dev *pdev, unsigned int reg, uint32_t val, void *data); /* * Check for pending vPCI operations on this vcpu. Returns true if the vcpu
Introduce a handler for the PCI status register, with ability to mask the capabilities bit. The status register contains RsvdZ bits, read-only bits, and write-1-to-clear bits, so introduce bitmasks to handle these in vPCI. If a bit in the bitmask is set, then the special meaning applies: rsvdz_mask: read as zero, guest write ignore (write zero to hardware) ro_mask: read normal, guest write ignore (preserve on write to hardware) rw1c_mask: read normal, write 1 to clear The RsvdZ naming was borrowed from the PCI Express Base 4.0 specification. Xen preserves the value of read-only bits on write to hardware, discarding the guests write value. The mask_cap_list flag will be set in a follow-on patch. Signed-off-by: Stewart Hildebrand <stewart.hildebrand@amd.com> --- v6->v7: * re-work args passed to vpci_add_register_mask() (called in init_bars()) * also check for overlap of (rsvdz_mask & ro_mask) in add_register() * slightly adjust masking operation in vpci_write_helper() v5->v6: * remove duplicate PCI_STATUS_CAP_LIST in constant definition * style fixup in constant definitions * s/res_mask/rsvdz_mask/ * combine a new masking operation into single line * preserve r/o bits on write * get rid of status_read. Instead, use rsvdz_mask for conditionally masking PCI_STATUS_CAP_LIST bit * add comment about PCI_STATUS_CAP_LIST and rsvdp behavior * add sanity checks in add_register * move mask_cap_list from struct vpci_header to local variable v4->v5: * add support for res_mask * add support for ro_mask (squash ro_mask patch) * add constants for reserved, read-only, and rw1c masks v3->v4: * move mask_cap_list setting to the capabilities patch * single pci_conf_read16 in status_read * align mask_cap_list bitfield in struct vpci_header * change to rw1c bit mask instead of treating whole register as rw1c * drop subsystem prefix on renamed add_register function v2->v3: * new patch --- xen/drivers/vpci/header.c | 16 +++++++++++ xen/drivers/vpci/vpci.c | 55 +++++++++++++++++++++++++++++--------- xen/include/xen/pci_regs.h | 9 +++++++ xen/include/xen/vpci.h | 8 ++++++ 4 files changed, 76 insertions(+), 12 deletions(-)