Message ID | 20221025144418.66800-6-roger.pau@citrix.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | (v)pci: fixes related to memory decoding handling | expand |
On 25.10.2022 16:44, Roger Pau Monne wrote: > Writes to the BARs are ignored if memory decoding is enabled for the > device, and the same happen with ROM BARs if the write is an attempt > to change the position of the BAR without disabling it first. > > The reason of ignoring such writes is a limitation in Xen, as it would > need to unmap the BAR, change the address, and remap the BAR at the > new position, which the current logic doesn't support. > > Some devices however seem to (wrongly) have the memory decoding bit > hardcoded to enabled, and attempts to disable it don't get reflected > on the command register. > > This causes issues for well behaved guests that disable memory > decoding and then try to size the BARs, as vPCI will think memory > decoding is still enabled and ignore the write. Just to avoid misunderstandings: "guests" here includes Dom0? In such cases we typically prefer to say "domains". This then also affects the next (final) paragraph. > --- a/xen/drivers/vpci/header.c > +++ b/xen/drivers/vpci/header.c > @@ -128,7 +128,10 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd, > } > > if ( !rom_only ) > + { > pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd); > + header->bars_mapped = map; > + } > else > ASSERT_UNREACHABLE(); > } > @@ -355,13 +358,13 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only) > static void cf_check cmd_write( > const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data) > { > - uint16_t current_cmd = pci_conf_read16(pdev->sbdf, reg); > + struct vpci_header *header = data; > > /* > * Let Dom0 play with all the bits directly except for the memory > * decoding one. > */ > - if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY ) > + if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) ) > /* > * Ignore the error. No memory has been added or removed from the p2m > * (because the actual p2m changes are deferred in defer_map) and the > @@ -388,12 +391,12 @@ static void cf_check bar_write( > else > val &= PCI_BASE_ADDRESS_MEM_MASK; > > - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) > + if ( bar->enabled ) In 3 of the 4 cases you use header->bars_mapped as replacement. Since it's not clear to me why you don't here, could you explain this to me? (I'm therefore undecided whether this is merely a cosmetic [consistency] issue.) > { > /* If the value written is the current one avoid printing a warning. */ > if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) ) > gprintk(XENLOG_WARNING, > - "%pp: ignored BAR %zu write with memory decoding enabled\n", > + "%pp: ignored BAR %zu write while mapped\n", > &pdev->sbdf, bar - pdev->vpci->header.bars + hi); > return; > } > @@ -422,13 +425,12 @@ static void cf_check rom_write( > { > struct vpci_header *header = &pdev->vpci->header; > struct vpci_bar *rom = data; > - uint16_t cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND); > bool new_enabled = val & PCI_ROM_ADDRESS_ENABLE; > > - if ( (cmd & PCI_COMMAND_MEMORY) && header->rom_enabled && new_enabled ) > + if ( header->bars_mapped && header->rom_enabled && new_enabled ) > { > gprintk(XENLOG_WARNING, > - "%pp: ignored ROM BAR write with memory decoding enabled\n", > + "%pp: ignored ROM BAR write while mapped\n", > &pdev->sbdf); > return; > } > @@ -440,7 +442,7 @@ static void cf_check rom_write( > */ > rom->addr = val & PCI_ROM_ADDRESS_MASK; > > - if ( !(cmd & PCI_COMMAND_MEMORY) || header->rom_enabled == new_enabled ) > + if ( !header->bars_mapped || header->rom_enabled == new_enabled ) > { > /* Just update the ROM BAR field. */ > header->rom_enabled = new_enabled;
On Wed, Oct 26, 2022 at 02:47:43PM +0200, Jan Beulich wrote: > On 25.10.2022 16:44, Roger Pau Monne wrote: > > Writes to the BARs are ignored if memory decoding is enabled for the > > device, and the same happen with ROM BARs if the write is an attempt > > to change the position of the BAR without disabling it first. > > > > The reason of ignoring such writes is a limitation in Xen, as it would > > need to unmap the BAR, change the address, and remap the BAR at the > > new position, which the current logic doesn't support. > > > > Some devices however seem to (wrongly) have the memory decoding bit > > hardcoded to enabled, and attempts to disable it don't get reflected > > on the command register. > > > > This causes issues for well behaved guests that disable memory > > decoding and then try to size the BARs, as vPCI will think memory > > decoding is still enabled and ignore the write. > > Just to avoid misunderstandings: "guests" here includes Dom0? In such > cases we typically prefer to say "domains". This then also affects > the next (final) paragraph. Right, will adjust. > > --- a/xen/drivers/vpci/header.c > > +++ b/xen/drivers/vpci/header.c > > @@ -128,7 +128,10 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd, > > } > > > > if ( !rom_only ) > > + { > > pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd); > > + header->bars_mapped = map; > > + } > > else > > ASSERT_UNREACHABLE(); > > } > > @@ -355,13 +358,13 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only) > > static void cf_check cmd_write( > > const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data) > > { > > - uint16_t current_cmd = pci_conf_read16(pdev->sbdf, reg); > > + struct vpci_header *header = data; > > > > /* > > * Let Dom0 play with all the bits directly except for the memory > > * decoding one. > > */ > > - if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY ) > > + if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) ) > > /* > > * Ignore the error. No memory has been added or removed from the p2m > > * (because the actual p2m changes are deferred in defer_map) and the > > @@ -388,12 +391,12 @@ static void cf_check bar_write( > > else > > val &= PCI_BASE_ADDRESS_MEM_MASK; > > > > - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) > > + if ( bar->enabled ) > > In 3 of the 4 cases you use header->bars_mapped as replacement. Since it's > not clear to me why you don't here, could you explain this to me? (I'm > therefore undecided whether this is merely a cosmetic [consistency] issue.) No, it's intended to use bar->enabled here rather than header->bars_mapped. It's possible to have header->bars_mapped == true, but bar->enabled == false if memory decoding is enabled, but this BAR specifically has failed to be mapped in the guest p2m, which means dom0 is safe to move it for what Xen cares (ie: it won't mess with p2m mappings because there are none for the BAR). We could be more strict and use header->bars_mapped, but I don't think there's a need for it. What about adding a comment with: /* * Xen only cares whether the BAR is mapped into the p2m, so allow BAR * writes as long as the BAR is not mapped into the p2m. */ Otherwise I can switch to using header->bars_mapped if you think that's clearer. Thanks, Roger.
On 26.10.2022 15:58, Roger Pau Monné wrote: > On Wed, Oct 26, 2022 at 02:47:43PM +0200, Jan Beulich wrote: >> On 25.10.2022 16:44, Roger Pau Monne wrote: >>> @@ -388,12 +391,12 @@ static void cf_check bar_write( >>> else >>> val &= PCI_BASE_ADDRESS_MEM_MASK; >>> >>> - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) >>> + if ( bar->enabled ) >> >> In 3 of the 4 cases you use header->bars_mapped as replacement. Since it's >> not clear to me why you don't here, could you explain this to me? (I'm >> therefore undecided whether this is merely a cosmetic [consistency] issue.) > > No, it's intended to use bar->enabled here rather than > header->bars_mapped. > > It's possible to have header->bars_mapped == true, but bar->enabled == > false if memory decoding is enabled, but this BAR specifically has > failed to be mapped in the guest p2m, which means dom0 is safe to move > it for what Xen cares (ie: it won't mess with p2m mappings because > there are none for the BAR). > > We could be more strict and use header->bars_mapped, but I don't think > there's a need for it. > > What about adding a comment with: > > /* > * Xen only cares whether the BAR is mapped into the p2m, so allow BAR > * writes as long as the BAR is not mapped into the p2m. > */ > > Otherwise I can switch to using header->bars_mapped if you think > that's clearer. It's not so much a matter of being clearer, but a matter of consistency: Why does the same consideration not apply in rom_write()? In fact both uses there are (already before the change) combined with further conditions (checking header->rom_enabled and new_enabled). If the inconsistency is on purpose (and perhaps necessary), I'd like to first understand why that is before deciding what to do about it. A comment like you suggest it _may_ be the route to go. Jan
On Wed, Oct 26, 2022 at 04:06:40PM +0200, Jan Beulich wrote: > On 26.10.2022 15:58, Roger Pau Monné wrote: > > On Wed, Oct 26, 2022 at 02:47:43PM +0200, Jan Beulich wrote: > >> On 25.10.2022 16:44, Roger Pau Monne wrote: > >>> @@ -388,12 +391,12 @@ static void cf_check bar_write( > >>> else > >>> val &= PCI_BASE_ADDRESS_MEM_MASK; > >>> > >>> - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) > >>> + if ( bar->enabled ) > >> > >> In 3 of the 4 cases you use header->bars_mapped as replacement. Since it's > >> not clear to me why you don't here, could you explain this to me? (I'm > >> therefore undecided whether this is merely a cosmetic [consistency] issue.) > > > > No, it's intended to use bar->enabled here rather than > > header->bars_mapped. > > > > It's possible to have header->bars_mapped == true, but bar->enabled == > > false if memory decoding is enabled, but this BAR specifically has > > failed to be mapped in the guest p2m, which means dom0 is safe to move > > it for what Xen cares (ie: it won't mess with p2m mappings because > > there are none for the BAR). > > > > We could be more strict and use header->bars_mapped, but I don't think > > there's a need for it. > > > > What about adding a comment with: > > > > /* > > * Xen only cares whether the BAR is mapped into the p2m, so allow BAR > > * writes as long as the BAR is not mapped into the p2m. > > */ > > > > Otherwise I can switch to using header->bars_mapped if you think > > that's clearer. > > It's not so much a matter of being clearer, but a matter of consistency: > Why does the same consideration not apply in rom_write()? In fact both > uses there are (already before the change) combined with further > conditions (checking header->rom_enabled and new_enabled). If the > inconsistency is on purpose (and perhaps necessary), I'd like to first > understand why that is before deciding what to do about it. A comment > like you suggest it _may_ be the route to go. ROM register is more complex to handle, because the same register that's used to store the address also contains the bit that can trigger whether it's mapped into the guest p2m or not (PCI_ROM_ADDRESS_ENABLE). So ROM BAR writes with the ROM BAR mapped and the PCI_ROM_ADDRESS_ENABLE bit also set in the to be written value will be rejected, because we only allow to first disable the ROM and then change the address (which is likely to not play well with OSes, but so far I haven't been able to test ROM BAR register usage on PVH). I do think for consistency it would be better to use rom->enabled in the first conditional of rom_write() check, so it would be: if ( rom->enabled && new_enabled ) { gprintk(XENLOG_WARNING, "%pp: ignored ROM BAR write while mapped\n", &pdev->sbdf); return; } So that we also allow changing the address of ROM BARs even with memory decoding and PCI_ROM_ADDRESS_ENABLE as long as the ROM BAR is not mapped in the p2m. Would you be fine with the comment in the previous email added and rom_write() adjusted as suggested above? Thanks, Roger.
On 26.10.2022 18:01, Roger Pau Monné wrote: > On Wed, Oct 26, 2022 at 04:06:40PM +0200, Jan Beulich wrote: >> On 26.10.2022 15:58, Roger Pau Monné wrote: >>> On Wed, Oct 26, 2022 at 02:47:43PM +0200, Jan Beulich wrote: >>>> On 25.10.2022 16:44, Roger Pau Monne wrote: >>>>> @@ -388,12 +391,12 @@ static void cf_check bar_write( >>>>> else >>>>> val &= PCI_BASE_ADDRESS_MEM_MASK; >>>>> >>>>> - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) >>>>> + if ( bar->enabled ) >>>> >>>> In 3 of the 4 cases you use header->bars_mapped as replacement. Since it's >>>> not clear to me why you don't here, could you explain this to me? (I'm >>>> therefore undecided whether this is merely a cosmetic [consistency] issue.) >>> >>> No, it's intended to use bar->enabled here rather than >>> header->bars_mapped. >>> >>> It's possible to have header->bars_mapped == true, but bar->enabled == >>> false if memory decoding is enabled, but this BAR specifically has >>> failed to be mapped in the guest p2m, which means dom0 is safe to move >>> it for what Xen cares (ie: it won't mess with p2m mappings because >>> there are none for the BAR). >>> >>> We could be more strict and use header->bars_mapped, but I don't think >>> there's a need for it. >>> >>> What about adding a comment with: >>> >>> /* >>> * Xen only cares whether the BAR is mapped into the p2m, so allow BAR >>> * writes as long as the BAR is not mapped into the p2m. >>> */ >>> >>> Otherwise I can switch to using header->bars_mapped if you think >>> that's clearer. >> >> It's not so much a matter of being clearer, but a matter of consistency: >> Why does the same consideration not apply in rom_write()? In fact both >> uses there are (already before the change) combined with further >> conditions (checking header->rom_enabled and new_enabled). If the >> inconsistency is on purpose (and perhaps necessary), I'd like to first >> understand why that is before deciding what to do about it. A comment >> like you suggest it _may_ be the route to go. > > ROM register is more complex to handle, because the same register > that's used to store the address also contains the bit that can > trigger whether it's mapped into the guest p2m or not > (PCI_ROM_ADDRESS_ENABLE). So ROM BAR writes with the ROM BAR mapped > and the PCI_ROM_ADDRESS_ENABLE bit also set in the to be written value > will be rejected, because we only allow to first disable the ROM and > then change the address (which is likely to not play well with OSes, > but so far I haven't been able to test ROM BAR register usage on PVH). > > I do think for consistency it would be better to use rom->enabled in > the first conditional of rom_write() check, so it would be: > > if ( rom->enabled && new_enabled ) > { > gprintk(XENLOG_WARNING, > "%pp: ignored ROM BAR write while mapped\n", > &pdev->sbdf); > return; > } > > So that we also allow changing the address of ROM BARs even with > memory decoding and PCI_ROM_ADDRESS_ENABLE as long as the ROM BAR is > not mapped in the p2m. > > Would you be fine with the comment in the previous email added and > rom_write() adjusted as suggested above? Yes, that would look better to me. The comment then probably also wants duplicating (or pointing at from) here. Jan
diff --git a/xen/drivers/vpci/header.c b/xen/drivers/vpci/header.c index 4d7f8f4a30..ecd95059b2 100644 --- a/xen/drivers/vpci/header.c +++ b/xen/drivers/vpci/header.c @@ -128,7 +128,10 @@ static void modify_decoding(const struct pci_dev *pdev, uint16_t cmd, } if ( !rom_only ) + { pci_conf_write16(pdev->sbdf, PCI_COMMAND, cmd); + header->bars_mapped = map; + } else ASSERT_UNREACHABLE(); } @@ -355,13 +358,13 @@ static int modify_bars(const struct pci_dev *pdev, uint16_t cmd, bool rom_only) static void cf_check cmd_write( const struct pci_dev *pdev, unsigned int reg, uint32_t cmd, void *data) { - uint16_t current_cmd = pci_conf_read16(pdev->sbdf, reg); + struct vpci_header *header = data; /* * Let Dom0 play with all the bits directly except for the memory * decoding one. */ - if ( (cmd ^ current_cmd) & PCI_COMMAND_MEMORY ) + if ( header->bars_mapped != !!(cmd & PCI_COMMAND_MEMORY) ) /* * Ignore the error. No memory has been added or removed from the p2m * (because the actual p2m changes are deferred in defer_map) and the @@ -388,12 +391,12 @@ static void cf_check bar_write( else val &= PCI_BASE_ADDRESS_MEM_MASK; - if ( pci_conf_read16(pdev->sbdf, PCI_COMMAND) & PCI_COMMAND_MEMORY ) + if ( bar->enabled ) { /* If the value written is the current one avoid printing a warning. */ if ( val != (uint32_t)(bar->addr >> (hi ? 32 : 0)) ) gprintk(XENLOG_WARNING, - "%pp: ignored BAR %zu write with memory decoding enabled\n", + "%pp: ignored BAR %zu write while mapped\n", &pdev->sbdf, bar - pdev->vpci->header.bars + hi); return; } @@ -422,13 +425,12 @@ static void cf_check rom_write( { struct vpci_header *header = &pdev->vpci->header; struct vpci_bar *rom = data; - uint16_t cmd = pci_conf_read16(pdev->sbdf, PCI_COMMAND); bool new_enabled = val & PCI_ROM_ADDRESS_ENABLE; - if ( (cmd & PCI_COMMAND_MEMORY) && header->rom_enabled && new_enabled ) + if ( header->bars_mapped && header->rom_enabled && new_enabled ) { gprintk(XENLOG_WARNING, - "%pp: ignored ROM BAR write with memory decoding enabled\n", + "%pp: ignored ROM BAR write while mapped\n", &pdev->sbdf); return; } @@ -440,7 +442,7 @@ static void cf_check rom_write( */ rom->addr = val & PCI_ROM_ADDRESS_MASK; - if ( !(cmd & PCI_COMMAND_MEMORY) || header->rom_enabled == new_enabled ) + if ( !header->bars_mapped || header->rom_enabled == new_enabled ) { /* Just update the ROM BAR field. */ header->rom_enabled = new_enabled; diff --git a/xen/include/xen/vpci.h b/xen/include/xen/vpci.h index 67c9a0c631..d8acfeba8a 100644 --- a/xen/include/xen/vpci.h +++ b/xen/include/xen/vpci.h @@ -88,6 +88,12 @@ struct vpci { * is mapped into guest p2m) if there's a ROM BAR on the device. */ bool rom_enabled : 1; + /* + * Cache whether memory decoding is enabled from our PoV. + * Some devices have a sticky memory decoding so that can't be relied + * upon to know whether BARs are mapped into the guest p2m. + */ + bool bars_mapped : 1; /* FIXME: currently there's no support for SR-IOV. */ } header;
Writes to the BARs are ignored if memory decoding is enabled for the device, and the same happen with ROM BARs if the write is an attempt to change the position of the BAR without disabling it first. The reason of ignoring such writes is a limitation in Xen, as it would need to unmap the BAR, change the address, and remap the BAR at the new position, which the current logic doesn't support. Some devices however seem to (wrongly) have the memory decoding bit hardcoded to enabled, and attempts to disable it don't get reflected on the command register. This causes issues for well behaved guests that disable memory decoding and then try to size the BARs, as vPCI will think memory decoding is still enabled and ignore the write. Since vPCI doesn't explicitly care about whether the memory decoding bit is disabled as long as the BAR is not mapped in the guest p2m use the information in the vpci_bar to check whether the BAR is mapped, and refuse writes only based on that information. This workarounds the issue, and allows guests to size and reposition the BARs properly. Signed-off-by: Roger Pau Monné <roger.pau@citrix.com> --- Changes since v1: - Cache setting of memory decoding in command register. - Reword some log messages. --- xen/drivers/vpci/header.c | 18 ++++++++++-------- xen/include/xen/vpci.h | 6 ++++++ 2 files changed, 16 insertions(+), 8 deletions(-)