diff mbox

PCI: Add ACS quirk for Intel 300 series chipset

Message ID 20180427104423.2203-1-mika.westerberg@linux.intel.com (mailing list archive)
State New, archived
Delegated to: Bjorn Helgaas
Headers show

Commit Message

Mika Westerberg April 27, 2018, 10:44 a.m. UTC
Intel 300 series chipset still has the same ACS issue than the previous
generations so extend the ACS quirk to cover it as well.

Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
---
 drivers/pci/quirks.c | 6 ++++++
 1 file changed, 6 insertions(+)

Comments

Bjorn Helgaas April 27, 2018, 6:10 p.m. UTC | #1
On Fri, Apr 27, 2018 at 01:44:23PM +0300, Mika Westerberg wrote:
> Intel 300 series chipset still has the same ACS issue than the previous
> generations so extend the ACS quirk to cover it as well.
> 
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>

Applied to pci/virtualization for v4.18, thanks!

Stable tag here, too?

> ---
>  drivers/pci/quirks.c | 6 ++++++
>  1 file changed, 6 insertions(+)
> 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 2990ad1e7c99..7a0f41176369 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4230,6 +4230,11 @@ static int pci_quirk_qcom_rp_acs(struct pci_dev *dev, u16 acs_flags)
>   * 0xa290-0xa29f PCI Express Root port #{0-16}
>   * 0xa2e7-0xa2ee PCI Express Root port #{17-24}
>   *
> + * The 300 series chipset suffers from the same bug so include those root
> + * ports here as well.
> + *
> + * 0xa32c-0xa343 PCI Express Root port #{0-24}
> + *
>   * [1] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html
>   * [2] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-1.html
>   * [3] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-spec-update.html
> @@ -4244,6 +4249,7 @@ static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev)
>  	switch (dev->device) {
>  	case 0xa110 ... 0xa11f: case 0xa167 ... 0xa16a: /* Sunrise Point */
>  	case 0xa290 ... 0xa29f: case 0xa2e7 ... 0xa2ee: /* Union Point */
> +	case 0xa32c ... 0xa343:				/* 300 series */
>  		return true;
>  	}
>  
> -- 
> 2.17.0
>
Mika Westerberg April 29, 2018, 7:03 a.m. UTC | #2
On Fri, Apr 27, 2018 at 01:10:24PM -0500, Bjorn Helgaas wrote:
> On Fri, Apr 27, 2018 at 01:44:23PM +0300, Mika Westerberg wrote:
> > Intel 300 series chipset still has the same ACS issue than the previous
> > generations so extend the ACS quirk to cover it as well.
> > 
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> 
> Applied to pci/virtualization for v4.18, thanks!

Thanks!

> Stable tag here, too?

Fine by me.
Alex Williamson Aug. 15, 2018, 9:21 p.m. UTC | #3
On Fri, 27 Apr 2018 13:44:23 +0300
Mika Westerberg <mika.westerberg@linux.intel.com> wrote:

> Intel 300 series chipset still has the same ACS issue than the previous
> generations so extend the ACS quirk to cover it as well.
> 
> Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> ---
>  drivers/pci/quirks.c | 6 ++++++
>  1 file changed, 6 insertions(+)

There seems to be some suspicion whether this patch is correct, see
link below[1].  The user lists a kernel "4.14.36-1-lts" where lspci
shows the ACSCtl register with SV, RR, and CR enabled.  AIUI how
previous Intel root ports were defective in this regard, a dword
register was used for the ACS capability and control field rather than
the spec defined word register, therefore the upper half of each
mal-sized register was reserved.  I don't think lspci has gained any
insight into this quirk, so showing those control values enabled
suggests this hardware might actually not require the quirk.

The second listing for v4.17 (and I don't know if this is actually
v4.17.10 which would include the stable backport of this patch or
v4.17.0) shows the ACSCtl register as zero.  This is what we'd see if
the hardware does have this errata OR if we applied the quirk to
hardware that doesn't require it.  The existence of the first listing
kind of suggests the latter.

Is there a datasheet available to support this quirk?  Thanks,

Alex

[1]https://www.reddit.com/r/VFIO/comments/97da1s/intel_c246_suspect_300_series_acs_quirk_breaks/

 
> diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> index 2990ad1e7c99..7a0f41176369 100644
> --- a/drivers/pci/quirks.c
> +++ b/drivers/pci/quirks.c
> @@ -4230,6 +4230,11 @@ static int pci_quirk_qcom_rp_acs(struct pci_dev *dev, u16 acs_flags)
>   * 0xa290-0xa29f PCI Express Root port #{0-16}
>   * 0xa2e7-0xa2ee PCI Express Root port #{17-24}
>   *
> + * The 300 series chipset suffers from the same bug so include those root
> + * ports here as well.
> + *
> + * 0xa32c-0xa343 PCI Express Root port #{0-24}
> + *
>   * [1] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html
>   * [2] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-1.html
>   * [3] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-spec-update.html
> @@ -4244,6 +4249,7 @@ static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev)
>  	switch (dev->device) {
>  	case 0xa110 ... 0xa11f: case 0xa167 ... 0xa16a: /* Sunrise Point */
>  	case 0xa290 ... 0xa29f: case 0xa2e7 ... 0xa2ee: /* Union Point */
> +	case 0xa32c ... 0xa343:				/* 300 series */
>  		return true;
>  	}
>
Alex Williamson Aug. 15, 2018, 10:43 p.m. UTC | #4
On Wed, 15 Aug 2018 15:21:39 -0600
Alex Williamson <alex.williamson@redhat.com> wrote:

> On Fri, 27 Apr 2018 13:44:23 +0300
> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> 
> > Intel 300 series chipset still has the same ACS issue than the previous
> > generations so extend the ACS quirk to cover it as well.
> > 
> > Signed-off-by: Mika Westerberg <mika.westerberg@linux.intel.com>
> > ---
> >  drivers/pci/quirks.c | 6 ++++++
> >  1 file changed, 6 insertions(+)  
> 
> There seems to be some suspicion whether this patch is correct, see
> link below[1].  The user lists a kernel "4.14.36-1-lts" where lspci
> shows the ACSCtl register with SV, RR, and CR enabled.  AIUI how
> previous Intel root ports were defective in this regard, a dword
> register was used for the ACS capability and control field rather than
> the spec defined word register, therefore the upper half of each
> mal-sized register was reserved.  I don't think lspci has gained any
> insight into this quirk, so showing those control values enabled
> suggests this hardware might actually not require the quirk.
> 
> The second listing for v4.17 (and I don't know if this is actually
> v4.17.10 which would include the stable backport of this patch or
> v4.17.0) shows the ACSCtl register as zero.  This is what we'd see if
> the hardware does have this errata OR if we applied the quirk to
> hardware that doesn't require it.  The existence of the first listing
> kind of suggests the latter.
> 
> Is there a datasheet available to support this quirk?  Thanks,
> 
> Alex
> 
> [1]https://www.reddit.com/r/VFIO/comments/97da1s/intel_c246_suspect_300_series_acs_quirk_breaks/

Sorry for the self follow-up, but another user pointed me to these:

https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-1.pdf
https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/300-series-c240-series-chipset-pch-spec-update.pdf

And the device IDs match and the errata is #3 in the second link, yet:

00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a338] (rev f0) (prog-if 00 [Normal decode])
00:1c.2 PCI bridge [0604]: Intel Corporation Device [8086:a33a] (rev f0) (prog-if 00 [Normal decode])
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
00:1c.5 PCI bridge [0604]: Intel Corporation Device [8086:a33d] (rev f0) (prog-if 00 [Normal decode])
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a33e] (rev f0) (prog-if 00 [Normal decode])
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
00:1c.7 PCI bridge [0604]: Intel Corporation Device [8086:a33f] (rev f0) (prog-if 00 [Normal decode])
	Capabilities: [140 v1] Access Control Services
		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-

And the dump of config space is (1c.2):

140: 0d 00 01 15 0f 00 0d 00 00 00 00 00 00 00 00 00
                 ^^^^^ ^^^^^
                  Cap   Ctl

So it sure seems like the quirk shouldn't apply here.  Note that the
user calls this a C246 chipset, though this isn't a directly addressed
product SKU in either document.  The LPC device ID is a309, which also
is not directly addressed by the above datasheet.  Has Intel fixed this
in some SKUs, but not others, both using the same root port device
IDs?  Thanks,

Alex

> > diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
> > index 2990ad1e7c99..7a0f41176369 100644
> > --- a/drivers/pci/quirks.c
> > +++ b/drivers/pci/quirks.c
> > @@ -4230,6 +4230,11 @@ static int pci_quirk_qcom_rp_acs(struct pci_dev *dev, u16 acs_flags)
> >   * 0xa290-0xa29f PCI Express Root port #{0-16}
> >   * 0xa2e7-0xa2ee PCI Express Root port #{17-24}
> >   *
> > + * The 300 series chipset suffers from the same bug so include those root
> > + * ports here as well.
> > + *
> > + * 0xa32c-0xa343 PCI Express Root port #{0-24}
> > + *
> >   * [1] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html
> >   * [2] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-1.html
> >   * [3] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-spec-update.html
> > @@ -4244,6 +4249,7 @@ static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev)
> >  	switch (dev->device) {
> >  	case 0xa110 ... 0xa11f: case 0xa167 ... 0xa16a: /* Sunrise Point */
> >  	case 0xa290 ... 0xa29f: case 0xa2e7 ... 0xa2ee: /* Union Point */
> > +	case 0xa32c ... 0xa343:				/* 300 series */
> >  		return true;
> >  	}
> >    
>
Mika Westerberg Aug. 16, 2018, 6:10 a.m. UTC | #5
On Wed, Aug 15, 2018 at 04:43:08PM -0600, Alex Williamson wrote:
> Sorry for the self follow-up, but another user pointed me to these:
> 
> https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-1.pdf
> https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/300-series-c240-series-chipset-pch-spec-update.pdf
> 
> And the device IDs match and the errata is #3 in the second link, yet:
> 
> 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a338] (rev f0) (prog-if 00 [Normal decode])
> 00:1c.2 PCI bridge [0604]: Intel Corporation Device [8086:a33a] (rev f0) (prog-if 00 [Normal decode])
> 	Capabilities: [140 v1] Access Control Services
> 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 00:1c.5 PCI bridge [0604]: Intel Corporation Device [8086:a33d] (rev f0) (prog-if 00 [Normal decode])
> 	Capabilities: [140 v1] Access Control Services
> 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a33e] (rev f0) (prog-if 00 [Normal decode])
> 	Capabilities: [140 v1] Access Control Services
> 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 00:1c.7 PCI bridge [0604]: Intel Corporation Device [8086:a33f] (rev f0) (prog-if 00 [Normal decode])
> 	Capabilities: [140 v1] Access Control Services
> 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> 
> And the dump of config space is (1c.2):
> 
> 140: 0d 00 01 15 0f 00 0d 00 00 00 00 00 00 00 00 00
>                  ^^^^^ ^^^^^
>                   Cap   Ctl

It certainly looks like so.

> So it sure seems like the quirk shouldn't apply here.  Note that the
> user calls this a C246 chipset, though this isn't a directly addressed
> product SKU in either document.  The LPC device ID is a309, which also
> is not directly addressed by the above datasheet.  Has Intel fixed this
> in some SKUs, but not others, both using the same root port device
> IDs?  Thanks,

Let me ask around and see if someone here can explain this.
Alex Williamson Aug. 16, 2018, 3:13 p.m. UTC | #6
On Thu, 16 Aug 2018 09:10:15 +0300
Mika Westerberg <mika.westerberg@linux.intel.com> wrote:

> On Wed, Aug 15, 2018 at 04:43:08PM -0600, Alex Williamson wrote:
> > Sorry for the self follow-up, but another user pointed me to these:
> > 
> > https://www.intel.com/content/dam/www/public/us/en/documents/datasheets/300-series-chipset-pch-datasheet-vol-1.pdf
> > https://www.intel.com/content/dam/www/public/us/en/documents/specification-updates/300-series-c240-series-chipset-pch-spec-update.pdf
> > 
> > And the device IDs match and the errata is #3 in the second link, yet:
> > 
> > 00:1c.0 PCI bridge [0604]: Intel Corporation Device [8086:a338] (rev f0) (prog-if 00 [Normal decode])
> > 00:1c.2 PCI bridge [0604]: Intel Corporation Device [8086:a33a] (rev f0) (prog-if 00 [Normal decode])
> > 	Capabilities: [140 v1] Access Control Services
> > 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 00:1c.5 PCI bridge [0604]: Intel Corporation Device [8086:a33d] (rev f0) (prog-if 00 [Normal decode])
> > 	Capabilities: [140 v1] Access Control Services
> > 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 00:1c.6 PCI bridge [0604]: Intel Corporation Device [8086:a33e] (rev f0) (prog-if 00 [Normal decode])
> > 	Capabilities: [140 v1] Access Control Services
> > 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 00:1c.7 PCI bridge [0604]: Intel Corporation Device [8086:a33f] (rev f0) (prog-if 00 [Normal decode])
> > 	Capabilities: [140 v1] Access Control Services
> > 		ACSCap:	SrcValid+ TransBlk+ ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 		ACSCtl:	SrcValid+ TransBlk- ReqRedir+ CmpltRedir+ UpstreamFwd- EgressCtrl- DirectTrans-
> > 
> > And the dump of config space is (1c.2):
> > 
> > 140: 0d 00 01 15 0f 00 0d 00 00 00 00 00 00 00 00 00
> >                  ^^^^^ ^^^^^
> >                   Cap   Ctl  
> 
> It certainly looks like so.
> 
> > So it sure seems like the quirk shouldn't apply here.  Note that the
> > user calls this a C246 chipset, though this isn't a directly addressed
> > product SKU in either document.  The LPC device ID is a309, which also
> > is not directly addressed by the above datasheet.  Has Intel fixed this
> > in some SKUs, but not others, both using the same root port device
> > IDs?  Thanks,  
> 
> Let me ask around and see if someone here can explain this.

Untested, but I wonder if we need something like below to validate that
the standard control register is read-only, or in reality the upper half
of the dword register where all the bits are reserved on afflicted
devices.  Adding a test of the LPC device ID seems even messier.
Thanks,

Alex

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index e249676fbf04..b67d4789a4bb 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4352,6 +4352,7 @@ static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev)
 static int pci_quirk_intel_spt_pch_acs(struct pci_dev *dev, u16 acs_flags)
 {
 	int pos;
+	u16 std_ctrl;
 	u32 cap, ctrl;
 
 	if (!pci_quirk_intel_spt_pch_acs_match(dev))
@@ -4361,6 +4362,11 @@ static int pci_quirk_intel_spt_pch_acs(struct pci_dev *dev, u16 acs_flags)
 	if (!pos)
 		return -ENOTTY;
 
+	/* If the standard control bits are set, this is not our dev */
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
+	if (std_ctrl)
+		return -ENOTTY;
+
 	/* see pci_acs_flags_enabled() */
 	pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap);
 	acs_flags &= (cap | PCI_ACS_EC);
@@ -4612,6 +4618,7 @@ static int pci_quirk_enable_intel_pch_acs(struct pci_dev *dev)
 static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
 {
 	int pos;
+	u16 std_ctrl;
 	u32 cap, ctrl;
 
 	if (!pci_quirk_intel_spt_pch_acs_match(dev))
@@ -4621,6 +4628,18 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
 	if (!pos)
 		return -ENOTTY;
 
+	/* If the std control word has bits set or is writable, do not quirk */
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
+	if (std_ctrl)
+		return -ENOTTY;
+
+	pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0xff);
+	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
+	if (std_ctrl) {
+		pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0);
+		return -ENOTTY;
+	}
+
 	pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap);
 	pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl);
Mika Westerberg Aug. 16, 2018, 7:28 p.m. UTC | #7
On Thu, Aug 16, 2018 at 09:13:03AM -0600, Alex Williamson wrote:
>  static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
>  {
>  	int pos;
> +	u16 std_ctrl;
>  	u32 cap, ctrl;
>  
>  	if (!pci_quirk_intel_spt_pch_acs_match(dev))
> @@ -4621,6 +4628,18 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
>  	if (!pos)
>  		return -ENOTTY;
>  
> +	/* If the std control word has bits set or is writable, do not quirk */
> +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
> +	if (std_ctrl)
> +		return -ENOTTY;
> +
> +	pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0xff);

I don't know ACS well but could the above have some unwanted
side-effects, even if we write back zeroes below?

> +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
> +	if (std_ctrl) {
> +		pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0);
> +		return -ENOTTY;
> +	}
> +
>  	pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap);
>  	pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl);
>
Alex Williamson Aug. 16, 2018, 8:25 p.m. UTC | #8
On Thu, 16 Aug 2018 22:28:05 +0300
Mika Westerberg <mika.westerberg@linux.intel.com> wrote:

> On Thu, Aug 16, 2018 at 09:13:03AM -0600, Alex Williamson wrote:
> >  static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> >  {
> >  	int pos;
> > +	u16 std_ctrl;
> >  	u32 cap, ctrl;
> >  
> >  	if (!pci_quirk_intel_spt_pch_acs_match(dev))
> > @@ -4621,6 +4628,18 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> >  	if (!pos)
> >  		return -ENOTTY;
> >  
> > +	/* If the std control word has bits set or is writable, do not quirk */
> > +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
> > +	if (std_ctrl)
> > +		return -ENOTTY;
> > +
> > +	pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0xff);  
> 
> I don't know ACS well but could the above have some unwanted
> side-effects, even if we write back zeroes below?

It can certainly influence in-flight packet routing, but at the point
we're performing this test, we're about to do that anyway.  This should
only be happening during discovery and we're limited to a set of root
ports for this test, so we shouldn't have any drivers attached
downstream from here.  For the majority of devices that would pass
through this quirk, we expect the register to behave as if it were
read-only so we can only potentially break the already broken C246 port
through here.

We could possibly refine this to fully replace pci_std_enable_acs(),
even for the matched Intel root ports that seem to be fixed by
attempting to set the requested flags at the standard location, test if
they stick, if so consider it done (exit success rather than -ENOTTY),
if not try the dword version.  Thanks,

Alex

> > +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
> > +	if (std_ctrl) {
> > +		pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0);
> > +		return -ENOTTY;
> > +	}
> > +
> >  	pci_read_config_dword(dev, pos + PCI_ACS_CAP, &cap);
> >  	pci_read_config_dword(dev, pos + INTEL_SPT_ACS_CTRL, &ctrl);
> >
Mika Westerberg Aug. 17, 2018, 9:37 a.m. UTC | #9
On Thu, Aug 16, 2018 at 02:25:04PM -0600, Alex Williamson wrote:
> On Thu, 16 Aug 2018 22:28:05 +0300
> Mika Westerberg <mika.westerberg@linux.intel.com> wrote:
> 
> > On Thu, Aug 16, 2018 at 09:13:03AM -0600, Alex Williamson wrote:
> > >  static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> > >  {
> > >  	int pos;
> > > +	u16 std_ctrl;
> > >  	u32 cap, ctrl;
> > >  
> > >  	if (!pci_quirk_intel_spt_pch_acs_match(dev))
> > > @@ -4621,6 +4628,18 @@ static int pci_quirk_enable_intel_spt_pch_acs(struct pci_dev *dev)
> > >  	if (!pos)
> > >  		return -ENOTTY;
> > >  
> > > +	/* If the std control word has bits set or is writable, do not quirk */
> > > +	pci_read_config_word(dev, pos + PCI_ACS_CTRL, &std_ctrl);
> > > +	if (std_ctrl)
> > > +		return -ENOTTY;
> > > +
> > > +	pci_write_config_word(dev, pos + PCI_ACS_CTRL, 0xff);  
> > 
> > I don't know ACS well but could the above have some unwanted
> > side-effects, even if we write back zeroes below?
> 
> It can certainly influence in-flight packet routing, but at the point
> we're performing this test, we're about to do that anyway.  This should
> only be happening during discovery and we're limited to a set of root
> ports for this test, so we shouldn't have any drivers attached
> downstream from here.  For the majority of devices that would pass
> through this quirk, we expect the register to behave as if it were
> read-only so we can only potentially break the already broken C246 port
> through here.

OK.

> We could possibly refine this to fully replace pci_std_enable_acs(),
> even for the matched Intel root ports that seem to be fixed by
> attempting to set the requested flags at the standard location, test if
> they stick, if so consider it done (exit success rather than -ENOTTY),
> if not try the dword version.  Thanks,

Before going there, I would like to get some definitive answer from our
PCIe people regarding this (currently waiting for their reply).
Mika Westerberg Sept. 5, 2018, 9:58 a.m. UTC | #10
On Fri, Aug 17, 2018 at 12:37:16PM +0300, Mika Westerberg wrote:
> Before going there, I would like to get some definitive answer from our
> PCIe people regarding this (currently waiting for their reply).

Sorry for the delay.

I finally got confirmation and indeed this issue is fixed in 300 series
chipset. The datasheet has outdated information but it will be fixed as
well.

I will send a revert soon.
diff mbox

Patch

diff --git a/drivers/pci/quirks.c b/drivers/pci/quirks.c
index 2990ad1e7c99..7a0f41176369 100644
--- a/drivers/pci/quirks.c
+++ b/drivers/pci/quirks.c
@@ -4230,6 +4230,11 @@  static int pci_quirk_qcom_rp_acs(struct pci_dev *dev, u16 acs_flags)
  * 0xa290-0xa29f PCI Express Root port #{0-16}
  * 0xa2e7-0xa2ee PCI Express Root port #{17-24}
  *
+ * The 300 series chipset suffers from the same bug so include those root
+ * ports here as well.
+ *
+ * 0xa32c-0xa343 PCI Express Root port #{0-24}
+ *
  * [1] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-2.html
  * [2] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-datasheet-vol-1.html
  * [3] http://www.intel.com/content/www/us/en/chipsets/100-series-chipset-spec-update.html
@@ -4244,6 +4249,7 @@  static bool pci_quirk_intel_spt_pch_acs_match(struct pci_dev *dev)
 	switch (dev->device) {
 	case 0xa110 ... 0xa11f: case 0xa167 ... 0xa16a: /* Sunrise Point */
 	case 0xa290 ... 0xa29f: case 0xa2e7 ... 0xa2ee: /* Union Point */
+	case 0xa32c ... 0xa343:				/* 300 series */
 		return true;
 	}