Message ID | 1485793552-12614-1-git-send-email-Vadim.Lomovtsev@caviumnetworks.com (mailing list archive) |
---|---|
State | New, archived |
Headers | show |
On 01/30/2017 08:25 AM, Vadim Lomovtsev wrote: > This patch is to address PEM initialization issue > which causes network issues. > > It is necessary to search for _HID:PNP0A08 while requesting > PEM resources via ACPI instead of "THRX0002". Since this is fixing new code, there should be no chance to break previously working systems. Bjorn: If possible, this should be merged for v4.10 > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> Acked-by: David Daney <david.daney@cavium.com> > --- > drivers/pci/host/pci-thunder-pem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > index af722eb..aec30b8 100644 > --- a/drivers/pci/host/pci-thunder-pem.c > +++ b/drivers/pci/host/pci-thunder-pem.c > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > if (!res_pem) > return -ENOMEM; > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > if (ret) { > dev_err(dev, "can't get rc base address\n"); > return ret; >
Hi Vadim, On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > This patch is to address PEM initialization issue > which causes network issues. > > It is necessary to search for _HID:PNP0A08 while requesting > PEM resources via ACPI instead of "THRX0002". > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > --- > drivers/pci/host/pci-thunder-pem.c | 2 +- > 1 file changed, 1 insertion(+), 1 deletion(-) > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > index af722eb..aec30b8 100644 > --- a/drivers/pci/host/pci-thunder-pem.c > +++ b/drivers/pci/host/pci-thunder-pem.c > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > if (!res_pem) > return -ENOMEM; > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no guarantee that if we find a PNP0A08 device, it is a ThunderX device. I think the only way to call thunder_pem_acpi_init() is via an MCFG quirk that mentions thunder_pem_ecam_ops, which means we only call it if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's probably safe in that sense. But it's an abuse of the ACPI _HID model. If you match a device using PNP0A08, all you can assume about it is that it uses the generic PNP0A08 programming model, and I don't think that includes "the first memory resource in _CRS contains ECAM space and MSI-X tables." I expect this is a teething issue because you have firmware in the field that uses PNP0A08 and it's not feasible to update it. If that's the case, the changelog should have details about it and we should have a comment in the code, because I don't think this is the model we want to end up with in future releases. > if (ret) { > dev_err(dev, "can't get rc base address\n"); > return ret; > -- > 2.4.11 > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
Hi Bjorn, On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > Hi Vadim, > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > This patch is to address PEM initialization issue > > which causes network issues. > > > > It is necessary to search for _HID:PNP0A08 while requesting > > PEM resources via ACPI instead of "THRX0002". > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > --- > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > index af722eb..aec30b8 100644 > > --- a/drivers/pci/host/pci-thunder-pem.c > > +++ b/drivers/pci/host/pci-thunder-pem.c > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > if (!res_pem) > > return -ENOMEM; > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > quirk that mentions thunder_pem_ecam_ops, which means we only call it > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > probably safe in that sense. Agree, it is not the best solution. We will implement such approach and send for review. > > But it's an abuse of the ACPI _HID model. If you match a device using > PNP0A08, all you can assume about it is that it uses the generic > PNP0A08 programming model, and I don't think that includes "the first > memory resource in _CRS contains ECAM space and MSI-X tables." > > I expect this is a teething issue because you have firmware in the > field that uses PNP0A08 and it's not feasible to update it. If that's > the case, the changelog should have details about it and we should > have a comment in the code, because I don't think this is the model we > want to end up with in future releases. It could become so. However, for now I didn't get any reports on that, (may be I miss something) except some internal emailings. At my testing HW I was able to see some issues related to acpi-PEM stuff. Thanks for feed-back, we will prepare another patch or patchset implementing approach you've highlighted. > > if (ret) { > > dev_err(dev, "can't get rc base address\n"); > > return ret; > > -- > > 2.4.11 > > > > > > _______________________________________________ > > linux-arm-kernel mailing list > > linux-arm-kernel@lists.infradead.org > > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel
On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > Hi Bjorn, > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > Hi Vadim, > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > This patch is to address PEM initialization issue > > > which causes network issues. > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > PEM resources via ACPI instead of "THRX0002". > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > --- > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > index af722eb..aec30b8 100644 > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > if (!res_pem) > > > return -ENOMEM; > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > probably safe in that sense. > > Agree, it is not the best solution. > We will implement such approach and send for review. > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > PNP0A08, all you can assume about it is that it uses the generic > > PNP0A08 programming model, and I don't think that includes "the first > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > I expect this is a teething issue because you have firmware in the > > field that uses PNP0A08 and it's not feasible to update it. If that's > > the case, the changelog should have details about it and we should > > have a comment in the code, because I don't think this is the model we > > want to end up with in future releases. > > It could become so. However, for now I didn't get any reports on that, > (may be I miss something) except some internal emailings. > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > Thanks for feed-back, we will prepare another patch or patchset > implementing approach you've highlighted. The approach I would like best is to search for THRX0002, because then you know you have a ThunderX PEM device, and you can assume device-specific details about its _CRS. But that's what the existing code does, and apparently that doesn't work. My guess is that you have firmware in the field where the host bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. Per spec, the OS can only assume the generic PCI host bridge programming model in that case. It can't use any device-specific features, like the register space you're extracting here, since there's no way to tell what specific device you have. If the OS needs to use device-specific features, there must be a device-specific _HID, i.e., THRX0002. This is the important thing to get right in the future. If existing firmware in the field has no device-specific _HID, we might need something like this as a quirk to work around that firmware deficiency. If that's the case, all I'm really asking for is: - Some indication that you have a plan to change the firmware strategy so future releases don't require similar quirks, and - Some comments in the code pointing out that this is a workaround for a firmware deficiency. As I mentioned, I think this change should be safe because we only run this code if we find a CAVIUM THUNDERX MCFG table, and we search for a PNP0A08 device that matches the segment from MCFG. The only way we could accidentally match the wrong device would be if we had another bridge in the same segment. Maybe we should have made acpi_get_rc_resources() match on the bus range as well as the segment. But that feels like just a "pedantically correct" sort of thing and probably over-engineering for this situation; I don't think it would solve any current problem. I know you'd like to see this in v4.10, so we need to sort this out ASAP. Bjorn
On Tue, Jan 31, 2017 at 08:25:25AM -0600, Bjorn Helgaas wrote: > On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > > Hi Bjorn, > > > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > > Hi Vadim, > > > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > > This patch is to address PEM initialization issue > > > > which causes network issues. > > > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > > PEM resources via ACPI instead of "THRX0002". > > > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > > --- > > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > > index af722eb..aec30b8 100644 > > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > > if (!res_pem) > > > > return -ENOMEM; > > > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > > probably safe in that sense. > > > > Agree, it is not the best solution. > > We will implement such approach and send for review. > > > > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > > PNP0A08, all you can assume about it is that it uses the generic > > > PNP0A08 programming model, and I don't think that includes "the first > > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > > > I expect this is a teething issue because you have firmware in the > > > field that uses PNP0A08 and it's not feasible to update it. If that's > > > the case, the changelog should have details about it and we should > > > have a comment in the code, because I don't think this is the model we > > > want to end up with in future releases. > > > > It could become so. However, for now I didn't get any reports on that, > > (may be I miss something) except some internal emailings. > > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > > > Thanks for feed-back, we will prepare another patch or patchset > > implementing approach you've highlighted. > > The approach I would like best is to search for THRX0002, because then > you know you have a ThunderX PEM device, and you can assume > device-specific details about its _CRS. > > But that's what the existing code does, and apparently that doesn't > work. My guess is that you have firmware in the field where the host > bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. Because there is no such ACPI ID as "THRX0002" registered (http://www.uefi.org/acpi_id_list). From the other hand we may gather device resources through _CRS (we have acpi_device already set by this moment) but we need to be sure that we're running at Cavium ThunderX board then. To do that we may add _SUB value (accrodingly to spec) with exact ID string (a full PCI ID value "177DA22D" is suggested). This will eventually become the main method of finding out whether we run on a ThunderX PEM. However the current FW have no such string and we need to handle this case also. > Per spec, the OS can only assume the generic PCI host bridge > programming model in that case. It can't use any device-specific > features, like the register space you're extracting here, since > there's no way to tell what specific device you have. > > If the OS needs to use device-specific features, there must be a > device-specific _HID, i.e., THRX0002. This is the important thing to > get right in the future. > > If existing firmware in the field has no device-specific _HID, we I'm afraid not. > might need something like this as a quirk to work around that firmware > deficiency. If that's the case, all I'm really asking for is: > > - Some indication that you have a plan to change the firmware > strategy so future releases don't require similar quirks, and I believe we have some work at FW level to get this things done accordingly, but we have HW in the field and this could become critical. > - Some comments in the code pointing out that this is a workaround > for a firmware deficiency. > > As I mentioned, I think this change should be safe because we only run > this code if we find a CAVIUM THUNDERX MCFG table, and we search for a > PNP0A08 device that matches the segment from MCFG. However, the patch is not in good shape (was done in rush to be honest) so I prepare another one shortly, addressing your comments. > The only way we could accidentally match the wrong device would be if > we had another bridge in the same segment. Maybe we should have made > acpi_get_rc_resources() match on the bus range as well as the segment. > But that feels like just a "pedantically correct" sort of thing and > probably over-engineering for this situation; I don't think it would > solve any current problem. > > I know you'd like to see this in v4.10, so we need to sort this out > ASAP. > > Bjorn
On Tue, Jan 31, 2017 at 06:57:20AM -0800, Vadim Lomovtsev wrote: > On Tue, Jan 31, 2017 at 08:25:25AM -0600, Bjorn Helgaas wrote: > > On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > > > Hi Bjorn, > > > > > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > > > Hi Vadim, > > > > > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > > > This patch is to address PEM initialization issue > > > > > which causes network issues. > > > > > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > > > PEM resources via ACPI instead of "THRX0002". > > > > > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > > > --- > > > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > > > index af722eb..aec30b8 100644 > > > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > > > if (!res_pem) > > > > > return -ENOMEM; > > > > > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > > > probably safe in that sense. > > > > > > Agree, it is not the best solution. > > > We will implement such approach and send for review. > > > > > > > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > > > PNP0A08, all you can assume about it is that it uses the generic > > > > PNP0A08 programming model, and I don't think that includes "the first > > > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > > > > > I expect this is a teething issue because you have firmware in the > > > > field that uses PNP0A08 and it's not feasible to update it. If that's > > > > the case, the changelog should have details about it and we should > > > > have a comment in the code, because I don't think this is the model we > > > > want to end up with in future releases. > > > > > > It could become so. However, for now I didn't get any reports on that, > > > (may be I miss something) except some internal emailings. > > > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > > > > > Thanks for feed-back, we will prepare another patch or patchset > > > implementing approach you've highlighted. > > > > The approach I would like best is to search for THRX0002, because then > > you know you have a ThunderX PEM device, and you can assume > > device-specific details about its _CRS. > > > > But that's what the existing code does, and apparently that doesn't > > work. My guess is that you have firmware in the field where the host > > bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. > > Because there is no such ACPI ID as "THRX0002" registered > (http://www.uefi.org/acpi_id_list). To be pedantically correct, I think you want "THRX" registered. Then you can manage the "0002" part internally without registering each individual device. > From the other hand we may gather device resources through > _CRS (we have acpi_device already set by this moment) but > we need to be sure that we're running at Cavium ThunderX board then. > > To do that we may add _SUB value (accrodingly to spec) with > exact ID string (a full PCI ID value "177DA22D" is suggested). > This will eventually become the main method of finding out > whether we run on a ThunderX PEM. I don't understand why you would use _SUB. I think the correct way to handle this is to use a _HID of "THRX0002" with a _CID of "PNP0A08". Is there some existing _SUB usage you're using as an example? Bjorn
On Tue, Jan 31, 2017 at 02:31:09PM -0600, Bjorn Helgaas wrote: > On Tue, Jan 31, 2017 at 06:57:20AM -0800, Vadim Lomovtsev wrote: > > On Tue, Jan 31, 2017 at 08:25:25AM -0600, Bjorn Helgaas wrote: > > > On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > > > > Hi Bjorn, > > > > > > > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > > > > Hi Vadim, > > > > > > > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > > > > This patch is to address PEM initialization issue > > > > > > which causes network issues. > > > > > > > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > > > > PEM resources via ACPI instead of "THRX0002". > > > > > > > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > > > > --- > > > > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > > > > index af722eb..aec30b8 100644 > > > > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > > > > if (!res_pem) > > > > > > return -ENOMEM; > > > > > > > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > > > > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > > > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > > > > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > > > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > > > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > > > > probably safe in that sense. > > > > > > > > Agree, it is not the best solution. > > > > We will implement such approach and send for review. > > > > > > > > > > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > > > > PNP0A08, all you can assume about it is that it uses the generic > > > > > PNP0A08 programming model, and I don't think that includes "the first > > > > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > > > > > > > I expect this is a teething issue because you have firmware in the > > > > > field that uses PNP0A08 and it's not feasible to update it. If that's > > > > > the case, the changelog should have details about it and we should > > > > > have a comment in the code, because I don't think this is the model we > > > > > want to end up with in future releases. > > > > > > > > It could become so. However, for now I didn't get any reports on that, > > > > (may be I miss something) except some internal emailings. > > > > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > > > > > > > Thanks for feed-back, we will prepare another patch or patchset > > > > implementing approach you've highlighted. > > > > > > The approach I would like best is to search for THRX0002, because then > > > you know you have a ThunderX PEM device, and you can assume > > > device-specific details about its _CRS. > > > > > > But that's what the existing code does, and apparently that doesn't > > > work. My guess is that you have firmware in the field where the host > > > bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. > > > > Because there is no such ACPI ID as "THRX0002" registered > > (http://www.uefi.org/acpi_id_list). > > To be pedantically correct, I think you want "THRX" registered. Then > you can manage the "0002" part internally without registering each > individual device. Not sure if it would be registered that way, because (AFAIK) it expected to be string constructed from Vendor ID (not the Product ID) plus four hex digit manged internaly. So we suggest to change it to 177DXXXX which corresponds to Cavium PCI ID https://pci-ids.ucw.cz/pci.ids. It's also possible to use the 3-digit PNP ID, "CAV", to construct these _HID/_CID/_SUB values (http://www.uefi.org/pnp_id_list). So the FW will be updated accordingly. For old FW we'll implement fallback scenario gathering resource info via _CRS object (comments will be provided in the code also). > > > From the other hand we may gather device resources through > > _CRS (we have acpi_device already set by this moment) but > > we need to be sure that we're running at Cavium ThunderX board then. > > > > To do that we may add _SUB value (accrodingly to spec) with > > exact ID string (a full PCI ID value "177DA22D" is suggested). > > This will eventually become the main method of finding out > > whether we run on a ThunderX PEM. > > I don't understand why you would use _SUB. I think the correct way to > handle this is to use a _HID of "THRX0002" with a _CID of "PNP0A08". > Is there some existing _SUB usage you're using as an example? > Here we were talking about extra quirk to match PEM and ECAM and thus make a decision. This object could also store some vendor-specific IDs. To be honest I didn't see usage in kernel (didn't search), but ACPI Spec provide an example as (p 6.1.9) : Name (_SUB, "MSFT3000")// Vendor-defined subsystem However this is still debatable and we probably stick to _HID while implementing additional match functions. > Bjorn
On Wed, Feb 01, 2017 at 04:53:25AM -0800, Vadim Lomovtsev wrote: > On Tue, Jan 31, 2017 at 02:31:09PM -0600, Bjorn Helgaas wrote: > > On Tue, Jan 31, 2017 at 06:57:20AM -0800, Vadim Lomovtsev wrote: > > > On Tue, Jan 31, 2017 at 08:25:25AM -0600, Bjorn Helgaas wrote: > > > > On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > > > > > Hi Bjorn, > > > > > > > > > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > > > > > Hi Vadim, > > > > > > > > > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > > > > > This patch is to address PEM initialization issue > > > > > > > which causes network issues. > > > > > > > > > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > > > > > PEM resources via ACPI instead of "THRX0002". > > > > > > > > > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > > > > > --- > > > > > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > > > > > index af722eb..aec30b8 100644 > > > > > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > > > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > > > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > > > > > if (!res_pem) > > > > > > > return -ENOMEM; > > > > > > > > > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > > > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > > > > > > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > > > > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > > > > > > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > > > > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > > > > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > > > > > probably safe in that sense. > > > > > > > > > > Agree, it is not the best solution. > > > > > We will implement such approach and send for review. > > > > > > > > > > > > > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > > > > > PNP0A08, all you can assume about it is that it uses the generic > > > > > > PNP0A08 programming model, and I don't think that includes "the first > > > > > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > > > > > > > > > I expect this is a teething issue because you have firmware in the > > > > > > field that uses PNP0A08 and it's not feasible to update it. If that's > > > > > > the case, the changelog should have details about it and we should > > > > > > have a comment in the code, because I don't think this is the model we > > > > > > want to end up with in future releases. > > > > > > > > > > It could become so. However, for now I didn't get any reports on that, > > > > > (may be I miss something) except some internal emailings. > > > > > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > > > > > > > > > Thanks for feed-back, we will prepare another patch or patchset > > > > > implementing approach you've highlighted. > > > > > > > > The approach I would like best is to search for THRX0002, because then > > > > you know you have a ThunderX PEM device, and you can assume > > > > device-specific details about its _CRS. > > > > > > > > But that's what the existing code does, and apparently that doesn't > > > > work. My guess is that you have firmware in the field where the host > > > > bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. > > > > > > Because there is no such ACPI ID as "THRX0002" registered > > > (http://www.uefi.org/acpi_id_list). > > > > To be pedantically correct, I think you want "THRX" registered. Then > > you can manage the "0002" part internally without registering each > > individual device. > > Not sure if it would be registered that way, because (AFAIK) > it expected to be string constructed from Vendor ID (not the Product ID) plus > four hex digit manged internaly. So we suggest to change it to 177DXXXX > which corresponds to Cavium PCI ID https://pci-ids.ucw.cz/pci.ids. > It's also possible to use the 3-digit PNP ID, "CAV", to construct these > _HID/_CID/_SUB values (http://www.uefi.org/pnp_id_list). My point was that you only need to register the prefix ("CAV" or "THRX") of the PNP or ACPI ID. Then you manage the suffixes internally. You as long as you register "CAV" or "THRX", you can assign and use "THRX0002" yourself without registering that specifically. > So the FW will be updated accordingly. > > For old FW we'll implement fallback scenario gathering resource info > via _CRS object (comments will be provided in the code also). > > > > > > From the other hand we may gather device resources through > > > _CRS (we have acpi_device already set by this moment) but > > > we need to be sure that we're running at Cavium ThunderX board then. > > > > > > To do that we may add _SUB value (accrodingly to spec) with > > > exact ID string (a full PCI ID value "177DA22D" is suggested). > > > This will eventually become the main method of finding out > > > whether we run on a ThunderX PEM. > > > > I don't understand why you would use _SUB. I think the correct way to > > handle this is to use a _HID of "THRX0002" with a _CID of "PNP0A08". > > Is there some existing _SUB usage you're using as an example? > > Here we were talking about extra quirk to match PEM and ECAM and thus make > a decision. This object could also store some vendor-specific IDs. > To be honest I didn't see usage in kernel (didn't search), but ACPI Spec > provide an example as (p 6.1.9) : > Name (_SUB, "MSFT3000")// Vendor-defined subsystem > > However this is still debatable and we probably stick to _HID while > implementing additional match functions. If you want a driver to bind to a specific device, e.g., the ThunderX PCI host bridge, the firmware should use a unique _HID in the ACPI namespace and the driver should claim that _HID. It's conceivable that you could use _SUB, but that's needlessly different from everything else. Bjorn
On Wed, Feb 01, 2017 at 09:18:07AM -0600, Bjorn Helgaas wrote: > On Wed, Feb 01, 2017 at 04:53:25AM -0800, Vadim Lomovtsev wrote: > > On Tue, Jan 31, 2017 at 02:31:09PM -0600, Bjorn Helgaas wrote: > > > On Tue, Jan 31, 2017 at 06:57:20AM -0800, Vadim Lomovtsev wrote: > > > > On Tue, Jan 31, 2017 at 08:25:25AM -0600, Bjorn Helgaas wrote: > > > > > On Tue, Jan 31, 2017 at 02:28:30AM -0800, Vadim Lomovtsev wrote: > > > > > > Hi Bjorn, > > > > > > > > > > > > On Mon, Jan 30, 2017 at 03:12:37PM -0600, Bjorn Helgaas wrote: > > > > > > > Hi Vadim, > > > > > > > > > > > > > > On Mon, Jan 30, 2017 at 08:25:52AM -0800, Vadim Lomovtsev wrote: > > > > > > > > This patch is to address PEM initialization issue > > > > > > > > which causes network issues. > > > > > > > > > > > > > > > > It is necessary to search for _HID:PNP0A08 while requesting > > > > > > > > PEM resources via ACPI instead of "THRX0002". > > > > > > > > > > > > > > > > Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> > > > > > > > > --- > > > > > > > > drivers/pci/host/pci-thunder-pem.c | 2 +- > > > > > > > > 1 file changed, 1 insertion(+), 1 deletion(-) > > > > > > > > > > > > > > > > diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c > > > > > > > > index af722eb..aec30b8 100644 > > > > > > > > --- a/drivers/pci/host/pci-thunder-pem.c > > > > > > > > +++ b/drivers/pci/host/pci-thunder-pem.c > > > > > > > > @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) > > > > > > > > if (!res_pem) > > > > > > > > return -ENOMEM; > > > > > > > > > > > > > > > > - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); > > > > > > > > + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); > > > > > > > > > > > > > > This doesn't smell right: PNP0A08 is the generic ACPI ID. There's no > > > > > > > guarantee that if we find a PNP0A08 device, it is a ThunderX device. > > > > > > > > > > > > > > I think the only way to call thunder_pem_acpi_init() is via an MCFG > > > > > > > quirk that mentions thunder_pem_ecam_ops, which means we only call it > > > > > > > if we find an MCFG with "CAVIUM" "THUNDERX" OEM and table IDs, so it's > > > > > > > probably safe in that sense. > > > > > > > > > > > > Agree, it is not the best solution. > > > > > > We will implement such approach and send for review. > > > > > > > > > > > > > > > > > > > > But it's an abuse of the ACPI _HID model. If you match a device using > > > > > > > PNP0A08, all you can assume about it is that it uses the generic > > > > > > > PNP0A08 programming model, and I don't think that includes "the first > > > > > > > memory resource in _CRS contains ECAM space and MSI-X tables." > > > > > > > > > > > > > > I expect this is a teething issue because you have firmware in the > > > > > > > field that uses PNP0A08 and it's not feasible to update it. If that's > > > > > > > the case, the changelog should have details about it and we should > > > > > > > have a comment in the code, because I don't think this is the model we > > > > > > > want to end up with in future releases. > > > > > > > > > > > > It could become so. However, for now I didn't get any reports on that, > > > > > > (may be I miss something) except some internal emailings. > > > > > > At my testing HW I was able to see some issues related to acpi-PEM stuff. > > > > > > > > > > > > Thanks for feed-back, we will prepare another patch or patchset > > > > > > implementing approach you've highlighted. > > > > > > > > > > The approach I would like best is to search for THRX0002, because then > > > > > you know you have a ThunderX PEM device, and you can assume > > > > > device-specific details about its _CRS. > > > > > > > > > > But that's what the existing code does, and apparently that doesn't > > > > > work. My guess is that you have firmware in the field where the host > > > > > bridge has only PNP0A08 (and maybe PNP0A03) as device IDs. > > > > > > > > Because there is no such ACPI ID as "THRX0002" registered > > > > (http://www.uefi.org/acpi_id_list). > > > > > > To be pedantically correct, I think you want "THRX" registered. Then > > > you can manage the "0002" part internally without registering each > > > individual device. > > > > Not sure if it would be registered that way, because (AFAIK) > > it expected to be string constructed from Vendor ID (not the Product ID) plus > > four hex digit manged internaly. So we suggest to change it to 177DXXXX > > which corresponds to Cavium PCI ID https://pci-ids.ucw.cz/pci.ids. > > It's also possible to use the 3-digit PNP ID, "CAV", to construct these > > _HID/_CID/_SUB values (http://www.uefi.org/pnp_id_list). > > My point was that you only need to register the prefix ("CAV" or > "THRX") of the PNP or ACPI ID. Then you manage the suffixes > internally. You as long as you register "CAV" or "THRX", you can > assign and use "THRX0002" yourself without registering that > specifically. Yes, exactly. And the "CAV" perfix is already registered. And I think will'll use it to keep things aligned to specs & rules. > > > So the FW will be updated accordingly. > > > > For old FW we'll implement fallback scenario gathering resource info > > via _CRS object (comments will be provided in the code also). > > > > > > > > > From the other hand we may gather device resources through > > > > _CRS (we have acpi_device already set by this moment) but > > > > we need to be sure that we're running at Cavium ThunderX board then. > > > > > > > > To do that we may add _SUB value (accrodingly to spec) with > > > > exact ID string (a full PCI ID value "177DA22D" is suggested). > > > > This will eventually become the main method of finding out > > > > whether we run on a ThunderX PEM. > > > > > > I don't understand why you would use _SUB. I think the correct way to > > > handle this is to use a _HID of "THRX0002" with a _CID of "PNP0A08". > > > Is there some existing _SUB usage you're using as an example? > > > > Here we were talking about extra quirk to match PEM and ECAM and thus make > > a decision. This object could also store some vendor-specific IDs. > > To be honest I didn't see usage in kernel (didn't search), but ACPI Spec > > provide an example as (p 6.1.9) : > > Name (_SUB, "MSFT3000")// Vendor-defined subsystem > > > > However this is still debatable and we probably stick to _HID while > > implementing additional match functions. > > If you want a driver to bind to a specific device, e.g., the ThunderX > PCI host bridge, the firmware should use a unique _HID in the ACPI > namespace and the driver should claim that _HID. > > It's conceivable that you could use _SUB, but that's needlessly > different from everything else. See your point, thank you for quick and clear feed-back. Just to let you know: We are in the middle of developing solution for fall-back scenario (when ACPI/PNP ID wasn't found) for PEM initalization. Hope to be able to get it into 4.10. > > Bjorn Vadim
Hi Bjorn, Vadim, Following up to this old thread... On 02/01/2017 10:18 AM, Bjorn Helgaas wrote: > On Wed, Feb 01, 2017 at 04:53:25AM -0800, Vadim Lomovtsev wrote: >>>> Because there is no such ACPI ID as "THRX0002" registered >>>> (http://www.uefi.org/acpi_id_list). There is still no "THRX" prefix registered with UEFI as of this morning. >>> To be pedantically correct, I think you want "THRX" registered. Then >>> you can manage the "0002" part internally without registering each >>> individual device. The upstream Linux kernel contains a quirk matching entry that looks for "THRX". Therefore, you have already agreed (as of at least January) that this is the prefix that you will use in any firmware updates to support the latest upstream Linux kernel. Please register this prefix promptly. >> Not sure if it would be registered that way, because (AFAIK) >> it expected to be string constructed from Vendor ID (not the Product ID) plus >> four hex digit manged internaly. So we suggest to change it to 177DXXXX >> which corresponds to Cavium PCI ID https://pci-ids.ucw.cz/pci.ids. >> It's also possible to use the 3-digit PNP ID, "CAV", to construct these >> _HID/_CID/_SUB values (http://www.uefi.org/pnp_id_list). > > My point was that you only need to register the prefix ("CAV" or > "THRX") of the PNP or ACPI ID. Then you manage the suffixes > internally. You as long as you register "CAV" or "THRX", you can > assign and use "THRX0002" yourself without registering that > specifically. > >> So the FW will be updated accordingly. Indeed. The version Bjorn merged looks for "THRX". This is the version that you will use, and you will promptly register that prefix with UEFI and provide fixes for existing firmware to correctly use the solution that is upstream. Thanks, Jon.
Hi Jon, On Wed, Mar 15, 2017 at 07:14:38AM -0400, Jon Masters wrote: > Hi Bjorn, Vadim, > > Following up to this old thread... > > On 02/01/2017 10:18 AM, Bjorn Helgaas wrote: > > On Wed, Feb 01, 2017 at 04:53:25AM -0800, Vadim Lomovtsev wrote: > > >>>> Because there is no such ACPI ID as "THRX0002" registered > >>>> (http://www.uefi.org/acpi_id_list). > > There is still no "THRX" prefix registered with UEFI as of this morning. > > >>> To be pedantically correct, I think you want "THRX" registered. Then > >>> you can manage the "0002" part internally without registering each > >>> individual device. > > The upstream Linux kernel contains a quirk matching entry that looks for > "THRX". Therefore, you have already agreed (as of at least January) that > this is the prefix that you will use in any firmware updates to support > the latest upstream Linux kernel. Please register this prefix promptly. And from what I know for now - we wont going to register this since we have already regsitered "CAV" prefix for that. And this was the part of our discussion also. We had a bit long review of proper implementation of legacy firmware support, so my apologise on that. Please take a look at link to the patchset posted by Tomasz. https://www.spinics.net/lists/arm-kernel/msg568741.html > > >> Not sure if it would be registered that way, because (AFAIK) > >> it expected to be string constructed from Vendor ID (not the Product ID) plus > >> four hex digit manged internaly. So we suggest to change it to 177DXXXX > >> which corresponds to Cavium PCI ID https://pci-ids.ucw.cz/pci.ids. > >> It's also possible to use the 3-digit PNP ID, "CAV", to construct these > >> _HID/_CID/_SUB values (http://www.uefi.org/pnp_id_list). > > > > My point was that you only need to register the prefix ("CAV" or > > "THRX") of the PNP or ACPI ID. Then you manage the suffixes > > internally. You as long as you register "CAV" or "THRX", you can > > assign and use "THRX0002" yourself without registering that > > specifically. > > And my reply here was : "Yes, exactly. And the "CAV" perfix is already registered. And I think will'll use it to keep things aligned to specs & rules." > >> So the FW will be updated accordingly. > > Indeed. Yes, it is now contains "CAVxxx" as _HID for device config object. > > The version Bjorn merged looks for "THRX". This is the version that you will > use, and you will promptly register that prefix with UEFI and provide fixes > for existing firmware to correctly use the solution that is upstream. Cavium FW is updated accordingly to use already registered prefix. For existent FW legacy support is posted by Tomasz. > > Thanks, > > Jon. > WBR, Vadim
Hi Vadim, Thanks for your followup and attention to this matter. More below. On 03/15/2017 07:33 AM, Vadim Lomovtsev wrote: >> The upstream Linux kernel contains a quirk matching entry that looks for >> "THRX". Therefore, you have already agreed (as of at least January) that >> this is the prefix that you will use in any firmware updates to support >> the latest upstream Linux kernel. Please register this prefix promptly. > > And from what I know for now - we wont going to register this > since we have already regsitered "CAV" prefix for that. And this was the part > of our discussion also. > > We had a bit long review of proper implementation of legacy firmware support, > so my apologise on that. > > Please take a look at link to the patchset posted by Tomasz. > https://www.spinics.net/lists/arm-kernel/msg568741.html I'll let others comment on the suitability of taking that for upstream. >>>> So the FW will be updated accordingly. >> >> Indeed. > > Yes, it is now contains "CAVxxx" as _HID for device config object. Which is different from the version that was merged into upstream. That should never have happened. It will never happen again. I have spent some time over the past few days ensuring folks understand that I will not allow a repeat of this to occur the next time around. We will have platforms that are bulletproof and supported by upstream with any errata fixes in a very carefully controlled manner. There will under no circumstances ever be a situation like this again. >> The version Bjorn merged looks for "THRX". This is the version that you will >> use, and you will promptly register that prefix with UEFI and provide fixes >> for existing firmware to correctly use the solution that is upstream. > > Cavium FW is updated accordingly to use already registered prefix. > For existent FW legacy support is posted by Tomasz. I'm watching this to ensure it's cleaned up properly. Jon.
On 03/16/2017 07:32 AM, Jon Masters wrote: > Hi Vadim, > > Thanks for your followup and attention to this matter. More below. > > On 03/15/2017 07:33 AM, Vadim Lomovtsev wrote: > >>> The upstream Linux kernel contains a quirk matching entry that looks for >>> "THRX". Therefore, you have already agreed (as of at least January) that >>> this is the prefix that you will use in any firmware updates to support >>> the latest upstream Linux kernel. Please register this prefix promptly. >> >> And from what I know for now - we wont going to register this >> since we have already regsitered "CAV" prefix for that. And this was the part >> of our discussion also. >> >> We had a bit long review of proper implementation of legacy firmware support, >> so my apologise on that. >> >> Please take a look at link to the patchset posted by Tomasz. >> https://www.spinics.net/lists/arm-kernel/msg568741.html > > I'll let others comment on the suitability of taking that for upstream. > >>>>> So the FW will be updated accordingly. >>> >>> Indeed. >> >> Yes, it is now contains "CAVxxx" as _HID for device config object. > > Which is different from the version that was merged into upstream. That > should never have happened. It will never happen again. I have spent some > time over the past few days ensuring folks understand that I will not > allow a repeat of this to occur the next time around. We will have > platforms that are bulletproof and supported by upstream with any > errata fixes in a very carefully controlled manner. There will > under no circumstances ever be a situation like this again. We are still evaluating the merits of registering the values that appeared in v4.10, and not changing them. We should know more in a couple of days. > >>> The version Bjorn merged looks for "THRX". This is the version that you will >>> use, and you will promptly register that prefix with UEFI and provide fixes >>> for existing firmware to correctly use the solution that is upstream. >> >> Cavium FW is updated accordingly to use already registered prefix. >> For existent FW legacy support is posted by Tomasz. > > I'm watching this to ensure it's cleaned up properly. > > Jon. > > > _______________________________________________ > linux-arm-kernel mailing list > linux-arm-kernel@lists.infradead.org > http://lists.infradead.org/mailman/listinfo/linux-arm-kernel >
On 03/16/2017 12:25 PM, David Daney wrote: > On 03/16/2017 07:32 AM, Jon Masters wrote: >>> Yes, it is now contains "CAVxxx" as _HID for device config object. >> >> Which is different from the version that was merged into upstream. That >> should never have happened. It will never happen again. I have spent some >> time over the past few days ensuring folks understand that I will not >> allow a repeat of this to occur the next time around. We will have >> platforms that are bulletproof and supported by upstream with any >> errata fixes in a very carefully controlled manner. There will >> under no circumstances ever be a situation like this again. > > We are still evaluating the merits of registering the values that appeared > in v4.10, and not changing them. We should know more in a couple of days. Thanks David. What was the verdict? (for the public record). If we need to get a change into upstream, let's get that teed up before 4.12 merge. And for other folks following along with this thread: I'm not just picking on Cavium here. I'll be doing the same with *every* ARM server SoC company as necessary over the coming months. We are going to have militantly compliant standards adherence in this industry and every ARM server SoC is going to "just work" with an upstream Linux kernel with an ACPI enabled platform. This will be so utterly clean and boring it'll be amazing. Jon.
On Tue, Mar 21, 2017 at 07:38:07AM -0400, Jon Masters wrote: > On 03/16/2017 12:25 PM, David Daney wrote: > > On 03/16/2017 07:32 AM, Jon Masters wrote: > > >>> Yes, it is now contains "CAVxxx" as _HID for device config object. > >> > >> Which is different from the version that was merged into upstream. That > >> should never have happened. It will never happen again. I have spent some > >> time over the past few days ensuring folks understand that I will not > >> allow a repeat of this to occur the next time around. We will have > >> platforms that are bulletproof and supported by upstream with any > >> errata fixes in a very carefully controlled manner. There will > >> under no circumstances ever be a situation like this again. > > > > We are still evaluating the merits of registering the values that appeared > > in v4.10, and not changing them. We should know more in a couple of days. > > Thanks David. What was the verdict? (for the public record). If we need to > get a change into upstream, let's get that teed up before 4.12 merge. > > And for other folks following along with this thread: I'm not just picking > on Cavium here. I'll be doing the same with *every* ARM server SoC company > as necessary over the coming months. We are going to have militantly > compliant standards adherence in this industry and every ARM server SoC is > going to "just work" with an upstream Linux kernel with an ACPI enabled > platform. This will be so utterly clean and boring it'll be amazing. Thanks for keeping on top of this, Jon. I agree, we should not be using unregistered vendor prefixes, e.g., the "THRX" added by 44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host controller"). I'm sorry I merged that without doing the due diligence. I suspect the resolution will be to register "THRX". If that doesn't happen, I'll propose reverting 44f22bd91e88, not because I want to break things, but only because I'm not personally in a position to do anything smarter. So please propose a better solution that fits within the ACPI _HID/_CID model :) Bjorn
Hi Bjorn, On 21.03.2017 14:47, Bjorn Helgaas wrote: > On Tue, Mar 21, 2017 at 07:38:07AM -0400, Jon Masters wrote: >> On 03/16/2017 12:25 PM, David Daney wrote: >>> On 03/16/2017 07:32 AM, Jon Masters wrote: >> >>>>> Yes, it is now contains "CAVxxx" as _HID for device config object. >>>> >>>> Which is different from the version that was merged into upstream. That >>>> should never have happened. It will never happen again. I have spent some >>>> time over the past few days ensuring folks understand that I will not >>>> allow a repeat of this to occur the next time around. We will have >>>> platforms that are bulletproof and supported by upstream with any >>>> errata fixes in a very carefully controlled manner. There will >>>> under no circumstances ever be a situation like this again. >>> >>> We are still evaluating the merits of registering the values that appeared >>> in v4.10, and not changing them. We should know more in a couple of days. >> >> Thanks David. What was the verdict? (for the public record). If we need to >> get a change into upstream, let's get that teed up before 4.12 merge. >> >> And for other folks following along with this thread: I'm not just picking >> on Cavium here. I'll be doing the same with *every* ARM server SoC company >> as necessary over the coming months. We are going to have militantly >> compliant standards adherence in this industry and every ARM server SoC is >> going to "just work" with an upstream Linux kernel with an ACPI enabled >> platform. This will be so utterly clean and boring it'll be amazing. > > Thanks for keeping on top of this, Jon. I agree, we should not be > using unregistered vendor prefixes, e.g., the "THRX" added by > 44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host > controller"). I'm sorry I merged that without doing the due > diligence. Honestly, it is me who is responsible for this since I submitted the patch. > > I suspect the resolution will be to register "THRX". If that doesn't > happen, I'll propose reverting 44f22bd91e88, not because I want to > break things, but only because I'm not personally in a position to do > anything smarter. So please propose a better solution that fits > within the ACPI _HID/_CID model :) I already submitted the patch to fix this. Please see: https://patchwork.ozlabs.org/patch/739042/ Thanks, Tomasz
On 03/21/2017 07:17 AM, Tomasz Nowicki wrote: > Hi Bjorn, > > On 21.03.2017 14:47, Bjorn Helgaas wrote: >> On Tue, Mar 21, 2017 at 07:38:07AM -0400, Jon Masters wrote: >>> On 03/16/2017 12:25 PM, David Daney wrote: >>>> On 03/16/2017 07:32 AM, Jon Masters wrote: >>> >>>>>> Yes, it is now contains "CAVxxx" as _HID for device config object. >>>>> >>>>> Which is different from the version that was merged into upstream. >>>>> That >>>>> should never have happened. It will never happen again. I have >>>>> spent some >>>>> time over the past few days ensuring folks understand that I will not >>>>> allow a repeat of this to occur the next time around. We will have >>>>> platforms that are bulletproof and supported by upstream with any >>>>> errata fixes in a very carefully controlled manner. There will >>>>> under no circumstances ever be a situation like this again. >>>> >>>> We are still evaluating the merits of registering the values that >>>> appeared >>>> in v4.10, and not changing them. We should know more in a couple of >>>> days. >>> >>> Thanks David. What was the verdict? (for the public record). If we >>> need to >>> get a change into upstream, let's get that teed up before 4.12 merge. >>> >>> And for other folks following along with this thread: I'm not just >>> picking >>> on Cavium here. I'll be doing the same with *every* ARM server SoC >>> company >>> as necessary over the coming months. We are going to have militantly >>> compliant standards adherence in this industry and every ARM server >>> SoC is >>> going to "just work" with an upstream Linux kernel with an ACPI enabled >>> platform. This will be so utterly clean and boring it'll be amazing. >> >> Thanks for keeping on top of this, Jon. I agree, we should not be >> using unregistered vendor prefixes, e.g., the "THRX" added by >> 44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host >> controller"). I'm sorry I merged that without doing the due >> diligence. > > Honestly, it is me who is responsible for this since I submitted the patch. Yes. After all this back and forth, Cavium has decided to deploy firmware with "CAVxxx" as _HID. The deciding factor was that the prefix is already registered and there are probably fewer than 10 systems deployed with the experimental and erroneous "THRXxxx" value. Neither option (switching the kernel to "CAVxxx", or changing the firmware to use "THRXxxx") was without its drawbacks. There were valid arguments on either side. In the end internal momentum, among other factors, brought us to the conclusion that using the "CAVxxx" as _HID, and trying to get Tomasz' patches merged is what we will do. We fully realize that there may exists some combinations of the Linux kernel and Cavium firmware (both officially released, and experimental and unsupported) that don't result in functional PCIe. I, personally, am sorry that this screw up happened in the first place. Let's hope that everyone learned something from the experience. Thanks, David Daney > >> >> I suspect the resolution will be to register "THRX". If that doesn't >> happen, I'll propose reverting 44f22bd91e88, not because I want to >> break things, but only because I'm not personally in a position to do >> anything smarter. So please propose a better solution that fits >> within the ACPI _HID/_CID model :) > > I already submitted the patch to fix this. Please see: > https://patchwork.ozlabs.org/patch/739042/ > > Thanks, > Tomasz
On Tue, Mar 21, 2017 at 03:17:17PM +0100, Tomasz Nowicki wrote: > Hi Bjorn, > > On 21.03.2017 14:47, Bjorn Helgaas wrote: > >On Tue, Mar 21, 2017 at 07:38:07AM -0400, Jon Masters wrote: > >>On 03/16/2017 12:25 PM, David Daney wrote: > >>>On 03/16/2017 07:32 AM, Jon Masters wrote: > >> > >>>>>Yes, it is now contains "CAVxxx" as _HID for device config object. > >>>> > >>>>Which is different from the version that was merged into upstream. That > >>>>should never have happened. It will never happen again. I have spent some > >>>>time over the past few days ensuring folks understand that I will not > >>>>allow a repeat of this to occur the next time around. We will have > >>>>platforms that are bulletproof and supported by upstream with any > >>>>errata fixes in a very carefully controlled manner. There will > >>>>under no circumstances ever be a situation like this again. > >>> > >>>We are still evaluating the merits of registering the values that appeared > >>>in v4.10, and not changing them. We should know more in a couple of days. > >> > >>Thanks David. What was the verdict? (for the public record). If we need to > >>get a change into upstream, let's get that teed up before 4.12 merge. > >> > >>And for other folks following along with this thread: I'm not just picking > >>on Cavium here. I'll be doing the same with *every* ARM server SoC company > >>as necessary over the coming months. We are going to have militantly > >>compliant standards adherence in this industry and every ARM server SoC is > >>going to "just work" with an upstream Linux kernel with an ACPI enabled > >>platform. This will be so utterly clean and boring it'll be amazing. > > > >Thanks for keeping on top of this, Jon. I agree, we should not be > >using unregistered vendor prefixes, e.g., the "THRX" added by > >44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host > >controller"). I'm sorry I merged that without doing the due > >diligence. > > Honestly, it is me who is responsible for this since I submitted the patch. > > > > >I suspect the resolution will be to register "THRX". If that doesn't > >happen, I'll propose reverting 44f22bd91e88, not because I want to > >break things, but only because I'm not personally in a position to do > >anything smarter. So please propose a better solution that fits > >within the ACPI _HID/_CID model :) > > I already submitted the patch to fix this. Please see: > https://patchwork.ozlabs.org/patch/739042/ Applied, thanks! Glad to have this cleared up! Bjorn
On 03/21/2017 10:56 AM, David Daney wrote: > On 03/21/2017 07:17 AM, Tomasz Nowicki wrote: >> On 21.03.2017 14:47, Bjorn Helgaas wrote: >>>> And for other folks following along with this thread: I'm not just >>>> picking on Cavium here. I'll be doing the same with *every* ARM server >>>> SoC company as necessary over the coming months. >>> Thanks for keeping on top of this, Jon. You're welcome. I'm pleased (in some sense) that we're starting to see enough systems shipping that unifying quirks and IDs such that ODMs can bend metal easily is a problem that we want to solve. I am saddened that there isn't an ARM swat team with black helicopters swooping in to ensure zoo avoidance (and I do actually request this every year in my own budget cycle), but I am very "happy" to serve that role for now. As I said, this isn't Cavium's fault. They're a victim of their market success. I'm super excited to see them shipping systems on which we want to run general purpose Operating Systems. At the same time, as with *every* other ARM vendor, I will keep my eye out for compliance concerns and I will act to ensure that these things are flagged. >>> I agree, we should not be >>> using unregistered vendor prefixes, e.g., the "THRX" added by >>> 44f22bd91e88 ("PCI: Add MCFG quirks for Cavium ThunderX pass2.x host >>> controller"). I'm sorry I merged that without doing the due >>> diligence. Oh, it's difficult for you to police everything without having every possible platform in front of you, with every firmware, and a lot of time that none of us have :) >> Honestly, it is me who is responsible for this since I submitted >> the patch. You're great Tomasz. You've done awesome stuff over the past few months. I want to be /very/ clear that none of my pushback is directed at you, David, or any specific individual. You're doing great. I'm going to make sure that alignment happens in this industry because I need to ship a "common core" single binary build OS that supports "ARM servers". That means every server, from every vendor. Not all are going to be "certified" to run RHEL, but all servers must be capable of booting and working with upstream kernels, and running *ANY* Linux distro, so that customers and users who try an "ARM server" from a random ODM don't get upset. There will be no zoo. There will only be "upstream first" driven development and the distros will learn to consume only from upstream. They won't produce hacked up nonsense with patches to support platforms that aren't upstream. > Yes. After all this back and forth, Cavium has decided to deploy > firmware with "CAVxxx" as _HID. Great. How about a stable backport for Greg K-H? I want to make sure that everyone running "upstream" has a chance of booting. > The deciding factor was that the prefix is already registered and there > are probably fewer than 10 systems deployed with the experimental and > erroneous "THRXxxx" value. Neither option (switching the kernel to "CAVxxx", > or changing the firmware to use "THRXxxx") was without its drawbacks. Agree. Let's pick a solution and learn for the future. I know you know this, but for everyone else (especially ARM vendors who follow): The dirty secret to server is that we have software we ship, and hardware that ships separately. The software lives for years. It's easier to change the hardware than software that has already shipped. This is where very rich and featurefull firmware comes in. The platform is defined by the firmware, which should provide a fully standard interface that is SET IN STONE. It's so utterly bulletproof that it's both forward and backward compatible. Going forward, all of the ARM vendors are going to have utterly bulletproof server platforms with an amazing level of joined up cohesion in terms of tracking changes on the software and hardware side in terms of the platform firmware gluing it together. It's a dirty secret that x86 teaches us, and we're going to play exactly the same model out again (this has been the evil plan for many many years). But to do it right requires that we are very very careful in connecting dots between the platform pieces. Thanks, Jon.
On Wed, Mar 22, 2017 at 10:28:27AM -0400, Jon Masters wrote: > On 03/21/2017 10:56 AM, David Daney wrote: > > Yes. After all this back and forth, Cavium has decided to deploy > > firmware with "CAVxxx" as _HID. > > Great. How about a stable backport for Greg K-H? I want to make sure > that everyone running "upstream" has a chance of booting. The first patch, fee4d813850c ("PCI: Use Cavium assigned hardware ID for ThunderX host controller"), has a stable tag already. The second, 1dc94a38af89 ("PCI: Add legacy firmware support for Cavium ThunderX host controller"), does not, but I can easily add it if needed. Bjorn
On 03/22/2017 10:48 AM, Bjorn Helgaas wrote: > On Wed, Mar 22, 2017 at 10:28:27AM -0400, Jon Masters wrote: >> On 03/21/2017 10:56 AM, David Daney wrote: > >>> Yes. After all this back and forth, Cavium has decided to deploy >>> firmware with "CAVxxx" as _HID. >> >> Great. How about a stable backport for Greg K-H? I want to make sure >> that everyone running "upstream" has a chance of booting. > > The first patch, fee4d813850c ("PCI: Use Cavium assigned hardware ID for > ThunderX host controller"), has a stable tag already. Thanks - I saw that after I mailed. > The second, 1dc94a38af89 ("PCI: Add legacy firmware support for Cavium > ThunderX host controller"), does not, but I can easily add it if needed. I think that would be ideal. There is firmware out in the wild that has neither identifier in it (for example, a bunch of folks in the office bought platforms recently that don't boot upstream kernels). My guys are used to just taking an upstream kernel and using that, for development, and I have no intention of having the broader RH org do any different from what they would do with an x86 box ;) Jon.
On Wed, Mar 22, 2017 at 12:25:39PM -0400, Jon Masters wrote: > On 03/22/2017 10:48 AM, Bjorn Helgaas wrote: > > On Wed, Mar 22, 2017 at 10:28:27AM -0400, Jon Masters wrote: > >> On 03/21/2017 10:56 AM, David Daney wrote: > > > >>> Yes. After all this back and forth, Cavium has decided to deploy > >>> firmware with "CAVxxx" as _HID. > >> > >> Great. How about a stable backport for Greg K-H? I want to make sure > >> that everyone running "upstream" has a chance of booting. > > > > The first patch, fee4d813850c ("PCI: Use Cavium assigned hardware ID for > > ThunderX host controller"), has a stable tag already. > > Thanks - I saw that after I mailed. > > > The second, 1dc94a38af89 ("PCI: Add legacy firmware support for Cavium > > ThunderX host controller"), does not, but I can easily add it if needed. > > I think that would be ideal. There is firmware out in the wild that > has neither identifier in it (for example, a bunch of folks in the > office bought platforms recently that don't boot upstream kernels). > My guys are used to just taking an upstream kernel and using that, > for development, and I have no intention of having the broader RH > org do any different from what they would do with an x86 box ;) I added the same stable tag (v4.10+) to the second patch.
On Wed, Mar 22, 2017 at 11:34:00AM -0500, Bjorn Helgaas wrote: > On Wed, Mar 22, 2017 at 12:25:39PM -0400, Jon Masters wrote: > > On 03/22/2017 10:48 AM, Bjorn Helgaas wrote: > > > On Wed, Mar 22, 2017 at 10:28:27AM -0400, Jon Masters wrote: > > >> On 03/21/2017 10:56 AM, David Daney wrote: > > > > > >>> Yes. After all this back and forth, Cavium has decided to deploy > > >>> firmware with "CAVxxx" as _HID. > > >> > > >> Great. How about a stable backport for Greg K-H? I want to make sure > > >> that everyone running "upstream" has a chance of booting. > > > > > > The first patch, fee4d813850c ("PCI: Use Cavium assigned hardware ID for > > > ThunderX host controller"), has a stable tag already. > > > > Thanks - I saw that after I mailed. > > > > > The second, 1dc94a38af89 ("PCI: Add legacy firmware support for Cavium > > > ThunderX host controller"), does not, but I can easily add it if needed. > > > > I think that would be ideal. There is firmware out in the wild that > > has neither identifier in it (for example, a bunch of folks in the > > office bought platforms recently that don't boot upstream kernels). > > My guys are used to just taking an upstream kernel and using that, > > for development, and I have no intention of having the broader RH > > org do any different from what they would do with an x86 box ;) > > I added the same stable tag (v4.10+) to the second patch. I moved these from pci/host-thunder (targeted for v4.12) to for-linus so we can get these in v4.11.
Thanks :) :) :)
diff --git a/drivers/pci/host/pci-thunder-pem.c b/drivers/pci/host/pci-thunder-pem.c index af722eb..aec30b8 100644 --- a/drivers/pci/host/pci-thunder-pem.c +++ b/drivers/pci/host/pci-thunder-pem.c @@ -331,7 +331,7 @@ static int thunder_pem_acpi_init(struct pci_config_window *cfg) if (!res_pem) return -ENOMEM; - ret = acpi_get_rc_resources(dev, "THRX0002", root->segment, res_pem); + ret = acpi_get_rc_resources(dev, "PNP0A08", root->segment, res_pem); if (ret) { dev_err(dev, "can't get rc base address\n"); return ret;
This patch is to address PEM initialization issue which causes network issues. It is necessary to search for _HID:PNP0A08 while requesting PEM resources via ACPI instead of "THRX0002". Signed-off-by: Vadim Lomovtsev <Vadim.Lomovtsev@caviumnetworks.com> --- drivers/pci/host/pci-thunder-pem.c | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-)