Message ID | 20190718182617.6964-1-marcel.p.bocu@gmail.com (mailing list archive) |
---|---|
State | Superseded |
Headers | show |
Series | [1/2] x86/amd_nb: Add PCI device IDs for family 17h, model 71h | expand |
On 18/07/2019 22:33, Guenter Roeck wrote: > On Thu, Jul 18, 2019 at 09:26:16PM +0300, Marcel Bocu wrote: >> The AMD Ryzen gen 3 processors came with a different PCI IDs for the >> function 3 & 4 which are used to access the SMN interface. The root >> PCI address however remained at the same address as the model 30h. >> >> Adding the F3/F4 PCI IDs respectively to the misc and link ids appear >> to be sufficient for k10temp, so let's add them and follow up on the >> patch if other functions need more tweaking. >> >> Signed-off-by: Marcel Bocu <marcel.p.bocu@gmail.com> >> Tested-by: Marcel Bocu <marcel.p.bocu@gmail.com> >> > > How is this version of the patch series different to the first version ? They seem pretty much identical except for the macro's name (71h vs 70h) and the fact that I already Cc:ed the x86 crew. I had checked this weekend if there were patches already for this, but I guess Vicki sent his patches right after I checked. Sorry for the noise! Could anyone add Vicki Pfau (I don't have his email address because I subscribed to this list after he sent his patches), and then we can ask him if he already submited his patches to x86 or not. In any case, whatever patchset gets selected for inclusion, I suggest we both sign off on the commit (I do not care about authorship). I will anyway try to follow up with other patches to access the chipset's temperature and the fan speed. Sorry again for the noise! Marcel
Hi Marcel, On 7/19/19 12:40 AM, Marcel Bocu wrote: > On 18/07/2019 22:33, Guenter Roeck wrote: >> On Thu, Jul 18, 2019 at 09:26:16PM +0300, Marcel Bocu wrote: >>> The AMD Ryzen gen 3 processors came with a different PCI IDs for the >>> function 3 & 4 which are used to access the SMN interface. The root >>> PCI address however remained at the same address as the model 30h. >>> >>> Adding the F3/F4 PCI IDs respectively to the misc and link ids appear >>> to be sufficient for k10temp, so let's add them and follow up on the >>> patch if other functions need more tweaking. >>> >>> Signed-off-by: Marcel Bocu <marcel.p.bocu@gmail.com> >>> Tested-by: Marcel Bocu <marcel.p.bocu@gmail.com> >>> >> >> How is this version of the patch series different to the first version ? > > They seem pretty much identical except for the macro's name (71h vs 70h) Normally (or at least so far) the device ID specifications are for a group of models, not just for one chip in the group. Traditionally, AMD uses the same PCI ID for model numbers 7xh, so 70h would probably be more appropriate. Of course, that is hard to say without documentation from AMD, and I don't see anything published for Ryzen 3000. > and the fact that I already Cc:ed the x86 crew. I had checked this > weekend if there were patches already for this, but I guess Vicki sent > his patches right after I checked. Sorry for the noise! > > Could anyone add Vicki Pfau (I don't have his email address because I > subscribed to this list after he sent his patches), and then we can ask > him if he already submited his patches to x86 or not. > > In any case, whatever patchset gets selected for inclusion, I suggest we > both sign off on the commit (I do not care about authorship). I will > anyway try to follow up with other patches to access the chipset's > temperature and the fan speed. > Wouldn't that require patches to the Super-IO chip on your board ? That seems orthogonal to this patch series (and to the R3000 CPU). > Sorry again for the noise! Sorry myself, I didn't realize that the patches were from different people since they looked similar. Guenter
On 19/07/2019 16:27, Guenter Roeck wrote: > Hi Marcel, > > On 7/19/19 12:40 AM, Marcel Bocu wrote: >> On 18/07/2019 22:33, Guenter Roeck wrote: >>> On Thu, Jul 18, 2019 at 09:26:16PM +0300, Marcel Bocu wrote: >>>> The AMD Ryzen gen 3 processors came with a different PCI IDs for the >>>> function 3 & 4 which are used to access the SMN interface. The root >>>> PCI address however remained at the same address as the model 30h. >>>> >>>> Adding the F3/F4 PCI IDs respectively to the misc and link ids appear >>>> to be sufficient for k10temp, so let's add them and follow up on the >>>> patch if other functions need more tweaking. >>>> >>>> Signed-off-by: Marcel Bocu <marcel.p.bocu@gmail.com> >>>> Tested-by: Marcel Bocu <marcel.p.bocu@gmail.com> >>>> >>> >>> How is this version of the patch series different to the first version ? >> >> They seem pretty much identical except for the macro's name (71h vs 70h) > > Normally (or at least so far) the device ID specifications are for a group > of models, not just for one chip in the group. Traditionally, AMD uses the > same PCI ID for model numbers 7xh, so 70h would probably be more appropriate. > Of course, that is hard to say without documentation from AMD, and I don't > see anything published for Ryzen 3000. Makes sense. Thanks for giving me some historical background! > >> and the fact that I already Cc:ed the x86 crew. I had checked this >> weekend if there were patches already for this, but I guess Vicki sent >> his patches right after I checked. Sorry for the noise! >> >> Could anyone add Vicki Pfau (I don't have his email address because I >> subscribed to this list after he sent his patches), and then we can ask >> him if he already submited his patches to x86 or not. >> >> In any case, whatever patchset gets selected for inclusion, I suggest we >> both sign off on the commit (I do not care about authorship). I will >> anyway try to follow up with other patches to access the chipset's >> temperature and the fan speed. >> > > Wouldn't that require patches to the Super-IO chip on your board ? > That seems orthogonal to this patch series (and to the R3000 CPU). Yes, but what I meant is that I don't mind not having authorship on these patches since most of my work will be on adding new features. Sorry for not being clear. > >> Sorry again for the noise! > > Sorry myself, I didn't realize that the patches were from different people > since they looked similar. I can't blame you for that! Could anyone in this thread add Vicki Pfau to figure out what's the best course of action? We could merge his patches (if he contacted the x86 crew), make a v2 of mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? Marcel > > Guenter >
On Fri, Jul 19, 2019 at 06:30:56PM +0300, Marcel Bocu wrote: > On 19/07/2019 16:27, Guenter Roeck wrote: > > Hi Marcel, > > > > On 7/19/19 12:40 AM, Marcel Bocu wrote: > >> On 18/07/2019 22:33, Guenter Roeck wrote: > >>> On Thu, Jul 18, 2019 at 09:26:16PM +0300, Marcel Bocu wrote: > >>>> The AMD Ryzen gen 3 processors came with a different PCI IDs for the > >>>> function 3 & 4 which are used to access the SMN interface. The root > >>>> PCI address however remained at the same address as the model 30h. > >>>> > >>>> Adding the F3/F4 PCI IDs respectively to the misc and link ids appear > >>>> to be sufficient for k10temp, so let's add them and follow up on the > >>>> patch if other functions need more tweaking. > >>>> > >>>> Signed-off-by: Marcel Bocu <marcel.p.bocu@gmail.com> > >>>> Tested-by: Marcel Bocu <marcel.p.bocu@gmail.com> > >>>> > >>> > >>> How is this version of the patch series different to the first version ? > >> > >> They seem pretty much identical except for the macro's name (71h vs 70h) > > > > Normally (or at least so far) the device ID specifications are for a group > > of models, not just for one chip in the group. Traditionally, AMD uses the > > same PCI ID for model numbers 7xh, so 70h would probably be more appropriate. > > Of course, that is hard to say without documentation from AMD, and I don't > > see anything published for Ryzen 3000. > > Makes sense. Thanks for giving me some historical background! > > > > >> and the fact that I already Cc:ed the x86 crew. I had checked this > >> weekend if there were patches already for this, but I guess Vicki sent > >> his patches right after I checked. Sorry for the noise! > >> > >> Could anyone add Vicki Pfau (I don't have his email address because I > >> subscribed to this list after he sent his patches), and then we can ask > >> him if he already submited his patches to x86 or not. > >> > >> In any case, whatever patchset gets selected for inclusion, I suggest we > >> both sign off on the commit (I do not care about authorship). I will > >> anyway try to follow up with other patches to access the chipset's > >> temperature and the fan speed. > >> > > > > Wouldn't that require patches to the Super-IO chip on your board ? > > That seems orthogonal to this patch series (and to the R3000 CPU). > > Yes, but what I meant is that I don't mind not having authorship on > these patches since most of my work will be on adding new features. > Sorry for not being clear. > > > > >> Sorry again for the noise! > > > > Sorry myself, I didn't realize that the patches were from different people > > since they looked similar. > > I can't blame you for that! Could anyone in this thread add Vicki Pfau > to figure out what's the best course of action? > Copied. Can the two of you sort this out ? Note that Vicki's patch series is in patchwork: https://patchwork.kernel.org/patch/11043277/ https://patchwork.kernel.org/patch/11043271/ > We could merge his patches (if he contacted the x86 crew), make a v2 of > mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? > Either case, we'll need feedback from x86 maintainers. They are not exactly happy if anyone pushes a patch into arch/x86 without their approval. Guenter > Marcel > > > > > Guenter > > >
On Fri, 19 Jul 2019, Guenter Roeck wrote: > > We could merge his patches (if he contacted the x86 crew), make a v2 of > > mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? > > > Either case, we'll need feedback from x86 maintainers. They are not exactly > happy if anyone pushes a patch into arch/x86 without their approval. Adding those PCI ids looks straight forward. So feel free to route them through hwmon with: Acked-by: Thomas Gleixner <tglx@linutronix.de>
On Mon, Jul 22, 2019 at 10:59:32AM +0200, Thomas Gleixner wrote: > On Fri, 19 Jul 2019, Guenter Roeck wrote: > > > We could merge his patches (if he contacted the x86 crew), make a v2 of > > > mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? > > > > > Either case, we'll need feedback from x86 maintainers. They are not exactly > > happy if anyone pushes a patch into arch/x86 without their approval. > > Adding those PCI ids looks straight forward. So feel free to route them > through hwmon with: ... but before you do that, let's wait for Brian to confirm the proper model range. Thx.
On Mon, Jul 22, 2019 at 11:12:45AM +0200, Borislav Petkov wrote: > On Mon, Jul 22, 2019 at 10:59:32AM +0200, Thomas Gleixner wrote: > > On Fri, 19 Jul 2019, Guenter Roeck wrote: > > > > We could merge his patches (if he contacted the x86 crew), make a v2 of > > > > mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? > > > > > > > Either case, we'll need feedback from x86 maintainers. They are not exactly > > > happy if anyone pushes a patch into arch/x86 without their approval. > > > > Adding those PCI ids looks straight forward. So feel free to route them > > through hwmon with: > > ... but before you do that, let's wait for Brian to confirm the proper > model range. > > Thx. > > -- > Regards/Gruss, > Boris. > > ECO tip #101: Trim your mails when you reply. > -- Generally we use X0-XF for a model of processors. Starting at 70h would be preferred. The PCI devices match our internal documentation for what it's worth. I was going to test these patches but the system I borrowed was requisitioned and hasn't been returned yet so I can't test them now, but I don't want to delay these anymore. Assuming the change from 71h -> 70h, feel free to add: Acked-by: Brian Woods <brian.woods@amd.com>
On Mon, Jul 22, 2019 at 04:04:24PM +0000, Woods, Brian wrote: > On Mon, Jul 22, 2019 at 11:12:45AM +0200, Borislav Petkov wrote: > > On Mon, Jul 22, 2019 at 10:59:32AM +0200, Thomas Gleixner wrote: > > > On Fri, 19 Jul 2019, Guenter Roeck wrote: > > > > > We could merge his patches (if he contacted the x86 crew), make a v2 of > > > > > mine (71h -> 70h, add his Signed-off-by?) and merge, or some other option? > > > > > > > > > Either case, we'll need feedback from x86 maintainers. They are not exactly > > > > happy if anyone pushes a patch into arch/x86 without their approval. > > > > > > Adding those PCI ids looks straight forward. So feel free to route them > > > through hwmon with: > > > > ... but before you do that, let's wait for Brian to confirm the proper > > model range. > > > > Thx. > > > > -- > > Regards/Gruss, > > Boris. > > > > ECO tip #101: Trim your mails when you reply. > > -- > > Generally we use X0-XF for a model of processors. Starting at 70h would > be preferred. > > The PCI devices match our internal documentation for what it's worth. I > was going to test these patches but the system I borrowed was > requisitioned and hasn't been returned yet so I can't test them now, but > I don't want to delay these anymore. With 3900X, and the series applied, I get: k10temp-pci-00c3 Adapter: PCI adapter Tdie: +45.1°C (high = +70.0°C) Tctl: +45.1°C which looks about right. Interesting, though. I thought there are two CPU dies on this chip. I guess the temperature sensor is on the IO block ? If so, are there additional sensors on the CPU dies ? Guenter > > Assuming the change from 71h -> 70h, feel free to add: > Acked-by: Brian Woods <brian.woods@amd.com> > > -- > Brian Woods
On Mon, Jul 22, 2019 at 09:51:05AM -0700, Guenter Roeck wrote: > > With 3900X, and the series applied, I get: > > k10temp-pci-00c3 > Adapter: PCI adapter > Tdie: +45.1°C (high = +70.0°C) > Tctl: +45.1°C > > which looks about right. > > Interesting, though. I thought there are two CPU dies on this chip. > I guess the temperature sensor is on the IO block ? If so, are there > additional sensors on the CPU dies ? > > Guenter > That's good to know. Thanks for testing it. What happens is the IOD takes the max temperature of the chiplets in the package and presents that as the temperature of the package. It works the same way as it does in Rome (server parts). For better or worse, you just get the max temperature of the chiplets rather than the temperatures of the individual chiplets.
On 22/07/2019 20:39, Woods, Brian wrote: > On Mon, Jul 22, 2019 at 09:51:05AM -0700, Guenter Roeck wrote: >> >> With 3900X, and the series applied, I get: >> >> k10temp-pci-00c3 >> Adapter: PCI adapter >> Tdie: +45.1°C (high = +70.0°C) >> Tctl: +45.1°C >> >> which looks about right. >> >> Interesting, though. I thought there are two CPU dies on this chip. >> I guess the temperature sensor is on the IO block ? If so, are there >> additional sensors on the CPU dies ? >> >> Guenter >> > > That's good to know. Thanks for testing it. > > What happens is the IOD takes the max temperature of the chiplets in the > package and presents that as the temperature of the package. It works > the same way as it does in Rome (server parts). For better or worse, > you just get the max temperature of the chiplets rather than the > temperatures of the individual chiplets. > Out of curiosity, is it the maximum temperature of all chiplets, or just the non-powergated ones? Because this might explain why I get so much variance in the idle temperature (40 <-> 50°C in a matter of a second with a mostly-idle processor). This variance is visible on linux, but not at all on the firmware's configuration interface. One other option would be the stock fan not being tight-enough... but apparently quite a few people have the issue. I'll try tightening it! Marcel
I'm getting similar variance. Compiling Linux seems to spike the temperature above 70, even with a new CPU cooler, so I'm wondering if there might be an offset I'm missing. It may just be the fan being too slow (going to be reconfiguring the BIOS settings today). The reason I haven't replied sooner is because I've been busy with a new job, and I wasn't sure if my phone would send a properly (un)formatted email. I'm fine with whosever patch gets in, so long as temperature reading works on my machine. Also, for what it's worth, I'm not a "he". Vicki is generally a female name. Vicki >> On Jul 22, 2019, at 11:04 AM, Marcel Bocu <marcel.p.bocu@gmail.com> wrote: >> >>> On 22/07/2019 20:39, Woods, Brian wrote: >>> On Mon, Jul 22, 2019 at 09:51:05AM -0700, Guenter Roeck wrote: >>> >>> With 3900X, and the series applied, I get: >>> >>> k10temp-pci-00c3 >>> Adapter: PCI adapter >>> Tdie: +45.1°C (high = +70.0°C) >>> Tctl: +45.1°C >>> >>> which looks about right. >>> >>> Interesting, though. I thought there are two CPU dies on this chip. >>> I guess the temperature sensor is on the IO block ? If so, are there >>> additional sensors on the CPU dies ? >>> >>> Guenter >> >> That's good to know. Thanks for testing it. >> >> What happens is the IOD takes the max temperature of the chiplets in the >> package and presents that as the temperature of the package. It works >> the same way as it does in Rome (server parts). For better or worse, >> you just get the max temperature of the chiplets rather than the >> temperatures of the individual chiplets. > > Out of curiosity, is it the maximum temperature of all chiplets, or just > the non-powergated ones? Because this might explain why I get so much > variance in the idle temperature (40 <-> 50°C in a matter of a second > with a mostly-idle processor). This variance is visible on linux, but > not at all on the firmware's configuration interface. > > One other option would be the stock fan not being tight-enough... but > apparently quite a few people have the issue. I'll try tightening it! > > Marcel
On 22/07/2019 21:11, Vicki Pfau wrote: > I'm getting similar variance. Compiling Linux seems to spike the temperature above 70, even with a new CPU cooler, so I'm wondering if there might be an offset I'm missing. It may just be the fan being too slow (going to be reconfiguring the BIOS settings today). Thanks for the information! Compiling the kernel gets me to 83°C with the stock fan (Ryzen 3700X), and I think I get thermally throttled at this point. > > The reason I haven't replied sooner is because I've been busy with a new job, and I wasn't sure if my phone would send a properly (un)formatted email. I see! No worries, and good luck with your new job! > I'm fine with whosever patch gets in, so long as temperature reading works on my machine. Given that our patches were looking exactly the same, it's safe to assume it will be fine. > > Also, for what it's worth, I'm not a "he". Vicki is generally a female name. I did not know this. Sorry for assuming your gender, and thanks for correcting me! Marcel > > Vicki > >>> On Jul 22, 2019, at 11:04 AM, Marcel Bocu <marcel.p.bocu@gmail.com> wrote: >>> >>>> On 22/07/2019 20:39, Woods, Brian wrote: >>>> On Mon, Jul 22, 2019 at 09:51:05AM -0700, Guenter Roeck wrote: >>>> >>>> With 3900X, and the series applied, I get: >>>> >>>> k10temp-pci-00c3 >>>> Adapter: PCI adapter >>>> Tdie: +45.1°C (high = +70.0°C) >>>> Tctl: +45.1°C >>>> >>>> which looks about right. >>>> >>>> Interesting, though. I thought there are two CPU dies on this chip. >>>> I guess the temperature sensor is on the IO block ? If so, are there >>>> additional sensors on the CPU dies ? >>>> >>>> Guenter >>> >>> That's good to know. Thanks for testing it. >>> >>> What happens is the IOD takes the max temperature of the chiplets in the >>> package and presents that as the temperature of the package. It works >>> the same way as it does in Rome (server parts). For better or worse, >>> you just get the max temperature of the chiplets rather than the >>> temperatures of the individual chiplets. >> >> Out of curiosity, is it the maximum temperature of all chiplets, or just >> the non-powergated ones? Because this might explain why I get so much >> variance in the idle temperature (40 <-> 50°C in a matter of a second >> with a mostly-idle processor). This variance is visible on linux, but >> not at all on the firmware's configuration interface. >> >> One other option would be the stock fan not being tight-enough... but >> apparently quite a few people have the issue. I'll try tightening it! >> >> Marcel >
On Tue, Jul 23, 2019 at 12:09:16AM +0300, Marcel Bocu wrote: > On 22/07/2019 21:11, Vicki Pfau wrote: > > I'm getting similar variance. Compiling Linux seems to spike the temperature above 70, even with a new CPU cooler, so I'm wondering if there might be an offset I'm missing. It may just be the fan being too slow (going to be reconfiguring the BIOS settings today). > > Thanks for the information! Compiling the kernel gets me to 83°C with > the stock fan (Ryzen 3700X), and I think I get thermally throttled at > this point. > Same here, with 3900X. The cores run at ~4GHz each, with stock cooler. Idle temperature seems to be around 35C. Guenter
diff --git a/arch/x86/kernel/amd_nb.c b/arch/x86/kernel/amd_nb.c index d63e63b7d1d9..297dc50da3bd 100644 --- a/arch/x86/kernel/amd_nb.c +++ b/arch/x86/kernel/amd_nb.c @@ -21,6 +21,7 @@ #define PCI_DEVICE_ID_AMD_17H_DF_F4 0x1464 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F4 0x15ec #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F4 0x1494 +#define PCI_DEVICE_ID_AMD_17H_M71H_DF_F4 0x1444 /* Protect the PCI config register pairs used for SMN and DF indirect access. */ static DEFINE_MUTEX(smn_mutex); @@ -50,6 +51,7 @@ const struct pci_device_id amd_nb_misc_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F3) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F3) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M71H_DF_F3) }, {} }; EXPORT_SYMBOL_GPL(amd_nb_misc_ids); @@ -63,6 +65,7 @@ static const struct pci_device_id amd_nb_link_ids[] = { { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M10H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M30H_DF_F4) }, + { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_17H_M71H_DF_F4) }, { PCI_DEVICE(PCI_VENDOR_ID_AMD, PCI_DEVICE_ID_AMD_CNB17H_F4) }, {} }; diff --git a/include/linux/pci_ids.h b/include/linux/pci_ids.h index c842735a4f45..5f99cfac9c06 100644 --- a/include/linux/pci_ids.h +++ b/include/linux/pci_ids.h @@ -548,6 +548,7 @@ #define PCI_DEVICE_ID_AMD_17H_DF_F3 0x1463 #define PCI_DEVICE_ID_AMD_17H_M10H_DF_F3 0x15eb #define PCI_DEVICE_ID_AMD_17H_M30H_DF_F3 0x1493 +#define PCI_DEVICE_ID_AMD_17H_M71H_DF_F3 0x1443 #define PCI_DEVICE_ID_AMD_CNB17H_F3 0x1703 #define PCI_DEVICE_ID_AMD_LANCE 0x2000 #define PCI_DEVICE_ID_AMD_LANCE_HOME 0x2001