Message ID | 20190531121443.30694-1-suganath-prabu.subramani@broadcom.com (mailing list archive) |
---|---|
Headers | show |
Series | mpt3sas: Aero/Sea HBA feature addition | expand |
Suganath, I applied this series to 5.3/scsi-queue. However, I remain unconvinced of the merits of the config page putback. Why even bother if a controller reset causes the defaults to be loaded from NVRAM? Also, triggering on X86 for selecting performance mode seems questionable. I would like to see a follow-on patch that comes up with a better heuristic.
> > Suganath, > > I applied this series to 5.3/scsi-queue. > > However, I remain unconvinced of the merits of the config page putback. Why > even bother if a controller reset causes the defaults to be loaded from > NVRAM? > > Also, triggering on X86 for selecting performance mode seems questionable. I > would like to see a follow-on patch that comes up with a better heuristic. Martin - AMD EPYC is not efficient w.r.t QPI transaction. I tested performance on AMD EPYC 7601 Chipset. It has totally 128 logical CPU. Aero/Sea controller support at max 128 MSIx vector. In good case scenario, we will have 1:1 CPU to MSIX mapping. I can get 2.4 M IOPS in this case. Just to simulate performance issue, I reduce controller msix vector to 64. It means cpu to msix mapping is 2:1. Indirectly, I am trying to generate completion which requires completion on remote cpu (via call_function_single_interrupt). In this case, I can get 1.7M IOPS. Same test on Intel architecture provides better result (Negligible performance impact). This patch set maps high iops queues (queues with interrupt coalescing turned on) to local numa node. High iops queue count is limited and it depends upon QPI for io completion. We have enable this feature only for intel arch where we have seen improvement. Not having this feature is not bad, but if we enable this feature we may get negative impact if QPI overhead (like AMD) is high. Kashyap > > -- > Martin K. Petersen Oracle Linux Engineering
Kashyap, > AMD EPYC is not efficient w.r.t QPI transaction. [...] > Same test on Intel architecture provides better result Heuristics are always hard. However, you are making assumptions based on observed performance of current Intel offerings vs. current AMD offerings. This results in what is inevitably going to be a short-lived heuristic in the kernel. Things could easily be reversed in next generation platforms from these vendors. So while I appreciate that the logic works given the machines you are currently testing, I think CPU manufacturer is a horrible heuristic. You are stating "This will be the right choice for all future processors manufactured by Intel". That's a bit of a leap of faith. Instead of predicting the future I prefer to make decisions based on things we know. Measured negative impact on current EPYC family, for instance. That's a fairly well-defined and narrow scope. That said, I am still not a big fan of platform-specific tweaks in drivers. While I prefer the kernel to do the right thing out of the box, I think the module parameter is probably the better choice in this case.
On Fri, Jun 7, 2019 at 10:19 PM Martin K. Petersen <martin.petersen@oracle.com> wrote: > > > Kashyap, > > > AMD EPYC is not efficient w.r.t QPI transaction. > [...] > > Same test on Intel architecture provides better result > > Heuristics are always hard. > > However, you are making assumptions based on observed performance of > current Intel offerings vs. current AMD offerings. This results in what > is inevitably going to be a short-lived heuristic in the kernel. Things > could easily be reversed in next generation platforms from these > vendors. > > So while I appreciate that the logic works given the machines you are > currently testing, I think CPU manufacturer is a horrible heuristic. You > are stating "This will be the right choice for all future processors > manufactured by Intel". That's a bit of a leap of faith. > > Instead of predicting the future I prefer to make decisions based on > things we know. Measured negative impact on current EPYC family, for > instance. That's a fairly well-defined and narrow scope. > > That said, I am still not a big fan of platform-specific tweaks in > drivers. While I prefer the kernel to do the right thing out of the box, > I think the module parameter is probably the better choice in this case. Martin, If we decide to remove cpu arch check later, things will be unnecessary complex to explain default driver behavior as we may have two driver behaviors. We are going to remove cpu architecture detection logic. It is good to have module parameter based dependency from day one. We will be sending relevant patch soon. Kashyap > > -- > Martin K. Petersen Oracle Linux Engineering