diff mbox

kernel 4.15.7/64bit, C3600 is unstable during heavy I/O on PCI

Message ID 20180317113655.GA30572@ls3530.fritz.box (mailing list archive)
State Superseded, archived
Headers show

Commit Message

Helge Deller March 17, 2018, 11:36 a.m. UTC
* Carlo Pisani <carlojpisani@gmail.com>:
> I have created and applied the following patch
> testing the kernel with heavy I/O seems now stable
> 
> my C3600 is still under testing, moving chunks of 500Mbyte and compiling gcc-v6
> 
> 
> http://93.55.217.0//wonderland/chunk_of/user/ivelegacy/happa-dev/hppa2_0001_HPMC_fix_my_v1.patch


This one seems wrong.
I think you just didn't hit a HPMC with your first patch, and as such
this patch has no influence...

Helge
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

Comments

Grant Grundler March 17, 2018, 4:12 p.m. UTC | #1
Hi Helge,
just a nit on PCI terminology...

On Sat, Mar 17, 2018 at 12:36 PM, Helge Deller <deller@gmx.de> wrote:
> * Carlo Pisani <carlojpisani@gmail.com>:
>> I have created and applied the following patch
>> testing the kernel with heavy I/O seems now stable
>>
>> my C3600 is still under testing, moving chunks of 500Mbyte and compiling gcc-v6
>>
>>
>> http://93.55.217.0//wonderland/chunk_of/user/ivelegacy/happa-dev/hppa2_0001_HPMC_fix_my_v1.patch
>
> --- drivers/parisc/lba_pci.c    2018-01-28 22:20:33.000000000 +0100
> +++ drivers/parisc/lba_pci.c    2018-03-15 12:26:44.839894952 +0100
> @@ -1405,7 +1405,7 @@
>
>         /* Set HF mode as the default (vs. -1 mode). */
>          stat = READ_REG32(d->hba.base_addr + LBA_STAT_CTL);
> -       WRITE_REG32(stat | HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL);
> +       WRITE_REG32(stat & ~HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL);
>
>         /*
>         ** Writing a zero to STAT_CTL.rf (bit 0) will clear reset signal
>
> That's the patch from Kyle:
> https://www.spinics.net/lists/linux-parisc/msg01027.html
>
> which comes out of this mail thread:
> https://www.spinics.net/lists/linux-parisc/msg01024.html
> specifically with those notes:
> https://www.spinics.net/lists/linux-parisc/msg01026.html
>
> Citing here:
> "bus timeout" usually means we tried to read an address that doesn't
> respond. that is, nothing on the bus accepted the transaction for it,
> so it timed out and HPMC'd the box.

HF= Hard Fail on PCI "Master Abort".  "Master Abort" means the MMIO
transaction timed out - usually due to the device not responding to an
MMIO read.   We normally want HF to be enabled as you noted below in
order to get state of the CPU and which IO device it was trying to
access.

It's possible the "~0L" returned in SoftFail mode is being handled by
the driver OR the particular read that fails just doesn't matter.
Would have to see dmesg output if the driver ever complains about
invalid MMIO read data (~0L).

cheers,
grant

> what you really need is the IIR, and the address it tried to access
> (both the kernel vaddr which will be in the register, and the "system
> requester address" from the hpmc dump which will be the physical address
> mapped.
>
> not sure why the hpmc handler is getting skipped, that's a little weird.
>
> you can try hacking elroy to set softfail mode on that bus, which will
> result in a timeout on the pci bus to return -1 (like what x86 and most
> other architectures do) rather than hang the box, but it really likely
> means a driver bug.
>
>
>
> So, you change LBA to return -1 instead of faulting via HPMC which is
> of course one work-around to avoid the HPMC.
>
> But could you try to check the driver instead?
>
> You run this SATA controller:
> 01:05.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID
>
> Can you maybe try to localize where in the drivers/ata/sata_via.c driver
> triggers the HPMC ?
>
>
> --- arch/parisc/kernel/hpmc.S   2018-01-28 22:20:33.000000000 +0100
> +++ arch/parisc/kernel/hpmc.S   2018-03-15 14:13:46.611969815 +0100
> @@ -308,4 +290,5 @@
>         .align 4
>         .export os_hpmc_size
>  os_hpmc_size:
> -       .word .os_hpmc_end-.os_hpmc
> +       /* .word .os_hpmc_end-.os_hpmc */
> +       .word (.os_hpmc_end - .os_hpmc) * 4 /* sizeof(u32) */
>
> This one seems wrong.
> I think you just didn't hit a HPMC with your first patch, and as such
> this patch has no influence...
>
> Helge
>
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grant Grundler March 17, 2018, 5:05 p.m. UTC | #2
[adding lists back to CC since this is a public discussion]

On Sat, Mar 17, 2018 at 5:36 PM, Carlo Pisani <carlojpisani@gmail.com> wrote:
>> Would have to see dmesg output if the driver ever complains about
>> invalid MMIO read data (~0L).
>
> I repeat AGAIN
>
> I have tested different sATA controllers
> - via6421
> - SIL24
> - Adaptex 2410

Ah sorry - I missed that.

> none of them has never complained in dmesg ( i have the console
> redirected the serial port )

Thanks for confirming Carlo!

Having worked on mv7042 and SIL3124 (IIRC) driver support for almost 4
years, I can tell you most SATA drivers suck. Helge is most likely
correct that this is due to a SATA driver bug. Since Intel x86 (32 or
64bit) systems ONLY support "SoftFail" mode, many driver issues are
never exposed until something gets corrupted or HW totally wedges.

> and all of them showed the same behavior under heavy I/O

While the symptom (HPMC) looks the same, the details are likely
different for each card. It's been over 10 years but ISTR that "SER
PIM" command when entered at boot prompt will dump those details. Ah,
yes, that's correct:
   https://parisc.wiki.kernel.org/index.php/How_to_report_a_parisc-linux_kernel_problem#HPMC


BTW, HPMCs can also be due to devices DMAing to an invalid address
since the IOMMU is "strict" and requires all DMA be "mapped". IIRC,
these DMA failures are "imprecise events" meaning SER PIM is the only
way to determine what the target DMA address was. This is non-trivial
stuff to debug since it requires pretty deep understanding of the SATA
controller operations and how the driver works.

> the C3600 machine stops to work, and I see errors in the LCD
> without a line from the kernel on the serial console

Yes, HPMC due to IO errors (including DMA failures and PCI Master
Abort) will lock up all other IO including the serial console. SER PIM
is the only output that will lead to diagnosing the underlying driver
bug.

But in SoftFail mode, I was hoping one of the drivers might complain
about "invalid value" - but that is likely way too optimistic on my
part. The only drivers that look for ~0L on MMIO read are the ones
that support PCI hotplug (e.g. PCMCIA devices and some higher end PCI
IO devices which target data center/server market).

cheers,
grant

ps. I haven't looked at mvsata or sil3124 SATA controller code since
2010. But if you could find a MV7042 PCI (or PCI-X?) SATA controller,
I have more faith in that driver/HW combination partly because I
tested it pretty thoroughly at the time.
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin March 17, 2018, 5:47 p.m. UTC | #3
Hi Grant,

On 2018-03-17 12:12 PM, Grant Grundler wrote:
> "Master Abort" means the MMIO
> transaction timed out - usually due to the device not responding to an
> MMIO read.
In lba_pci.c and sba_iommu.c, it says "BE WARNED: register writes are 
posted" and need
to be followed by a read.  It seems there are a some routines in these 
modules that have
writes that aren't followed by a read.  One is lba_wr_cfg(). Another 
might be the macro
LBA_CFG_RESTORE().  Are these okay?

It seems probable that the problem that Carlo is having is a conflict 
between devices.

Dave
Grant Grundler March 19, 2018, 7:30 p.m. UTC | #4
Hi John!

On Sat, Mar 17, 2018 at 6:47 PM, John David Anglin <dave.anglin@bell.net> wrote:
> Hi Grant,
>
> On 2018-03-17 12:12 PM, Grant Grundler wrote:
>>
>> "Master Abort" means the MMIO
>> transaction timed out - usually due to the device not responding to an
>> MMIO read.
>
> In lba_pci.c and sba_iommu.c, it says "BE WARNED: register writes are
> posted" and need to be followed by a read.  It seems there are a some
> routines in these modules that have writes that aren't followed by a read.
> One is lba_wr_cfg(). Another might be the macro
> LBA_CFG_RESTORE().  Are these okay?

I looked through the two examples you point out and I *think* both are
ok.   lba_wr_cfg() issues an mmio write and immediately after calls
LBA_CFG_MASTER_ABORT_CHECK() which performs an MMIO read from the same
base address.

The LBA_CFG_RESTORE() is "lazy" - the next MMIO read will flush those
three writes and (I believe) any following MMIO writes will still be
issued in order.

Typically, the problem with posted MMIO writes is DMA or other events
don't start until the MMIO write is "seen" by the device. This is
important when specific timing between MMIO transactions is required
OR some magic (e.g. device reset, updates Frame Buffer, etc) happens.

> It seems probable that the problem that Carlo is having is a conflict
> between devices.

Hrm. I don't know. I haven't yet looked at the latest dump that Carlo
helpfully provided as I'm still traveling. Why do you suspect this?

I'm skeptical about "conflict between devices" (due to lba_wr_cfg())
for two reasons:
1) configuration space accesses are usually not part of normal IO
device transaction processing.
2) I've nearly always found that PCI Master Aborts (on MMIO reads) are
usually just a symptom of something else going wrong and not the root
cause.

Typically, the issues I recall running into are around the drivers
hitting a corner case where the device is still performing DMA to an
address that gets unmapped by the driver.  This will wedge the IOMMU
(sba) and then following MMIO reads will generate an HPMC.

The hard part is to determine what the corner case is based on a DMA
address (as reported in SER PIM output). It requires deeper
understanding of the DMA programming for the given SATA controller
(driver directing HW what to do), how transaction completions are
reported (SATA controller HW) and handled (driver operation).

In the past, I've sorted several of these issues out for tg3 and tulip
NIC drivers and I can with confidence say that some issues still
remain in the tulip driver shutdown path. But I gave up on trying to
fix those and lost interest later.

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grant Grundler March 19, 2018, 7:37 p.m. UTC | #5
[adding public lists back to CC since I can't sort this by myself any more]

On Sat, Mar 17, 2018 at 6:45 PM, Carlo Pisani <carlojpisani@gmail.com> wrote:
>> While the symptom (HPMC) looks the same, the details are likely
>> different for each card. It's been over 10 years but ISTR that "SER
>> PIM" command when entered at boot prompt will dump those details.
>
> this is an old log
> http://93.55.217.0/wonderland/chunk_of/user/ivelegacy/hppa-dev/ser-prim-2018-03-12--sata-crash

Awesome - thank you!

Can you tell me which SATA controller was installed when this crash was taken?

> my C3600 is now busy, compiling 4.16.*, a couple of branches with two
> different configurations, so I can't force a crash to have a fresh log
> at the moment, but I will with several SATA card!

No worries - start with one. :)

> what about SAS kernel modules? are them better than SATA?

In general, I expect they are better supported but I never worked with
SAS controllers.

> and what about SCSI (adaptec 29320, PCI-X)?

Sorry: I don't have any experience with adaptec - only the LSI SCSI
controllers, Marvel 7042 SATA controller, and Silicon Image 3124
(IIRC). The SI3124 was a PoS. MV7042 worked well and was reliable "at
scale".

> I am waiting for an array of disks, I can test the SCSI card under
> heavy I/O to see
> - performance
> - stability
>
> it will be delivered here for the end of the next week

Ok! Sounds like fun! :D

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Grant Grundler March 19, 2018, 7:41 p.m. UTC | #6
On Mon, Mar 19, 2018 at 8:37 PM, Grant Grundler <grantgrundler@gmail.com> wrote:
...
>> this is an old log
>> http://93.55.217.0/wonderland/chunk_of/user/ivelegacy/hppa-dev/ser-prim-2018-03-12--sata-crash

BTW, I should point out one needs to give "CLEARPIM" command at PDC
command line in order to record a new HPMC.

So once you reproduce the crash, collect HPMC dump with "SER PIM",
then "CLEAR PIM" again. :)

cheers,
grant
--
To unsubscribe from this list: send the line "unsubscribe linux-parisc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html
John David Anglin March 19, 2018, 8:46 p.m. UTC | #7
This is info about PCI devices that Carlo sent me.  It would help if we 
lnew where 0x4046b840
was in the kernel. From the IIR, 0xf3c001c, it seems the timeout 
occurred on a byte operation.

Dave

On 2018-03-17 2:58 PM, Carlo Pisani wrote:
>> Probably, it would be useful to have verbose output from lspci posted to
>> list as well.
>
> c3600 ~ $ lspci -v
> 00:0c.0 Ethernet controller: Digital Equipment Corporation DECchip
> 21142/43 (rev 41)
>          Subsystem: Hewlett-Packard Company DECchip 21142/43
>          Flags: bus master, medium devsel, latency 255, IRQ 17
>          I/O ports at 1000 [size=128]
>          Memory at fffffffff4008000 (32-bit, non-prefetchable) [size=1K]
>          Expansion ROM at fffffffff4040000 [disabled] [size=256K]
>          Kernel driver in use: tulip
> lspci: Unable to load libkmod resources: error -12
>
> 00:0d.0 Multimedia audio controller: Analog Devices Device 1889
>          Subsystem: Analog Devices Device 1889
>          Flags: bus master, medium devsel, latency 255, IRQ 18
>          Memory at fffffffff400c000 (32-bit, prefetchable) [size=512]
>          Memory at fffffffff400b000 (32-bit, prefetchable) [size=16]
>          Memory at fffffffff400a000 (32-bit, prefetchable) [size=16]
>          Memory at fffffffff4009000 (32-bit, prefetchable) [size=16]
>          Capabilities: [dc] Power Management version 1
>          Kernel driver in use: snd_ad1889
>
> 00:0e.0 IDE interface: National Semiconductor Corporation 87415/87560
> IDE (rev 03) (prog-if 8f [Master SecP SecO PriP PriO])
>          Flags: bus master, medium devsel, latency 255, IRQ 7
>          I/O ports at 0f00 [size=8]
>          I/O ports at 0e00 [size=4]
>          I/O ports at 0d00 [size=8]
>          I/O ports at 0b00 [size=4]
>          I/O ports at 0a00 [size=16]
>          Kernel driver in use: NS87415_IDE
>
> 00:0e.1 Bridge: National Semiconductor Corporation 87560 Legacy I/O (rev 01)
>          Flags: bus master, medium devsel, latency 255, IRQ 19
>          Kernel driver in use: SuperIO
>
> 00:0e.2 USB controller: National Semiconductor Corporation USB
> Controller (rev 02) (prog-if 10 [OHCI])
>          Flags: bus master, medium devsel, latency 240, IRQ 1
>          Memory at fffffffff4007000 (32-bit, non-prefetchable) [size=4K]
>          Memory at fffffffff4006000 (32-bit, non-prefetchable) [size=4K]
>          Kernel driver in use: ohci-pci
>
> 00:0f.0 SCSI storage controller: LSI Logic / Symbios Logic 53C896/897 (rev 07)
>          Subsystem: LSI Logic / Symbios Logic LSI53C896/7 PCI to Dual
> Channel Ultra2 SCSI Multifunction Controller
>          Flags: bus master, medium devsel, latency 255, IRQ 20
>          I/O ports at 0900 [size=256]
>          Memory at fffffffff4005000 (64-bit, non-prefetchable) [size=1K]
>          Memory at fffffffff4002000 (64-bit, non-prefetchable) [size=8K]
>          Capabilities: [40] Power Management version 2
>          Kernel driver in use: sym53c8xx
>
> 00:0f.1 SCSI storage controller: LSI Logic / Symbios Logic 53C896/897 (rev 07)
>          Subsystem: LSI Logic / Symbios Logic LSI53C896/7 PCI to Dual
> Channel Ultra2 SCSI Multifunction Controller
>          Flags: bus master, medium devsel, latency 255, IRQ 20
>          I/O ports at 0800 [size=256]
>          Memory at fffffffff4004000 (64-bit, non-prefetchable) [size=1K]
>          Memory at fffffffff4000000 (64-bit, non-prefetchable) [size=8K]
>          Capabilities: [40] Power Management version 2
>          Kernel driver in use: sym53c8xx
>
> 01:05.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE/SATA
> Controller (rev 50)
>          Subsystem: VIA Technologies, Inc. VT6421 IDE/SATA Controller
>          Flags: bus master, medium devsel, latency 240, IRQ 21
>          I/O ports at 12500 [size=16]
>          I/O ports at 12400 [size=16]
>          I/O ports at 12300 [size=16]
>          I/O ports at 12200 [size=16]
>          I/O ports at 12100 [size=32]
>          I/O ports at 12000 [size=256]
>          Expansion ROM at fffffffff4800000 [disabled] [size=64K]
>          Capabilities: [e0] Power Management version 2
>          Kernel driver in use: sata_via
>

The requesting device was 10/1/5/0.
diff mbox

Patch

--- drivers/parisc/lba_pci.c	2018-01-28 22:20:33.000000000 +0100
+++ drivers/parisc/lba_pci.c	2018-03-15 12:26:44.839894952 +0100
@@ -1405,7 +1405,7 @@ 
 
 	/* Set HF mode as the default (vs. -1 mode). */
         stat = READ_REG32(d->hba.base_addr + LBA_STAT_CTL);
-	WRITE_REG32(stat | HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL);
+	WRITE_REG32(stat & ~HF_ENABLE, d->hba.base_addr + LBA_STAT_CTL);
 
 	/*
 	** Writing a zero to STAT_CTL.rf (bit 0) will clear reset signal

That's the patch from Kyle:
https://www.spinics.net/lists/linux-parisc/msg01027.html

which comes out of this mail thread:
https://www.spinics.net/lists/linux-parisc/msg01024.html
specifically with those notes:
https://www.spinics.net/lists/linux-parisc/msg01026.html

Citing here:
"bus timeout" usually means we tried to read an address that doesn't
respond. that is, nothing on the bus accepted the transaction for it,
so it timed out and HPMC'd the box.

what you really need is the IIR, and the address it tried to access
(both the kernel vaddr which will be in the register, and the "system
requester address" from the hpmc dump which will be the physical address
mapped.

not sure why the hpmc handler is getting skipped, that's a little weird.

you can try hacking elroy to set softfail mode on that bus, which will
result in a timeout on the pci bus to return -1 (like what x86 and most
other architectures do) rather than hang the box, but it really likely
means a driver bug.



So, you change LBA to return -1 instead of faulting via HPMC which is
of course one work-around to avoid the HPMC.

But could you try to check the driver instead?

You run this SATA controller:
01:05.0 RAID bus controller: VIA Technologies, Inc. VT6421 IDE RAID

Can you maybe try to localize where in the drivers/ata/sata_via.c driver
triggers the HPMC ?


--- arch/parisc/kernel/hpmc.S	2018-01-28 22:20:33.000000000 +0100
+++ arch/parisc/kernel/hpmc.S	2018-03-15 14:13:46.611969815 +0100
@@ -308,4 +290,5 @@ 
 	.align 4
 	.export os_hpmc_size
 os_hpmc_size:
-	.word .os_hpmc_end-.os_hpmc
+	/* .word .os_hpmc_end-.os_hpmc */
+	.word (.os_hpmc_end - .os_hpmc) * 4 /* sizeof(u32) */