diff mbox series

PCI: mvebu: Mark driver as BROKEN

Message ID 20230114164125.1298-1-pali@kernel.org (mailing list archive)
State Accepted
Delegated to: Krzysztof Wilczyński
Headers show
Series PCI: mvebu: Mark driver as BROKEN | expand

Commit Message

Pali Rohár Jan. 14, 2023, 4:41 p.m. UTC
People are reporting that pci-mvebu.c driver does not work with recent
mainline kernel. There are more bugs which prevents its for daily usage.
So lets mark it as broken for now, until somebody would be able to fix it
in mainline kernel.

Signed-off-by: Pali Rohár <pali@kernel.org>

---
Bjorn: I would really appreciate if you take this change and send it in
pull request for v6.2 release. There is no reason to wait more longer.


I'm periodically receiving emails that driver does not work correctly
anymore, PCIe cards are not detected or that they crashes during boot.

Some of the issues are handled in patches which are waiting on the list for
a long time and nobody cares for them. Some others needs investigation.

I'm really tired in replying to those user emails as I cannot do more in
this area. I have asked more people for help but either there were only
promises without any action for more than year or simple no direction how
to move forward or what to do with it.

So mark this driver as broken. Users would see the real current state
and hopefully will stop reporting me old or new bugs.
---
 drivers/pci/controller/Kconfig | 1 +
 1 file changed, 1 insertion(+)

Comments

Luís Mendes Jan. 15, 2023, 10:02 p.m. UTC | #1
I don't want to interfere, however I would like to contribute my
opinion as an end user of PCI_MVEBU.
I have been using Linux on a custom built hardware with an Armada
A388, for several years, and to me the most critical component for
getting PCIe working right in Linux is the u-boot version. Although
I've stopped trying the official u-boot releases as well as the SoM
vendor u-boot versions, because I don't have much time available to
tinker with those, however the ones I have tried always failed to
bring a working PCIe environment under Linux. So I am stuck with a
u-boot based on the Marvel release 2013.01.
It was key to getting PCIe working in Linux, other than that, all
Linux kernels have worked fine and I have been using an AMD RX550 PCIe
dGPU and a DVB-T/DVB-S PCIe tuner card simultaneously, each in its own
slot, without issues for all these years.

Thanks,
Luís

On Sat, Jan 14, 2023 at 4:51 PM Pali Rohár <pali@kernel.org> wrote:
>
> People are reporting that pci-mvebu.c driver does not work with recent
> mainline kernel. There are more bugs which prevents its for daily usage.
> So lets mark it as broken for now, until somebody would be able to fix it
> in mainline kernel.
>
> Signed-off-by: Pali Rohár <pali@kernel.org>
>
> ---
> Bjorn: I would really appreciate if you take this change and send it in
> pull request for v6.2 release. There is no reason to wait more longer.
>
>
> I'm periodically receiving emails that driver does not work correctly
> anymore, PCIe cards are not detected or that they crashes during boot.
>
> Some of the issues are handled in patches which are waiting on the list for
> a long time and nobody cares for them. Some others needs investigation.
>
> I'm really tired in replying to those user emails as I cannot do more in
> this area. I have asked more people for help but either there were only
> promises without any action for more than year or simple no direction how
> to move forward or what to do with it.
>
> So mark this driver as broken. Users would see the real current state
> and hopefully will stop reporting me old or new bugs.
> ---
>  drivers/pci/controller/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
>
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 1569d9a3ada0..b4a4d84a358b 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -9,6 +9,7 @@ config PCI_MVEBU
>         depends on MVEBU_MBUS
>         depends on ARM
>         depends on OF
> +       depends on BROKEN
>         select PCI_BRIDGE_EMUL
>         help
>          Add support for Marvell EBU PCIe controller. This PCIe controller
> --
> 2.20.1
>
Pali Rohár Feb. 6, 2023, 10:45 p.m. UTC | #2
On Saturday 14 January 2023 17:41:25 Pali Rohár wrote:
> People are reporting that pci-mvebu.c driver does not work with recent
> mainline kernel. There are more bugs which prevents its for daily usage.
> So lets mark it as broken for now, until somebody would be able to fix it
> in mainline kernel.
> 
> Signed-off-by: Pali Rohár <pali@kernel.org>
> 
> ---
> Bjorn: I would really appreciate if you take this change and send it in
> pull request for v6.2 release. There is no reason to wait more longer.

Ping? Any comments?

> 
> I'm periodically receiving emails that driver does not work correctly
> anymore, PCIe cards are not detected or that they crashes during boot.
> 
> Some of the issues are handled in patches which are waiting on the list for
> a long time and nobody cares for them. Some others needs investigation.
> 
> I'm really tired in replying to those user emails as I cannot do more in
> this area. I have asked more people for help but either there were only
> promises without any action for more than year or simple no direction how
> to move forward or what to do with it.
> 
> So mark this driver as broken. Users would see the real current state
> and hopefully will stop reporting me old or new bugs.
> ---
>  drivers/pci/controller/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 1569d9a3ada0..b4a4d84a358b 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -9,6 +9,7 @@ config PCI_MVEBU
>  	depends on MVEBU_MBUS
>  	depends on ARM
>  	depends on OF
> +	depends on BROKEN
>  	select PCI_BRIDGE_EMUL
>  	help
>  	 Add support for Marvell EBU PCIe controller. This PCIe controller
> -- 
> 2.20.1
>
Lorenzo Pieralisi Feb. 9, 2023, 9:02 a.m. UTC | #3
On Sat, 14 Jan 2023 17:41:25 +0100, Pali Rohár wrote:
> People are reporting that pci-mvebu.c driver does not work with recent
> mainline kernel. There are more bugs which prevents its for daily usage.
> So lets mark it as broken for now, until somebody would be able to fix it
> in mainline kernel.
> 
> 

Applied to pci/controller/mvebu, thanks!

[1/1] PCI: mvebu: Mark driver as BROKEN
      https://git.kernel.org/pci/pci/c/b3574f579ece

Thanks,
Lorenzo
Russell King (Oracle) Aug. 4, 2023, 11:35 a.m. UTC | #4
Hi,

So it seems this patch got applied, but it wasn't Cc'd to
linux-arm-kernel or anyone else, so those of us with platforms never
had a chance to comment on it.

*** This change causes a regression to working setups. ***

It appears that the *only* reason this patch was proposed is to stop a
kernel developer receiving problem reports from a set of users, but
completely ignores that there is another group of users where this works
fine - and thus the addition of this patch causes working setups to
regress.

Because one is being bothered with problem reports is not a reason to
mark a driver broken - and especially not doing so in a way that those
who may be affected don't get an opportunity to comment on the patch!
Also, there is _zero_ information provided on what the reported problems
actually are, so no one else can guess what these issues are.

However, given that there are working setups and this change causes
those to regress, it needs to be reverted.

For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
platform, and this works fine.

Uwe has a SATA controller for a bunch of disks in an Armada 370 based
NAS platform that is connected to PCIe, and removing PCIe support
effectively makes his platform utterly useless.

Please revert this patch.

Thanks.

On Sat, Jan 14, 2023 at 05:41:25PM +0100, Pali Rohár wrote:
> People are reporting that pci-mvebu.c driver does not work with recent
> mainline kernel. There are more bugs which prevents its for daily usage.
> So lets mark it as broken for now, until somebody would be able to fix it
> in mainline kernel.
> 
> Signed-off-by: Pali Rohár <pali@kernel.org>
> 
> ---
> Bjorn: I would really appreciate if you take this change and send it in
> pull request for v6.2 release. There is no reason to wait more longer.
> 
> 
> I'm periodically receiving emails that driver does not work correctly
> anymore, PCIe cards are not detected or that they crashes during boot.
> 
> Some of the issues are handled in patches which are waiting on the list for
> a long time and nobody cares for them. Some others needs investigation.
> 
> I'm really tired in replying to those user emails as I cannot do more in
> this area. I have asked more people for help but either there were only
> promises without any action for more than year or simple no direction how
> to move forward or what to do with it.
> 
> So mark this driver as broken. Users would see the real current state
> and hopefully will stop reporting me old or new bugs.
> ---
>  drivers/pci/controller/Kconfig | 1 +
>  1 file changed, 1 insertion(+)
> 
> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 1569d9a3ada0..b4a4d84a358b 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -9,6 +9,7 @@ config PCI_MVEBU
>  	depends on MVEBU_MBUS
>  	depends on ARM
>  	depends on OF
> +	depends on BROKEN
>  	select PCI_BRIDGE_EMUL
>  	help
>  	 Add support for Marvell EBU PCIe controller. This PCIe controller
> -- 
> 2.20.1
>
Uwe Kleine-König Aug. 4, 2023, 1:46 p.m. UTC | #5
[Cc += linux-arm-kernel list]

On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> So it seems this patch got applied, but it wasn't Cc'd to
> linux-arm-kernel or anyone else, so those of us with platforms never
> had a chance to comment on it.
> 
> *** This change causes a regression to working setups. ***
> 
> It appears that the *only* reason this patch was proposed is to stop a
> kernel developer receiving problem reports from a set of users, but
> completely ignores that there is another group of users where this works
> fine - and thus the addition of this patch causes working setups to
> regress.
> 
> Because one is being bothered with problem reports is not a reason to
> mark a driver broken - and especially not doing so in a way that those
> who may be affected don't get an opportunity to comment on the patch!
> Also, there is _zero_ information provided on what the reported problems
> actually are, so no one else can guess what these issues are.
> 
> However, given that there are working setups and this change causes
> those to regress, it needs to be reverted.
> 
> For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> platform, and this works fine.
> 
> Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> NAS platform that is connected to PCIe, and removing PCIe support
> effectively makes his platform utterly useless.

While this is true there is really a problem on my platform with
accessing the hard disks via that pci controller and a 88SE9215 SATA
controller. While it seems to work in principle, it's incredible slow.

I intend to bisect that, 6.1.x is still fine. Don't know when I find the
time though, as there are a few things that are more important
currently.

+1 on some information about what is already known about the breakage.

> Please revert this patch.
> 
> Thanks.
> 
> On Sat, Jan 14, 2023 at 05:41:25PM +0100, Pali Rohár wrote:
> > People are reporting that pci-mvebu.c driver does not work with recent
> > mainline kernel. There are more bugs which prevents its for daily usage.
> > So lets mark it as broken for now, until somebody would be able to fix it
> > in mainline kernel.
> > 
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > 
> > ---
> > Bjorn: I would really appreciate if you take this change and send it in
> > pull request for v6.2 release. There is no reason to wait more longer.
> > 
> > 
> > I'm periodically receiving emails that driver does not work correctly
> > anymore, PCIe cards are not detected or that they crashes during boot.
> > 
> > Some of the issues are handled in patches which are waiting on the list for
> > a long time and nobody cares for them. Some others needs investigation.
> > 
> > I'm really tired in replying to those user emails as I cannot do more in
> > this area. I have asked more people for help but either there were only
> > promises without any action for more than year or simple no direction how
> > to move forward or what to do with it.
> > 
> > So mark this driver as broken. Users would see the real current state
> > and hopefully will stop reporting me old or new bugs.
> > ---
> >  drivers/pci/controller/Kconfig | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> > index 1569d9a3ada0..b4a4d84a358b 100644
> > --- a/drivers/pci/controller/Kconfig
> > +++ b/drivers/pci/controller/Kconfig
> > @@ -9,6 +9,7 @@ config PCI_MVEBU
> >  	depends on MVEBU_MBUS
> >  	depends on ARM
> >  	depends on OF
> > +	depends on BROKEN
> >  	select PCI_BRIDGE_EMUL
> >  	help
> >  	 Add support for Marvell EBU PCIe controller. This PCIe controller
> > -- 
> > 2.20.1
> > 
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
>
Russell King (Oracle) Aug. 4, 2023, 2:54 p.m. UTC | #6
On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> Hi,
> 
> So it seems this patch got applied, but it wasn't Cc'd to
> linux-arm-kernel or anyone else, so those of us with platforms never
> had a chance to comment on it.
> 
> *** This change causes a regression to working setups. ***
> 
> It appears that the *only* reason this patch was proposed is to stop a
> kernel developer receiving problem reports from a set of users, but
> completely ignores that there is another group of users where this works
> fine - and thus the addition of this patch causes working setups to
> regress.
> 
> Because one is being bothered with problem reports is not a reason to
> mark a driver broken - and especially not doing so in a way that those
> who may be affected don't get an opportunity to comment on the patch!
> Also, there is _zero_ information provided on what the reported problems
> actually are, so no one else can guess what these issues are.
> 
> However, given that there are working setups and this change causes
> those to regress, it needs to be reverted.
> 
> For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> platform, and this works fine.

Further testing - same platform with a mini-PCIe SATA card:

01:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)

with a WD10JPVX-60JC3T0 2.5" drive with hdparm -t:

 Timing buffered disk reads: 344 MB in  3.01 seconds = 114.16 MB/sec

which is about what is expected for spinny-rust 2.5" drives.

This was tested with ASPM and AER disabled. AER isn't supported anyway
as pcie_init_service_irqs() fails with -ENODEV.

For further info, the Atheros WiFi card was:

02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter
Uwe Kleine-König Aug. 4, 2023, 5 p.m. UTC | #7
Hello,

On Fri, Aug 04, 2023 at 03:46:22PM +0200, Uwe Kleine-König wrote:
> On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> > 
> > *** This change causes a regression to working setups. ***
> > 
> > It appears that the *only* reason this patch was proposed is to stop a
> > kernel developer receiving problem reports from a set of users, but
> > completely ignores that there is another group of users where this works
> > fine - and thus the addition of this patch causes working setups to
> > regress.
> > 
> > Because one is being bothered with problem reports is not a reason to
> > mark a driver broken - and especially not doing so in a way that those
> > who may be affected don't get an opportunity to comment on the patch!
> > Also, there is _zero_ information provided on what the reported problems
> > actually are, so no one else can guess what these issues are.
> > 
> > However, given that there are working setups and this change causes
> > those to regress, it needs to be reverted.
> > 
> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> > 
> > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > NAS platform that is connected to PCIe, and removing PCIe support
> > effectively makes his platform utterly useless.
> 
> While this is true there is really a problem on my platform with
> accessing the hard disks via that pci controller and a 88SE9215 SATA
> controller. While it seems to work in principle, it's incredible slow.
> 
> I intend to bisect that, 6.1.x is still fine. Don't know when I find the
> time though, as there are a few things that are more important
> currently.

I did that and found next-20230803 to be bad but next-20230804 is good.
I didn't debug that further and didn't spot anything obvious in

	git log --oneline --no-merges --left-right next-20230803...next-20230804

. I will just assume the problem is gone for good.

So now I'm in the position to say: For me PCI_MVEBU works fine and so I
support Russell's request to revert
b3574f579ece24439c90e9a179742c61205fbcfa.

Thanks
Uwe
Bjorn Helgaas Aug. 4, 2023, 5:06 p.m. UTC | #8
[+cc Krzysztof]

On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> So it seems this patch got applied, but it wasn't Cc'd to
> linux-arm-kernel or anyone else, so those of us with platforms never
> had a chance to comment on it.
> 
> *** This change causes a regression to working setups. ***
> 
> It appears that the *only* reason this patch was proposed is to stop a
> kernel developer receiving problem reports from a set of users, but
> completely ignores that there is another group of users where this works
> fine - and thus the addition of this patch causes working setups to
> regress.
> 
> Because one is being bothered with problem reports is not a reason to
> mark a driver broken - and especially not doing so in a way that those
> who may be affected don't get an opportunity to comment on the patch!
> Also, there is _zero_ information provided on what the reported problems
> actually are, so no one else can guess what these issues are.
> 
> However, given that there are working setups and this change causes
> those to regress, it needs to be reverted.
> 
> For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> platform, and this works fine.
> 
> Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> NAS platform that is connected to PCIe, and removing PCIe support
> effectively makes his platform utterly useless.
> 
> Please revert this patch.

Sorry for the inconvenience.

I was under the mistaken impression that making the driver depend on
CONFIG_BROKEN would keep the driver available but only if the user
explicitly requested it, similar to how 
CONFIG_COMPILE_TEST works.  But obviously that's not the case, so
we'll revert the change.

I queued up the revert below, including a note in the Kconfig help
text about the known issues.

commit 814b6bb15367 ("Revert "PCI: mvebu: Mark driver as BROKEN"")
Author: Bjorn Helgaas <bhelgaas@google.com>
Date:   Fri Aug 4 11:54:43 2023 -0500

    Revert "PCI: mvebu: Mark driver as BROKEN"
    
    b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to
    enable the pci-mvebu driver.  The driver does have known problems, but as
    Russell and Uwe reported, it does work in some configurations, so removing
    it broke some working setups.
    
    Revert b3574f579ece so pci-mvebu is available.  Mention the known problems
    in the Kconfig help text.
    
    Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
    Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk
    Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
    Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de
    Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>


diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 8d49bad7f847..478f158b2dfb 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -179,13 +179,15 @@ config PCI_MVEBU
 	depends on MVEBU_MBUS
 	depends on ARM
 	depends on OF
-	depends on BROKEN
 	select PCI_BRIDGE_EMUL
 	help
 	 Add support for Marvell EBU PCIe controller. This PCIe controller
 	 is used on 32-bit Marvell ARM SoCs: Dove, Kirkwood, Armada 370,
 	 Armada XP, Armada 375, Armada 38x and Armada 39x.
 
+	 This driver has known problems that may cause crashes during boot
+	 and failure to detect PCIe devices in some cases.
+
 config PCIE_MEDIATEK
 	tristate "MediaTek PCIe controller"
 	depends on ARCH_AIROHA || ARCH_MEDIATEK || COMPILE_TEST
Russell King (Oracle) Aug. 4, 2023, 7:44 p.m. UTC | #9
On Fri, Aug 04, 2023 at 12:06:55PM -0500, Bjorn Helgaas wrote:
> [+cc Krzysztof]
> 
> On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> > 
> > *** This change causes a regression to working setups. ***
> > 
> > It appears that the *only* reason this patch was proposed is to stop a
> > kernel developer receiving problem reports from a set of users, but
> > completely ignores that there is another group of users where this works
> > fine - and thus the addition of this patch causes working setups to
> > regress.
> > 
> > Because one is being bothered with problem reports is not a reason to
> > mark a driver broken - and especially not doing so in a way that those
> > who may be affected don't get an opportunity to comment on the patch!
> > Also, there is _zero_ information provided on what the reported problems
> > actually are, so no one else can guess what these issues are.
> > 
> > However, given that there are working setups and this change causes
> > those to regress, it needs to be reverted.
> > 
> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> > 
> > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > NAS platform that is connected to PCIe, and removing PCIe support
> > effectively makes his platform utterly useless.
> > 
> > Please revert this patch.
> 
> Sorry for the inconvenience.
> 
> I was under the mistaken impression that making the driver depend on
> CONFIG_BROKEN would keep the driver available but only if the user
> explicitly requested it, similar to how 
> CONFIG_COMPILE_TEST works.  But obviously that's not the case, so
> we'll revert the change.

CONFIG_BROKEN isn't something that can be enabled without editing
init/Kconfig, since the option does not have a prompt:

config BROKEN
        bool

Marking something as "depends on BROKEN" prevents it appearing in
Kconfig, and thus makes it unselectable. It's effectively one step
away from deleting the code.

As we don't know what the problems were, I would not suggest updating
the Kconfig text - we don't know whether they were boot time crashes,
boot hangs, or failures for the link to come up with the device making
some PCIe devices undetectable.

Given that it does work fine for Uwe and myself, it suggests it's the
latter - and that could be down to the PCIe devices themselves, or it
could be a platform hardware issue.
Luís Mendes Aug. 6, 2023, 10:06 p.m. UTC | #10
It also works for me. I have a home built Armada Duo motherboard that
I designed with two PCIe slots, and I have it working with an AMD GPU
card and the amdgpu driver on one slot and I also use a second PCIe
card for a TV/SAT tuner card or others like SATA controllers, or
Ethernet controller cards.

The video is in Portuguese, but illustrates the system working:
https://www.youtube.com/watch?v=98GuzKp1U6E

Kind Regards,
Luís Mendes
Pali Rohár Aug. 8, 2023, 7:26 a.m. UTC | #11
On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> Hi,
> 
> So it seems this patch got applied, but it wasn't Cc'd to
> linux-arm-kernel or anyone else, so those of us with platforms never
> had a chance to comment on it.

You have received more changes and fixes for last 2 years for these
issues and you have done **nothing**. You even not said anything.
So you are the last one who can complain here.

And I'm stopped communicating with people who do not want to communicate
with me. This is pretty normal situation and you should have think about
it. No?

> *** This change causes a regression to working setups. ***
> 
> It appears that the *only* reason this patch was proposed is to stop a
> kernel developer receiving problem reports from a set of users, but
> completely ignores that there is another group of users where this works
> fine - and thus the addition of this patch causes working setups to
> regress.

You should have come up and start solving issues. And not complaining
now.

> Because one is being bothered with problem reports is not a reason to
> mark a driver broken - and especially not doing so in a way that those
> who may be affected don't get an opportunity to comment on the patch!

You had oportunity for last two years, so stop saying untruth
information here.

> Also, there is _zero_ information provided on what the reported problems
> actually are, so no one else can guess what these issues are.
> 
> However, given that there are working setups and this change causes
> those to regress, it needs to be reverted.
> 
> For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> platform, and this works fine.
> 
> Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> NAS platform that is connected to PCIe, and removing PCIe support
> effectively makes his platform utterly useless.
> 
> Please revert this patch.

Please do not revert it, instead start fixing problems.

I'm really fed up with people who ignores all previous emails, did
absolutely nothing in related area and start complaining that something
does not work for them.

> Thanks.
> 
> On Sat, Jan 14, 2023 at 05:41:25PM +0100, Pali Rohár wrote:
> > People are reporting that pci-mvebu.c driver does not work with recent
> > mainline kernel. There are more bugs which prevents its for daily usage.
> > So lets mark it as broken for now, until somebody would be able to fix it
> > in mainline kernel.
> > 
> > Signed-off-by: Pali Rohár <pali@kernel.org>
> > 
> > ---
> > Bjorn: I would really appreciate if you take this change and send it in
> > pull request for v6.2 release. There is no reason to wait more longer.
> > 
> > 
> > I'm periodically receiving emails that driver does not work correctly
> > anymore, PCIe cards are not detected or that they crashes during boot.
> > 
> > Some of the issues are handled in patches which are waiting on the list for
> > a long time and nobody cares for them. Some others needs investigation.
> > 
> > I'm really tired in replying to those user emails as I cannot do more in
> > this area. I have asked more people for help but either there were only
> > promises without any action for more than year or simple no direction how
> > to move forward or what to do with it.
> > 
> > So mark this driver as broken. Users would see the real current state
> > and hopefully will stop reporting me old or new bugs.
> > ---
> >  drivers/pci/controller/Kconfig | 1 +
> >  1 file changed, 1 insertion(+)
> > 
> > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> > index 1569d9a3ada0..b4a4d84a358b 100644
> > --- a/drivers/pci/controller/Kconfig
> > +++ b/drivers/pci/controller/Kconfig
> > @@ -9,6 +9,7 @@ config PCI_MVEBU
> >  	depends on MVEBU_MBUS
> >  	depends on ARM
> >  	depends on OF
> > +	depends on BROKEN
> >  	select PCI_BRIDGE_EMUL
> >  	help
> >  	 Add support for Marvell EBU PCIe controller. This PCIe controller
> > -- 
> > 2.20.1
> > 
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Pali Rohár Aug. 8, 2023, 7:27 a.m. UTC | #12
On Friday 04 August 2023 15:46:22 Uwe Kleine-König wrote:
> [Cc += linux-arm-kernel list]
> 
> On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> > 
> > *** This change causes a regression to working setups. ***
> > 
> > It appears that the *only* reason this patch was proposed is to stop a
> > kernel developer receiving problem reports from a set of users, but
> > completely ignores that there is another group of users where this works
> > fine - and thus the addition of this patch causes working setups to
> > regress.
> > 
> > Because one is being bothered with problem reports is not a reason to
> > mark a driver broken - and especially not doing so in a way that those
> > who may be affected don't get an opportunity to comment on the patch!
> > Also, there is _zero_ information provided on what the reported problems
> > actually are, so no one else can guess what these issues are.
> > 
> > However, given that there are working setups and this change causes
> > those to regress, it needs to be reverted.
> > 
> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> > 
> > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > NAS platform that is connected to PCIe, and removing PCIe support
> > effectively makes his platform utterly useless.
> 
> While this is true there is really a problem on my platform with
> accessing the hard disks via that pci controller and a 88SE9215 SATA
> controller. While it seems to work in principle, it's incredible slow.

Exactly those are things which randomly does not work.

> I intend to bisect that, 6.1.x is still fine. Don't know when I find the
> time though, as there are a few things that are more important
> currently.
> 
> +1 on some information about what is already known about the breakage.
> 
> > Please revert this patch.
> > 
> > Thanks.
> > 
> > On Sat, Jan 14, 2023 at 05:41:25PM +0100, Pali Rohár wrote:
> > > People are reporting that pci-mvebu.c driver does not work with recent
> > > mainline kernel. There are more bugs which prevents its for daily usage.
> > > So lets mark it as broken for now, until somebody would be able to fix it
> > > in mainline kernel.
> > > 
> > > Signed-off-by: Pali Rohár <pali@kernel.org>
> > > 
> > > ---
> > > Bjorn: I would really appreciate if you take this change and send it in
> > > pull request for v6.2 release. There is no reason to wait more longer.
> > > 
> > > 
> > > I'm periodically receiving emails that driver does not work correctly
> > > anymore, PCIe cards are not detected or that they crashes during boot.
> > > 
> > > Some of the issues are handled in patches which are waiting on the list for
> > > a long time and nobody cares for them. Some others needs investigation.
> > > 
> > > I'm really tired in replying to those user emails as I cannot do more in
> > > this area. I have asked more people for help but either there were only
> > > promises without any action for more than year or simple no direction how
> > > to move forward or what to do with it.
> > > 
> > > So mark this driver as broken. Users would see the real current state
> > > and hopefully will stop reporting me old or new bugs.
> > > ---
> > >  drivers/pci/controller/Kconfig | 1 +
> > >  1 file changed, 1 insertion(+)
> > > 
> > > diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> > > index 1569d9a3ada0..b4a4d84a358b 100644
> > > --- a/drivers/pci/controller/Kconfig
> > > +++ b/drivers/pci/controller/Kconfig
> > > @@ -9,6 +9,7 @@ config PCI_MVEBU
> > >  	depends on MVEBU_MBUS
> > >  	depends on ARM
> > >  	depends on OF
> > > +	depends on BROKEN
> > >  	select PCI_BRIDGE_EMUL
> > >  	help
> > >  	 Add support for Marvell EBU PCIe controller. This PCIe controller
> > > -- 
> > > 2.20.1
> > > 
> > 
> > -- 
> > RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> > FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
> > 
> 
> -- 
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | https://www.pengutronix.de/ |
Pali Rohár Aug. 8, 2023, 7:27 a.m. UTC | #13
On Friday 04 August 2023 15:54:26 Russell King (Oracle) wrote:
> On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > Hi,
> > 
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> > 
> > *** This change causes a regression to working setups. ***
> > 
> > It appears that the *only* reason this patch was proposed is to stop a
> > kernel developer receiving problem reports from a set of users, but
> > completely ignores that there is another group of users where this works
> > fine - and thus the addition of this patch causes working setups to
> > regress.
> > 
> > Because one is being bothered with problem reports is not a reason to
> > mark a driver broken - and especially not doing so in a way that those
> > who may be affected don't get an opportunity to comment on the patch!
> > Also, there is _zero_ information provided on what the reported problems
> > actually are, so no one else can guess what these issues are.
> > 
> > However, given that there are working setups and this change causes
> > those to regress, it needs to be reverted.
> > 
> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> 
> Further testing - same platform with a mini-PCIe SATA card:
> 
> 01:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
> 
> with a WD10JPVX-60JC3T0 2.5" drive with hdparm -t:
> 
>  Timing buffered disk reads: 344 MB in  3.01 seconds = 114.16 MB/sec
> 
> which is about what is expected for spinny-rust 2.5" drives.
> 
> This was tested with ASPM and AER disabled. AER isn't supported anyway
> as pcie_init_service_irqs() fails with -ENODEV.

So another thing which is broken. Perfect!

> For further info, the Atheros WiFi card was:
> 
> 02:00.0 Network controller: Qualcomm Atheros QCA986x/988x 802.11ac Wireless Network Adapter
> 
> -- 
> RMK's Patch system: https://www.armlinux.org.uk/developer/patches/
> FTTP is here! 80Mbps down 10Mbps up. Decent connectivity at last!
Pali Rohár Aug. 8, 2023, 7:31 a.m. UTC | #14
On Friday 04 August 2023 12:06:55 Bjorn Helgaas wrote:
> [+cc Krzysztof]
> 
> On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> > 
> > *** This change causes a regression to working setups. ***
> > 
> > It appears that the *only* reason this patch was proposed is to stop a
> > kernel developer receiving problem reports from a set of users, but
> > completely ignores that there is another group of users where this works
> > fine - and thus the addition of this patch causes working setups to
> > regress.
> > 
> > Because one is being bothered with problem reports is not a reason to
> > mark a driver broken - and especially not doing so in a way that those
> > who may be affected don't get an opportunity to comment on the patch!
> > Also, there is _zero_ information provided on what the reported problems
> > actually are, so no one else can guess what these issues are.
> > 
> > However, given that there are working setups and this change causes
> > those to regress, it needs to be reverted.
> > 
> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> > 
> > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > NAS platform that is connected to PCIe, and removing PCIe support
> > effectively makes his platform utterly useless.
> > 
> > Please revert this patch.
> 
> Sorry for the inconvenience.
> 
> I was under the mistaken impression that making the driver depend on
> CONFIG_BROKEN would keep the driver available but only if the user
> explicitly requested it, similar to how 
> CONFIG_COMPILE_TEST works.  But obviously that's not the case, so
> we'll revert the change.
> 
> I queued up the revert below, including a note in the Kconfig help
> text about the known issues.
> 
> commit 814b6bb15367 ("Revert "PCI: mvebu: Mark driver as BROKEN"")
> Author: Bjorn Helgaas <bhelgaas@google.com>
> Date:   Fri Aug 4 11:54:43 2023 -0500
> 
>     Revert "PCI: mvebu: Mark driver as BROKEN"
>     
>     b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to
>     enable the pci-mvebu driver.  The driver does have known problems, but as
>     Russell and Uwe reported, it does work in some configurations, so removing
>     it broke some working setups.
>     
>     Revert b3574f579ece so pci-mvebu is available.  Mention the known problems
>     in the Kconfig help text.
>     
>     Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
>     Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk
>     Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
>     Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de
>     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> 

What you are trying to achieve with this patch now? Do you think that it
is really correct to show that everything is working for everybody
correctly? Use a common sense here.

Or this is a way how kernel people are fixing bugs?

Now I'm starting understand why majority of HW industry say to not use
"unsupported mainline kernel" and instead use our prepared patched
kernels...

> diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
> index 8d49bad7f847..478f158b2dfb 100644
> --- a/drivers/pci/controller/Kconfig
> +++ b/drivers/pci/controller/Kconfig
> @@ -179,13 +179,15 @@ config PCI_MVEBU
>  	depends on MVEBU_MBUS
>  	depends on ARM
>  	depends on OF
> -	depends on BROKEN
>  	select PCI_BRIDGE_EMUL
>  	help
>  	 Add support for Marvell EBU PCIe controller. This PCIe controller
>  	 is used on 32-bit Marvell ARM SoCs: Dove, Kirkwood, Armada 370,
>  	 Armada XP, Armada 375, Armada 38x and Armada 39x.
>  
> +	 This driver has known problems that may cause crashes during boot
> +	 and failure to detect PCIe devices in some cases.
> +
>  config PCIE_MEDIATEK
>  	tristate "MediaTek PCIe controller"
>  	depends on ARCH_AIROHA || ARCH_MEDIATEK || COMPILE_TEST
Uwe Kleine-König Aug. 8, 2023, 7:38 a.m. UTC | #15
Hello Pali,

On Tue, Aug 08, 2023 at 09:27:01AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 15:46:22 Uwe Kleine-König wrote:
> > On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > > NAS platform that is connected to PCIe, and removing PCIe support
> > > effectively makes his platform utterly useless.
> > 
> > While this is true there is really a problem on my platform with
> > accessing the hard disks via that pci controller and a 88SE9215 SATA
> > controller. While it seems to work in principle, it's incredible slow.
> 
> Exactly those are things which randomly does not work.

I had this slow behaviour consistently on next-20230803 and
next-20230804 was fine. I thought that meant that there was something
fixed between these two trees. Do you suggest this is worth to
investigate as it might just be some butterfly effect that made the
problem go away?

Best regards
Uwe
Pali Rohár Aug. 8, 2023, 7:56 a.m. UTC | #16
On Tuesday 08 August 2023 09:38:22 Uwe Kleine-König wrote:
> Hello Pali,
> 
> On Tue, Aug 08, 2023 at 09:27:01AM +0200, Pali Rohár wrote:
> > On Friday 04 August 2023 15:46:22 Uwe Kleine-König wrote:
> > > On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > > > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > > > NAS platform that is connected to PCIe, and removing PCIe support
> > > > effectively makes his platform utterly useless.
> > > 
> > > While this is true there is really a problem on my platform with
> > > accessing the hard disks via that pci controller and a 88SE9215 SATA
> > > controller. While it seems to work in principle, it's incredible slow.
> > 
> > Exactly those are things which randomly does not work.
> 
> I had this slow behaviour consistently on next-20230803 and
> next-20230804 was fine. I thought that meant that there was something
> fixed between these two trees. Do you suggest this is worth to
> investigate as it might just be some butterfly effect that made the
> problem go away?

These issues are there for a longer time, it started appearing after
5.15 lts version. And by your description it means that they were not
fixed yet.

> Best regards
> Uwe
> 
> -- 
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | https://www.pengutronix.de/ |
Uwe Kleine-König Aug. 8, 2023, 8:01 a.m. UTC | #17
Hello Pali,

On Tue, Aug 08, 2023 at 09:31:54AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 12:06:55 Bjorn Helgaas wrote:
> > I queued up the revert below, including a note in the Kconfig help
> > text about the known issues.
> > 
> > commit 814b6bb15367 ("Revert "PCI: mvebu: Mark driver as BROKEN"")
> > Author: Bjorn Helgaas <bhelgaas@google.com>
> > Date:   Fri Aug 4 11:54:43 2023 -0500
> > 
> >     Revert "PCI: mvebu: Mark driver as BROKEN"
> >     
> >     b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to
> >     enable the pci-mvebu driver.  The driver does have known problems, but as
> >     Russell and Uwe reported, it does work in some configurations, so removing
> >     it broke some working setups.
> >     
> >     Revert b3574f579ece so pci-mvebu is available.  Mention the known problems
> >     in the Kconfig help text.
> >     
> >     Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
> >     Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk
> >     Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >     Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > 
> 
> What you are trying to achieve with this patch now?

We all agree that there are problems with the pci-mvebu driver. But to
my knowledge it works in some configurations. So I guess the thing to
achieve here is to go from "broken for everyone" (with the driver
depending on BROKEN) to "broken for only some people". So in my view
it's an improvement.

> Do you think that it is really correct to show that everything is
> working for everybody correctly?

What makes you think Bjorn considers everything to work fine? Both the
commit log and the added help text suggest he's aware of problems.

> Now I'm starting understand why majority of HW industry say to not use
> "unsupported mainline kernel" and instead use our prepared patched
> kernels...

Yes, for a given company it's the easiest and cheapest option to just
support a single kernel version. In this regard the patch adding the
dependency on BROKEN is even worse than the commit that removes it
again. In an ideal world Marvell would care to work on the issues of
their hardware's drivers or at least provide enough documentation to the
folks who care about these drivers. I guess we both know we're not in
that ideal world.

I wonder what's your objective problem with this revert. You can just
keep PCI_MVEBU disabled, so it doesn't affect the machines you care
about.  You might get some mail with complaints that the driver is
broken in some configuration. But without this revert you might get
(more?) complaints that the driver cannot be enabled. Is that any
better?

Best regards
Uwe
Russell King (Oracle) Aug. 8, 2023, 8:28 a.m. UTC | #18
On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > Hi,
> > 
> > So it seems this patch got applied, but it wasn't Cc'd to
> > linux-arm-kernel or anyone else, so those of us with platforms never
> > had a chance to comment on it.
> 
> You have received more changes and fixes for last 2 years for these
> issues and you have done **nothing**. You even not said anything.
> So you are the last one who can complain here.

That's because I can't help - what I have *works*. I have *zero*
issues with the PCI interfaces on Armada 388.

> You should have come up and start solving issues. And not complaining
> now.

How can one solve issues when they're probably hardware related and
one doesn't experience them?

Sorry, but no. If you feel as strongly as you do, walk away.
Russell King (Oracle) Aug. 8, 2023, 8:31 a.m. UTC | #19
On Tue, Aug 08, 2023 at 09:27:40AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 15:54:26 Russell King (Oracle) wrote:
> > On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > > Hi,
> > > 
> > > So it seems this patch got applied, but it wasn't Cc'd to
> > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > had a chance to comment on it.
> > > 
> > > *** This change causes a regression to working setups. ***
> > > 
> > > It appears that the *only* reason this patch was proposed is to stop a
> > > kernel developer receiving problem reports from a set of users, but
> > > completely ignores that there is another group of users where this works
> > > fine - and thus the addition of this patch causes working setups to
> > > regress.
> > > 
> > > Because one is being bothered with problem reports is not a reason to
> > > mark a driver broken - and especially not doing so in a way that those
> > > who may be affected don't get an opportunity to comment on the patch!
> > > Also, there is _zero_ information provided on what the reported problems
> > > actually are, so no one else can guess what these issues are.
> > > 
> > > However, given that there are working setups and this change causes
> > > those to regress, it needs to be reverted.
> > > 
> > > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > > platform, and this works fine.
> > 
> > Further testing - same platform with a mini-PCIe SATA card:
> > 
> > 01:00.0 SATA controller: ASMedia Technology Inc. ASM1062 Serial ATA Controller (rev 01)
> > 
> > with a WD10JPVX-60JC3T0 2.5" drive with hdparm -t:
> > 
> >  Timing buffered disk reads: 344 MB in  3.01 seconds = 114.16 MB/sec
> > 
> > which is about what is expected for spinny-rust 2.5" drives.
> > 
> > This was tested with ASPM and AER disabled. AER isn't supported anyway
> > as pcie_init_service_irqs() fails with -ENODEV.
> 
> So another thing which is broken. Perfect!

No, not broken. There has never been AER because (a) historically the
PCI config reg emulation didn't allow access, and (b) there are _no_
interrupts for this stuff. The reason pcie_init_service_irqs() fails
is because of that.

This is not something that needs fixing. As far as I can see, we do
*not* have the information to support AER.

That does not mean we need to fix it. If we don't have the information,
the feature is not supported. And it's fine that the kernel fails to
initialise the feature.
Russell King (Oracle) Aug. 8, 2023, 8:32 a.m. UTC | #20
On Tue, Aug 08, 2023 at 09:31:54AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 12:06:55 Bjorn Helgaas wrote:
> > [+cc Krzysztof]
> > 
> > On Fri, Aug 04, 2023 at 12:35:13PM +0100, Russell King (Oracle) wrote:
> > > So it seems this patch got applied, but it wasn't Cc'd to
> > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > had a chance to comment on it.
> > > 
> > > *** This change causes a regression to working setups. ***
> > > 
> > > It appears that the *only* reason this patch was proposed is to stop a
> > > kernel developer receiving problem reports from a set of users, but
> > > completely ignores that there is another group of users where this works
> > > fine - and thus the addition of this patch causes working setups to
> > > regress.
> > > 
> > > Because one is being bothered with problem reports is not a reason to
> > > mark a driver broken - and especially not doing so in a way that those
> > > who may be affected don't get an opportunity to comment on the patch!
> > > Also, there is _zero_ information provided on what the reported problems
> > > actually are, so no one else can guess what these issues are.
> > > 
> > > However, given that there are working setups and this change causes
> > > those to regress, it needs to be reverted.
> > > 
> > > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > > platform, and this works fine.
> > > 
> > > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > > NAS platform that is connected to PCIe, and removing PCIe support
> > > effectively makes his platform utterly useless.
> > > 
> > > Please revert this patch.
> > 
> > Sorry for the inconvenience.
> > 
> > I was under the mistaken impression that making the driver depend on
> > CONFIG_BROKEN would keep the driver available but only if the user
> > explicitly requested it, similar to how 
> > CONFIG_COMPILE_TEST works.  But obviously that's not the case, so
> > we'll revert the change.
> > 
> > I queued up the revert below, including a note in the Kconfig help
> > text about the known issues.
> > 
> > commit 814b6bb15367 ("Revert "PCI: mvebu: Mark driver as BROKEN"")
> > Author: Bjorn Helgaas <bhelgaas@google.com>
> > Date:   Fri Aug 4 11:54:43 2023 -0500
> > 
> >     Revert "PCI: mvebu: Mark driver as BROKEN"
> >     
> >     b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to
> >     enable the pci-mvebu driver.  The driver does have known problems, but as
> >     Russell and Uwe reported, it does work in some configurations, so removing
> >     it broke some working setups.
> >     
> >     Revert b3574f579ece so pci-mvebu is available.  Mention the known problems
> >     in the Kconfig help text.
> >     
> >     Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
> >     Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk
> >     Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> >     Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de
> >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > 
> 
> What you are trying to achieve with this patch now? Do you think that it
> is really correct to show that everything is working for everybody
> correctly? Use a common sense here.

Common sense is not to break people's working setups. You seem to lack
that appreciation.
Russell King (Oracle) Aug. 8, 2023, 8:42 a.m. UTC | #21
On Tue, Aug 08, 2023 at 09:28:28AM +0100, Russell King (Oracle) wrote:
> On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> > On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > > Hi,
> > > 
> > > So it seems this patch got applied, but it wasn't Cc'd to
> > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > had a chance to comment on it.
> > 
> > You have received more changes and fixes for last 2 years for these
> > issues and you have done **nothing**. You even not said anything.
> > So you are the last one who can complain here.
> 
> That's because I can't help - what I have *works*. I have *zero*
> issues with the PCI interfaces on Armada 388.
> 
> > You should have come up and start solving issues. And not complaining
> > now.
> 
> How can one solve issues when they're probably hardware related and
> one doesn't experience them?
> 
> Sorry, but no. If you feel as strongly as you do, walk away.

... and how dare you tell me what I should work on - you are *not* my
boss.
Bjorn Helgaas Aug. 8, 2023, 4:26 p.m. UTC | #22
[+cc linux-arm-kernel, beginning of thread:
https://lore.kernel.org/r/20230114164125.1298-1-pali@kernel.org]

On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> ...

> > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > platform, and this works fine.
> > 
> > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > NAS platform that is connected to PCIe, and removing PCIe support
> > effectively makes his platform utterly useless.
> > 
> > Please revert this patch.
> 
> Please do not revert it, instead start fixing problems.

We know that like all the other drivers, the mvebu driver isn't
perfect.  But I don't think effectively removing the driver completely
helps anybody.  If people try to use it and notice problems, we have a
chance to try to fix them.

Or maybe I'm missing your point.  I think you're suggesting that we
keep pci-mvebu in the tree but unselectable because it depends on
CONFIG_BROKEN.  What would be the advantage of doing that?

Bjorn
Pali Rohár Aug. 8, 2023, 7:01 p.m. UTC | #23
On Tuesday 08 August 2023 10:01:59 Uwe Kleine-König wrote:
> Hello Pali,
> 
> On Tue, Aug 08, 2023 at 09:31:54AM +0200, Pali Rohár wrote:
> > On Friday 04 August 2023 12:06:55 Bjorn Helgaas wrote:
> > > I queued up the revert below, including a note in the Kconfig help
> > > text about the known issues.
> > > 
> > > commit 814b6bb15367 ("Revert "PCI: mvebu: Mark driver as BROKEN"")
> > > Author: Bjorn Helgaas <bhelgaas@google.com>
> > > Date:   Fri Aug 4 11:54:43 2023 -0500
> > > 
> > >     Revert "PCI: mvebu: Mark driver as BROKEN"
> > >     
> > >     b3574f579ece ("PCI: mvebu: Mark driver as BROKEN") made it impossible to
> > >     enable the pci-mvebu driver.  The driver does have known problems, but as
> > >     Russell and Uwe reported, it does work in some configurations, so removing
> > >     it broke some working setups.
> > >     
> > >     Revert b3574f579ece so pci-mvebu is available.  Mention the known problems
> > >     in the Kconfig help text.
> > >     
> > >     Reported-by: Russell King (Oracle) <linux@armlinux.org.uk>
> > >     Link: https://lore.kernel.org/r/ZMzicVQEyHyZzBOc@shell.armlinux.org.uk
> > >     Reported-by: Uwe Kleine-König <u.kleine-koenig@pengutronix.de>
> > >     Link: https://lore.kernel.org/r/20230804134622.pmbymxtzxj2yfhri@pengutronix.de
> > >     Signed-off-by: Bjorn Helgaas <bhelgaas@google.com>
> > > 
> > 
> > What you are trying to achieve with this patch now?
> 
> We all agree that there are problems with the pci-mvebu driver. But to
> my knowledge it works in some configurations. So I guess the thing to
> achieve here is to go from "broken for everyone" (with the driver
> depending on BROKEN) to "broken for only some people". So in my view
> it's an improvement.

There is no improvement. I do not see any patch, suggestion or other
contribution or discussion how to at least start doing anything in this
area.

> > Do you think that it is really correct to show that everything is
> > working for everybody correctly?
> 
> What makes you think Bjorn considers everything to work fine? Both the
> commit log and the added help text suggest he's aware of problems.

Where it is written? There is nothing which instruct majority of
distributions that there are problems about which they should be aware.

Simple revert without any indication is not a solution.

> > Now I'm starting understand why majority of HW industry say to not use
> > "unsupported mainline kernel" and instead use our prepared patched
> > kernels...
> 
> Yes, for a given company it's the easiest and cheapest option to just
> support a single kernel version. In this regard the patch adding the
> dependency on BROKEN is even worse than the commit that removes it
> again. In an ideal world Marvell would care to work on the issues of
> their hardware's drivers or at least provide enough documentation to the
> folks who care about these drivers. I guess we both know we're not in
> that ideal world.

Now I perfectly understand Marvell, why they do not want to do it.
Now after 3 years I see how the whole development in PCI works.

> I wonder what's your objective problem with this revert. You can just
> keep PCI_MVEBU disabled, so it doesn't affect the machines you care
> about.  You might get some mail with complaints that the driver is
> broken in some configuration. But without this revert you might get
> (more?) complaints that the driver cannot be enabled. Is that any
> better?

Surprisingly I got less emails about problems since then.

> Best regards
> Uwe
> 
> -- 
> Pengutronix e.K.                           | Uwe Kleine-König            |
> Industrial Linux Solutions                 | https://www.pengutronix.de/ |
Pali Rohár Aug. 8, 2023, 7:06 p.m. UTC | #24
On Tuesday 08 August 2023 09:28:28 Russell King (Oracle) wrote:
> On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> > On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > > Hi,
> > > 
> > > So it seems this patch got applied, but it wasn't Cc'd to
> > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > had a chance to comment on it.
> > 
> > You have received more changes and fixes for last 2 years for these
> > issues and you have done **nothing**. You even not said anything.
> > So you are the last one who can complain here.
> 
> That's because I can't help - what I have *works*. I have *zero*
> issues with the PCI interfaces on Armada 388.

Perfect, apply your patch on your kernel setup and stop complaining
here. Nobody is interested for one Russel's user setup.

> > You should have come up and start solving issues. And not complaining
> > now.
> 
> How can one solve issues when they're probably hardware related and
> one doesn't experience them?
> 
> Sorry, but no.

Ok, if you do not want, that we have nothing to discuss here, and your
patch should be rejected.

> If you feel as strongly as you do, walk away.

What? You should go away and never return. Nobody is interested for one
nobody Russel user. Thanks.

And if you not agree with me, then go ahead and send a patch to remove
my maintainer line from this driver and explain to all people that you
are best one user on the world and that everybody should do what you
wrote.
Pali Rohár Aug. 8, 2023, 7:07 p.m. UTC | #25
On Tuesday 08 August 2023 09:42:19 Russell King (Oracle) wrote:
> On Tue, Aug 08, 2023 at 09:28:28AM +0100, Russell King (Oracle) wrote:
> > On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> > > On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > > > Hi,
> > > > 
> > > > So it seems this patch got applied, but it wasn't Cc'd to
> > > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > > had a chance to comment on it.
> > > 
> > > You have received more changes and fixes for last 2 years for these
> > > issues and you have done **nothing**. You even not said anything.
> > > So you are the last one who can complain here.
> > 
> > That's because I can't help - what I have *works*. I have *zero*
> > issues with the PCI interfaces on Armada 388.
> > 
> > > You should have come up and start solving issues. And not complaining
> > > now.
> > 
> > How can one solve issues when they're probably hardware related and
> > one doesn't experience them?
> > 
> > Sorry, but no. If you feel as strongly as you do, walk away.
> 
> ... and how dare you tell me what I should work on - you are *not* my
> boss.

I'm not your boss, but I'm still marked as maintainer of this code and
you should get my review for any changes here. Or was there some change?
Pali Rohár Aug. 8, 2023, 7:20 p.m. UTC | #26
On Tuesday 08 August 2023 11:26:27 Bjorn Helgaas wrote:
> [+cc linux-arm-kernel, beginning of thread:
> https://lore.kernel.org/r/20230114164125.1298-1-pali@kernel.org]
> 
> On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> > On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > ...
> 
> > > For example, I have an Atheros PCIe WiFi card in an Armada 388 Clearfog
> > > platform, and this works fine.
> > > 
> > > Uwe has a SATA controller for a bunch of disks in an Armada 370 based
> > > NAS platform that is connected to PCIe, and removing PCIe support
> > > effectively makes his platform utterly useless.
> > > 
> > > Please revert this patch.
> > 
> > Please do not revert it, instead start fixing problems.
> 
> We know that like all the other drivers, the mvebu driver isn't
> perfect.  But I don't think effectively removing the driver completely
> helps anybody.  If people try to use it and notice problems, we have a
> chance to try to fix them.

I do not want to remove it. I was trying to find somebody who can start
caring about issues. In last year I was resending patches, some smaller
which could improve situation, but most of them were ignored or rejected.

So I'm here and waiting for alternatives, and I'm prepared to review
changes and patches for mvebu, which can improve driver support.

But I do not see anything. The only one who wrote something useful was
Uwe as he wanted to do some git bisect (which normally indicates issues
or also fixup/patch).

Also some times ago Greg wrote something like that (mainline) kernel is
place for unsupported and broken drivers. But mvebu is going in this
direction.

How can I otherwise point out to start doing something in this area?

Or are you unhappy with the fact that there is at least somebody (me)
who is willing to do patch review for this marvell stuff? You should
have said it to me earlier.

But as I'm reading now, that I should go away, maybe you should have to
find also new reviewer for driver. Good luck here as there was nobody
who even wanted to do anything in this area.

> Or maybe I'm missing your point.  I think you're suggesting that we
> keep pci-mvebu in the tree but unselectable because it depends on
> CONFIG_BROKEN.  What would be the advantage of doing that?
> 
> Bjorn

Well, all knows here that driver is in bad state. In past there were
regressions and no accepted fixes for it. (At that time I prepared fix,
but you did not like it and nobody else comes with other alternative
patch).

There area other options which can be done now, if there are only people
like Russel who are complaining but refusing to do absolutely nothing.
For example mark driver as experimental (there is some Kconfig symbol
for it). Or add a new menuconfig selectable symbol which appropriately
warn all distributions about problems and would be dependency for mvebu.
(if you do not like broken symbol).
Russell King (Oracle) Aug. 8, 2023, 7:29 p.m. UTC | #27
On Tue, Aug 08, 2023 at 09:06:09PM +0200, Pali Rohár wrote:
> On Tuesday 08 August 2023 09:28:28 Russell King (Oracle) wrote:
> > On Tue, Aug 08, 2023 at 09:26:05AM +0200, Pali Rohár wrote:
> > > On Friday 04 August 2023 12:35:13 Russell King (Oracle) wrote:
> > > > Hi,
> > > > 
> > > > So it seems this patch got applied, but it wasn't Cc'd to
> > > > linux-arm-kernel or anyone else, so those of us with platforms never
> > > > had a chance to comment on it.
> > > 
> > > You have received more changes and fixes for last 2 years for these
> > > issues and you have done **nothing**. You even not said anything.
> > > So you are the last one who can complain here.
> > 
> > That's because I can't help - what I have *works*. I have *zero*
> > issues with the PCI interfaces on Armada 388.
> 
> Perfect, apply your patch on your kernel setup and stop complaining
> here. Nobody is interested for one Russel's user setup.

What the fuck?

What patch? I'm not proposing a patch. What I am asking for is the
damage that you have caused (by making the driver depend on BROKEN
because you don't want to support it anymore) be reverted so that
those of us for whom IT IS WORKING can continue to use the driver.

> > > You should have come up and start solving issues. And not complaining
> > > now.
> > 
> > How can one solve issues when they're probably hardware related and
> > one doesn't experience them?
> > 
> > Sorry, but no.
> 
> Ok, if you do not want, that we have nothing to discuss here, and your
> patch should be rejected.

What patch? You are making this crap up.
Bjorn Helgaas Aug. 8, 2023, 7:41 p.m. UTC | #28
On Tue, Aug 08, 2023 at 09:06:09PM +0200, Pali Rohár wrote:

> ... then go ahead and send a patch to remove
> my maintainer line from this driver ...

Oh, I forgot that you're listed in MAINTAINERS for this driver.  I
don't want to do it for you, but if you'd prefer not to get problem
reports about pci-mvebu, you could send a patch to update that
MAINTAINERS entry.

Bjorn
Russell King (Oracle) Aug. 8, 2023, 7:54 p.m. UTC | #29
On Tue, Aug 08, 2023 at 09:20:26PM +0200, Pali Rohár wrote:
> There area other options which can be done now, if there are only people
> like Russel who are complaining but refusing to do absolutely nothing.

Fucking hell, here we go with the accusations again.

And you can't even be bothered to spell my name correctly.

Let's start over at your first accusation in this thread, because this
says everything about what the problem here is:

"You have received more changes and fixes for last 2 years for these
issues and you have done **nothing**. You even not said anything.
So you are the last one who can complain here.

And I'm stopped communicating with people who do not want to communicate
with me. This is pretty normal situation and you should have think about
it. No?"

Let's go through my points one by one, maybe you'll then understand,
because right now you seem to be totally immune to any appreciation
of anyone's situation other than your own.

1. pci-mvebu works 100% fine for me.

2. I do not see any problems with the hardware I have. If I have no
   problems, then by definition it works fine. How can I test - for
   example, failure to bring up the PCIe link (which I believe some
   of your patches were trying to address) when I HAVE NO PROBLEM WITH
   THE PCIe LINK NOT COMING UP? FFS, take a moment to think about that.

3. I have not asked you to work on it.

4. You do not have the right to demand that I do anything with it.

5. You decided to pick up some patches that I had in my tree and
   merge them into the kernel.

6. You objected when I rebased my branches on top of what you had
   merged (which were modified versions of my patches). I wanted to
   keep my changes intact.

   (Again, you have no damn right in this whole damned world to
   complain about what I do in my own git tree.)

7. You decided on your own back to mark the driver BROKEN - and you
   did that in such a way to *ensure* that no one who would have
   been using it would notice until after the patch was merged. That
   is underhand tactics. You effectively admit to that in my above
   quotation, which effectively states that you _actively_ decided
   to exclude me from that because I didn't interact with you.

> For example mark driver as experimental (there is some Kconfig symbol
> for it). Or add a new menuconfig selectable symbol which appropriately
> warn all distributions about problems and would be dependency for mvebu.
> (if you do not like broken symbol).

So distributions end up crippling systems that they've stated that they
support? That is the side effect of your patch?

I welcome your resignation as maintainer of this driver - and at this
point I think that is the best course of action given everything that
has happened. Yes - you have now driven me to actively seek your
resignation.
diff mbox series

Patch

diff --git a/drivers/pci/controller/Kconfig b/drivers/pci/controller/Kconfig
index 1569d9a3ada0..b4a4d84a358b 100644
--- a/drivers/pci/controller/Kconfig
+++ b/drivers/pci/controller/Kconfig
@@ -9,6 +9,7 @@  config PCI_MVEBU
 	depends on MVEBU_MBUS
 	depends on ARM
 	depends on OF
+	depends on BROKEN
 	select PCI_BRIDGE_EMUL
 	help
 	 Add support for Marvell EBU PCIe controller. This PCIe controller