diff mbox

net: phy: Fix phy_modify() semantic difference fallout

Message ID 1515496281-10988-1-git-send-email-geert+renesas@glider.be (mailing list archive)
State Superseded
Delegated to: Geert Uytterhoeven
Headers show

Commit Message

Geert Uytterhoeven Jan. 9, 2018, 11:11 a.m. UTC
In case of success, the return values of (__)phy_write() and
(__)phy_modify() are not compatible: (__)phy_write() returns 0, while
(__)phy_modify() returns the old PHY register value.

Apparently this change was catered for in drivers/net/phy/marvell.c, but
not in other source files.

Hence genphy_restart_aneg() now returns 4416 instead zero, which is
considered an error:

    ravb e6800000.ethernet eth0: failed to connect PHY
    IP-Config: Failed to open eth0
    IP-Config: No network devices available

Fix this by converting positive values to zero in all callers of
phy_modify().

Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
---
Alternatively, __phy_modify() could be changed to follow __phy_write()
semantics?
---
 drivers/net/phy/at803x.c     |  4 +++-
 drivers/net/phy/phy_device.c | 29 +++++++++++++++++++++--------
 2 files changed, 24 insertions(+), 9 deletions(-)

Comments

Andrew Lunn Jan. 9, 2018, 2:10 p.m. UTC | #1
On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
> In case of success, the return values of (__)phy_write() and
> (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> (__)phy_modify() returns the old PHY register value.
> 
> Apparently this change was catered for in drivers/net/phy/marvell.c, but
> not in other source files.
> 
> Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> considered an error:
> 
>     ravb e6800000.ethernet eth0: failed to connect PHY
>     IP-Config: Failed to open eth0
>     IP-Config: No network devices available
> 
> Fix this by converting positive values to zero in all callers of
> phy_modify().
> 
> Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Alternatively, __phy_modify() could be changed to follow __phy_write()
> semantics?

Hi Geert, Russell

I took a quick look at the uses of phy_modify(). I don't see any uses
of the return code other than as an error indicator. So having it
return 0 on success seems like a better fix.

       Andrew
Russell King (Oracle) Jan. 9, 2018, 2:22 p.m. UTC | #2
On Tue, Jan 09, 2018 at 03:10:08PM +0100, Andrew Lunn wrote:
> On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
> > In case of success, the return values of (__)phy_write() and
> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> > (__)phy_modify() returns the old PHY register value.
> > 
> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
> > not in other source files.
> > 
> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> > considered an error:
> > 
> >     ravb e6800000.ethernet eth0: failed to connect PHY
> >     IP-Config: Failed to open eth0
> >     IP-Config: No network devices available
> > 
> > Fix this by converting positive values to zero in all callers of
> > phy_modify().
> > 
> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > ---
> > Alternatively, __phy_modify() could be changed to follow __phy_write()
> > semantics?
> 
> Hi Geert, Russell
> 
> I took a quick look at the uses of phy_modify(). I don't see any uses
> of the return code other than as an error indicator. So having it
> return 0 on success seems like a better fix.

I'd like to avoid that, because I don't want to have yet another
accessor that needs to be used for advertisment modification (where
we need to know if we changed any bits.)

That's why this accessor returns the old value.
Niklas Cassel Jan. 9, 2018, 2:35 p.m. UTC | #3
On 09/01/18 12:11, Geert Uytterhoeven wrote:
> In case of success, the return values of (__)phy_write() and
> (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> (__)phy_modify() returns the old PHY register value.
> 
> Apparently this change was catered for in drivers/net/phy/marvell.c, but
> not in other source files.
> 
> Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> considered an error:
> 
>     ravb e6800000.ethernet eth0: failed to connect PHY
>     IP-Config: Failed to open eth0
>     IP-Config: No network devices available
> 
> Fix this by converting positive values to zero in all callers of
> phy_modify().
> 
> Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Alternatively, __phy_modify() could be changed to follow __phy_write()
> semantics?
> ---
>  drivers/net/phy/at803x.c     |  4 +++-
>  drivers/net/phy/phy_device.c | 29 +++++++++++++++++++++--------
>  2 files changed, 24 insertions(+), 9 deletions(-)
> 
> diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
> index 411cf1072bae5796..6b6b9cef517f1bc3 100644
> --- a/drivers/net/phy/at803x.c
> +++ b/drivers/net/phy/at803x.c
> @@ -230,7 +230,9 @@ static int at803x_suspend(struct phy_device *phydev)
>  
>  static int at803x_resume(struct phy_device *phydev)
>  {
> -	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
> +	int ret = phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
> +
> +	return ret < 0 ? ret : 0;
>  }
>  
>  static int at803x_probe(struct phy_device *phydev)
> diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
> index 6bd11a070ec8147b..a132e845e4aec3d5 100644
> --- a/drivers/net/phy/phy_device.c
> +++ b/drivers/net/phy/phy_device.c
> @@ -1369,6 +1369,7 @@ static int genphy_config_eee_advert(struct phy_device *phydev)
>  int genphy_setup_forced(struct phy_device *phydev)
>  {
>  	u16 ctl = 0;
> +	int ret;
>  
>  	phydev->pause = 0;
>  	phydev->asym_pause = 0;
> @@ -1381,8 +1382,12 @@ int genphy_setup_forced(struct phy_device *phydev)
>  	if (DUPLEX_FULL == phydev->duplex)
>  		ctl |= BMCR_FULLDPLX;
>  
> -	return phy_modify(phydev, MII_BMCR,
> -			  BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN, ctl);
> +	ret = phy_modify(phydev, MII_BMCR,
> +			 BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN, ctl);
> +	if (ret < 0)
> +		return ret;
> +
> +	return 0;
>  }
>  EXPORT_SYMBOL(genphy_setup_forced);
>  
> @@ -1393,8 +1398,10 @@ EXPORT_SYMBOL(genphy_setup_forced);
>  int genphy_restart_aneg(struct phy_device *phydev)
>  {
>  	/* Don't isolate the PHY if we're negotiating */
> -	return phy_modify(phydev, MII_BMCR, BMCR_ISOLATE,
> -			  BMCR_ANENABLE | BMCR_ANRESTART);
> +	int ret = phy_modify(phydev, MII_BMCR, BMCR_ISOLATE,
> +			     BMCR_ANENABLE | BMCR_ANRESTART);
> +
> +	return ret < 0 ? ret : 0;
>  }
>  EXPORT_SYMBOL(genphy_restart_aneg);
>  
> @@ -1660,20 +1667,26 @@ EXPORT_SYMBOL(genphy_config_init);
>  
>  int genphy_suspend(struct phy_device *phydev)
>  {
> -	return phy_modify(phydev, MII_BMCR, 0, BMCR_PDOWN);
> +	int ret = phy_modify(phydev, MII_BMCR, 0, BMCR_PDOWN);
> +
> +	return ret < 0 ? ret : 0;
>  }
>  EXPORT_SYMBOL(genphy_suspend);
>  
>  int genphy_resume(struct phy_device *phydev)
>  {
> -	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
> +	int ret = phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
> +
> +	return ret < 0 ? ret : 0;
>  }
>  EXPORT_SYMBOL(genphy_resume);
>  
>  int genphy_loopback(struct phy_device *phydev, bool enable)
>  {
> -	return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
> -			  enable ? BMCR_LOOPBACK : 0);
> +	int ret = phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
> +			     enable ? BMCR_LOOPBACK : 0);
> +
> +	return ret < 0 ? ret : 0;
>  }
>  EXPORT_SYMBOL(genphy_loopback);
>  
> 

This patch solves the following problem on ARTPEC-6:

[    2.562607] dwc-eth-dwmac f8010000.ethernet eth0: device MAC address 00:aa:bb:cc:13:36
[    2.670029] dwc-eth-dwmac f8010000.ethernet eth0: Could not attach to PHY
[    2.676826] dwc-eth-dwmac f8010000.ethernet eth0: stmmac_open: Cannot attach to PHY (error: -19)

When using linux-next: next-20180109

next-20180109 contains:
fea23fb591cc ("net: phy: convert read-modify-write to phy_modify()")
and
f102852f980e ("net: phy: fix wrong masks to phy_modify()")

Tested-by: Niklas Cassel <niklas.cassel@axis.com>
Geert Uytterhoeven Jan. 9, 2018, 2:35 p.m. UTC | #4
Hi Russell,

On Tue, Jan 9, 2018 at 3:22 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Tue, Jan 09, 2018 at 03:10:08PM +0100, Andrew Lunn wrote:
>> On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
>> > In case of success, the return values of (__)phy_write() and
>> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>> > (__)phy_modify() returns the old PHY register value.
>> >
>> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
>> > not in other source files.
>> >
>> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>> > considered an error:
>> >
>> >     ravb e6800000.ethernet eth0: failed to connect PHY
>> >     IP-Config: Failed to open eth0
>> >     IP-Config: No network devices available
>> >
>> > Fix this by converting positive values to zero in all callers of
>> > phy_modify().
>> >
>> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> > ---
>> > Alternatively, __phy_modify() could be changed to follow __phy_write()
>> > semantics?
>>
>> Hi Geert, Russell
>>
>> I took a quick look at the uses of phy_modify(). I don't see any uses
>> of the return code other than as an error indicator. So having it
>> return 0 on success seems like a better fix.
>
> I'd like to avoid that, because I don't want to have yet another
> accessor that needs to be used for advertisment modification (where
> we need to know if we changed any bits.)
>
> That's why this accessor returns the old value.

Can I consider that to be an Acked-by for my patch? ;-)

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Andrew Lunn Jan. 9, 2018, 2:48 p.m. UTC | #5
> > I took a quick look at the uses of phy_modify(). I don't see any uses
> > of the return code other than as an error indicator. So having it
> > return 0 on success seems like a better fix.
> 
> I'd like to avoid that, because I don't want to have yet another
> accessor that needs to be used for advertisment modification (where
> we need to know if we changed any bits.)
> 
> That's why this accessor returns the old value.

Hi Russell

where exactly is this use case? I've not found it yet.

I can understand your argument. But how long it is going to take us to
find all the breakage because the return value has changed meaning?

The trade off is adding yet another accessor vs debugging and fixing
the repercussions.

I think i prefer not breaking existing code.

  Andrew
Russell King (Oracle) Jan. 9, 2018, 2:50 p.m. UTC | #6
On Tue, Jan 09, 2018 at 03:48:13PM +0100, Andrew Lunn wrote:
> > > I took a quick look at the uses of phy_modify(). I don't see any uses
> > > of the return code other than as an error indicator. So having it
> > > return 0 on success seems like a better fix.
> > 
> > I'd like to avoid that, because I don't want to have yet another
> > accessor that needs to be used for advertisment modification (where
> > we need to know if we changed any bits.)
> > 
> > That's why this accessor returns the old value.
> 
> Hi Russell
> 
> where exactly is this use case? I've not found it yet.
> 
> I can understand your argument. But how long it is going to take us to
> find all the breakage because the return value has changed meaning?
> 
> The trade off is adding yet another accessor vs debugging and fixing
> the repercussions.
> 
> I think i prefer not breaking existing code.

Please introduce a new accessor then.
Geert Uytterhoeven Jan. 9, 2018, 6:25 p.m. UTC | #7
Hi Russell,

On Tue, Jan 9, 2018 at 3:22 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Tue, Jan 09, 2018 at 03:10:08PM +0100, Andrew Lunn wrote:
>> On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
>> > In case of success, the return values of (__)phy_write() and
>> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>> > (__)phy_modify() returns the old PHY register value.
>> >
>> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
>> > not in other source files.
>> >
>> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>> > considered an error:
>> >
>> >     ravb e6800000.ethernet eth0: failed to connect PHY
>> >     IP-Config: Failed to open eth0
>> >     IP-Config: No network devices available
>> >
>> > Fix this by converting positive values to zero in all callers of
>> > phy_modify().
>> >
>> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> > ---
>> > Alternatively, __phy_modify() could be changed to follow __phy_write()
>> > semantics?
>>
>> Hi Geert, Russell
>>
>> I took a quick look at the uses of phy_modify(). I don't see any uses
>> of the return code other than as an error indicator. So having it
>> return 0 on success seems like a better fix.
>
> I'd like to avoid that, because I don't want to have yet another
> accessor that needs to be used for advertisment modification (where
> we need to know if we changed any bits.)
>
> That's why this accessor returns the old value.

But this is documented nowhere!

I believe there are no current users of (__)phy_modify() that rely on this
behavior.  Except perhaps phy_restore_page(), which I don't understand at all.

BTW, I think phy_restore_page() may return a strict positive value as well,
thus breaking m88e1318_set_wol(), which is not supposed to return strict
positive values.

So changing __phy_modify() to return zero on success seems like the way
forward...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Russell King (Oracle) Jan. 9, 2018, 6:31 p.m. UTC | #8
On Tue, Jan 09, 2018 at 07:25:40PM +0100, Geert Uytterhoeven wrote:
> Hi Russell,
> 
> On Tue, Jan 9, 2018 at 3:22 PM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
> > On Tue, Jan 09, 2018 at 03:10:08PM +0100, Andrew Lunn wrote:
> >> On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
> >> > In case of success, the return values of (__)phy_write() and
> >> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> >> > (__)phy_modify() returns the old PHY register value.
> >> >
> >> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
> >> > not in other source files.
> >> >
> >> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> >> > considered an error:
> >> >
> >> >     ravb e6800000.ethernet eth0: failed to connect PHY
> >> >     IP-Config: Failed to open eth0
> >> >     IP-Config: No network devices available
> >> >
> >> > Fix this by converting positive values to zero in all callers of
> >> > phy_modify().
> >> >
> >> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> >> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> >> > ---
> >> > Alternatively, __phy_modify() could be changed to follow __phy_write()
> >> > semantics?
> >>
> >> Hi Geert, Russell
> >>
> >> I took a quick look at the uses of phy_modify(). I don't see any uses
> >> of the return code other than as an error indicator. So having it
> >> return 0 on success seems like a better fix.
> >
> > I'd like to avoid that, because I don't want to have yet another
> > accessor that needs to be used for advertisment modification (where
> > we need to know if we changed any bits.)
> >
> > That's why this accessor returns the old value.
> 
> But this is documented nowhere!
> 
> I believe there are no current users of (__)phy_modify() that rely on this
> behavior.  Except perhaps phy_restore_page(), which I don't understand at all.
> 
> BTW, I think phy_restore_page() may return a strict positive value as well,
> thus breaking m88e1318_set_wol(), which is not supposed to return strict
> positive values.

Correct, and it has to for temperature reading in marvell.c to work.

> So changing __phy_modify() to return zero on success seems like the way
> forward...

So what do we call an accessor that returns the original value?

__phy_modify_return_old_value()

bit long-winded isn't it?
Geert Uytterhoeven Jan. 9, 2018, 6:36 p.m. UTC | #9
Hi Russell,

On Tue, Jan 9, 2018 at 7:31 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Tue, Jan 09, 2018 at 07:25:40PM +0100, Geert Uytterhoeven wrote:
>> On Tue, Jan 9, 2018 at 3:22 PM, Russell King - ARM Linux
>> <linux@armlinux.org.uk> wrote:
>> > On Tue, Jan 09, 2018 at 03:10:08PM +0100, Andrew Lunn wrote:
>> >> On Tue, Jan 09, 2018 at 12:11:21PM +0100, Geert Uytterhoeven wrote:
>> >> > In case of success, the return values of (__)phy_write() and
>> >> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>> >> > (__)phy_modify() returns the old PHY register value.
>> >> >
>> >> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
>> >> > not in other source files.
>> >> >
>> >> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>> >> > considered an error:
>> >> >
>> >> >     ravb e6800000.ethernet eth0: failed to connect PHY
>> >> >     IP-Config: Failed to open eth0
>> >> >     IP-Config: No network devices available
>> >> >
>> >> > Fix this by converting positive values to zero in all callers of
>> >> > phy_modify().
>> >> >
>> >> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>> >> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> >> > ---
>> >> > Alternatively, __phy_modify() could be changed to follow __phy_write()
>> >> > semantics?
>> >>
>> >> Hi Geert, Russell
>> >>
>> >> I took a quick look at the uses of phy_modify(). I don't see any uses
>> >> of the return code other than as an error indicator. So having it
>> >> return 0 on success seems like a better fix.
>> >
>> > I'd like to avoid that, because I don't want to have yet another
>> > accessor that needs to be used for advertisment modification (where
>> > we need to know if we changed any bits.)
>> >
>> > That's why this accessor returns the old value.
>>
>> But this is documented nowhere!
>>
>> I believe there are no current users of (__)phy_modify() that rely on this
>> behavior.  Except perhaps phy_restore_page(), which I don't understand at all.
>>
>> BTW, I think phy_restore_page() may return a strict positive value as well,
>> thus breaking m88e1318_set_wol(), which is not supposed to return strict
>> positive values.
>
> Correct, and it has to for temperature reading in marvell.c to work.

For phy_restore_page()?
Not for breaking m88e1318_set_wol(), I guess?

>> So changing __phy_modify() to return zero on success seems like the way
>> forward...
>
> So what do we call an accessor that returns the original value?
>
> __phy_modify_return_old_value()

__phy_modify_ret()?

Or __phy_modify(...., u16 *oldval) (where oldval can be NULL)?

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
David Miller Jan. 11, 2018, 3:48 p.m. UTC | #10
From: Geert Uytterhoeven <geert+renesas@glider.be>
Date: Tue,  9 Jan 2018 12:11:21 +0100

> In case of success, the return values of (__)phy_write() and
> (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> (__)phy_modify() returns the old PHY register value.
> 
> Apparently this change was catered for in drivers/net/phy/marvell.c, but
> not in other source files.
> 
> Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> considered an error:
> 
>     ravb e6800000.ethernet eth0: failed to connect PHY
>     IP-Config: Failed to open eth0
>     IP-Config: No network devices available
> 
> Fix this by converting positive values to zero in all callers of
> phy_modify().
> 
> Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> ---
> Alternatively, __phy_modify() could be changed to follow __phy_write()
> semantics?

I really want a resolution to this quickly, this broke lots of stuff
for people.

__phy_modify() wants to return multiple values, so it should be coded
up to do so explicitly rather than trying to encode two values from
overlapping value spaces in one return value.

That means the original value should be returned by-reference.  And
this will make the error/no-error return value unambiguous.

int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
		 u16 *orig_val);

Thank you.
Russell King (Oracle) Jan. 11, 2018, 3:53 p.m. UTC | #11
On Thu, Jan 11, 2018 at 10:48:35AM -0500, David Miller wrote:
> From: Geert Uytterhoeven <geert+renesas@glider.be>
> Date: Tue,  9 Jan 2018 12:11:21 +0100
> 
> > In case of success, the return values of (__)phy_write() and
> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> > (__)phy_modify() returns the old PHY register value.
> > 
> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
> > not in other source files.
> > 
> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> > considered an error:
> > 
> >     ravb e6800000.ethernet eth0: failed to connect PHY
> >     IP-Config: Failed to open eth0
> >     IP-Config: No network devices available
> > 
> > Fix this by converting positive values to zero in all callers of
> > phy_modify().
> > 
> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> > ---
> > Alternatively, __phy_modify() could be changed to follow __phy_write()
> > semantics?
> 
> I really want a resolution to this quickly, this broke lots of stuff
> for people.
> 
> __phy_modify() wants to return multiple values, so it should be coded
> up to do so explicitly rather than trying to encode two values from
> overlapping value spaces in one return value.
> 
> That means the original value should be returned by-reference.  And
> this will make the error/no-error return value unambiguous.
> 
> int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
> 		 u16 *orig_val);

I'm sorry I have no time to work on this right now due to the meltdown
and spectre stuff that hit last week.  If you need to do something,
please revert both the mvneta series and the series containing this
patch.

Thanks.
Geert Uytterhoeven Jan. 11, 2018, 3:54 p.m. UTC | #12
On Thu, Jan 11, 2018 at 4:53 PM, Russell King - ARM Linux
<linux@armlinux.org.uk> wrote:
> On Thu, Jan 11, 2018 at 10:48:35AM -0500, David Miller wrote:
>> From: Geert Uytterhoeven <geert+renesas@glider.be>
>> Date: Tue,  9 Jan 2018 12:11:21 +0100
>>
>> > In case of success, the return values of (__)phy_write() and
>> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>> > (__)phy_modify() returns the old PHY register value.
>> >
>> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
>> > not in other source files.
>> >
>> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>> > considered an error:
>> >
>> >     ravb e6800000.ethernet eth0: failed to connect PHY
>> >     IP-Config: Failed to open eth0
>> >     IP-Config: No network devices available
>> >
>> > Fix this by converting positive values to zero in all callers of
>> > phy_modify().
>> >
>> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> > ---
>> > Alternatively, __phy_modify() could be changed to follow __phy_write()
>> > semantics?
>>
>> I really want a resolution to this quickly, this broke lots of stuff
>> for people.
>>
>> __phy_modify() wants to return multiple values, so it should be coded
>> up to do so explicitly rather than trying to encode two values from
>> overlapping value spaces in one return value.
>>
>> That means the original value should be returned by-reference.  And
>> this will make the error/no-error return value unambiguous.
>>
>> int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
>>                u16 *orig_val);
>
> I'm sorry I have no time to work on this right now due to the meltdown
> and spectre stuff that hit last week.  If you need to do something,
> please revert both the mvneta series and the series containing this
> patch.

I'll have a look into it...

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Geert Uytterhoeven Jan. 11, 2018, 4 p.m. UTC | #13
On Thu, Jan 11, 2018 at 4:54 PM, Geert Uytterhoeven
<geert@linux-m68k.org> wrote:
> On Thu, Jan 11, 2018 at 4:53 PM, Russell King - ARM Linux
> <linux@armlinux.org.uk> wrote:
>> On Thu, Jan 11, 2018 at 10:48:35AM -0500, David Miller wrote:
>>> From: Geert Uytterhoeven <geert+renesas@glider.be>
>>> Date: Tue,  9 Jan 2018 12:11:21 +0100
>>>
>>> > In case of success, the return values of (__)phy_write() and
>>> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>>> > (__)phy_modify() returns the old PHY register value.
>>> >
>>> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
>>> > not in other source files.
>>> >
>>> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>>> > considered an error:
>>> >
>>> >     ravb e6800000.ethernet eth0: failed to connect PHY
>>> >     IP-Config: Failed to open eth0
>>> >     IP-Config: No network devices available
>>> >
>>> > Fix this by converting positive values to zero in all callers of
>>> > phy_modify().
>>> >
>>> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>>> > ---
>>> > Alternatively, __phy_modify() could be changed to follow __phy_write()
>>> > semantics?
>>>
>>> I really want a resolution to this quickly, this broke lots of stuff
>>> for people.
>>>
>>> __phy_modify() wants to return multiple values, so it should be coded
>>> up to do so explicitly rather than trying to encode two values from
>>> overlapping value spaces in one return value.
>>>
>>> That means the original value should be returned by-reference.  And
>>> this will make the error/no-error return value unambiguous.
>>>
>>> int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
>>>                u16 *orig_val);
>>
>> I'm sorry I have no time to work on this right now due to the meltdown
>> and spectre stuff that hit last week.  If you need to do something,
>> please revert both the mvneta series and the series containing this
>> patch.
>
> I'll have a look into it...

Sorry, the phy_restore_page() semantics are driving me crazy.
Let's revert.

Gr{oetje,eeting}s,

                        Geert

--
Geert Uytterhoeven -- There's lots of Linux beyond ia32 -- geert@linux-m68k.org

In personal conversations with technical people, I call myself a hacker. But
when I'm talking to journalists I just say "programmer" or something like that.
                                -- Linus Torvalds
Russell King (Oracle) Jan. 11, 2018, 4:05 p.m. UTC | #14
On Thu, Jan 11, 2018 at 05:00:03PM +0100, Geert Uytterhoeven wrote:
> On Thu, Jan 11, 2018 at 4:54 PM, Geert Uytterhoeven
> <geert@linux-m68k.org> wrote:
> > On Thu, Jan 11, 2018 at 4:53 PM, Russell King - ARM Linux
> > <linux@armlinux.org.uk> wrote:
> >> On Thu, Jan 11, 2018 at 10:48:35AM -0500, David Miller wrote:
> >>> From: Geert Uytterhoeven <geert+renesas@glider.be>
> >>> Date: Tue,  9 Jan 2018 12:11:21 +0100
> >>>
> >>> > In case of success, the return values of (__)phy_write() and
> >>> > (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
> >>> > (__)phy_modify() returns the old PHY register value.
> >>> >
> >>> > Apparently this change was catered for in drivers/net/phy/marvell.c, but
> >>> > not in other source files.
> >>> >
> >>> > Hence genphy_restart_aneg() now returns 4416 instead zero, which is
> >>> > considered an error:
> >>> >
> >>> >     ravb e6800000.ethernet eth0: failed to connect PHY
> >>> >     IP-Config: Failed to open eth0
> >>> >     IP-Config: No network devices available
> >>> >
> >>> > Fix this by converting positive values to zero in all callers of
> >>> > phy_modify().
> >>> >
> >>> > Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
> >>> > Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
> >>> > ---
> >>> > Alternatively, __phy_modify() could be changed to follow __phy_write()
> >>> > semantics?
> >>>
> >>> I really want a resolution to this quickly, this broke lots of stuff
> >>> for people.
> >>>
> >>> __phy_modify() wants to return multiple values, so it should be coded
> >>> up to do so explicitly rather than trying to encode two values from
> >>> overlapping value spaces in one return value.
> >>>
> >>> That means the original value should be returned by-reference.  And
> >>> this will make the error/no-error return value unambiguous.
> >>>
> >>> int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
> >>>                u16 *orig_val);
> >>
> >> I'm sorry I have no time to work on this right now due to the meltdown
> >> and spectre stuff that hit last week.  If you need to do something,
> >> please revert both the mvneta series and the series containing this
> >> patch.
> >
> > I'll have a look into it...
> 
> Sorry, the phy_restore_page() semantics are driving me crazy.
> Let's revert.

I don't have any other solution for the phy_restore_page() stuff other
than open coding those exact same semantics _everywhere_ it's used.

Unfortunately, not having those race fixes in place blocks converting
any network driver to use phylink and SFP.  If people find
phy_restore_page() distasteful, I have no way to progress phylink and
SFP any further.

I guess phylink and SFP stuff will also need to be reverted unless
someone can find a solution to this.
Andrew Lunn Jan. 11, 2018, 5:04 p.m. UTC | #15
> Sorry, the phy_restore_page() semantics are driving me crazy.
> Let's revert.

I will try to take a look today.

  Andrew
Florian Fainelli Jan. 11, 2018, 8:28 p.m. UTC | #16
On 01/11/2018 07:48 AM, David Miller wrote:
> From: Geert Uytterhoeven <geert+renesas@glider.be>
> Date: Tue,  9 Jan 2018 12:11:21 +0100
> 
>> In case of success, the return values of (__)phy_write() and
>> (__)phy_modify() are not compatible: (__)phy_write() returns 0, while
>> (__)phy_modify() returns the old PHY register value.
>>
>> Apparently this change was catered for in drivers/net/phy/marvell.c, but
>> not in other source files.
>>
>> Hence genphy_restart_aneg() now returns 4416 instead zero, which is
>> considered an error:
>>
>>     ravb e6800000.ethernet eth0: failed to connect PHY
>>     IP-Config: Failed to open eth0
>>     IP-Config: No network devices available
>>
>> Fix this by converting positive values to zero in all callers of
>> phy_modify().
>>
>> Fixes: fea23fb591cce995 ("net: phy: convert read-modify-write to phy_modify()")
>> Signed-off-by: Geert Uytterhoeven <geert+renesas@glider.be>
>> ---
>> Alternatively, __phy_modify() could be changed to follow __phy_write()
>> semantics?
> 
> I really want a resolution to this quickly, this broke lots of stuff
> for people.
> 
> __phy_modify() wants to return multiple values, so it should be coded
> up to do so explicitly rather than trying to encode two values from
> overlapping value spaces in one return value.
> 
> That means the original value should be returned by-reference.  And
> this will make the error/no-error return value unambiguous.
> 
> int __phy_modify(struct phy_device *phydev, u32 regnum, u16 mask, u16 set,
> 		 u16 *orig_val);

I am fine with that approach, there should only be a handful of
locations where we care about the old value that __phy_modify() returns
so we should be able to wrap these accessors in a way that is not
disruptive and requires less code auditing that the patch currently
submitted.

Thanks!
diff mbox

Patch

diff --git a/drivers/net/phy/at803x.c b/drivers/net/phy/at803x.c
index 411cf1072bae5796..6b6b9cef517f1bc3 100644
--- a/drivers/net/phy/at803x.c
+++ b/drivers/net/phy/at803x.c
@@ -230,7 +230,9 @@  static int at803x_suspend(struct phy_device *phydev)
 
 static int at803x_resume(struct phy_device *phydev)
 {
-	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
+	int ret = phy_modify(phydev, MII_BMCR, BMCR_PDOWN | BMCR_ISOLATE, 0);
+
+	return ret < 0 ? ret : 0;
 }
 
 static int at803x_probe(struct phy_device *phydev)
diff --git a/drivers/net/phy/phy_device.c b/drivers/net/phy/phy_device.c
index 6bd11a070ec8147b..a132e845e4aec3d5 100644
--- a/drivers/net/phy/phy_device.c
+++ b/drivers/net/phy/phy_device.c
@@ -1369,6 +1369,7 @@  static int genphy_config_eee_advert(struct phy_device *phydev)
 int genphy_setup_forced(struct phy_device *phydev)
 {
 	u16 ctl = 0;
+	int ret;
 
 	phydev->pause = 0;
 	phydev->asym_pause = 0;
@@ -1381,8 +1382,12 @@  int genphy_setup_forced(struct phy_device *phydev)
 	if (DUPLEX_FULL == phydev->duplex)
 		ctl |= BMCR_FULLDPLX;
 
-	return phy_modify(phydev, MII_BMCR,
-			  BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN, ctl);
+	ret = phy_modify(phydev, MII_BMCR,
+			 BMCR_LOOPBACK | BMCR_ISOLATE | BMCR_PDOWN, ctl);
+	if (ret < 0)
+		return ret;
+
+	return 0;
 }
 EXPORT_SYMBOL(genphy_setup_forced);
 
@@ -1393,8 +1398,10 @@  EXPORT_SYMBOL(genphy_setup_forced);
 int genphy_restart_aneg(struct phy_device *phydev)
 {
 	/* Don't isolate the PHY if we're negotiating */
-	return phy_modify(phydev, MII_BMCR, BMCR_ISOLATE,
-			  BMCR_ANENABLE | BMCR_ANRESTART);
+	int ret = phy_modify(phydev, MII_BMCR, BMCR_ISOLATE,
+			     BMCR_ANENABLE | BMCR_ANRESTART);
+
+	return ret < 0 ? ret : 0;
 }
 EXPORT_SYMBOL(genphy_restart_aneg);
 
@@ -1660,20 +1667,26 @@  EXPORT_SYMBOL(genphy_config_init);
 
 int genphy_suspend(struct phy_device *phydev)
 {
-	return phy_modify(phydev, MII_BMCR, 0, BMCR_PDOWN);
+	int ret = phy_modify(phydev, MII_BMCR, 0, BMCR_PDOWN);
+
+	return ret < 0 ? ret : 0;
 }
 EXPORT_SYMBOL(genphy_suspend);
 
 int genphy_resume(struct phy_device *phydev)
 {
-	return phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
+	int ret = phy_modify(phydev, MII_BMCR, BMCR_PDOWN, 0);
+
+	return ret < 0 ? ret : 0;
 }
 EXPORT_SYMBOL(genphy_resume);
 
 int genphy_loopback(struct phy_device *phydev, bool enable)
 {
-	return phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
-			  enable ? BMCR_LOOPBACK : 0);
+	int ret = phy_modify(phydev, MII_BMCR, BMCR_LOOPBACK,
+			     enable ? BMCR_LOOPBACK : 0);
+
+	return ret < 0 ? ret : 0;
 }
 EXPORT_SYMBOL(genphy_loopback);