Message ID | E1syBPE-006Unh-TL@rmk-PC.armlinux.org.uk (mailing list archive) |
---|---|
State | Superseded |
Delegated to: | Netdev Maintainers |
Headers | show |
Series | Removing more phylink cruft | expand |
On Tue, Oct 08, 2024 at 03:41:44PM +0100, Russell King (Oracle) wrote: > With DSA's implementation of the mac_select_pcs() method removed, we > can now remove the detection of mac_select_pcs() implementation. > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > --- > drivers/net/phy/phylink.c | 14 +++----------- > 1 file changed, 3 insertions(+), 11 deletions(-) > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > index 4309317de3d1..8f86599d3d78 100644 > --- a/drivers/net/phy/phylink.c > +++ b/drivers/net/phy/phylink.c > @@ -79,7 +79,6 @@ struct phylink { > unsigned int pcs_state; > > bool mac_link_dropped; > - bool using_mac_select_pcs; > > struct sfp_bus *sfp_bus; > bool sfp_may_have_phy; > @@ -661,12 +660,12 @@ static int phylink_validate_mac_and_pcs(struct phylink *pl, > int ret; > > /* Get the PCS for this interface mode */ > - if (pl->using_mac_select_pcs) { > + if (pl->mac_ops->mac_select_pcs) { > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > if (IS_ERR(pcs)) > return PTR_ERR(pcs); > } else { > - pcs = pl->pcs; > + pcs = NULL; The assignment from the "else" branch could have been folded into the variable initialization. Also, maybe a word in the commit message would be good about why the "pcs = pl->pcs" line became "pcs = NULL". I get the impression that these are 2 logical changes in one patch. This second aspect I'm highlighting seems to be cleaning up the last remnants of phylink_set_pcs(). Since all phylink users have been converted to mac_select_pcs(), there's no other possible value for "pl->pcs" than NULL if "using_mac_select_pcs" is true. I'm not 100% sure that this is the case, but cross-checking with the git history, it seems to be the case. Commit 1054457006d4 ("net: phy: phylink: fix DSA mac_select_pcs() introduction") was merged on Feb 21,2022, and commit a5081bad2eac ("net: phylink: remove phylink_set_pcs()") on Feb 26. So it seems plausible that this fixup could have been made as soon as Feb 26, 2022. Please confirm. > } > > if (pcs) {
On Wed, Oct 09, 2024 at 03:29:38PM +0300, Vladimir Oltean wrote: > there's no other possible value for "pl->pcs" than NULL if > "using_mac_select_pcs" is true. I mean if it is false, sorry.
On 10/9/24 14:29, Vladimir Oltean wrote: > On Tue, Oct 08, 2024 at 03:41:44PM +0100, Russell King (Oracle) wrote: >> With DSA's implementation of the mac_select_pcs() method removed, we >> can now remove the detection of mac_select_pcs() implementation. >> >> Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> >> --- >> drivers/net/phy/phylink.c | 14 +++----------- >> 1 file changed, 3 insertions(+), 11 deletions(-) >> >> diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c >> index 4309317de3d1..8f86599d3d78 100644 >> --- a/drivers/net/phy/phylink.c >> +++ b/drivers/net/phy/phylink.c >> @@ -79,7 +79,6 @@ struct phylink { >> unsigned int pcs_state; >> >> bool mac_link_dropped; >> - bool using_mac_select_pcs; >> >> struct sfp_bus *sfp_bus; >> bool sfp_may_have_phy; >> @@ -661,12 +660,12 @@ static int phylink_validate_mac_and_pcs(struct phylink *pl, >> int ret; >> >> /* Get the PCS for this interface mode */ >> - if (pl->using_mac_select_pcs) { >> + if (pl->mac_ops->mac_select_pcs) { >> pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); >> if (IS_ERR(pcs)) >> return PTR_ERR(pcs); >> } else { >> - pcs = pl->pcs; >> + pcs = NULL; > > The assignment from the "else" branch could have been folded into the > variable initialization. > > Also, maybe a word in the commit message would be good about why the > "pcs = pl->pcs" line became "pcs = NULL". I get the impression that > these are 2 logical changes in one patch. This second aspect I'm > highlighting seems to be cleaning up the last remnants of phylink_set_pcs(). > Since all phylink users have been converted to mac_select_pcs(), there's > no other possible value for "pl->pcs" than NULL if "using_mac_select_pcs" > is true. > > I'm not 100% sure that this is the case, but cross-checking with the git > history, it seems to be the case. Commit 1054457006d4 ("net: phy: > phylink: fix DSA mac_select_pcs() introduction") was merged on Feb 21,2022, > and commit a5081bad2eac ("net: phylink: remove phylink_set_pcs()") on > Feb 26. So it seems plausible that this fixup could have been made as > soon as Feb 26, 2022. Please confirm. I agree with Vladimir, I think at least expanding the reasoning in the commit message would be useful. Thanks, Paolo
On Wed, Oct 09, 2024 at 03:29:38PM +0300, Vladimir Oltean wrote: > On Tue, Oct 08, 2024 at 03:41:44PM +0100, Russell King (Oracle) wrote: > > With DSA's implementation of the mac_select_pcs() method removed, we > > can now remove the detection of mac_select_pcs() implementation. > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > --- > > drivers/net/phy/phylink.c | 14 +++----------- > > 1 file changed, 3 insertions(+), 11 deletions(-) > > > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > > index 4309317de3d1..8f86599d3d78 100644 > > --- a/drivers/net/phy/phylink.c > > +++ b/drivers/net/phy/phylink.c > > @@ -79,7 +79,6 @@ struct phylink { > > unsigned int pcs_state; > > > > bool mac_link_dropped; > > - bool using_mac_select_pcs; > > > > struct sfp_bus *sfp_bus; > > bool sfp_may_have_phy; > > @@ -661,12 +660,12 @@ static int phylink_validate_mac_and_pcs(struct phylink *pl, > > int ret; > > > > /* Get the PCS for this interface mode */ > > - if (pl->using_mac_select_pcs) { > > + if (pl->mac_ops->mac_select_pcs) { > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > if (IS_ERR(pcs)) > > return PTR_ERR(pcs); > > } else { > > - pcs = pl->pcs; > > + pcs = NULL; > > The assignment from the "else" branch could have been folded into the > variable initialization. > > Also, maybe a word in the commit message would be good about why the > "pcs = pl->pcs" line became "pcs = NULL". I get the impression that > these are 2 logical changes in one patch. This second aspect I'm > highlighting seems to be cleaning up the last remnants of phylink_set_pcs(). > Since all phylink users have been converted to mac_select_pcs(), there's > no other possible value for "pl->pcs" than NULL if "using_mac_select_pcs" > is true. Hmm. Looking at this again, we're getting into quite a mess because of one of your previous review comments from a number of years back. You stated that you didn't see the need to support a transition from having-a-PCS to having-no-PCS. I don't have a link to that discussion. However, it is why we've ended up with phylink_major_config() having the extra complexity here, effectively preventing mac_select_pcs() from being able to remove a PCS that was previously added: pcs_changed = pcs && pl->pcs != pcs; because if mac_select_pcs() returns NULL, it was decided that any in-use PCS would not be removed. It seems (at least to me) to be a silly decision now. However, if mac_select_pcs() in phylink_major_config() returns NULL, we don't do any validation of the PCS. So this, today, before these patches, is already an inconsistent mess. To fix this, I think: struct phylink_pcs *pcs = NULL; ... if (pl->mac_ops->mac_select_pcs) { pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); if (IS_ERR(pcs)) return PTR_ERR(pcs); } if (!pcs) pcs = pl->pcs; is needed to give consistent behaviour. Alternatively, we could allow mac_select_pcs() to return NULL, which would then allow the PCS to be removed. Let me know if you've changed your mind on what behaviour we should have, because this affects what I do to sort this out. However, it is true that if mac_select_pcs() is NULL, since commit a5081bad2eac ("net: phylink: remove phylink_set_pcs()") there is no way pl->pcs can be non-NULL, since there is no other way for it to be set.
On Thu, Oct 10, 2024 at 12:21:43PM +0100, Russell King (Oracle) wrote: > On Wed, Oct 09, 2024 at 03:29:38PM +0300, Vladimir Oltean wrote: > > On Tue, Oct 08, 2024 at 03:41:44PM +0100, Russell King (Oracle) wrote: > > > With DSA's implementation of the mac_select_pcs() method removed, we > > > can now remove the detection of mac_select_pcs() implementation. > > > > > > Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> > > > --- > > > drivers/net/phy/phylink.c | 14 +++----------- > > > 1 file changed, 3 insertions(+), 11 deletions(-) > > > > > > diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c > > > index 4309317de3d1..8f86599d3d78 100644 > > > --- a/drivers/net/phy/phylink.c > > > +++ b/drivers/net/phy/phylink.c > > > @@ -79,7 +79,6 @@ struct phylink { > > > unsigned int pcs_state; > > > > > > bool mac_link_dropped; > > > - bool using_mac_select_pcs; > > > > > > struct sfp_bus *sfp_bus; > > > bool sfp_may_have_phy; > > > @@ -661,12 +660,12 @@ static int phylink_validate_mac_and_pcs(struct phylink *pl, > > > int ret; > > > > > > /* Get the PCS for this interface mode */ > > > - if (pl->using_mac_select_pcs) { > > > + if (pl->mac_ops->mac_select_pcs) { > > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > > if (IS_ERR(pcs)) > > > return PTR_ERR(pcs); > > > } else { > > > - pcs = pl->pcs; > > > + pcs = NULL; > > > > The assignment from the "else" branch could have been folded into the > > variable initialization. > > > > Also, maybe a word in the commit message would be good about why the > > "pcs = pl->pcs" line became "pcs = NULL". I get the impression that > > these are 2 logical changes in one patch. This second aspect I'm > > highlighting seems to be cleaning up the last remnants of phylink_set_pcs(). > > Since all phylink users have been converted to mac_select_pcs(), there's > > no other possible value for "pl->pcs" than NULL if "using_mac_select_pcs" > > is true. > > Hmm. Looking at this again, we're getting into quite a mess because of > one of your previous review comments from a number of years back. > > You stated that you didn't see the need to support a transition from > having-a-PCS to having-no-PCS. I don't have a link to that discussion. > However, it is why we've ended up with phylink_major_config() having > the extra complexity here, effectively preventing mac_select_pcs() > from being able to remove a PCS that was previously added: > > pcs_changed = pcs && pl->pcs != pcs; > > because if mac_select_pcs() returns NULL, it was decided that any > in-use PCS would not be removed. It seems (at least to me) to be a > silly decision now. > > However, if mac_select_pcs() in phylink_major_config() returns NULL, > we don't do any validation of the PCS. > > So this, today, before these patches, is already an inconsistent mess. > > To fix this, I think: > > struct phylink_pcs *pcs = NULL; > ... > if (pl->mac_ops->mac_select_pcs) { > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > if (IS_ERR(pcs)) > return PTR_ERR(pcs); > } > > if (!pcs) > pcs = pl->pcs; > > is needed to give consistent behaviour. > > Alternatively, we could allow mac_select_pcs() to return NULL, which > would then allow the PCS to be removed. > > Let me know if you've changed your mind on what behaviour we should > have, because this affects what I do to sort this out. Here's a link to the original discussion from November 2021: https://lore.kernel.org/all/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ Google uselessly refused to find it, so I searched my own mailboxes to find the message ID.
On Thu, Oct 10, 2024 at 02:00:32PM +0100, Russell King (Oracle) wrote: > On Thu, Oct 10, 2024 at 12:21:43PM +0100, Russell King (Oracle) wrote: > > Hmm. Looking at this again, we're getting into quite a mess because of > > one of your previous review comments from a number of years back. > > > > You stated that you didn't see the need to support a transition from > > having-a-PCS to having-no-PCS. I don't have a link to that discussion. > > However, it is why we've ended up with phylink_major_config() having > > the extra complexity here, effectively preventing mac_select_pcs() > > from being able to remove a PCS that was previously added: > > > > pcs_changed = pcs && pl->pcs != pcs; > > > > because if mac_select_pcs() returns NULL, it was decided that any > > in-use PCS would not be removed. It seems (at least to me) to be a > > silly decision now. > > > > However, if mac_select_pcs() in phylink_major_config() returns NULL, > > we don't do any validation of the PCS. > > > > So this, today, before these patches, is already an inconsistent mess. > > > > To fix this, I think: > > > > struct phylink_pcs *pcs = NULL; > > ... > > if (pl->mac_ops->mac_select_pcs) { > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > if (IS_ERR(pcs)) > > return PTR_ERR(pcs); > > } > > > > if (!pcs) > > pcs = pl->pcs; > > > > is needed to give consistent behaviour. > > > > Alternatively, we could allow mac_select_pcs() to return NULL, which > > would then allow the PCS to be removed. > > > > Let me know if you've changed your mind on what behaviour we should > > have, because this affects what I do to sort this out. > > Here's a link to the original discussion from November 2021: > > https://lore.kernel.org/all/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ > > Google uselessly refused to find it, so I searched my own mailboxes > to find the message ID. Important note: I cannot find any discussion on any mailing list which fills the gap between me asking what is the real world applicability of mac_select_pcs() returning NULL after it has returned non-NULL, and the current phylink behavior, as described above by you. That behavior was first posted here: https://lore.kernel.org/netdev/Ybiue1TPCwsdHmV4@shell.armlinux.org.uk/ in patches 1/7 and 2/7. I did not state that phylink should keep the old PCS around, and I do not take responsibility for that. Keeping in mind that I don't know whether anything has changed since 2021 which would make this condition any less theoretical than it was back then, I guess if I were maintaining the code involved, I'd choose between 2 options (whichever is easiest): - Imagine a purely theoretical scenario where phylink transitions between a state->interface requiring a phylink_pcs, and one not requiring a phylink_pcs. I'm not even saying a serial PCS hardware block isn't present, just that it isn't modeled as a phylink_pcs (for reasons which may be valid or not). Probably the most logical thing to do in this scenario is allow the old phylink_pcs to be removed, and its ops never to be used for the new state->interface. - Validate, possibly at phylink_validate_phy() time, that for all phy->possible_interfaces, mac_select_pcs() either returns NULL for all of them, or non-NULL for all of them. The idea would be to leave room for the use case to define itself (and the restriction to be lifted whenever necessary), instead of giving a predefined behavior for the transition when in reality we have no idea of the use case behind it. I don't know whether checking phy->possible_interfaces would be sufficient in ensuring that such a transition cannot occur. I find no contradiction between my replies (mostly questions, actually) in Nov 2021 and my current agreement that phylink's behavior of keeping the old PCS and using it for the new state->interface doesn't make much sense.
On Fri, Oct 11, 2024 at 01:39:12PM +0300, Vladimir Oltean wrote: > On Thu, Oct 10, 2024 at 02:00:32PM +0100, Russell King (Oracle) wrote: > > On Thu, Oct 10, 2024 at 12:21:43PM +0100, Russell King (Oracle) wrote: > > > Hmm. Looking at this again, we're getting into quite a mess because of > > > one of your previous review comments from a number of years back. > > > > > > You stated that you didn't see the need to support a transition from > > > having-a-PCS to having-no-PCS. I don't have a link to that discussion. > > > However, it is why we've ended up with phylink_major_config() having > > > the extra complexity here, effectively preventing mac_select_pcs() > > > from being able to remove a PCS that was previously added: > > > > > > pcs_changed = pcs && pl->pcs != pcs; > > > > > > because if mac_select_pcs() returns NULL, it was decided that any > > > in-use PCS would not be removed. It seems (at least to me) to be a > > > silly decision now. > > > > > > However, if mac_select_pcs() in phylink_major_config() returns NULL, > > > we don't do any validation of the PCS. > > > > > > So this, today, before these patches, is already an inconsistent mess. > > > > > > To fix this, I think: > > > > > > struct phylink_pcs *pcs = NULL; > > > ... > > > if (pl->mac_ops->mac_select_pcs) { > > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > > if (IS_ERR(pcs)) > > > return PTR_ERR(pcs); > > > } > > > > > > if (!pcs) > > > pcs = pl->pcs; > > > > > > is needed to give consistent behaviour. > > > > > > Alternatively, we could allow mac_select_pcs() to return NULL, which > > > would then allow the PCS to be removed. > > > > > > Let me know if you've changed your mind on what behaviour we should > > > have, because this affects what I do to sort this out. > > > > Here's a link to the original discussion from November 2021: > > > > https://lore.kernel.org/all/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ > > > > Google uselessly refused to find it, so I searched my own mailboxes > > to find the message ID. > > Important note: I cannot find any discussion on any mailing list which > fills the gap between me asking what is the real world applicability of > mac_select_pcs() returning NULL after it has returned non-NULL, and the > current phylink behavior, as described above by you. That behavior was > first posted here: > https://lore.kernel.org/netdev/Ybiue1TPCwsdHmV4@shell.armlinux.org.uk/ > in patches 1/7 and 2/7. I did not state that phylink should keep the old > PCS around, and I do not take responsibility for that. I wanted to add support for phylink_set_pcs() to remove the current PCS and submitted a patch for it. You didn't see a use case and objected to the patch, which wasn't merged. I've kept that behaviour ever since on the grounds of your objection - as per the link that I posted. This has been carried forward into the mac_select_pcs() implementation where it explicitly does not allow a PCS to be removed. Pointing to the introduction of mac_select_pcs() is later than your objection. Let me restate it. As a *direct* result of your comments on patch 8/8 which starts here: https://lore.kernel.org/netdev/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ I took your comments as meaning that you saw no reason why we should allow a PCS to ever be removed. phylink_set_pcs() needed to be modified to allow that to happen. Given your objection, that behaviour has been carried forward by having explicit additional code to ensure that a PCS can't be removed from phylink without replacing it with a different PCS. In other words, mac_select_pcs() returning NULL when it has previously returned a PCS does *nothing* to remove the previous PCS. Maybe this was not your intention when reviewing patch 8/8, but that's how your comments came over, and lead me to the conclusion that we need to enforce that - and that is enforced by: pcs_changed = pcs && pl->pcs != pcs; so pcs_change will always be false if pcs == NULL, thus preventing the replacement of the pcs: if (pcs_changed) { phylink_pcs_disable(pl->pcs); if (pl->pcs) pl->pcs->phylink = NULL; pcs->phylink = pl; pl->pcs = pcs; } I wouldn't have coded it this way had you not objected to patch 8/8 which lead me to believe we should not allow a PCS to be removed. Review comments do have implications for future patches... maybe it wasn't want you intended, but this is a great example of cause and (possibly unintended) effect.
On Fri, Oct 11, 2024 at 11:58:07AM +0100, Russell King (Oracle) wrote: > On Fri, Oct 11, 2024 at 01:39:12PM +0300, Vladimir Oltean wrote: > > On Thu, Oct 10, 2024 at 02:00:32PM +0100, Russell King (Oracle) wrote: > > > On Thu, Oct 10, 2024 at 12:21:43PM +0100, Russell King (Oracle) wrote: > > > > Hmm. Looking at this again, we're getting into quite a mess because of > > > > one of your previous review comments from a number of years back. > > > > > > > > You stated that you didn't see the need to support a transition from > > > > having-a-PCS to having-no-PCS. I don't have a link to that discussion. > > > > However, it is why we've ended up with phylink_major_config() having > > > > the extra complexity here, effectively preventing mac_select_pcs() > > > > from being able to remove a PCS that was previously added: > > > > > > > > pcs_changed = pcs && pl->pcs != pcs; > > > > > > > > because if mac_select_pcs() returns NULL, it was decided that any > > > > in-use PCS would not be removed. It seems (at least to me) to be a > > > > silly decision now. > > > > > > > > However, if mac_select_pcs() in phylink_major_config() returns NULL, > > > > we don't do any validation of the PCS. > > > > > > > > So this, today, before these patches, is already an inconsistent mess. > > > > > > > > To fix this, I think: > > > > > > > > struct phylink_pcs *pcs = NULL; > > > > ... > > > > if (pl->mac_ops->mac_select_pcs) { > > > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > > > if (IS_ERR(pcs)) > > > > return PTR_ERR(pcs); > > > > } > > > > > > > > if (!pcs) > > > > pcs = pl->pcs; > > > > > > > > is needed to give consistent behaviour. > > > > > > > > Alternatively, we could allow mac_select_pcs() to return NULL, which > > > > would then allow the PCS to be removed. > > > > > > > > Let me know if you've changed your mind on what behaviour we should > > > > have, because this affects what I do to sort this out. > > > > > > Here's a link to the original discussion from November 2021: > > > > > > https://lore.kernel.org/all/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ > > > > > > Google uselessly refused to find it, so I searched my own mailboxes > > > to find the message ID. > > > > Important note: I cannot find any discussion on any mailing list which > > fills the gap between me asking what is the real world applicability of > > mac_select_pcs() returning NULL after it has returned non-NULL, and the > > current phylink behavior, as described above by you. That behavior was > > first posted here: > > https://lore.kernel.org/netdev/Ybiue1TPCwsdHmV4@shell.armlinux.org.uk/ > > in patches 1/7 and 2/7. I did not state that phylink should keep the old > > PCS around, and I do not take responsibility for that. > > I wanted to add support for phylink_set_pcs() to remove the current > PCS and submitted a patch for it. You didn't see a use case and objected > to the patch, which wasn't merged. It was an RFC, it wasn't a candidate for merging anyway. > I've kept that behaviour ever since on the grounds of your objection - > as per the link that I posted. This has been carried forward into the > mac_select_pcs() implementation where it explicitly does not allow a > PCS to be removed. Pointing to the introduction of mac_select_pcs() is > later than your objection. This does not invalidate my previous point in any way. The phylink_set_pcs() API at that time implied a voluntary call from the driver. "pcs" was not allowed to be NULL, and your patch was the one introducing phylink_set_pcs(NULL) as a valid call. That's what I objected to as not seeing the purpose of. Whereas the new mac_select_pcs() has "return NULL" already baked into it from day one (the one-to-one equivalent of it being: don't call phylink_set_pcs()). It becomes inevitable, in this new model, to handle somehow the "what if" scenario of returning NULL after non-NULL, whereas that was not necessary before. It's a different API, which automatically implies a new set of rules. My point is that it was impossible for me to consciously contribute to the definition of the mac_select_pcs() API rules. Saying that the introduction of mac_select_pcs() came later than my comment proves exactly that. I couldn't have possibly known that my review comment will be used for a different API than phylink_set_pcs(), because the new API hadn't been posted. Whereas what you are saying is that the mac_select_pcs() posting took place after my comment => of course you took my comment into consideration. You seem to be omitting that I did not have all information. > Let me restate it. As a *direct* result of your comments on patch 8/8 > which starts here: > https://lore.kernel.org/netdev/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ > I took your comments as meaning that you saw no reason why we should > allow a PCS to ever be removed. I still stand by that statement, in that context. You took it out of context. > phylink_set_pcs() needed to be modified to allow that to happen. Given > your objection, that behaviour has been carried forward by having > explicit additional code to ensure that a PCS can't be removed from > phylink without replacing it with a different PCS. In other words, > mac_select_pcs() returning NULL when it has previously returned a PCS > does *nothing* to remove the previous PCS. The carrying over of old behavior from one API to another API is merely one of the design choices that can be made, and far from the only option. In general, you introduce new API exactly _because_ you want to change the behavior in some conditions. > Maybe this was not your intention when reviewing patch 8/8, but that's > how your comments came over, and lead me to the conclusion that we > need to enforce that - and that is enforced by: > > pcs_changed = pcs && pl->pcs != pcs; > > so pcs_change will always be false if pcs == NULL, thus preventing the > replacement of the pcs: > > if (pcs_changed) { > phylink_pcs_disable(pl->pcs); > > if (pl->pcs) > pl->pcs->phylink = NULL; > > pcs->phylink = pl; > > pl->pcs = pcs; > } > > I wouldn't have coded it this way had you not objected to patch 8/8 > which lead me to believe we should not allow a PCS to be removed. > > Review comments do have implications for future patches... maybe it > wasn't want you intended, but this is a great example of cause and > (possibly unintended) effect. This restatement was not necessary, as I believe I got the point from your previous message. I still fundamentally reject any responsibility you wish to attribute to me here. For one, my review feedback was in a different context. But let's even assume it was directly related, and I knew mac_select_pcs() would come. If I were maintaining a piece of code and received some review feedback, I would not translate it into code that carries exclusively _my_ sign-off, unless _I_ agree with it and can justify it. I would not seek to transfer responsibility to somebody else. In fact, if I were to be held accountable for patches signed off by others, I wouldn't even be reviewing anything. So I think it's a very fair collaboration rule, and I'm only asking you to apply it to me as well.
On Fri, Oct 11, 2024 at 03:54:21PM +0300, Vladimir Oltean wrote: > On Fri, Oct 11, 2024 at 11:58:07AM +0100, Russell King (Oracle) wrote: > > On Fri, Oct 11, 2024 at 01:39:12PM +0300, Vladimir Oltean wrote: > > > On Thu, Oct 10, 2024 at 02:00:32PM +0100, Russell King (Oracle) wrote: > > > > On Thu, Oct 10, 2024 at 12:21:43PM +0100, Russell King (Oracle) wrote: > > > > > Hmm. Looking at this again, we're getting into quite a mess because of > > > > > one of your previous review comments from a number of years back. > > > > > > > > > > You stated that you didn't see the need to support a transition from > > > > > having-a-PCS to having-no-PCS. I don't have a link to that discussion. > > > > > However, it is why we've ended up with phylink_major_config() having > > > > > the extra complexity here, effectively preventing mac_select_pcs() > > > > > from being able to remove a PCS that was previously added: > > > > > > > > > > pcs_changed = pcs && pl->pcs != pcs; > > > > > > > > > > because if mac_select_pcs() returns NULL, it was decided that any > > > > > in-use PCS would not be removed. It seems (at least to me) to be a > > > > > silly decision now. > > > > > > > > > > However, if mac_select_pcs() in phylink_major_config() returns NULL, > > > > > we don't do any validation of the PCS. > > > > > > > > > > So this, today, before these patches, is already an inconsistent mess. > > > > > > > > > > To fix this, I think: > > > > > > > > > > struct phylink_pcs *pcs = NULL; > > > > > ... > > > > > if (pl->mac_ops->mac_select_pcs) { > > > > > pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); > > > > > if (IS_ERR(pcs)) > > > > > return PTR_ERR(pcs); > > > > > } > > > > > > > > > > if (!pcs) > > > > > pcs = pl->pcs; > > > > > > > > > > is needed to give consistent behaviour. > > > > > > > > > > Alternatively, we could allow mac_select_pcs() to return NULL, which > > > > > would then allow the PCS to be removed. > > > > > > > > > > Let me know if you've changed your mind on what behaviour we should > > > > > have, because this affects what I do to sort this out. > > > > > > > > Here's a link to the original discussion from November 2021: > > > > > > > > https://lore.kernel.org/all/E1mpSba-00BXp6-9e@rmk-PC.armlinux.org.uk/ > > > > > > > > Google uselessly refused to find it, so I searched my own mailboxes > > > > to find the message ID. > > > > > > Important note: I cannot find any discussion on any mailing list which > > > fills the gap between me asking what is the real world applicability of > > > mac_select_pcs() returning NULL after it has returned non-NULL, and the > > > current phylink behavior, as described above by you. That behavior was > > > first posted here: > > > https://lore.kernel.org/netdev/Ybiue1TPCwsdHmV4@shell.armlinux.org.uk/ > > > in patches 1/7 and 2/7. I did not state that phylink should keep the old > > > PCS around, and I do not take responsibility for that. > > > > I wanted to add support for phylink_set_pcs() to remove the current > > PCS and submitted a patch for it. You didn't see a use case and objected > > to the patch, which wasn't merged. > > It was an RFC, it wasn't a candidate for merging anyway. What does that have to do with it???????????? An idea is put forward (the idea of allowing PCS to be removed.) It's put forward as a RFC. It gets shot down. Author then goes away believing that there is no desire to allow PCS to be removed. That idea gets carried forward into future patches. _That_ is what exactly happened. I'm not attributing blame for it, merely explaining how we got to where we are with this, and how we've ended up in the mess we have with PCS able to be used outside of its validated set. You want me to provide more explanation on the patch, but I've identified a fundamental error here caused as an effect of a previous review comment. I'm now wondering what to do about it and how to solve this in a way that won't cause us to go around another long confrontational discussion but it seems that's not possible. So, do I ignore your review comments and just do what I think is the right thing, or do I attempt to discuss it with you? I think, given _this_ debacle, I ignore you. I would much rather involve you but it seems that's a mistake.
On Fri, Oct 11, 2024 at 06:51:51PM +0100, Russell King (Oracle) wrote: > On Fri, Oct 11, 2024 at 03:54:21PM +0300, Vladimir Oltean wrote: > > On Fri, Oct 11, 2024 at 11:58:07AM +0100, Russell King (Oracle) wrote: > > > I wanted to add support for phylink_set_pcs() to remove the current > > > PCS and submitted a patch for it. You didn't see a use case and objected > > > to the patch, which wasn't merged. > > > > It was an RFC, it wasn't a candidate for merging anyway. > > What does that have to do with it???????????? > > An idea is put forward (the idea of allowing PCS to be removed.) It's > put forward as a RFC. It gets shot down. Author then goes away believing > that there is no desire to allow PCS to be removed. That idea gets > carried forward into future patches. > > _That_ is what exactly happened. I'm not attributing blame for it, > merely explaining how we got to where we are with this, and how we've > ended up in the mess we have with PCS able to be used outside of its > validated set. > > You want me to provide more explanation on the patch, but I've > identified a fundamental error here caused as an effect of a previous > review comment. > > I'm now wondering what to do about it and how to solve this in a way > that won't cause us to go around another long confrontational discussion > but it seems that's not possible. > > So, do I ignore your review comments and just do what I think is the > right thing, or do I attempt to discuss it with you? I think, given > _this_ debacle, I ignore you. I would much rather involve you but it > seems that's a mistake. My technical answer was already provided 2 replies ago: | Keeping in mind that I don't know whether anything has changed since | 2021 which would make this condition any less theoretical than it was | back then, I guess if I were maintaining the code involved, I'd choose | between 2 options (whichever is easiest): | | - Imagine a purely theoretical scenario where phylink transitions | between a state->interface requiring a phylink_pcs, and one not | requiring a phylink_pcs. I'm not even saying a serial PCS hardware | block isn't present, just that it isn't modeled as a phylink_pcs | (for reasons which may be valid or not). Probably the most logical | thing to do in this scenario is allow the old phylink_pcs to be | removed, and its ops never to be used for the new state->interface. | | - Validate, possibly at phylink_validate_phy() time, that for all | phy->possible_interfaces, mac_select_pcs() either returns NULL for | all of them, or non-NULL for all of them. The idea would be to leave | room for the use case to define itself (and the restriction to be | lifted whenever necessary), instead of giving a predefined behavior | for the transition when in reality we have no idea of the use case | behind it. I don't know whether checking phy->possible_interfaces | would be sufficient in ensuring that such a transition cannot occur. I have nothing more to add to this discussion.
diff --git a/drivers/net/phy/phylink.c b/drivers/net/phy/phylink.c index 4309317de3d1..8f86599d3d78 100644 --- a/drivers/net/phy/phylink.c +++ b/drivers/net/phy/phylink.c @@ -79,7 +79,6 @@ struct phylink { unsigned int pcs_state; bool mac_link_dropped; - bool using_mac_select_pcs; struct sfp_bus *sfp_bus; bool sfp_may_have_phy; @@ -661,12 +660,12 @@ static int phylink_validate_mac_and_pcs(struct phylink *pl, int ret; /* Get the PCS for this interface mode */ - if (pl->using_mac_select_pcs) { + if (pl->mac_ops->mac_select_pcs) { pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); if (IS_ERR(pcs)) return PTR_ERR(pcs); } else { - pcs = pl->pcs; + pcs = NULL; } if (pcs) { @@ -1182,7 +1181,7 @@ static void phylink_major_config(struct phylink *pl, bool restart, state->interface, state->advertising); - if (pl->using_mac_select_pcs) { + if (pl->mac_ops->mac_select_pcs) { pcs = pl->mac_ops->mac_select_pcs(pl->config, state->interface); if (IS_ERR(pcs)) { phylink_err(pl, @@ -1698,7 +1697,6 @@ struct phylink *phylink_create(struct phylink_config *config, phy_interface_t iface, const struct phylink_mac_ops *mac_ops) { - bool using_mac_select_pcs = false; struct phylink *pl; int ret; @@ -1709,11 +1707,6 @@ struct phylink *phylink_create(struct phylink_config *config, return ERR_PTR(-EINVAL); } - if (mac_ops->mac_select_pcs && - mac_ops->mac_select_pcs(config, PHY_INTERFACE_MODE_NA) != - ERR_PTR(-EOPNOTSUPP)) - using_mac_select_pcs = true; - pl = kzalloc(sizeof(*pl), GFP_KERNEL); if (!pl) return ERR_PTR(-ENOMEM); @@ -1732,7 +1725,6 @@ struct phylink *phylink_create(struct phylink_config *config, return ERR_PTR(-EINVAL); } - pl->using_mac_select_pcs = using_mac_select_pcs; pl->phy_state.interface = iface; pl->link_interface = iface; if (iface == PHY_INTERFACE_MODE_MOCA)
With DSA's implementation of the mac_select_pcs() method removed, we can now remove the detection of mac_select_pcs() implementation. Signed-off-by: Russell King (Oracle) <rmk+kernel@armlinux.org.uk> --- drivers/net/phy/phylink.c | 14 +++----------- 1 file changed, 3 insertions(+), 11 deletions(-)