mbox series

[v3,00/18] Add suspend to ram support for PCIe on J7200

Message ID 20240102-j7200-pcie-s2r-v3-0-5c2e4a3fac1f@bootlin.com (mailing list archive)
Headers show
Series Add suspend to ram support for PCIe on J7200 | expand

Message

Thomas Richard Feb. 15, 2024, 3:17 p.m. UTC
This add suspend to ram support for the PCIe (RC mode) on J7200 platform.

In RC mode, the reset pin for endpoints is managed by a gpio expander on a
i2c bus. This pin shall be managed in suspend_noirq() and resume_noirq().
The suspend/resume has been moved to suspend_noirq()/resume_noirq() for
pca953x (expander) and pinctrl-single.

To do i2c accesses during suspend_noirq/resume_noirq, we need to force the
wakeup of the i2c controller (which is autosuspended) during suspend
callback. 
It's the only way to wakeup the controller if it's autosuspended, as
runtime pm is disabled in suspend_noirq and resume_noirq.

The main change in this v3 is the removal of the probe boolean for the
functions wiz_clock_probe() and cdns_pcie_host_setup().
Their contents were split in multiple functions which are called in the
resume_noirq() callbacks.

Signed-off-by: Thomas Richard <thomas.richard@bootlin.com>
---
Changes in v3:
- pinctrl-single: split patch in two parts, a first patch to remove the
  dead code, a second to move suspend()/resume() callbacks to noirq.
- i2c-omap: add a comments above the suspend_noirq() callback.
- mux: now mux_chip_resume() try to restores all muxes, then return 0 if
  all is ok or the first failure.
- mmio: fix commit message.
- phy-j721e-wiz: add a patch to use dev_err_probe() instead of dev_err() in
  the wiz_clock_init() function.
- phy-j721e-wiz: remove probe boolean for the wiz_clock_init(), rename the
  function to wiz_clock_probe(), extract hardware configuration part in a
  new function wiz_clock_init().
- phy-cadence-torrent: use dev_err_probe() and fix commit messages
- pcie-cadence-host: remove probe boolean for the cdns_pcie_host_setup()
  function and extract the link setup part in a new function
  cdns_pcie_host_link_setup().
- pcie-cadence-host: make cdns_pcie_host_init() global.
- pci-j721e: use the cdns_pcie_host_link_setup() cdns_pcie_host_init()
  functions in the resume_noirq() callback.
- Link to v2: https://lore.kernel.org/r/20240102-j7200-pcie-s2r-v2-0-8e4f7d228ec2@bootlin.com

Changes in v2:
- all: fix commits messages.
- all: use DEFINE_NOIRQ_DEV_PM_OPS and pm_sleep_ptr macros.
- all: remove useless #ifdef CONFIG_PM.
- pinctrl-single: drop dead code
- mux: add mux_chip_resume() function in mux core.
- mmio: resume sequence is now a call to mux_chip_resume().
- phy-cadence-torrent: fix typo in resume sequence (reset_control_assert()
  instead of reset_control_put()).
- phy-cadence-torrent: use PHY instead of phy.
- pci-j721e: do not shadow cdns_pcie_host_setup return code in resume
  sequence.
- pci-j721e: drop dead code.
- Link to v1: https://lore.kernel.org/r/20240102-j7200-pcie-s2r-v1-0-84e55da52400@bootlin.com

---
Thomas Richard (15):
      gpio: pca953x: move suspend()/resume() to suspend_noirq()/resume_noirq()
      pinctrl: pinctrl-single: remove dead code in suspend() and resume() callbacks
      pinctrl: pinctrl-single: move suspend()/resume() callbacks to noirq
      i2c: omap: wakeup the controller during suspend() callback
      mux: add mux_chip_resume() function
      phy: ti: phy-j721e-wiz: use dev_err_probe() instead of dev_err()
      phy: ti: phy-j721e-wiz: split wiz_clock_init() function
      phy: ti: phy-j721e-wiz: add resume support
      phy: cadence-torrent: extract calls to clk_get from cdns_torrent_clk
      phy: cadence-torrent: register resets even if the phy is already configured
      phy: cadence-torrent: add already_configured to struct cdns_torrent_phy
      phy: cadence-torrent: remove noop_ops phy operations
      phy: cadence-torrent: add suspend and resume support
      PCI: cadence: extract link setup sequence from cdns_pcie_host_setup()
      PCI: cadence: set cdns_pcie_host_init() global

Théo Lebrun (3):
      mux: mmio: add resume support
      PCI: j721e: add reset GPIO to struct j721e_pcie
      PCI: j721e: add suspend and resume support

 drivers/gpio/gpio-pca953x.c                        |   7 +-
 drivers/i2c/busses/i2c-omap.c                      |  22 ++++
 drivers/mux/core.c                                 |  30 +++++
 drivers/mux/mmio.c                                 |  12 ++
 drivers/pci/controller/cadence/pci-j721e.c         | 102 ++++++++++++++++-
 drivers/pci/controller/cadence/pcie-cadence-host.c |  44 +++++---
 drivers/pci/controller/cadence/pcie-cadence.h      |  12 ++
 drivers/phy/cadence/phy-cadence-torrent.c          | 121 +++++++++++++++------
 drivers/phy/ti/phy-j721e-wiz.c                     | 119 +++++++++++++-------
 drivers/pinctrl/pinctrl-single.c                   |  28 ++---
 include/linux/mux/driver.h                         |   1 +
 11 files changed, 380 insertions(+), 118 deletions(-)
---
base-commit: 00ff0f9ce40db8e64fe16c424a965fd7ab769c42
change-id: 20240102-j7200-pcie-s2r-ecb1a979e357

Best regards,

Comments

Andy Shevchenko Feb. 15, 2024, 3:29 p.m. UTC | #1
On Thu, Feb 15, 2024 at 04:17:50PM +0100, Thomas Richard wrote:
> The mux_chip_resume() function restores a mux_chip using the cached state
> of each mux.

...

> +int mux_chip_resume(struct mux_chip *mux_chip)
> +{
> +	int global_ret = 0;
> +	int ret, i;
> +
> +	for (i = 0; i < mux_chip->controllers; ++i) {
> +		struct mux_control *mux = &mux_chip->mux[i];
> +
> +		if (mux->cached_state == MUX_CACHE_UNKNOWN)
> +			continue;
> +
> +		ret = mux_control_set(mux, mux->cached_state);
> +		if (ret < 0) {
> +			dev_err(&mux_chip->dev, "unable to restore state\n");
> +			if (!global_ret)
> +				global_ret = ret;

Hmm... This will record the first error and continue.

> +		}
> +	}
> +	return global_ret;

So here, we actually will get stale data in case there are > 1 failures.

> +}
Philipp Stanner Feb. 15, 2024, 3:40 p.m. UTC | #2
On Thu, 2024-02-15 at 16:17 +0100, Thomas Richard wrote:
> This add suspend to ram support for the PCIe (RC mode) on J7200
> platform.
> 
> In RC mode, the reset pin for endpoints is managed by a gpio expander
> on a
> i2c bus. This pin shall be managed in suspend_noirq() and
> resume_noirq().
> The suspend/resume has been moved to suspend_noirq()/resume_noirq()
> for
> pca953x (expander) and pinctrl-single.
> 
> To do i2c accesses during suspend_noirq/resume_noirq, we need to
> force the
> wakeup of the i2c controller (which is autosuspended) during suspend
> callback. 
> It's the only way to wakeup the controller if it's autosuspended, as
> runtime pm is disabled in suspend_noirq and resume_noirq.
> 
> The main change in this v3 is the removal of the probe boolean for
> the
> functions wiz_clock_probe() and cdns_pcie_host_setup().
> Their contents were split in multiple functions which are called in
> the
> resume_noirq() callbacks.
> 
> Signed-off-by: Thomas Richard <thomas.richard@bootlin.com>
> ---
> Changes in v3:
> - pinctrl-single: split patch in two parts, a first patch to remove
> the
>   dead code, a second to move suspend()/resume() callbacks to noirq.
> - i2c-omap: add a comments above the suspend_noirq() callback.
> - mux: now mux_chip_resume() try to restores all muxes, then return 0
> if
>   all is ok or the first failure.
> - mmio: fix commit message.
> - phy-j721e-wiz: add a patch to use dev_err_probe() instead of
> dev_err() in
>   the wiz_clock_init() function.
> - phy-j721e-wiz: remove probe boolean for the wiz_clock_init(),
> rename the
>   function to wiz_clock_probe(), extract hardware configuration part
> in a
>   new function wiz_clock_init().
> - phy-cadence-torrent: use dev_err_probe() and fix commit messages
> - pcie-cadence-host: remove probe boolean for the
> cdns_pcie_host_setup()
>   function and extract the link setup part in a new function
>   cdns_pcie_host_link_setup().
> - pcie-cadence-host: make cdns_pcie_host_init() global.
> - pci-j721e: use the cdns_pcie_host_link_setup()
> cdns_pcie_host_init()
>   functions in the resume_noirq() callback.
> - Link to v2:
> https://lore.kernel.org/r/20240102-j7200-pcie-s2r-v2-0-8e4f7d228ec2@bootlin.com
> 
> Changes in v2:
> - all: fix commits messages.
> - all: use DEFINE_NOIRQ_DEV_PM_OPS and pm_sleep_ptr macros.
> - all: remove useless #ifdef CONFIG_PM.
> - pinctrl-single: drop dead code
> - mux: add mux_chip_resume() function in mux core.
> - mmio: resume sequence is now a call to mux_chip_resume().
> - phy-cadence-torrent: fix typo in resume sequence
> (reset_control_assert()
>   instead of reset_control_put()).
> - phy-cadence-torrent: use PHY instead of phy.
> - pci-j721e: do not shadow cdns_pcie_host_setup return code in resume
>   sequence.
> - pci-j721e: drop dead code.
> - Link to v1:
> https://lore.kernel.org/r/20240102-j7200-pcie-s2r-v1-0-84e55da52400@bootlin.com
> 
> ---
> Thomas Richard (15):
>       gpio: pca953x: move suspend()/resume() to
> suspend_noirq()/resume_noirq()
>       pinctrl: pinctrl-single: remove dead code in suspend() and
> resume() callbacks
>       pinctrl: pinctrl-single: move suspend()/resume() callbacks to
> noirq
>       i2c: omap: wakeup the controller during suspend() callback
>       mux: add mux_chip_resume() function
>       phy: ti: phy-j721e-wiz: use dev_err_probe() instead of
> dev_err()
>       phy: ti: phy-j721e-wiz: split wiz_clock_init() function
>       phy: ti: phy-j721e-wiz: add resume support
>       phy: cadence-torrent: extract calls to clk_get from
> cdns_torrent_clk
>       phy: cadence-torrent: register resets even if the phy is
> already configured
>       phy: cadence-torrent: add already_configured to struct
> cdns_torrent_phy
>       phy: cadence-torrent: remove noop_ops phy operations
>       phy: cadence-torrent: add suspend and resume support
>       PCI: cadence: extract link setup sequence from
> cdns_pcie_host_setup()
>       PCI: cadence: set cdns_pcie_host_init() global
> 
> Théo Lebrun (3):
>       mux: mmio: add resume support
>       PCI: j721e: add reset GPIO to struct j721e_pcie
>       PCI: j721e: add suspend and resume support


For the PCI patches Bjorn is most likely going to ask you to adjust
them to PCI's common commit style; see here [1]

In particular, PCI (afaik) has no convention for naming subcomponents
such as j721e and the info following the : is written beginning with an
upper case, e.g.

PCI: Add suspend and resume support for j721e


Regards,
P.

[1] https://lore.kernel.org/linux-pci/20171026223701.GA25649@bhelgaas-glaptop.roam.corp.google.com/


> 
>  drivers/gpio/gpio-pca953x.c                        |   7 +-
>  drivers/i2c/busses/i2c-omap.c                      |  22 ++++
>  drivers/mux/core.c                                 |  30 +++++
>  drivers/mux/mmio.c                                 |  12 ++
>  drivers/pci/controller/cadence/pci-j721e.c         | 102
> ++++++++++++++++-
>  drivers/pci/controller/cadence/pcie-cadence-host.c |  44 +++++---
>  drivers/pci/controller/cadence/pcie-cadence.h      |  12 ++
>  drivers/phy/cadence/phy-cadence-torrent.c          | 121
> +++++++++++++++------
>  drivers/phy/ti/phy-j721e-wiz.c                     | 119
> +++++++++++++-------
>  drivers/pinctrl/pinctrl-single.c                   |  28 ++---
>  include/linux/mux/driver.h                         |   1 +
>  11 files changed, 380 insertions(+), 118 deletions(-)
> ---
> base-commit: 00ff0f9ce40db8e64fe16c424a965fd7ab769c42
> change-id: 20240102-j7200-pcie-s2r-ecb1a979e357
> 
> Best regards,
Andy Shevchenko Feb. 15, 2024, 3:43 p.m. UTC | #3
On Thu, Feb 15, 2024 at 04:17:53PM +0100, Thomas Richard wrote:
> The wiz_clock_init() function mixes probe and hardware configuration.
> Rename the wiz_clock_init() to wiz_clock_probe() and move the hardware
> configuration part in a new function named wiz_clock_init().
> 
> This hardware configuration sequence must be called during the resume
> stage of the driver.

...

(Side note, as this can be done later)

>  	if (rate >= 100000000)

> +		if (rate >= 100000000)

> +	if (rate >= 100000000)

I would make local definition and use it, we may get the global one as there
are users.

#define HZ_PER_GHZ	1000000000UL
Andy Shevchenko Feb. 15, 2024, 3:45 p.m. UTC | #4
On Thu, Feb 15, 2024 at 04:17:58PM +0100, Thomas Richard wrote:
> Even if a PHY is already configured, the PHY operations are needed during
> resume stage, as the PHY is in reset state.
> The noop_ops PHY operations is removed to always have PHY operations.
> The already_configured flag is checked at the begening of init, configure
> and poweron operations to keep the already_configured behaviour.

...

> +	if (cdns_phy->already_configured) {
> +		/* Give 5ms to 10ms delay for the PIPE clock to be stable */
> +		usleep_range(5000, 10000);

fsleep() ?
(Yes, I see this is the original code, perhaps later in a separate change)

> +		return 0;
> +	}
Andy Shevchenko Feb. 15, 2024, 3:46 p.m. UTC | #5
On Thu, Feb 15, 2024 at 04:17:59PM +0100, Thomas Richard wrote:
> Add suspend and resume support.
> 
> The already_configured flag is cleared during the suspend stage to force
> the PHY initialization during the resume stage.

> Based on the work of Théo Lebrun <theo.lebrun@bootlin.com>

SoB/Co-developed-by ?

...

> +static int cdns_torrent_phy_suspend_noirq(struct device *dev)
> +{
> +	struct cdns_torrent_phy *cdns_phy = dev_get_drvdata(dev);
> +	int i;

Why signed?

> +	reset_control_assert(cdns_phy->phy_rst);
> +	reset_control_assert(cdns_phy->apb_rst);
> +	for (i = 0; i < cdns_phy->nsubnodes; i++)
> +		reset_control_assert(cdns_phy->phys[i].lnk_rst);
> +
> +	if (cdns_phy->already_configured)
> +		cdns_phy->already_configured = 0;
> +	else
> +		clk_disable_unprepare(cdns_phy->clk);
> +
> +	return 0;
> +}
> +
> +static int cdns_torrent_phy_resume_noirq(struct device *dev)
> +{
> +	struct cdns_torrent_phy *cdns_phy = dev_get_drvdata(dev);
> +	int node = cdns_phy->nsubnodes;
> +	int ret, i;

Ditto.

> +	ret = cdns_torrent_clk(cdns_phy);
> +	if (ret)
> +		goto clk_cleanup;
> +
> +	/* Enable APB */
> +	reset_control_deassert(cdns_phy->apb_rst);
> +
> +	if (cdns_phy->nsubnodes > 1) {
> +		ret = cdns_torrent_phy_configure_multilink(cdns_phy);
> +		if (ret)
> +			goto put_lnk_rst;
> +	}
> +
> +	return 0;
> +
> +put_lnk_rst:
> +	for (i = 0; i < node; i++)
> +		reset_control_assert(cdns_phy->phys[i].lnk_rst);
> +	reset_control_assert(cdns_phy->apb_rst);
> +	clk_disable_unprepare(cdns_phy->clk);
> +clk_cleanup:
> +	cdns_torrent_clk_cleanup(cdns_phy);
> +	return ret;
> +}
Andy Shevchenko Feb. 15, 2024, 4:04 p.m. UTC | #6
On Thu, Feb 15, 2024 at 04:18:02PM +0100, Thomas Richard wrote:
> From: Théo Lebrun <theo.lebrun@bootlin.com>
> 
> Add reset GPIO to struct j721e_pcie, so it can be used at suspend and
> resume stages.

...

>  	case PCI_MODE_RC:
> -		gpiod = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
> -		if (IS_ERR(gpiod)) {
> -			ret = PTR_ERR(gpiod);
> +		pcie->reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
> +		if (IS_ERR(pcie->reset_gpio)) {
> +			ret = PTR_ERR(pcie->reset_gpio);
>  			if (ret != -EPROBE_DEFER)
>  				dev_err(dev, "Failed to get reset GPIO\n");
>  			goto err_get_sync;
> @@ -504,9 +504,9 @@ static int j721e_pcie_probe(struct platform_device *pdev)
>  		 * mode is selected while enabling the PHY. So deassert PERST#
>  		 * after 100 us.
>  		 */
> -		if (gpiod) {
> +		if (pcie->reset_gpio) {
>  			usleep_range(100, 200);
> -			gpiod_set_value_cansleep(gpiod, 1);
> +			gpiod_set_value_cansleep(pcie->reset_gpio, 1);
>  		}

Instead of all this, just add one line assignment. Moreover, before or after
this patch refactor the code to use ret = dev_err_probe(...); pattern that
eliminates those deferral probe checks.
Bjorn Helgaas Feb. 15, 2024, 5:06 p.m. UTC | #7
On Thu, Feb 15, 2024 at 04:17:45PM +0100, Thomas Richard wrote:
> This add suspend to ram support for the PCIe (RC mode) on J7200 platform.

>       PCI: cadence: extract link setup sequence from cdns_pcie_host_setup()
>       PCI: cadence: set cdns_pcie_host_init() global
>       PCI: j721e: add reset GPIO to struct j721e_pcie
>       PCI: j721e: add suspend and resume support

The drivers/pci/ subject line pattern is:

  PCI: <driver>: <Capitalized verb>

e.g.,

  PCI: cadence: Extract link setup sequence from cdns_pcie_host_setup()
Vinod Koul Feb. 16, 2024, 6:02 a.m. UTC | #8
On 15-02-24, 17:43, Andy Shevchenko wrote:
> On Thu, Feb 15, 2024 at 04:17:53PM +0100, Thomas Richard wrote:
> > The wiz_clock_init() function mixes probe and hardware configuration.
> > Rename the wiz_clock_init() to wiz_clock_probe() and move the hardware
> > configuration part in a new function named wiz_clock_init().
> > 
> > This hardware configuration sequence must be called during the resume
> > stage of the driver.
> 
> ...
> 
> (Side note, as this can be done later)
> 
> >  	if (rate >= 100000000)
> 
> > +		if (rate >= 100000000)
> 
> > +	if (rate >= 100000000)
> 
> I would make local definition and use it, we may get the global one as there
> are users.
> 
> #define HZ_PER_GHZ	1000000000UL

Better to define as:
#define HZ_PER_GHZ 1 * GIGA
Thomas Richard Feb. 16, 2024, 7:36 a.m. UTC | #9
On 2/15/24 16:46, Andy Shevchenko wrote:
>> +static int cdns_torrent_phy_suspend_noirq(struct device *dev)
>> +{
>> +	struct cdns_torrent_phy *cdns_phy = dev_get_drvdata(dev);
>> +	int i;
> 
> Why signed?

In the for loop below, the i variable is compared to
cdns_phy->nsubnodes, which is an int.

https://elixir.bootlin.com/linux/latest/source/drivers/phy/cadence/phy-cadence-torrent.c#L360

> 
>> +	reset_control_assert(cdns_phy->phy_rst);
>> +	reset_control_assert(cdns_phy->apb_rst);
>> +	for (i = 0; i < cdns_phy->nsubnodes; i++)
>> +		reset_control_assert(cdns_phy->phys[i].lnk_rst);
>> +
>> +	if (cdns_phy->already_configured)
>> +		cdns_phy->already_configured = 0;
>> +	else
>> +		clk_disable_unprepare(cdns_phy->clk);
>> +
>> +	return 0;
>> +}
>> +
>> +static int cdns_torrent_phy_resume_noirq(struct device *dev)
>> +{
>> +	struct cdns_torrent_phy *cdns_phy = dev_get_drvdata(dev);
>> +	int node = cdns_phy->nsubnodes;
>> +	int ret, i;
> 
> Ditto>

Same reason

>> +	ret = cdns_torrent_clk(cdns_phy);
>> +	if (ret)
>> +		goto clk_cleanup;
>> +
>> +	/* Enable APB */
>> +	reset_control_deassert(cdns_phy->apb_rst);
>> +
>> +	if (cdns_phy->nsubnodes > 1) {
>> +		ret = cdns_torrent_phy_configure_multilink(cdns_phy);
>> +		if (ret)
>> +			goto put_lnk_rst;
>> +	}
>> +
>> +	return 0;
>> +
>> +put_lnk_rst:
>> +	for (i = 0; i < node; i++)
>> +		reset_control_assert(cdns_phy->phys[i].lnk_rst);
>> +	reset_control_assert(cdns_phy->apb_rst);
>> +	clk_disable_unprepare(cdns_phy->clk);
>> +clk_cleanup:
>> +	cdns_torrent_clk_cleanup(cdns_phy);
>> +	return ret;
>> +}
>
Thomas Richard Feb. 16, 2024, 7:52 a.m. UTC | #10
On 2/15/24 16:29, Andy Shevchenko wrote:
> On Thu, Feb 15, 2024 at 04:17:50PM +0100, Thomas Richard wrote:
>> The mux_chip_resume() function restores a mux_chip using the cached state
>> of each mux.
> 
> ...
> 
>> +int mux_chip_resume(struct mux_chip *mux_chip)
>> +{
>> +	int global_ret = 0;
>> +	int ret, i;
>> +
>> +	for (i = 0; i < mux_chip->controllers; ++i) {
>> +		struct mux_control *mux = &mux_chip->mux[i];
>> +
>> +		if (mux->cached_state == MUX_CACHE_UNKNOWN)
>> +			continue;
>> +
>> +		ret = mux_control_set(mux, mux->cached_state);
>> +		if (ret < 0) {
>> +			dev_err(&mux_chip->dev, "unable to restore state\n");
>> +			if (!global_ret)
>> +				global_ret = ret;
> 
> Hmm... This will record the first error and continue.

In the v2 we talked about this with Peter Rosin.

In fact, in the v1 (mux_chip_resume() didn't exists yet, everything was
done in the mmio driver) I had the same behavior: try to restore all
muxes and in case of error restore the first one.

I don't know what is the right solution. I just restored the behavior I
had in v1.

> 
>> +		}
>> +	}
>> +	return global_ret;
> 
> So here, we actually will get stale data in case there are > 1 failures.

Yes, indeed. But we will have an error message for each failure.

> 
>> +}
>
Siddharth Vadapalli Feb. 16, 2024, 9:04 a.m. UTC | #11
On 24/02/16 11:32AM, Vinod Koul wrote:
> On 15-02-24, 17:43, Andy Shevchenko wrote:
> > On Thu, Feb 15, 2024 at 04:17:53PM +0100, Thomas Richard wrote:
> > > The wiz_clock_init() function mixes probe and hardware configuration.
> > > Rename the wiz_clock_init() to wiz_clock_probe() and move the hardware
> > > configuration part in a new function named wiz_clock_init().
> > > 
> > > This hardware configuration sequence must be called during the resume
> > > stage of the driver.
> > 
> > ...
> > 
> > (Side note, as this can be done later)
> > 
> > >  	if (rate >= 100000000)
> > 
> > > +		if (rate >= 100000000)
> > 
> > > +	if (rate >= 100000000)
> > 
> > I would make local definition and use it, we may get the global one as there
> > are users.
> > 
> > #define HZ_PER_GHZ	1000000000UL
> 
> Better to define as:
> #define HZ_PER_GHZ 1 * GIGA

The variable "rate" is being compared against 100 MHz and not 1 GHz.
The driver already has the following macros defined:
#define REF_CLK_19_2MHZ         19200000
#define REF_CLK_25MHZ           25000000
#define REF_CLK_100MHZ          100000000
#define REF_CLK_156_25MHZ       156250000

So would it be acceptable to change it to:
	if (rate >= REF_CLK_100MHZ)
instead?

Regards,
Siddharth.
Andy Shevchenko Feb. 16, 2024, 2:56 p.m. UTC | #12
On Fri, Feb 16, 2024 at 11:32:31AM +0530, Vinod Koul wrote:
> On 15-02-24, 17:43, Andy Shevchenko wrote:
> > On Thu, Feb 15, 2024 at 04:17:53PM +0100, Thomas Richard wrote:

...

> > (Side note, as this can be done later)
> > 
> > >  	if (rate >= 100000000)
> > 
> > > +		if (rate >= 100000000)
> > 
> > > +	if (rate >= 100000000)
> > 
> > I would make local definition and use it, we may get the global one as there
> > are users.
> > 
> > #define HZ_PER_GHZ	1000000000UL
> 
> Better to define as:
> #define HZ_PER_GHZ 1 * GIGA

(with parentheses)

Maybe here, but when it appears in units.h it will be defined as I wrote
to be aligned with the rest of definitions.
Andy Shevchenko Feb. 16, 2024, 3:03 p.m. UTC | #13
On Fri, Feb 16, 2024 at 02:34:39PM +0530, Siddharth Vadapalli wrote:
> On 24/02/16 11:32AM, Vinod Koul wrote:
> > On 15-02-24, 17:43, Andy Shevchenko wrote:
> > > On Thu, Feb 15, 2024 at 04:17:53PM +0100, Thomas Richard wrote:

...

> > > (Side note, as this can be done later)
> > > 
> > > >  	if (rate >= 100000000)
> > > 
> > > > +		if (rate >= 100000000)
> > > 
> > > > +	if (rate >= 100000000)
> > > 
> > > I would make local definition and use it, we may get the global one as there
> > > are users.
> > > 
> > > #define HZ_PER_GHZ	1000000000UL
> > 
> > Better to define as:
> > #define HZ_PER_GHZ 1 * GIGA
> 
> The variable "rate" is being compared against 100 MHz and not 1 GHz.

Extremely good point why constant definitions are better (to avoid missing
or extra 0, etc)!

> The driver already has the following macros defined:
> #define REF_CLK_19_2MHZ         19200000
> #define REF_CLK_25MHZ           25000000
> #define REF_CLK_100MHZ          100000000
> #define REF_CLK_156_25MHZ       156250000
> 
> So would it be acceptable to change it to:
> 	if (rate >= REF_CLK_100MHZ)
> instead?

Sounds like a good idea to me.
Andy Shevchenko Feb. 16, 2024, 3:07 p.m. UTC | #14
On Fri, Feb 16, 2024 at 08:52:17AM +0100, Thomas Richard wrote:
> On 2/15/24 16:29, Andy Shevchenko wrote:
> > On Thu, Feb 15, 2024 at 04:17:50PM +0100, Thomas Richard wrote:

...

> >> +int mux_chip_resume(struct mux_chip *mux_chip)
> >> +{
> >> +	int global_ret = 0;
> >> +	int ret, i;
> >> +
> >> +	for (i = 0; i < mux_chip->controllers; ++i) {
> >> +		struct mux_control *mux = &mux_chip->mux[i];
> >> +
> >> +		if (mux->cached_state == MUX_CACHE_UNKNOWN)
> >> +			continue;
> >> +
> >> +		ret = mux_control_set(mux, mux->cached_state);
> >> +		if (ret < 0) {
> >> +			dev_err(&mux_chip->dev, "unable to restore state\n");
> >> +			if (!global_ret)
> >> +				global_ret = ret;
> > 
> > Hmm... This will record the first error and continue.
> 
> In the v2 we talked about this with Peter Rosin.
> 
> In fact, in the v1 (mux_chip_resume() didn't exists yet, everything was
> done in the mmio driver) I had the same behavior: try to restore all
> muxes and in case of error restore the first one.
> 
> I don't know what is the right solution. I just restored the behavior I
> had in v1.

Okay, I believe you know what you are doing, folks. But to me this approach
sounds at bare minimum "unusual". Because the failures here are not fatal
and recording the first one may or may not make sense and it's so fragile
as it completely implementation-dependent.

> >> +		}
> >> +	}
> >> +	return global_ret;
> > 
> > So here, we actually will get stale data in case there are > 1 failures.
> 
> Yes, indeed. But we will have an error message for each failure.

Which is also problematic. PM calls may easily spam the logs and outshadow
really important messages (like oopses).
Thomas Richard Feb. 21, 2024, 10:59 a.m. UTC | #15
On 2/16/24 16:07, Andy Shevchenko wrote:
> On Fri, Feb 16, 2024 at 08:52:17AM +0100, Thomas Richard wrote:
>> On 2/15/24 16:29, Andy Shevchenko wrote:
>>> On Thu, Feb 15, 2024 at 04:17:50PM +0100, Thomas Richard wrote:
> 
> ...
> 
>>>> +int mux_chip_resume(struct mux_chip *mux_chip)
>>>> +{
>>>> +	int global_ret = 0;
>>>> +	int ret, i;
>>>> +
>>>> +	for (i = 0; i < mux_chip->controllers; ++i) {
>>>> +		struct mux_control *mux = &mux_chip->mux[i];
>>>> +
>>>> +		if (mux->cached_state == MUX_CACHE_UNKNOWN)
>>>> +			continue;
>>>> +
>>>> +		ret = mux_control_set(mux, mux->cached_state);
>>>> +		if (ret < 0) {
>>>> +			dev_err(&mux_chip->dev, "unable to restore state\n");
>>>> +			if (!global_ret)
>>>> +				global_ret = ret;
>>>
>>> Hmm... This will record the first error and continue.
>>
>> In the v2 we talked about this with Peter Rosin.
>>
>> In fact, in the v1 (mux_chip_resume() didn't exists yet, everything was
>> done in the mmio driver) I had the same behavior: try to restore all
>> muxes and in case of error restore the first one.
>>
>> I don't know what is the right solution. I just restored the behavior I
>> had in v1.
> 
> Okay, I believe you know what you are doing, folks. But to me this approach
> sounds at bare minimum "unusual". Because the failures here are not fatal
> and recording the first one may or may not make sense and it's so fragile
> as it completely implementation-dependent.

I guess if there is an error, the resume is completely dead so no need
to continue.
If it's okay for Peter I can return on first failure.

Regards,
Thomas Richard Feb. 26, 2024, 5:05 p.m. UTC | #16
On 2/15/24 17:04, Andy Shevchenko wrote:
> On Thu, Feb 15, 2024 at 04:18:02PM +0100, Thomas Richard wrote:
>> From: Théo Lebrun <theo.lebrun@bootlin.com>
>>
>> Add reset GPIO to struct j721e_pcie, so it can be used at suspend and
>> resume stages.
> 
> ...
> 
>>  	case PCI_MODE_RC:
>> -		gpiod = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
>> -		if (IS_ERR(gpiod)) {
>> -			ret = PTR_ERR(gpiod);
>> +		pcie->reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
>> +		if (IS_ERR(pcie->reset_gpio)) {
>> +			ret = PTR_ERR(pcie->reset_gpio);
>>  			if (ret != -EPROBE_DEFER)
>>  				dev_err(dev, "Failed to get reset GPIO\n");
>>  			goto err_get_sync;
>> @@ -504,9 +504,9 @@ static int j721e_pcie_probe(struct platform_device *pdev)
>>  		 * mode is selected while enabling the PHY. So deassert PERST#
>>  		 * after 100 us.
>>  		 */
>> -		if (gpiod) {
>> +		if (pcie->reset_gpio) {
>>  			usleep_range(100, 200);
>> -			gpiod_set_value_cansleep(gpiod, 1);
>> +			gpiod_set_value_cansleep(pcie->reset_gpio, 1);
>>  		}
> 
> Instead of all this, just add one line assignment. Moreover, before or after
> this patch refactor the code to use ret = dev_err_probe(...); pattern that
> eliminates those deferral probe checks.

Hi Andy,

I'm not sure what you exactly want when you write "just add one line
assignment".
For the dev_err_probe() it's okay, it will be fixed in the next iteration.

Regards,
Andy Shevchenko Feb. 26, 2024, 5:54 p.m. UTC | #17
On Mon, Feb 26, 2024 at 06:05:16PM +0100, Thomas Richard wrote:
> On 2/15/24 17:04, Andy Shevchenko wrote:
> > On Thu, Feb 15, 2024 at 04:18:02PM +0100, Thomas Richard wrote:
> >> From: Théo Lebrun <theo.lebrun@bootlin.com>

...

> >>  	case PCI_MODE_RC:
> >> -		gpiod = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
> >> -		if (IS_ERR(gpiod)) {
> >> -			ret = PTR_ERR(gpiod);
> >> +		pcie->reset_gpio = devm_gpiod_get_optional(dev, "reset", GPIOD_OUT_LOW);
> >> +		if (IS_ERR(pcie->reset_gpio)) {
> >> +			ret = PTR_ERR(pcie->reset_gpio);
> >>  			if (ret != -EPROBE_DEFER)
> >>  				dev_err(dev, "Failed to get reset GPIO\n");
> >>  			goto err_get_sync;
> >> @@ -504,9 +504,9 @@ static int j721e_pcie_probe(struct platform_device *pdev)
> >>  		 * mode is selected while enabling the PHY. So deassert PERST#
> >>  		 * after 100 us.
> >>  		 */
> >> -		if (gpiod) {
> >> +		if (pcie->reset_gpio) {
> >>  			usleep_range(100, 200);
> >> -			gpiod_set_value_cansleep(gpiod, 1);
> >> +			gpiod_set_value_cansleep(pcie->reset_gpio, 1);
> >>  		}
> > 
> > Instead of all this, just add one line assignment. Moreover, before or after
> > this patch refactor the code to use ret = dev_err_probe(...); pattern that
> > eliminates those deferral probe checks.
> 
> Hi Andy,
> 
> I'm not sure what you exactly want when you write "just add one line
> assignment".

		pcie->reset_gpio = gpiod;

That's it. No additional code changes are needed (WRT the above context,
of course you want to have a new structure member).