diff mbox series

[2/5] ASoC: Intel: bdw-rt5677: fix module load/unload issues

Message ID 20200622154241.29053-3-pierre-louis.bossart@linux.intel.com (mailing list archive)
State Accepted
Commit bcb43fdae1c0d08b772b792cf46f323ad0d17968
Headers show
Series ASoC: add dailink .exit() callback | expand

Commit Message

Pierre-Louis Bossart June 22, 2020, 3:42 p.m. UTC
The mainline code currently prevents modules from being removed.

The BE dailink .init() function calls devm_gpiod_get() using the codec
component device as argument. When the machine driver is removed, the
references to the gpiod are not released, and it's not possible to
remove the codec driver module - which is the only entity which could
free the gpiod.

This conceptual deadlock can be avoided by invoking gpiod_get() in the
.init() callback, and calling gpiod_put() in the exit() callback.

Tested on SAMUS Chromebook with SOF driver.

Suggested-by: Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Reviewed-by: Guennadi Liakhovetski <guennadi.liakhovetski@linux.intel.com>
Reviewed-by: Curtis Malainey <curtis@malainey.com>
Signed-off-by: Pierre-Louis Bossart <pierre-louis.bossart@linux.intel.com>
---
 sound/soc/intel/boards/bdw-rt5677.c | 18 ++++++++++++++++--
 1 file changed, 16 insertions(+), 2 deletions(-)

Comments

Andy Shevchenko June 22, 2020, 6:18 p.m. UTC | #1
On Mon, Jun 22, 2020 at 10:42:38AM -0500, Pierre-Louis Bossart wrote:
> The mainline code currently prevents modules from being removed.
> 
> The BE dailink .init() function calls devm_gpiod_get() using the codec
> component device as argument. When the machine driver is removed, the
> references to the gpiod are not released, and it's not possible to
> remove the codec driver module - which is the only entity which could
> free the gpiod.
> 
> This conceptual deadlock can be avoided by invoking gpiod_get() in the
> .init() callback, and calling gpiod_put() in the exit() callback.
> 
> Tested on SAMUS Chromebook with SOF driver.

> +static void bdw_rt5677_exit(struct snd_soc_pcm_runtime *rtd)
> +{
> +	struct bdw_rt5677_priv *bdw_rt5677 =
> +			snd_soc_card_get_drvdata(rtd->card);
> +
> +	/*
> +	 * The .exit() can be reached without going through the .init()
> +	 * so explicitly test if the gpiod is valid
> +	 */

> +	if (!IS_ERR_OR_NULL(bdw_rt5677->gpio_hp_en))

_OR_NULL is redundant. gpiod_put() is explicitly NULL-aware.

IS_ERR() I suppose can be avoided by using gpiod_get_optional(), if it suits the case.
Otherwise it would be questionable why we got error pointer value on ->exit().

> +		gpiod_put(bdw_rt5677->gpio_hp_en);
> +}
Pierre-Louis Bossart June 22, 2020, 6:23 p.m. UTC | #2
On 6/22/20 1:18 PM, Andy Shevchenko wrote:
> On Mon, Jun 22, 2020 at 10:42:38AM -0500, Pierre-Louis Bossart wrote:
>> The mainline code currently prevents modules from being removed.
>>
>> The BE dailink .init() function calls devm_gpiod_get() using the codec
>> component device as argument. When the machine driver is removed, the
>> references to the gpiod are not released, and it's not possible to
>> remove the codec driver module - which is the only entity which could
>> free the gpiod.
>>
>> This conceptual deadlock can be avoided by invoking gpiod_get() in the
>> .init() callback, and calling gpiod_put() in the exit() callback.
>>
>> Tested on SAMUS Chromebook with SOF driver.
> 
>> +static void bdw_rt5677_exit(struct snd_soc_pcm_runtime *rtd)
>> +{
>> +	struct bdw_rt5677_priv *bdw_rt5677 =
>> +			snd_soc_card_get_drvdata(rtd->card);
>> +
>> +	/*
>> +	 * The .exit() can be reached without going through the .init()
>> +	 * so explicitly test if the gpiod is valid
>> +	 */
> 
>> +	if (!IS_ERR_OR_NULL(bdw_rt5677->gpio_hp_en))
> 
> _OR_NULL is redundant. gpiod_put() is explicitly NULL-aware.
> 
> IS_ERR() I suppose can be avoided by using gpiod_get_optional(), if it suits the case.
> Otherwise it would be questionable why we got error pointer value on ->exit().

As I explained in the cover letter we can reach this function even if 
the init was not called or was not successful. There are tons of cases 
which reach the same probe_end label in the ASoC core.

So I explicitly added a test for all possible cases. I don't mind 
removing the _OR_NULL if indeed it's safe, but showing this redundancy 
also shows an intent to deal with such cases.

> 
>> +		gpiod_put(bdw_rt5677->gpio_hp_en);
>> +}
>
Mark Brown June 22, 2020, 6:26 p.m. UTC | #3
On Mon, Jun 22, 2020 at 01:23:04PM -0500, Pierre-Louis Bossart wrote:

> So I explicitly added a test for all possible cases. I don't mind removing
> the _OR_NULL if indeed it's safe, but showing this redundancy also shows an
> intent to deal with such cases.

Yeah, I think that's fine - it's perhaps redundant but we're not in a
hot path and as well as the intentionality it saves the reader from
having to know if gpiod_put() copes with NULL or not.
Andy Shevchenko June 23, 2020, 8:40 a.m. UTC | #4
On Mon, Jun 22, 2020 at 07:26:51PM +0100, Mark Brown wrote:
> On Mon, Jun 22, 2020 at 01:23:04PM -0500, Pierre-Louis Bossart wrote:
> 
> > So I explicitly added a test for all possible cases. I don't mind removing
> > the _OR_NULL if indeed it's safe, but showing this redundancy also shows an
> > intent to deal with such cases.
> 
> Yeah, I think that's fine - it's perhaps redundant but we're not in a
> hot path and as well as the intentionality it saves the reader from
> having to know if gpiod_put() copes with NULL or not.

What the point?
We can document this instead of being a dead code, right?

IS_ERR() may happen there only if we assign such pointer to be error. Is it possible case here?
Andy Shevchenko June 23, 2020, 8:43 a.m. UTC | #5
On Mon, Jun 22, 2020 at 01:23:04PM -0500, Pierre-Louis Bossart wrote:
> On 6/22/20 1:18 PM, Andy Shevchenko wrote:
> > On Mon, Jun 22, 2020 at 10:42:38AM -0500, Pierre-Louis Bossart wrote:
> > > The mainline code currently prevents modules from being removed.
> > > 
> > > The BE dailink .init() function calls devm_gpiod_get() using the codec
> > > component device as argument. When the machine driver is removed, the
> > > references to the gpiod are not released, and it's not possible to
> > > remove the codec driver module - which is the only entity which could
> > > free the gpiod.
> > > 
> > > This conceptual deadlock can be avoided by invoking gpiod_get() in the
> > > .init() callback, and calling gpiod_put() in the exit() callback.
> > > 
> > > Tested on SAMUS Chromebook with SOF driver.
> > 
> > > +static void bdw_rt5677_exit(struct snd_soc_pcm_runtime *rtd)
> > > +{
> > > +	struct bdw_rt5677_priv *bdw_rt5677 =
> > > +			snd_soc_card_get_drvdata(rtd->card);
> > > +
> > > +	/*
> > > +	 * The .exit() can be reached without going through the .init()
> > > +	 * so explicitly test if the gpiod is valid
> > > +	 */
> > 
> > > +	if (!IS_ERR_OR_NULL(bdw_rt5677->gpio_hp_en))
> > 
> > _OR_NULL is redundant. gpiod_put() is explicitly NULL-aware.
> > 
> > IS_ERR() I suppose can be avoided by using gpiod_get_optional(), if it suits the case.
> > Otherwise it would be questionable why we got error pointer value on ->exit().
> 
> As I explained in the cover letter we can reach this function even if the
> init was not called or was not successful. There are tons of cases which
> reach the same probe_end label in the ASoC core.


> So I explicitly added a test for all possible cases. I don't mind removing
> the _OR_NULL if indeed it's safe, but showing this redundancy also shows an
> intent to deal with such cases.

Thanks for an explanation. I think it's an established practice to have
NULL-aware resource release-like functions.

Do you put always something like below in your code? No.

	if (foo)
		kfree(foo);
Mark Brown June 23, 2020, 9:37 a.m. UTC | #6
On Tue, Jun 23, 2020 at 11:40:11AM +0300, Andy Shevchenko wrote:
> On Mon, Jun 22, 2020 at 07:26:51PM +0100, Mark Brown wrote:
> > On Mon, Jun 22, 2020 at 01:23:04PM -0500, Pierre-Louis Bossart wrote:

> > > So I explicitly added a test for all possible cases. I don't mind removing
> > > the _OR_NULL if indeed it's safe, but showing this redundancy also shows an
> > > intent to deal with such cases.

> > Yeah, I think that's fine - it's perhaps redundant but we're not in a
> > hot path and as well as the intentionality it saves the reader from
> > having to know if gpiod_put() copes with NULL or not.

> What the point?
> We can document this instead of being a dead code, right?

To a certain extent the _OR_NULL is documentation - the effect is the
same with or without it.

> IS_ERR() may happen there only if we assign such pointer to be error. Is it possible case here?

That one is a bit more real.
Cezary Rojewski June 24, 2020, 7:14 p.m. UTC | #7
On 2020-06-22 5:42 PM, Pierre-Louis Bossart wrote:
> The mainline code currently prevents modules from being removed.
> 
> The BE dailink .init() function calls devm_gpiod_get() using the codec
> component device as argument. When the machine driver is removed, the
> references to the gpiod are not released, and it's not possible to
> remove the codec driver module - which is the only entity which could
> free the gpiod.
> 
> This conceptual deadlock can be avoided by invoking gpiod_get() in the
> .init() callback, and calling gpiod_put() in the exit() callback.
> 
> Tested on SAMUS Chromebook with SOF driver.
> 

As /intel/haswell is the go-to driver for BDW platforms, please test and 
confirm with legacy driver first. SOF is optional and thus non-blocking.

Czarek
Pierre-Louis Bossart June 24, 2020, 8:06 p.m. UTC | #8
On 6/24/20 2:14 PM, Cezary Rojewski wrote:
> On 2020-06-22 5:42 PM, Pierre-Louis Bossart wrote:
>> The mainline code currently prevents modules from being removed.
>>
>> The BE dailink .init() function calls devm_gpiod_get() using the codec
>> component device as argument. When the machine driver is removed, the
>> references to the gpiod are not released, and it's not possible to
>> remove the codec driver module - which is the only entity which could
>> free the gpiod.
>>
>> This conceptual deadlock can be avoided by invoking gpiod_get() in the
>> .init() callback, and calling gpiod_put() in the exit() callback.
>>
>> Tested on SAMUS Chromebook with SOF driver.
>>
> 
> As /intel/haswell is the go-to driver for BDW platforms, please test and 
> confirm with legacy driver first. SOF is optional and thus non-blocking.

I'll retest when you've fixed the go-to legacy driver, I am not even 
going to try module load/unload tests when the platform code has known 
issues requiring reverts.
Cezary Rojewski June 24, 2020, 8:26 p.m. UTC | #9
On 2020-06-24 10:06 PM, Pierre-Louis Bossart wrote:
> On 6/24/20 2:14 PM, Cezary Rojewski wrote:
>> On 2020-06-22 5:42 PM, Pierre-Louis Bossart wrote:
>>> The mainline code currently prevents modules from being removed.
>>>
>>> The BE dailink .init() function calls devm_gpiod_get() using the codec
>>> component device as argument. When the machine driver is removed, the
>>> references to the gpiod are not released, and it's not possible to
>>> remove the codec driver module - which is the only entity which could
>>> free the gpiod.
>>>
>>> This conceptual deadlock can be avoided by invoking gpiod_get() in the
>>> .init() callback, and calling gpiod_put() in the exit() callback.
>>>
>>> Tested on SAMUS Chromebook with SOF driver.
>>>
>>
>> As /intel/haswell is the go-to driver for BDW platforms, please test 
>> and confirm with legacy driver first. SOF is optional and thus 
>> non-blocking.
> 
> I'll retest when you've fixed the go-to legacy driver, I am not even 
> going to try module load/unload tests when the platform code has known 
> issues requiring reverts.

??

Judging by recent comments from the revert thread, you already had build 
with patch-reverted ready. Shouldn't be a problem to retest with that one.

Power transition update is unavoidable if that's what you're asking. 
Without it, hardware *may* achieve a power state, but that's certainly 
not a D3 one.

Czarek
diff mbox series

Patch

diff --git a/sound/soc/intel/boards/bdw-rt5677.c b/sound/soc/intel/boards/bdw-rt5677.c
index 34a3abb5991f..c9da91147770 100644
--- a/sound/soc/intel/boards/bdw-rt5677.c
+++ b/sound/soc/intel/boards/bdw-rt5677.c
@@ -272,8 +272,8 @@  static int bdw_rt5677_init(struct snd_soc_pcm_runtime *rtd)
 			RT5677_CLK_SEL_SYS2);
 
 	/* Request rt5677 GPIO for headphone amp control */
-	bdw_rt5677->gpio_hp_en = devm_gpiod_get(component->dev, "headphone-enable",
-						GPIOD_OUT_LOW);
+	bdw_rt5677->gpio_hp_en = gpiod_get(component->dev, "headphone-enable",
+					   GPIOD_OUT_LOW);
 	if (IS_ERR(bdw_rt5677->gpio_hp_en)) {
 		dev_err(component->dev, "Can't find HP_AMP_SHDN_L gpio\n");
 		return PTR_ERR(bdw_rt5677->gpio_hp_en);
@@ -307,6 +307,19 @@  static int bdw_rt5677_init(struct snd_soc_pcm_runtime *rtd)
 	return 0;
 }
 
+static void bdw_rt5677_exit(struct snd_soc_pcm_runtime *rtd)
+{
+	struct bdw_rt5677_priv *bdw_rt5677 =
+			snd_soc_card_get_drvdata(rtd->card);
+
+	/*
+	 * The .exit() can be reached without going through the .init()
+	 * so explicitly test if the gpiod is valid
+	 */
+	if (!IS_ERR_OR_NULL(bdw_rt5677->gpio_hp_en))
+		gpiod_put(bdw_rt5677->gpio_hp_en);
+}
+
 /* broadwell digital audio interface glue - connects codec <--> CPU */
 SND_SOC_DAILINK_DEF(dummy,
 	DAILINK_COMP_ARRAY(COMP_DUMMY()));
@@ -372,6 +385,7 @@  static struct snd_soc_dai_link bdw_rt5677_dais[] = {
 		.dpcm_playback = 1,
 		.dpcm_capture = 1,
 		.init = bdw_rt5677_init,
+		.exit = bdw_rt5677_exit,
 #if !IS_ENABLED(CONFIG_SND_SOC_SOF_BROADWELL)
 		SND_SOC_DAILINK_REG(dummy, be, dummy),
 #else